WO2009088442A2 - Isomerases and epimerases and methods of using - Google Patents

Isomerases and epimerases and methods of using Download PDF

Info

Publication number
WO2009088442A2
WO2009088442A2 PCT/US2008/013968 US2008013968W WO2009088442A2 WO 2009088442 A2 WO2009088442 A2 WO 2009088442A2 US 2008013968 W US2008013968 W US 2008013968W WO 2009088442 A2 WO2009088442 A2 WO 2009088442A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
tryptophan
polypeptide
racemase
activity
Prior art date
Application number
PCT/US2008/013968
Other languages
French (fr)
Other versions
WO2009088442A3 (en
Inventor
Mervyn L. De Souza
Wei Niu
Jose M. Laplaza
Christopher Solheid
Sherry R. Kollmann
Joshua M. Lundorff
Paula M. Hicks
Sara C. Mcfarlan
Fernando A. Sanchez-Riera
David P. Weiner
Ellen Burke
Peter Luginbuhl
Analia Bueno
Joslin Cuenca
Original Assignee
Cargill, Incorporated
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cargill, Incorporated filed Critical Cargill, Incorporated
Publication of WO2009088442A2 publication Critical patent/WO2009088442A2/en
Publication of WO2009088442A3 publication Critical patent/WO2009088442A3/en
Priority to US12/828,714 priority Critical patent/US20110045547A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P13/00Preparation of nitrogen-containing organic compounds
    • C12P13/04Alpha- or beta- amino acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P13/00Preparation of nitrogen-containing organic compounds
    • C12P13/04Alpha- or beta- amino acids
    • C12P13/22Tryptophan; Tyrosine; Phenylalanine; 3,4-Dihydroxyphenylalanine
    • C12P13/227Tryptophan

Definitions

  • This invention relates to nucleic acids and polypeptides, and more particularly to nucleic acids and polypeptides encoding isomerases (e.g., racemases) and epimerases as well as methods of using such isomerases and epimerases.
  • isomerases e.g., racemases
  • epimerases as well as methods of using such isomerases and epimerases.
  • Isomerases such as racemases as well as epimerases can catalyze the interconversion of substrate enantiomers. Isomerases and epimerases can catalyze the stereochemical inversion around an asymmetric carbon atom in a substrate having one or more centers of asymmetry.
  • This disclosure provides for a number of different isomerase (e.g., racemase) and epimerase polypeptides and the nucleic acids encoding such isomerase and epimerase polypeptides. This disclosure also provides for methods of using such isomerase and epimerase nucleic acids and polypeptides.
  • the invention provides methods of isomerizing a substrate.
  • one or more L-amino acids can be converted to the corresponding one or more D-amino acids (or, alternatively, one or more D-amino acids to the corresponding one or more L-amino acids).
  • Such methods generally include combining one or more L-amino acids (or one or more D-amino acids) with a) one or more nucleic acid molecules chosen from the group consisting of SEQ I D NOs: 1 , 3, 5, 7, 9, I I , 13, 15, 17, 19, 2 1 , 23, 25, 27, 29, 3 1 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 5 1 , 53, 55, 57, 59, 61, 63, 65.67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87.89.91.93.95, 97, 99.101. 103, 105, 107, 109, 11 I.
  • the one or more nucleic acid molecules are chosen from the group consistingofSEQ IDNOs:1,3, 5, 7,9, U, 13, 15, 17, 19,21,23,25,27,29,31, 33,35,37,39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93,95,97,99, 101, 103, 105, 107, 109, 111, I 13, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183
  • the one or more polypeptides are chosen from the group consisting ofSEQ lDNOs:2, 4, 6, 8, 10, 12, 14, 16, 18,20,22,24,26, 28,30,32, 34,36,38,40,42,44,46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186,
  • the nucleic acid molecule has the sequence shown in SEQ ID NO:411 and the polypeptide has the sequence shown in SEQ ID NO:412.
  • the variant is a nucleic acid molecule that has at least 98% (e.g., at least 99%) sequence identity to SEQ ID NOs: 1 , 3, 5, 7, 9, I 1 , 13, 15, 17, 19, 21 , 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81,83, 85, 87, 89,91, 93,95,97,99, 101, 103, 105, 107, 109, 11 I, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139,
  • the variant is a polypeptide that has at least 98% (e.g., at least 99%) sequence identity to SEQ ID NOs:2, 4, 6, 8, I 0, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, 1 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 1 52, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 1 80, 182,
  • the variant is a nucleic acid that has at least 45% (e.g., at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%) sequence identity to SEQ ID NO:4 I I .
  • the variant is a polypeptide that has at least 25% (e.g., at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%) sequence identity to SEQ ID NO:412.
  • the variant is a mutant.
  • a representative mutant has a mutation at the residue that aligns with residue 76 of A. caviae BAR.
  • the variant is a nucleic acid molecule that has been codon optimized.
  • the variant polypeptide is a chimeric polypeptide.
  • the nucleic acid molecule is contained within an expression vector and, for example, can be overexpressed.
  • the isomerase or epimerase polypeptide lacks a signal sequence or a prepro domain.
  • the isomerase or epimerase polypeptide is immobilized on a solid support.
  • the polypeptide fragment is a PFAM domain.
  • Representative polypeptide fragments that include a PFAM domain have the sequence shown in SEQ IDNO: 426, 440, or 462.
  • the amino acid is tryptophan. In other embodiments, the amino acid is alanine. In some embodiments, the amino acid is a substituted amino acid. [0017] In another aspect, the invention provides for methods of converting L-tryptophan to D- tryptophan (or, alternatively, D-tryptophan to L-tryptophan).
  • Such methods typically include combining L-tryptophan (or D-tryptophan) with a) one or more nucleic acid molecules chosen from the group consisting of SEQ I D NOs: I, 3, 5, 7, 9, II, 13, 15, 17, 19,21,23,25,27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85,87,89,91,93,95,97,99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173,
  • Representative polypeptides include, without limitation, SEQ ID NOs: 1 72, 178, 180, 182, 184, 140, 144, 188, 190, 1 12, 148, 156, 120, 162, and 108.
  • Representative polypeptides also include, without limitation, SEQ ID NOs: 136, 174, 138, 296, and 1 10.
  • Representative polypeptides further include, without limitation, SEQ ID NOs: 150, 192, 1 52, 1 18, 194, 154, 196, 158, 160, and 1 16.
  • Representative polypeptides include, without limitation, SEQ ID NOs:248, 236, 246, 252, 250, 254, and 244.
  • Representative polypeptides include, without limitation, SEQ ID NOs:274, 234, 220, 222, 226, 232, 240, 242, 258, 260, 264, 266, 286, 290, 170, 216, and 288.
  • Representative polypeptides include, without limitation, SEQ I D NOs:208, 210, 228, 230, 270, 272, 278, 280, 282, 284, 292, 198, 212, 214, 1 14, and 218.
  • Representative polypeptides include, without limitation, SEQ ID NOs:204 and 218.
  • the invention provides methods of converting L-tryptophan to D- tryptophan.
  • Such methods generally include combining L-tryptophan with a) a nucleic acid molecule having the sequence shown in SEQ ID NO:4 I I , wherein the nucleic acid molecule encodes a polypeptides having racemase activity; b) a variant of a), wherein the variant encodes a polypeptide having racemase activity; c) a fragment of a) or b), wherein the fragment encodes a polypeptide having racemase activity; d) one or more polypeptides chosen from the group consisting of SEQ ID NO:41 1 , wherein the one or more polypeptides has racemase activity; e) a variant of d), wherein the variant has racemase activity; or f) a fragment of d) or e), wherein the fragment has racemase activity.
  • the tryptophan is a substituted tryptophan.
  • a representative substituted tryptophan is a chlorinated tryptophan (e.g., 6-chloro-D-tryptophan).
  • the substituted tryptophan is a halogenated tryptophan.
  • the invention provides methods of making monatin.
  • Such methods generally include combining L-tryptophan with a) one or more nucleic acid molecules chosen from the group consisting of SEQ ID NOs: I , 3, 5, 7, 9, I I , 13, 15, 17, 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 99, 101 , 103, 105, 107, 109, 1 1 1 , 1 13, 1 1 5, 1 17, 1 19, 121 , 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155,
  • such methods further include adding one or more polypeptides having synthase / lyase (EC 4.1.3.- or EC 4.1.2.-) activity or a nucleic acid encoding such a polypeptide and/or one or more polypeptides having D-aminotransferase activity or a nucleic acid encoding such a polypeptide.
  • the inonatin is predominantly R, R monatin.
  • the nucleic acid has the sequence shown in SEQ ID NO:4 I I and the polypeptide has the sequence shown in SEQ I D NO:412.
  • nucleic acid molecules encoding polypeptides having isomerase activity or epimerase activity
  • lsomerases such as racemases are provided herein that catalyze the racemization of a specific amino acid (e.g., tryptophan) or that catalyze the racemization of more than one amino acid (e.g., broad substrate racemases).
  • the nucleic acids or polypeptides disclosed herein can be used, for example, to convert L-tryptophan to D-tryptophan.
  • the present invention is based, in part, on the identification of nucleic acid molecules encoding polypeptides having isomerase activity, herein referred to as "isomerase" or
  • racemase nucleic acid molecules or polypeptides where appropriate.
  • the present invention also is based, in part, on the identification of nucleic acid molecules encoding polypeptides having epimerase activity, herein referred to as “epimerase” nucleic acid molecules or polypeptides, wherein appropriate.
  • nucleic acid molecules described herein include the sequences shown in SEQ ID NO: 1
  • ID NOs: l 3, 5, 7, 9, I I , 13, 15, 17, 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49,
  • nucleic acid molecule can include DNA molecules and RNA molecules, analogs of DNA or RNA generated using nucleotide analogs.
  • a nucleic acid molecule of the invention can be single-stranded or double-stranded, depending upon its intended use.
  • Nucleic acid molecules of the invention include molecules that have at least, for example, 75% sequence identity (e.g., at least 80%, 85%, 90%, 95%, or 99% sequence identity) to any of SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89,91,93, 95,97,99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169
  • the number of identical matches is divided by the length of the aligned region (i.e., the number of aligned nucleotides or amino acid residues) and multiplied by 100 to arrive at a percent sequence identity value.
  • the length of the aligned region can be a portion of one or both sequences up to the full-length size of the shortest sequence. It will be appreciated that a single sequence can align differently with other sequences and hence, can have different percent sequence identity values over each aligned region. It is noted that the percent identity value is usually rounded to the nearest integer.
  • BLAST searches can be performed to determine percent sequence identity between a DAT nucleic acid described herein and any other sequence or portion thereof aligned using the Altschul et al. algorithm.
  • BLASTN is the program used to align and compare the identity between nucleic acid sequences
  • BLASTP is the program used to align and compare the identity between amino acid sequences.
  • Nucleic acid molecules of the invention for example, those between about I0 and about 50 nucleotides in length, can be used, under standard amplification conditions, to amplify an isomerase or epimerase nucleic acid molecule.
  • Amplification of an isomerase or epimerase nucleic acid can be for the purpose of detecting the presence or absence of an isomerase or epimerase nucleic acid molecule or for the purpose of obtaining (e.g., cloning) an isomerase or epimerase nucleic acid molecule.
  • standard amplification conditions refer to the basic components of an amplification reaction mix, and cycling conditions that include multiple cycles of denaturing the template nucleic acid, annealing the oligonucleotide primers to the template nucleic acid, and extension of the primers by the polymerase to produce an amplification product (see, for example, U.S. Patent Nos.
  • the basic components of an amplification reaction mix generally include, for example, each of the four deoxynucleoside triphosphates, (e.g., dATP, dCTP, dTTP, and dGTP, or analogs thereof), oligonucleotide primers, template nucleic acid, and a polymerase enzyme.
  • deoxynucleoside triphosphates e.g., dATP, dCTP, dTTP, and dGTP, or analogs thereof
  • oligonucleotide primers e.g., oligonucleotide primers
  • template nucleic acid e.g., a polymerase enzyme
  • Template nucleic acid is typically denatured at a temperature of at least about 90°C
  • extension from primers is typically performed at a temperature of at least about 72°C.
  • PCR ligation chain reaction
  • LCR ligation chain reaction
  • the annealing temperature can be used to control the specificity of amplification.
  • the temperature at which primers anneal to template nucleic acid must be below the Tm of each of the primers, but high enough to avoid non-specific annealing of primers to the template nucleic acid.
  • the Tm is the temperature at which half of the DNA duplexes have separated into single strands, and can be predicted for an oligonucleotide primer using the formula provided in section 1 1.46 of Sambrook et al. ( 1989, Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York). Non-specific amplification products are detected as bands on a gel that are not the size expected for the correct amplification product.
  • Nucleic acid molecules of the invention for example, those between about 10 and several hundred nucleotides in length (up to several thousand nucleotides in length), can be used, under standard hybridization conditions, to hybridize to an isomerase or epimerase nucleic acid molecule.
  • Hybridization to an isomerase or epimerase nucleic acid molecule can be for the purpose of detecting or obtaining an isomerase or epimerase nucleic acid molecule.
  • standard hybridization conditions between nucleic acid molecules are discussed in detail in Sambrook et al. ( 1989, Molecular Cloning: A Laboratory Manual, 2 nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY; Sections 7.37-7.57, 9.47-9.57, 1 1.7-1 1 .8, and 1 1.45- 1 1 .57).
  • oligonucleotide probes less than about 100 nucleotides Sambrook et al. discloses suitable Southern blot conditions in Sections I 1 .45- 1 1 .46.
  • the Tm between a sequence that is less than 100 nucleotides in length and a second sequence can be calculated using the formula provided in Section 1 1 .46.
  • Sambrook et al. additionally discloses prehybridization and hybridization conditions for a Southern blot that uses oligonucleotide probes greater than about 100 nucleotides (see Sections 9.47-9.52). Hybridizations with an oligonucleotide greater than 100 nucleotides generally are performed 1 5-25°C below the Tm.
  • the Tm between a sequence greater than 100 nucleotides in length and a second sequence can be calculated using the formula provided in Sections 9.50-9.5 1 of Sambrook et al. Additionally, Sambrook et al. recommends the conditions indicated in Section 9.54 for washing a Southern blot that has been probed with an oligonucleotide greater than about 100 nucleotides.
  • the conditions under which membranes containing nucleic acids are prehybridized and hybridized, as well as the conditions under which membranes containing nucleic acids are washed to remove excess and non-specifically bound probe can play a significant role in the stringency of the hybridization.
  • hybridization and washing may be carried out under conditions of low stringency, moderate stringency or high stringency. Such conditions are described, for example, in Sambrook et al. section 1 1.45- 1 1 .46.
  • the conditions used to achieve a particular level of stringency will vary, depending on the nature of the nucleic acids being hybridized. For example, the length, degree of complementarity, nucleotide sequence composition (e.g., G/C vs.
  • A/T nucleotide content) and nucleic acid type (e.g., RNA v. DNA) of the hybridizing regions of the nucleic acids can be considered in selecting hybridization conditions.
  • washing conditions can be made more stringent by decreasing the salt concentration in the wash solutions and/or by increasing the temperature at which the washes are performed.
  • the amount of hybridization can be quantitated directly on a membrane or from an autoradiograph using, for example, a Phosphorlmager or a Densitometer (Molecular Dynamics, Sunnyvale, CA). It is understood by those of skill in the art that interpreting the amount of hybridization can be affected by, for example, the specific activity of the labeled oligonucleotide probe, the number of probe-binding sites on the target nucleic acid, and the amount of exposure of an autoradiograph or other detection medium.
  • any number of hybridization, washing and detection conditions can be used to examine hybridization of a probe nucleic acid molecule to immobilized target nucleic acids, it is more important to examine hybridization of a probe to target nucleic acids under identical hybridization, washing, and detection conditions.
  • the target nucleic acids are on the same membrane.
  • appropriate positive and negative controls should be performed with every set of amplification or hybridization reactions to avoid uncertainties related to contamination and/or non-specific annealing of oligonucleotide primers or probes.
  • Oligonucleotide primers or probes specifically anneal or hybridize to one or more isomerase or epimerase nucleic acids.
  • a pair of oligonucleotide primers generally anneal to opposite strands of the template nucleic acid, and should be an appropriate distance from one another such that the polymerase can effectively polymerize across the region and such that the amplification product can be readily detected using, for example, electrophoresis.
  • Oligonucleotide primers or probes can be designed using, for example, a computer program such as OLIGO (Molecular Biology Insights Inc., Cascade, CO) to assist in designing oligonucleotides.
  • OLIGO Molecular Biology Insights Inc., Cascade, CO
  • oligonucleotide primers are 10 to 30 or 40 or 50 nucleotides in length ⁇ e.g., 10, I 1 , 12, 13, 14, 15, 16, 1 7, 1 8, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides in length), but can be longer or shorter if appropriate amplification conditions are used.
  • Non-limiting representative pairs of oligonucleotide primers that were used to amplify isomerase nucleic acid molecules are shown in Tables 16, 26, 35, 37 and 38 (e.g., SEQ ID NOs:503-515, 5 17-543, and 545-548).
  • the sequences shown in SEQ I D NOs: 503-515, 5 17- 543, and 545-548 are non-limiting examples of oligonucleotide primers that can be used to amplify isomerase nucleic acid molecules.
  • Oligonucleotides in accordance with the invention can be obtained by restriction enzyme digestion of an isomerase or epimerase nucleic acid molecules or can be prepared by standard chemical synthesis and other known techniques.
  • an "isolated" nucleic acid molecule is a nucleic acid molecule that is separated from other nucleic acid molecules that are usually associated with the isolated nucleic acid molecule.
  • an "isolated" nucleic acid molecule includes, without limitation, a nucleic acid molecule that is free of sequences that naturally flank one or both ends of the nucleic acid in the genome of the organism from which the isolated nucleic acid is derived (e.g., a cDNA or genomic DNA fragment produced by PCR or restriction endonuclease digestion).
  • an isolated nucleic acid molecule is generally introduced into a vector ⁇ e.g., a cloning vector, or an expression vector) for convenience of manipulation or to generate a fusion nucleic acid molecule.
  • an isolated nucleic acid molecule can include an engineered nucleic acid molecule such as a recombinant or a synthetic nucleic acid molecule.
  • isolated nucleic acid molecules described herein having isomerase or epimerase activity can be obtained using techniques routine in the art, many of which are described in the Examples herein.
  • isolated nucleic acids within the scope of the invention can be obtained using any method including, without limitation, recombinant nucleic acid technology, the polymerase chain reaction (e.g., PCR, e.g., direct amplification or site-directed mutagenesis), and/or nucleic acid hybridization techniques (e.g., Southern blotting).
  • PCR polymerase chain reaction
  • nucleic acid hybridization techniques e.g., Southern blotting.
  • General PCR techniques are described, for example in PCR Primer: A Laboratory Manual, Dieffenbach & Dveksler, Eds., Cold Spring Harbor Laboratory Press, 1995.
  • Recombinant nucleic acid techniques include, for example, restriction enzyme digestion and ligation, which can be used to isolate an isomerase or epimerase nucleic acid molecule as described herein.
  • Isolated nucleic acids in accordance with the invention also can be chemically synthesized, either as a single nucleic acid molecule or as a series of oligonucleotides.
  • nucleic acids such as, e.g., subcloning, labeling probes (e.g., random-primer labeling using Klenow polymerase, nick translation, amplification), sequencing, hybridization, amplification and the like are well described in the scientific and patent literature, see, e.g., Sambrook et al., Eds., 1 989, Molecular Cloning: A Laboratory Manual (2 nd Ed.), VoIs 1 -3, Cold Spring Harbor Laboratory; Current Protocols in Molecular Biology, 1997, Ausubel, Ed. John Wiley & Sons, Inc., New York; Laboratory Techniques in Biochemistry and Molecular Biology: Hybridization with Nucleic Acid Probes, Part I. Theory and Nucleic Acid Preparation, Tijssen, Ed. Elsevier, N. Y. ( 1993).
  • labeling probes e.g., random-primer labeling using Klenow polymerase, nick translation, amplification
  • sequencing hybridization, amplification and the like
  • Isomerase and epimerase polypeptides refer to polypeptides that catalyze the stereochemical inversion around an asymmetric carbon atom of a substrate.
  • purified polypeptide refers to a polypeptide that has been separated from cellular components that naturally accompany it. Typically, a polypeptide is considered “purified” when it is at least partically free from the proteins and naturally occurring molecules with which it is naturally associated.
  • the extent of enrichment or purity of an isomerase or epimerase polypeptide can be measured using any appropriate method, e.g., column chromatography, polyacrylamide gel electrophoresis, or HPLC analysis.
  • the invention also provides for isomerase and epimerase polypeptides that differ in sequence from any of SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, 1 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 1 64, 166, 168, 170, 172, 174, 176, 1 78, 180, 182, 184, 186, 188, 190,
  • an isomerase or epimerase polypeptide e.g., SEQIDNOs:2, 4, 6, 8, 10, 12, 14, 16, 18,20,22,24,26,28, 30, 32,34,36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90,92,94,96,98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184,
  • changes can be introduced into an isomerase or epimerase nucleic acid coding sequence that lead to conservative and/or non-conservative amino acid substitutions at one or more amino acid residues in the encoded isomerase or epimerase polypeptide.
  • Polypeptides that differ in sequence from the amino acid sequences shown in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 1 8, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, 1 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 1 52, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 1 74, 176, 178, 1 80, 182, 184, 1 86, 188, 190, 192, 194, 196
  • changes in the polypeptide sequence can be introduced randomly along all or part of the isomerase or epimerase polypeptide, such as by saturation mutagenesis of the corresponding nucleic acid molecule.
  • changes can be introduced into a nucleic acid or polypeptide sequence by chemically synthesizing a nucleic acid molecule or polypeptide having such changes.
  • a "conservative amino acid substitution” is one in which one amino acid residue is replaced with a different amino acid residue having a similar side chain. Similarity between amino acid residues has been assessed in the art. For example, Dayhoff et al. ( 1978, in Atlas of Protein Sequence and Structure, 5(Suppl. 3):345-352) provides frequency tables for amino acid substitutions that can be employed as a measure of amino acid similarity.
  • conservative substitutions include, without limitation,, replacement of an aliphatic amino acid with another aliphatic amino acid; replacement of a serine with a threonine or vice versa; replacement of an acidic residue with another acidic residue; replacement of a residue bearing an amide group with another residue bearing an amide group; exchange of a basic residue with another basic residue; or replacement of an aromatic residue with another aromatic residue.
  • a non-conservative substitution is one in which an amino acid residue is replaced with an amino acid residue that does not have a similar side chain.
  • Changes in a nucleic acid can be introduced using one or more mutagens.
  • Mutagens include, without limitation, ultraviolet light, gamma irradiation, or chemical mutagens (e.g., mitomycin, nitrous acid, photoactivated psoralens, sodium bisulfite, nitrous acid, hydroxy lam ine, hydrazine or formic acid).
  • Other mutagens are analogues of nucleotide precursors, e.g., nitrosoguanidine, 5-bromouracil, 2-aminopurine, or acridine.
  • Intercalating agents such as proflavine, acriflavine, quinacrine and the like can also be used.
  • Changes also can be introduced into an isomerase or epimerase nucleic acid and/or polypeptide by methods such as error-prone PCR, shuffling, oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, gene reassembly (e.g., GeneReassembly, see, e.g., U.S. Patent No. 6,537,776), Gene Site Saturation Mutagenesis (GSSM), synthetic ligation reassembly (SLR), or a combination thereof.
  • methods such as error-prone PCR, shuffling, oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, gene reassembly
  • Changes also can be introduced into polypeptides by methods such as recombination, recursive sequence recombination, phosphothioate-modified DNA mutagenesis, uracil-containing template mutagenesis, gapped duplex mutagenesis, point mismatch repair mutagenesis, repair-deficient host strain mutagenesis, chemical mutagenesis, radiogenic mutagenesis, deletion mutagenesis, restriction-selection mutagenesis, restriction-purification mutagenesis, artificial gene synthesis, ensemble mutagenesis, chimeric nucleic acid multimer creation, or any combination thereof.
  • An isomerase or epimerase nucleic acid can be codon optimized if so desired.
  • a non-preferred or a less preferred codon can be identified and replaced with a preferred or neutrally used codon encoding the same amino acid as the replaced codon.
  • a preferred codon is a codon over-represented in coding sequences in genes in the host cell and a non-preferred or less preferred codon is a codon under-represented in coding sequences in genes in the host cell, thereby modifying the nucleic acid to increase its expression in a host cell.
  • An isomerase or epimerase nucleic acid can be optimized for particular codon usage from any host cell (e.g., any of the host cells described herein). See, for example, U.S. Patent No. 5,795,737 for a representative description of codon optimization.
  • nucleic acid can undergo directed evolution. See, for example, U.S. Patent No. 6,361 ,974.
  • Other changes also are within the scope of this disclosure.
  • one, two, three, four or more amino acids can be removed from the carboxy- and/or amino- terminal ends of an isomerase or epimerase polypeptide without significantly altering the biological activity.
  • one or more amino acids can be changed to increase or decrease the pi of a polypeptide.
  • a residue can be changed to, for example, a glutamate.
  • chimeric isomerase or epimerase polypeptides are also provided.
  • a chimeric isomerase or epimerase polypeptide can include portions of different binding or catalytic domains. Methods of recombining different domains from different polypeptides and screening the resultant chimerics to find the best combination for a particular application or substrate are routine in the art.
  • One particular change in sequence exemplified herein involves the residue corresponding to residue 76 in A. caviae BAR.
  • the polypeptide sequence of SEQ ID NO:441 was aligned with A. caviae BAR and the residue in SEQ ID NO:44 I that aligns with position 76 in A.
  • caviae BAR was identified (residue 56) and changed from Asp to Asn (SEQ ID NO:44 I with D56N). Those of skill in the art can readily identify the residue that corresponds to residue 76 from BAR A. caviae in any of the racemases disclosed herein and introduce a change into the polypeptide sequence at that particular residue.
  • the invention provides for racemase polypeptides having the sequence shown in SEQ I D NOs: 108, 1 10, 1 16, 244, 288 and 21 8 as well as racemase sequences that differ in sequence from SEQ I D NOs: 108, 1 10, 1 16, 244, 288 and 218.
  • racemase polypeptides having the sequence shown in SEQ ID NOs: 172, 1 78, 180, 182, 184, 140, 144, 1 88, 190, 1 12, 148, 156, 120 and 162 each exhibit 96% or greater sequence identity to the racemase polypeptide having the sequence shown in SEQ ID NO: 108; while SEQ ID NOs: 136, 1 74, 138 and 296 each exhibit 97% or greater sequence identity to SEQ ID NO: 1 10.
  • SEQ I D NOs: 150, 192, 152, 1 18, 194, 154, 196, 158 and 160 each exhibit 97% or greater sequence identity to SEQ ID NO: I 16; and SEQ ID NOs:248, 236, 246, 252, 250 and 254 each exhibit 97% or greater sequence identity to SEQ ID NO:244.
  • SEQ I D TM0s:274, 234, 220, 222, 226, 232, 240, 242, 258, 260, 264, 266, 286, 290, 170 and 2 16 each exhibit 97% or greater identity to SEQ ID NO:288;
  • SEQ ID NOs:208, 210, 228, 230, 270, 272, 278, 280, 282, 284, 292, 198, 212, 214 and 1 14 each exhibit 97% or greater sequence identity to SEQ I D NO:218; and
  • SEQ ID NO:204 exhibits 96% sequence identity to SEQ I D NO:21 8.
  • racemase polypeptides are exemplary and are not meant to be exhaustive of all possible sequence identities within or between the isomerase and epimerase nucleic acid and polypeptide sequences disclosed herein. Also as discussed herein, identifying and/or designing nucleic acid or polypeptide sequences that differ in sequence from, for example, one or more isomerase or epimerase sequences are well within the ordinary skill of those in the art.
  • racemase polypeptide having the sequence shown in SEQ ID NO:412 is novel; the closest polypeptide sequence in the public databases exhibits only 23% sequence identity to SEQ ID NO:4I 2. Therefore, polypeptides of the invention include polypeptides that have at least, for example, 25% sequence identity (e.g., at least 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% sequence identity) to SEQ ID NO:412 or fragments thereof and that have functional racemase activity.
  • the racemase polypeptide having the sequence shown in SEQ ID NO:412 is encoded by the nucleic acid having the sequence shown in SEQ ID NO:41 1.
  • SEQ ID NO:41 1 is a novel nucleic acid, for which the closest nucleic acid sequence in the public databases exhibits only 43% sequence identity to SEQ ID NO:4 I I . Therefore, nucleic acid molecules of the invention include molecules that have at least, for example, 45% sequence identity (e.g., at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% sequence identity) to SEQ ID 1X10:41 1 or fragments thereof and that encode a polypeptide that has functional racemase activity.
  • 45% sequence identity e.g., at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% sequence identity
  • a fragment of an isomerase and epimerase nucleic acid or polypeptide refers to a portion of a full-length isomerase and epimerase nucleic acid or polypeptide.
  • “functional fragments” are those fragments of an isomerase or epimerase polypeptide that retain the respective enzymatic activity.
  • “Functional fragments” also refer to fragments of an isomerase or epimerase nucleic acid that encode a polypeptide that retains the respective enzymatic activity.
  • functional fragments can be used in in vitro or in vivo reactions to catalyze transaminase or oxidation-reduction reactions, respectively.
  • PFAM domain from racemase polypeptides (PF0I 168 and PF00842; Finn et al., 2006, Nuc. Acids Res., Database Issue, 34:D247-D251).
  • the PFAM domain is a fragment of a full-legnth racemase polypeptide that lacks about 30 to about 40 amino acids from the N-terminus of the polypeptide and also lacks about 10 to about 20 amino acids from the C-terminus of the polypeptide.
  • sequences of representative PFAM domains are shown in SEQ ID NOs:426, 440 and 462.
  • This disclosure provides for isomerase and epimerase polypeptides (and the nucleic acids encoding such polypeptides) lacking signal sequences (e.g., signal peptides) or prepro domains, and also provides for isomerases and epimerases that include signal sequences or prepro domains.
  • the signal sequences or prepro domains can be isomerase or epimerase signal sequences or prepro domains, or such signal sequences or prepro domains can be obtained from non-isomerase, non-racemase and non-epimerase sequences.
  • Such signal sequences or prepro domains can be operably linked to an isomerase or epimerase nucleic acid or polypeptide.
  • a prepro domain can be located on the amino terminal or the carboxy terminal end of the polypeptide.
  • SignalP which can be used to identify signal peptides and cleavage sites.
  • Representative signal sequences (also known as leader sequences) for racemase polypeptides include, without limitation, MHKKTLLATLIXGLLAGQAVA (SEQ ID NO:50I ), wherein X is F or L, and MPFRRTLLAASLALLITGQAPLYA (SEQ ID NO:502).
  • Isomerase or epimerase polypeptides can be obtained (e.g., purified) from natural sources (e.g., a biological sample) by known methods such as DEAE ion exchange, gel filtration, and hydroxyapatite chromatography.
  • Natural sources include, but are not limited to, microorganisms such as bacteria and yeast.
  • a purified isomerase or epimerase polypeptide also can be obtained, for example, by cloning and expressing an isomerase or epimerase nucleic acid (e.g., SEQ ID NOs: I, 3, 5, 7, 9, II, 13, 15, 17, 19,21,23,25,27,29,31, 33,35,37,39,41,43,45,47,49,51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163,
  • GST glutathione S- transferase
  • pGEX Pharmacia Biotech Inc
  • pMAL New England Biolabs, Beverly, MA
  • pRIT5 Pharmacia, Piscataway, NJ
  • a purified isomerase or epimerase polypeptide can be obtained by chemical synthesis using, for example, solid-phase synthesis techniques (see e.g., Roberge, 1995, Science, 269:202; Merrifield, 1997, Methods EnzymoL, 289:3- 13).
  • a purified isomerase or epimerase polypeptide or a fragment thereof can be used as an immunogen to generate polyclonal or monoclonal antibodies that have specific binding affinity for one or more isomerase or epimerase polypeptides.
  • Such antibodies can be generated using standard techniques that are used routinely in the art.
  • Full-length isomerase or epimerase polypeptides or, alternatively, antigenic fragments of isomerase or epimerase polypeptides can be used as immunogens.
  • An antigenic fragment of an isomerase or epimerase polypeptide usually includes at least 8 (e.g., 10, 15, 20, or 30) amino acid residues of an isomerase or epimerase polypeptide (e.g., having the sequence shown in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, I 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 1 50, 152, 1 54, 156, 1 58, 160, 162, 164, 166
  • Polypeptides can be detected and quantified by any method known in the art including, but not limited to, nuclear magnetic resonance (NMR), spectrophotometry, radiography (protein radiolabeling), electrophoresis, capillary electrophoresis, high performance liquid chromatography (HPLC), thin layer chromatography (TLC), hyperdiffusion chromatography, various immunological methods, e.g.
  • NMR nuclear magnetic resonance
  • HPLC high performance liquid chromatography
  • TLC thin layer chromatography
  • hyperdiffusion chromatography various immunological methods, e.g.
  • the isomerase or epimerase polypeptides or the isomerase or epimerase nucleic acids encoding such isomerase and epimerase polypeptides, respectively, can be used in the conversion of one or more L-amino acids to the corresponding D-amino acid(s). It is noted that the reactions described herein are not limited to any particular method, unless otherwise stated. In one embodiment, one or more of the racemase nucleic acids or polypeptides disclosed herein can be used to produce D-tryptophan (or another D-amino acid) from L-tryptophan (or another L-amino acid), or vice versa. The reactions disclosed herein can take place in vivo, in vitro, or a combination thereof.
  • Constructs containing isomerase or epimerase nucleic acid molecules are provided.
  • Constructs, including expression vectors, suitable for expressing an isomerase or epimerase polypeptide are commercially available and/or readily produced by recombinant DNA technology methods routine in the art.
  • Representative constructs or vectors include, without limitation, replicons (e.g., RNA replicons, bacteriophages), autonomous self-replicating circular or linear DNA or RNA, a viral vector (e.g., an adenovirus vector, a retroviral vector or an adeno- associated viral vector), a plasmid, a phage, a phagemid, a cosmid, a fosmid, a bacteriophage or an artificial chromosome.
  • replicons e.g., RNA replicons, bacteriophages
  • autonomous self-replicating circular or linear DNA or RNA e.g., an adenovirus vector, a retroviral vector or an adeno- associated viral vector
  • a viral vector e.g., an adenovirus vector, a retroviral vector or an adeno- associated viral vector
  • a plasmid e.g., an adeno
  • the cloning vehicle can comprise an artificial chromosome comprising a bacterial artificial chromosome (BAC), a bacteriophage P I -derived vector (PAC), a yeast artificial chromosome (YAC), or a mammalian artificial chromosome (MAC).
  • BAC bacterial artificial chromosome
  • PAC bacteriophage P I -derived vector
  • YAC yeast artificial chromosome
  • MAC mammalian artificial chromosome
  • Exemplary vectors include, without limitation, pBR322 (ATCC 37017), pKK223-3, pSVK3, pBPV, pMSG, and pSVL (Pharmacia Fine Chemicals, Uppsala, Sweden), GEM I (Promega Biotech, Madison, Wl, USA) pQE70, pQE60, pQE-9 (Qiagen), pD I 0, psiX I 74 pBluescript Il KS, pNH8A, pNH l ⁇ a, pNH 18A, pNH46A, pSV2CAT, pOG44, pXTI , pSG (Stratagene), ptrc99a, pKK223-3, pKK233-3, DR540, pRIT5 (Pharmacia), pKK232-8 and pCM7.
  • GEM I Promega Biotech, Madison, Wl, USA
  • a vector or construct containing an isomerase or epimerase nucleic acid molecule can have elements necessary for expression operably linked to the isomerase or epimerase nucleic acid.
  • Elements necessary for expression include nucleic acid sequences that direct and regulate expression of nucleic acid coding sequences.
  • One example of an element necessary for expression is a promoter sequence.
  • Promoter sequences are sequences that are capable of driving transcription of a coding sequence.
  • a promoter sequence can be, for example, an isomerase or epimerase promoter sequence, or a non-isomerase or non-epimerase promoter sequence.
  • Non-isomerase and non-epimerase promoters include, for example, bacterial promoters such as lad, lacZ, T3, T7, gpt, lambda PR, lambdaPL and trp as well as eukaryotic promoters such as CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein I. Promoters also can be, for example, constitutive, inducible, and/or tissue-specific.
  • a representative constitutive promoter is the CaMV 35S; representative inducible promoters include, for example, arabinose, tetracycline-inducible and salicylic acid-responsive promoters.
  • Additional elements necessary for expression can include introns, enhancer sequences (e.g., an SV40 enhancer), response elements, or inducible elements that modulate expression of a nucleic acid.
  • Elements necessary for expression also can include, for example, a ribosome binding site for translation initiation, splice donor and acceptor sites, and a transcription terminator.
  • Elements necessary for expression can be of bacterial, yeast, insect, mammalian, or viral origin, and vectors or constructs can contain a combination of elements from different origins. Elements necessary for expression are described, for example, in Goeddel, 1990, Gene Expression Technology: Methods in Enzymology, 185, Academic Press, San Diego, CA.
  • a vector or construct as described herein further can include sequences such as those encoding a selectable marker (e.g., genes encoding dihydrofolate reductase or genes conferring neomycin resistance for eukaryotic cells; genes conferring tetracycline or ampicillin resistance for E. coli; and the gene encoding TRP l for S. cerevisiae), sequences that can be used in purification of an isomerase or epimerase polypeptide (e.g., 6xHis tag), and one or more sequences involved in replication of the vector or construct (e.g., origins of replication).
  • a selectable marker e.g., genes encoding dihydrofolate reductase or genes conferring neomycin resistance for eukaryotic cells; genes conferring tetracycline or ampicillin resistance for E. coli; and the gene encoding TRP l for S. cerevisiae
  • a vector or construct can contain, for example, one or two regions that have sequence homology for integrating the vector or construct.
  • Vectors and constructs for genomic integration are well known in the art.
  • operably linked means that a promoter and/or other regulatory element(s) are positioned in a vector or construct relative to an isomerase or epimerase nucleic acid in such a way as to direct or regulate expression of the isomerase or epimerase nucleic acid.
  • promoter and other elements necessary for expression that are operably linked to a transcribed sequence are physically contiguous to the transcribed sequence, i.e., they are cis-acting.
  • Some transcriptional regulatory sequences such as enhancers, however, need not be physically contiguous or located in close proximity to the coding sequences whose expression they enhance.
  • Host cells generally contain a nucleic acid sequence of the invention, e.g., a sequence encoding an isomerase or an epimerase, or a vector or construct as described herein.
  • the host cell may be any of the host cells familiar to those skilled in the art such as prokaryotic cells or eukaryotic cells including bacterial cells, fungal cells, yeast cells, mammalian cells, insect cells, or plant cells.
  • Exemplary bacterial cells include any species within the genera Escherichia, Bacillus, Streptomyces, Salmonella, Pseudomonas and Staphylococcus, including, e.g., E. coli, L. lactis, B. subtilis, B.
  • Exemplary fungal cells include any species of Aspergillus, and exemplary yeast cells include any species of Pichia, Saccharomyces, Schizosaccharomyces, or Schwanniomyces, including P. pastoris, S. cerevisiae, or S. pombe.
  • Exemplary insect cells include any species of Spodoptera or Drosophila, including Drosophila S2 and Spodoptera Sj9.
  • Exemplary animal cells include CHO, COS, Bowes melanoma, C 127, 3T3, HeLa and BHK cell lines. See, for example, Gluzman, 1981 , Cell, 23: 175. The selection of an appropriate host is within the abilities of those skilled in the art.
  • a vector or construct can be introduced into host cells using any of a variety of techniques, including transformation, transfection, transduction, viral infection, gene guns, or Ti-mediated gene transfer.
  • Particular methods include calcium phosphate transfection, DEAE-Dextran mediated transfection, lipofection, or electroporation (Davis et al., 1986, Basic Methods in Molecular Biology).
  • Exemplary methods include CaPO 4 precipitation, liposome fusion, lipofection (e.g., LIPOFECTINTM), electroporation, viral infection, etc.
  • the isomerase or epimerase nucleic acids may stably integrate into the genome of the host cell (for example, with retroviral introduction) or may exist either transiently or stably in the cytoplasm (i.e. through the use of traditional plasmids, utilizing standard regulatory sequences, selection markers, etc.).
  • the content of host cells usually is harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained for further purification.
  • Microbial cells employed for expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or the use of cell lysing agents. Such methods are well known to those skilled in the art.
  • the expressed polypeptide or fragment thereof can be recovered and purified from cell cultures by methods including, but not limited to, precipitation (e.g., ammonium sulfate or ethanol), acid extraction, chromatography (e.g., anion or cation exchange, phosphocellulose, hydrophobic interaction, affinity, hydroxylapatite and lectin). If desired, high performance liquid chromatography (HPLC) can be employed for final purification steps.
  • precipitation e.g., ammonium sulfate or ethanol
  • acid extraction e.g., ethanol
  • chromatography e.g., anion or cation exchange, phosphocellulose, hydrophobic interaction, affinity, hydroxylapatite and lectin.
  • HPLC high performance liquid chromatography
  • Cell-free translation systems can also be employed to produce a polypeptide of the invention.
  • Cell-free translation systems can use mRNAs transcribed from a DNA construct comprising a promoter operably linked to a nucleic acid encoding the polypeptide or fragment thereof.
  • the DNA construct may be linearized prior to conducting an in vitro transcription reaction.
  • the transcribed mRNA is then incubated with an appropriate cell-free translation extract, such as a rabbit reticulocyte extract, to produce the desired polypeptide or fragment thereof.
  • An isomerase or epimerase polypeptide, a fragment, or a variant thereof can be assayed for activity by any number of methods.
  • Methods of detecting or measuring the activity of an enzymatic polypeptide generally include combining a polypeptide, fragment or variant thereof with an appropriate substrate and determining whether the amount of substrate decreases and/or the amount of product increases.
  • the sustrates used to evaluate the activity of a number of racemases disclosed herein typically were one or more L-amino acids (e.g., L-tryptophan) and the products were the corresponding isomerized D-amino acid (e.g., D-tryptophan).
  • racemases may show very little preference for or between substrate amino acids (e.g., broad activity racemases), while other racemases may exhibit a preference for one or more amino acids.
  • polypeptides disclosed herein also will utilize substituted L- or D-amino acid substrates such as, without limitation, chlorinated tryptophan or 5-hydroxytryptophan.
  • D-isomers can be distinguished and/or separated from L-isomerase using methods known in the art including, but not limited to, chiral chromatography, simulated moving bed (SMB) continuous chromatography, chiral ausiliaries, and/or enzymatic resolution.
  • an isomerase or epimerase polypeptide exhibits activity in the range of between about 0.05 to 20 units (e.g., about 0.10, 0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.90, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7, 8, 9, 10, I I , 12, 13, 14, 15, 16, 17, 18, 19 or 19.5 or more units).
  • a unit equals one ⁇ mol of product released per minute per mg of enzyme.
  • one unit of activity for a racemase polypeptide is one ⁇ mol of an isomer with inverted configuration (from the starting isomer) produced per minute per mg of enzyme (formed from the respective alpha-amino acid or amine).
  • one unit of activity for an amino acid racemase is one ⁇ mol of R-amino acid produced per minute per mg of enzyme formed from the corresponding S-amino acid.
  • one unit of activity for an amino acid racemase is one ⁇ mol of S-amino acid produced per minute per mg of enzyme formed from the corresponding R-amino acid.
  • the conversion of L-tryptophan to D-tryptophan using one or more of the isomerase or epimerase nucleic acids or polypeptides disclosed herein can be performed in vitro or in vivo, in solution or in a host cell, in series or in parallel.
  • the desired ingredients for the reaction(s) can be combined by admixture in an aqueous reaction medium or solution and maintained for a period of time sufficient for the desired product(s) to be produced.
  • one or more isomerase or epimerase polypeptides used in the one or more of the reactions described herein can be immobilized onto a solid support.
  • solid supports include those that contain epoxy, aldehyde, chelators, or primary amine groups.
  • suitable solid supports include, but are not limited to, Eupergit® C (Rohm and Haas Company, Philadelphia, PA) resin beads and SEPABEADS® EC- EP (Resindion).
  • an isomerase or epimerase nucleic acid e.g., an expression vector
  • an isomerase or epimerase nucleic acid can be introduced into any of the host cells described herein.
  • many or all of the co-factors e.g., a metal ion, a coenzyme, a pyridoxal-phosphate, or a phosphopantetheine
  • substrates necessary for the conversion reactions to take place can be provided in the culture medium. After allowing the in vitro or in vivo reaction to proceed, the efficiency of the conversion can be evaluated by determining whether the amount of substrate has decreased or the amount of product has increased.
  • the activity of one or more of the isomerase or epimerase polypeptides disclosed herein can be improved or optimized using any number of strategies known to those of skill in the art.
  • the in vivo or in vitro conditions under which one or more reactions are performed such as pH or temperature can be adjusted to improve or optimize the activity of a polypeptide.
  • the activity of a polypeptide can be improved or optimized by re-cloning the isomerase or epimerase nucleic acid into a different vector or construct and/or by using a different host cell.
  • a host cell can be used that has been genetically engineered or selected to exhibit increased uptake or production of tryptophan (see, for example, U.S. Patent No.
  • an isomerase or epimerase polypeptide can be improved or optimized by ensuring or assisting in the proper folding of the polypeptide (e.g., by using chaperone polypeptides) or in the proper post-translational modifications such as, but not limited to, acetylation, acylation, ADP-ribosylation, amidation, glycosylation, hydroxylation, iodination, methylation, myristolyation, oxidation, pegylation, phosphorylation, prenylation, selenoylation, sulfation, disulfide bond formation, and demethylation as well as covalent attachment of molecules such as flavin, a heme moiety, a nucleotide or nucleotide derivative, a lipid or lipid derivative, and/or a phosphytidylinositol.
  • solubility of a polypeptide can be increased using any number of methods known in the art such as, but not limited to, low temperature expression or periplasmic expression.
  • a number of polypeptides were identified herein that exhibit racemase activity. Specifically, SEQ ID NOs:412, 400, 406, 410, 408, 416, 41 8, 424, 440, 442, 444, 446, 454, 442, 474, 476, 322, 336, 338 and 442 exhibit isomerase activity.
  • the sequence shown in SEQ ID NO:412 represents a very unique racemase, as the most homologous sequence in the public databases has only 30% sequence identity to SEQ ID NO:412.
  • SEQ ID NO:412 exhibited low solubility in its native form, SEQ ID NO:412 still exhibits very effective racemase activity.
  • SEQ ID NO:322 also is unique as the encoded polypeptide is only 232 amino acids, making SEQ ID NO:322 the smallest active polypeptide identified.
  • monatin is used to refer to compositions including all four stereoisomers of monatin, compositions including any combination of monatin stereoisomers, (e.g., a composition including only the R,R and S, S, stereoisomers of monatin), as well as a single isomeric form (or any of the salts thereof).
  • monatin is known by a number of alternative chemical names, including: 2-hydroxy-2-(indol-3-ylmethyl)-4-aminoglutaric acid; 4-amino-2-hydroxy-2- (l H-indol-3-ylmethyl)-pentanedioic acid; 4-hydroxy-4-(3-indolylmethyl)glutamic acid; and, 3- (l -amino- l ,3-dicarboxy-3-hydroxy-but-4-yl) indole.
  • 61/01 8,814 can be used to generate a desired percentage or a minimum or maximum desired percentage of one or more particular monatin stereoisomers (e.g., R, R monatin) relative to other monatin stereoisomers (e.g., S, R monatin).
  • monatin stereoisomers e.g., R, R monatin
  • amino acid racemases that do not isomerize significant amounts of monatin are used rather than racemases that isomerize monatin as a method of maintaining the desired percentage of one or more particular monatin stereoisomers.
  • Monatin that is produced utilizing one or more of the racemase polypeptides disclosed herein can be at least about 0.5-30% R,R-monatin by weight of the total monatin produced.
  • the monatin produced using one or more of the polypeptides disclosed herein is greater than 30% R,R-monatin, by weight of the total monatin produced; for example, the R,R-monatin is 40%, 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or 99% of the total monatin produced.
  • various amounts of two or more preparations of monatin can be combined so as to result in a preparation that is a desired percentage of R,R-monatin.
  • Monatin produced using one or more of the racemase polypeptides disclosed herein can be, for example, a derivative. "Monatin derivatives" have the following structure:
  • R a , R b , R c , R d , and R e each independently represent any substituent selected from a hydrogen atom, a hydroxyl group, a C 1 -C 3 alkyl group, a C 1 -C 3 alkoxy group, an amino group, or a halogen atom, such as an iodine atom, bromine atom, chlorine atom, or fluorine atom.
  • R a , R b , R c , R d , and R e cannot simultaneously all be hydrogen.
  • Rb and R c , and/or R d and R e may together form a C 1 -C 4 alkylene group, respectively.
  • “Substituted monatin” refers to, for example, halogenated or chlorinated monatin or monatin derivatives. See, for example, U.S. Publication No. 2005/01 183 17.
  • Monatin derivatives also can be used as sweeteners.
  • chlorinated D- tryptophan particularly 6-chloro-D-tryptophan, which has structural similarities to R,R monatin, has been identified as a non-nutritive sweetener.
  • halogenated and hydroxy-substituted forms of monatin have been found to be sweet. See, for example, U.S. Publication No. 2005/01 18317.
  • Substituted indoles have been shown in the literature to be suitable substrates and have yielded substituted tryptophans. See, for example, Fukuda et al., 1971 , Appl. Environ. Microbiol., 21 :841 -43.
  • halogens and hydroxyl groups should be substitutable for hydrogen, particularly at positions 1 -4 of the benzene ring in the indole of tryptophan, without interfering in subsequent conversions to D- or L-tryptophan, indole-3- pyruvate, MP, or monatin.
  • Part A describe the methodologies used for characterization of the candidate isomerase and epimerase nucleic acids and the encoded polypeptides. Additional characterization of a subset of the isomerase and epimerase nucleic acids and polypeptides, particularly the nucleic acids encoding amino acid racemases, is described in Part B.
  • racemases that were discovered had native signal/leader sequences.
  • the signal sequences and corresponding cleavage sites were identified by SignalP 3.0 (at cbs.dtu.dk/services/SignalP/ on the World Wide Web). It was observed that clones containing racemases with leader sequences tended to be more difficult to grow. The clones grew well with fresh transformations, however they did not grow well when they were subcultured or inoculated from glycerol stocks. Samples were grown (or, at least, attempted) a minimum of two times. [0085] The table below indicates several clones that contained their native signal sequences.
  • the vector pSE420-cHis is a derivative of pSE420 (Invitrogen, Carlsbad, CA).
  • the vector was cut with Ncol and HindIII, and ligated with C- His: C-His: 5'-CCA TGG GAG GAT CCA GAT CTC ATC ACC ATC ACC ATC ACT AAG CTT-3- (SEQ ID ⁇ O:569).
  • the expression of the His-tag in this vector depends on the choice of host and stop codon.
  • the cells could potentially be expelling the plasmids, thereby losing the antibiotic resistance over time.
  • the number of rounds of growth for racemases with leader sequences was minimized. This was done by storing the DNA and performing fresh transformations each time the constructs were used.
  • the host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay in the presence of the respective leader sequences. Additionally, under optimized conditions, it is expected that racemases could have tryptophan racemase activity with or without leader sequences (native or artificial such as PeIB).
  • Example 2 Improvement of SEQ ID NO:4 l 2 solubility using ArcticExpressTM hosts
  • SEQ ID NO:4 l I nucleic acid encoding a racemase polypeptide having the sequence of SEQ ID NO:4 l 2 was analyzed by SDS-PAGE.
  • SEQ I D NO:4 l l nucleic acid expressed well and the resulting SEQ ID NO:4 I 2 polypeptide had high activity even though only a portion ( ⁇ 20%) of the corresponding band in the total protein lane was soluble.
  • the racemase was moved into two ArcticExpressTM hosts (Stratagene, La JoHa, CA).
  • the racemase was subcloned into the pET28b vector and the DNA was transformed into ArcticExpressTM(DE3) and ArcticExpressTM(DE3)RI L and plated on LB kanamycin 50 ⁇ g/mL, gentamicin 20 ⁇ g/mL, and LB kanamycin 50 ⁇ g/mL, gentamicin 20 ⁇ g/mL, streptomycin 75 ⁇ g/mL, respectively.
  • the pET28b vector was also transformed into each host as a negative control. Samples were grown overnight at 30°C. Four colonies were picked for each construct from each ArcticExpressTM host.
  • the flasks were transferred to an 1 1°C incubator and allowed to incubate for 10 minutes prior to inducing with a final concentration of 1 mM IPTG. Samples were induced overnight at M °C at 210 rpm (with the exception of DE3-2 and DE3-4, which were induced at 1 6°C).
  • Table 5 Racemase activity of SEQ ID NO:412 expressed from constructs in ArcticExpressTM(DE3) or ArcticExpressTM(DE3)RlL
  • ArcticExpress(DE3)RIL at a 7.5 ⁇ g/mL total protein loading. All the constructs were also active when the protein was loaded at 0.75 and 0.075 ⁇ g/mL total protein. The vector/host controls had little or no activity compared to the racemase constructs.
  • racemase polypeptide having the sequence of SEQ ID NO:412 was active and soluble expression was improved in ArcticExpressTM(DE3) &
  • PFAM domains were selected for subcloning in order to determine if the PFAM domain was sufficient to detect racemase activity.
  • the samples were subcloned into the pSE420-cHis vector (His-tag not expressed) in MB2946 host cells (Strych & Benedik, 2002, J. Bacteriology, 184:4321 -5).
  • the subclones were SEQ ID NOs:425, 439 and 461 encoding SEQ ID NOs:426, 440 and 462, respectively.
  • the polypeptide having the sequence shown in SEQ ID NO:426 was selected for activity testing. Flasks containing 50 mL LB, 100 ⁇ g/mL carbenicillin and 50 mM D-alanine were inoculated from glycerol stocks and grown overnight at 37°C with shaking. The following morning, flasks containing 400 mL LB, 100 ⁇ g/mL carbenicillin and 50 mM D-alanine were inoculated to an OD 600nm of 0.05. Cultures were grown at 37°C with shaking and induced with I mM IPTG when the OD 600nm was between 0.5-0.8. Cultures were induced overnight at 30°C.
  • Cell pellets were collected by centrifugation at 6000 rpm for 20 minutes. Cell pellets were resuspended in 20 mL of 50 mM sodium phosphate buffer pH 7.5 with 26 U/ml DNAsel. Cell pellets were lysed in a microfluidizer (Microfluidics Corporation, Newton, MA) per the manufacturer's instructions. Samples were centrifuged at 12,000 rpm for 30 minutes and the soluble fraction was collected. Protein concentration was determined by comparing the absorbance of cell extract containing the SEQ ID NO:426 polypeptide to known standards in the Bio-Rad Protein Assay reagent (Bio-Rad, Hercules, CA).
  • Racemases were loaded at I 0 mg/mL total protein and incubated with I 0 inM L-tryptophan and I 0 ⁇ M PLP at pH 8 and a temperature of 37°C. At indicated timepoints (0, 2, 4, and 24 hours), 50 ⁇ L of the reaction product was added to 1 50 ⁇ L of ice-cold acetonitrile. Samples were vortexed for 30 seconds and passed through a 0.2 ⁇ m filter and the filtrate was then diluted fifty-fold in methanol. Samples were then analyzed by LC/MS/MS (as described in Example 4) to monitor the D-tryptophan formed (Table 6).
  • E.coli MB2946 host cells (Strych & Benedik, supra) was used as the negative control, while wild type Pseudomonas putida KT2440 BAR was used as a positive control.
  • the racemase having SEQ ID NO:426 was active under the conditions described in Example 4. The results above thereby demonstrate that a racemase PFAM domain could have sufficient structural elements to maintain racemase catalytic activity.
  • Gycerol stocks were used to inoculate flasks containing 50 mL of LB with the appropriate antibiotic.
  • the starter culture was grown overnight at 37°C and 230 rpm.
  • the OD 600nm of starter culture was checked, and used to inoculate a 400 ml culture to an OD 600nm of 0.05.
  • the culture was incubated at 37°C and 230 rpm, and the OD 600nm was checked periodically.
  • the cultures were induced, typically with I mM IPTG, when the OD 600nm reached between 0.5-0.8. Induced cultures were incubated overnight at 30°C and 230 rpm.
  • the culture was harvested by pelleting cells at 4000 rpm for 15 minutes. The supernatant was poured off, and cell pellets were either lysed immediately or frozen for later use.
  • Enzymes were prepared for the activity assay by resuspending in 50 mM sodium phosphate (pH 7.5). The racemase assays were usually run with about I 0 - 20 mg/ml total protein.
  • LC/MS/MS screening was achieved by injecting samples from 96-well plates using a CTCPaI auto-sampler (LEAP Technologies, Carrboro, NC) into a 70/30 MeOH/H 2 O (0.25% AcOH) mixture provided by LC- l 0ADvp pumps (Shimadzu, Kyoto, Japan) at 0.8 mL/min through a Chirobiotic T column (4.6 x 250 mm) and into the API4000 Turbolon-Spray triple-quad mass spectrometer (Applied Biosystems, Foster City, CA). Ion spray and Multiple Reaction Monitoring (MRM) were performed for the analytes of interest in the positive ion mode and each analysis lasted 15.0 minutes. D- and L-tryptophan parent/daughter ions: 205.16/ 1 88.20.
  • MRM Multiple Reaction Monitoring
  • SEQ ID NOs:40 l , 385, 395, 413, 419, 421 , 425, 437, 427, 433, and 435 are racemase subclones that, when expressed (into polypeptides having the sequence of SEQ ID NOs:402, 386, 396, 414, 420, 422, 426, 438, 428, 434 and 436, respectively) were active under the conditions described in Part A.
  • racemase subclones were in the pSE420-cHis vector / MB2946 host (Strych
  • NO:413 was in pSE420-cHis / Top l 0 host (Invitrogen, Carlsbad,CA). The His-tag was not expressed in any of these subclones.
  • Example 4 The subclones were grown, lysed and lyophilized according to the procedures described in Example 4. Samples were tested for activity using a racemase assay (as described in Example 4). Racemases were incubated with I 0 mM L-tryptophan and I 0 ⁇ M PLP at pH 8 and 37°C. All racemases were utilized at I 0 mg/mL total protein concentration with the exception of the polypeptide having the sequence of SEQ ID NO:402. This polypeptide having the sequence of SEQ ID NO:402 was used at 5 mg/mL total protein concentration because there was not enough biomass to allow for a higher loading.
  • Tables 7, 8, 9 and I 0 show the racemase activity over time. Samples that were assayed together are grouped together in a single table.
  • racemases having the sequence shown in SEQ ID NOs:402, 386,
  • racemase activity may be attributed to differences in host strains, expression conditions, post-expression cell handling and assay protein-loading.
  • Example 3 for activity data for a racemase having SEQ I D NO:426. It is noted that the racemase having SEQ ID NO:428 is not included here because it did not reach an inducible OD 600 and, therefore, was not induced.
  • One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., 1995, Gene, l 64( l ):49-53.
  • the plasmids were transformed into E. coli XL- I Blue (Novagen/EMD Biosciences, San Diego, CA) cells as per manufacturer instructions.
  • Transformants were grown overnight at 37°C and 250 rpm in 5 ml LB containing ampicillin ( 100 ⁇ g/mL). Overnight cultures were used to inoculate 25 mL of the same media in 250 mL baffled shake flasks. Cultures were grown at 30°C and 250 rpm until they reached an OD 60O of 0.6, after which protein expression was induced with I mM IPTG for 4.25 h at 30°C. Samples for total protein were taken prior to induction and right before harvesting. Cells were harvested by centrifugation and frozen at -80°C.
  • Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 ⁇ L/m L of Protease Inhibitor Cocktail Il (Calbiochem, San Diego, CA) and I ⁇ l/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14,000 rpm for 20 min (at 4°C) and the supernatant was carefully removed.
  • Example 17 For the tryptophan racemase assay, desalted protein was added to target 50 ⁇ g racemase protein for each enzyme assay. Calculations were based on Pierce BCA total protein analysis with BSA as the standard (Pierce Biotechnology, Inc., Rockford, IL) and SDS-PAGE visual inspection. Formation of D-tryptophan was measured at 2 hours, and 21 hours. A cell- free extract of empty vector pSE420-cHis served as a negative control.
  • detectable activity was observed for candidates having the polypeptide sequence shown in SEQ ID NO:386, 396 and 402 using conditions described in Part A, but was not observed using conditions described in Part B (see, for example, Example 5). Detectable activity was not observed for the polypeptide shown in SEQ I D NO:394 under the conditions described in Part A, while very low activity (barely detectable at 21 hours) was observed for the racemase polypeptide shown in SEQ I D NO:394 under the conditions described in Part B. [00124] Some constructs were observed, under the conditions described in Part A, to be unstable in expression systems, particularly those with a leader sequence.
  • the host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity. [00125] It is expected that the presence of activity in a polypeptide encoded from a subcloned nucleic acid is predictive of the presence of activity in the corresponding polypeptide encoded from the full-length or wild type nucleic acid, as indicated below in Table 13.
  • Racemases having the polypeptide sequence of SEQ ID NO:412 and 414 were both found to be active when assayed for tryptophan racemase activity under the conditions described in Part A.
  • One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra. It should be noted that 10 mg of total protein in the form of lyophilized cell extracts was used in Part A when evaluating racemase activity (see Example 4). In some cases, this was ten times as much total soluble protein as was used in the assays described in Part B. This difference in the amount of protein used in the assays (i.e., of Part A vs. Part B) may explain, at least in part, some of the differences in activity observed with the same polypeptide.
  • the nucleic acid having the sequence of SEQ ID NO:4 I I which encodes the polypeptide having the sequence of SEQ ID NO:4 I 2, was expressed in the MB2946 host and found to be highly active.
  • the host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity.
  • Example 8 Polypeptide having the sequence of SEQ ID NO:412 is more active on tryptophan than alanine
  • SEQ ID NO:4 I 2 was PCR-amplified with Ncol and Xho ⁇ restriction sites for subcloning into pET28 ( ⁇ ovagen / EMD Chemicals, San Diego, CA).
  • pET28 constructs were created with and without a C-terminal His tag (tagged constructs were created by using a reverse primer without a stop codon in the PCR).
  • pET26b constructs were created with a C-terminal His tag. Constructs were sequenced for accuracy (Agencourt Bioscience Inc., Beverly MA) and used to transform BL2 I (DE3) (Novagen/EMD Biosciences, San Diego, CA).
  • Transformants were grown and induced in OvemightExpressTM media and cell- free extracts were prepared as described herein. Proteins were purified from tagged constructs on Novagen/EMD Biosciences His-bind columns (Novagen/EMD Biosciences, San Diego, CA) and desalted on PD- I 0 columns; for untagged constructs, cell-free extracts were desalted on PD- I 0 columns. [00137] Protein concentrations were determined by Pierce BCA protein assay and racemase purity was determined by Experion Automated Gel System (Experion, version A.0l . l 0, Biorad, Hercules, CA). Racemase assays were performed on purified and crude protein extracts as described in Example 17.
  • Racemase expression in the pET26b construct was lower than the pET28 vector, however, active protein having the sequence shown in SEQ I D NO:4 I 2 was obtained.
  • Results for SEQ ID NO:4 l I / pET28 (encoding the polypeptide having the sequence of SEQ ID NO:4 l 2) are shown in this example.
  • Purified protein having the sequence shown in SEQ ID NO:4l 2 from construct in pET28 was further characterized for racemase activity on tryptophan, alanine, and monatin. Tryptophan, monatin, and alanine assays were performed as described in Example 17, with /1 caviae D76N serving as positive control for racemization assays. The analytical methodology is described in Example 18. It should be noted that, at the time these analyses were performed, the analysis of D-alanine was less quantitative than the analysis for D-tryptophan. Racemase activity of SEQ ID NO:412 for tryptophan and alanine
  • SEQ ID NO:4l 2 consistently gave higher D-trp activity than the control racemase candidate, BAR, A. caviae D76N.
  • SEQ ID NO:4 l 2 appears to have a higher preference for tryptophan versus alanine as a substrate for racemization.
  • A. caviae D76N BAR while active on tryptophan has a preference for alanine as a substrate.
  • the ability of purified SEQ ID NO:4 l 2 to racemize 7 additional L-amino acids was evaluated and the details are reported in Example 10.
  • the polypeptide having the sequence of SEQ ID NO:4 l 2 had little to no impact on L- alanine.
  • the SEQ ID NO:4 l 4 polypeptide retained 96% - 100% of its tryptophan racemase activity between 20 minutes to the end of the assay at three hours.
  • BAR A. caviae D76N only retained 37%-55% of its tryptophan racemase activity in the presence of L-alanine, during the same time period.
  • the preference of the racemase having SEQ ID NO:4 I 4 for tryptophan as a substrate is advantageous in the presence of competing substrates like alanine.
  • Example 9 Racemases lacking monatin racemization activity.
  • Example 10 The polypeptide having the sequence of SEQ ID NO:412 is a broad specificity amino acid racemase
  • the polypeptide having the sequence of SEQ I D NO:4 I 2 appears to be an amino acid racemase with broad substrate specificity and seems to prefer bulky, hydrophobic amino acids.
  • Racemase activity for various amino acids as substrates was observed as follows, under the conditions of the assay as described: [Leucine / Phenylalanine / Tryptophan / Methionine] > [Tyrosine / Alanine] > [Lysine / Aspartic Acid] > Glutamate.
  • analytical methods for detection of all of the above D- amino acids with the exception of tryptophan are semi-quantitative so these results indicate a trend in racemase activity.
  • Example 1 Methods to improve solubility of an insoluble protein and its activity on tryptophan
  • polypeptide having the sequence of SEQ ID NO:4 l 2 showed lower solubility than other racemase candidates described in this application, under the expression conditions tested.
  • the insoluble fraction of the SEQ ID NO:4l 2 polypeptide was tested for racemization activity on tryptophan.
  • Cell-free extracts of SEQ ID NO:4 l l/pET28 were prepared from frozen cell pellets by adding 5 ml of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 ⁇ L/mL of Protease Inhibitor Cocktail II (Calbiochem, San Diego, CA) and I ⁇ l/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA), per gm of cell pellet. Cell pellet suspensions were incubated at room temperature with gentle mixing for 15 min; cells pellets were spun out at 14000 rpm for 20 min (at 4°C) and retained for assays.
  • Bugbuster Amine Free Novagen/EMD Biosciences, San Diego, CA
  • Protease Inhibitor Cocktail II Calbiochem, San Diego, CA
  • I ⁇ l/ml of benzonase nuclease Novagen/EMD Biosciences, San Diego, CA
  • SEQ ID NO:4I 2 is not a membrane associated protein, which might be a possibility given the lack of solubility but the presence of activity.
  • Competent cells of ArcticExpressTM (DE3) were transformed with SEQ ID NO: 1
  • Transformants were grown in LB containing kanamycin (50 mg/L) and gentainycin (20 mg/L) overnight at 37°Cand 250 rpm. A 2% inoculum was transferred to 50 mL of Novagen OvernightExpressTM AutoinductionSystem 2 (EMD Biosciences/Novagen catalog
  • CopyCutterTM EPI400TM cells were transformed with SEQ ID NO:4 I l/pET28 as per manufacturer instructions (Epicentre Biotechnologies, Madison, WI). Liquid cultures of transformants were grown overnight (LB kanamycin 50, 37°C, 250 rpm) and used to inoculate shake flasks containing 25 mL LB media, kanamycin (50 mg/L) and I X CopyCutterTM induction solution. Cultures were grown at 30°C and 250 rpm for 5 hours. Cultures were harvested and cell extracts were prepared as described herein. SDS-PAGE analysis of total and soluble protein was conducted.
  • E. coli HMS I 74 Novagen/EMD Biosciences, San Diego, CA
  • E. coli HMS I 74 Novagen/EMD Biosciences, San Diego, CA
  • SEQ ID NO:4 l 2 polypeptide was observed based on SDS-PAGE analyses.
  • the nucleic acid encoding the SEQ ID NO:4 I 2 polypeptide i.e., SEQ I D NO:4 I I
  • SEQ I D NO:4 I I was subcloned into a derivative of the pET23d vector (Novagen, Madison, Wl) containing the E. coli metE gene and promoter inserted at the NgoMIV restriction site and a second Psi ⁇ restriction site that was added for facile removal of the beta-lactamase gene (bid).
  • the construction of this vector is described in WO 2006/066072. This construct was transformed into E.
  • racemases were provided as pSE420-cHis clones.
  • One skilled in the art could synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra.
  • the plasmids were transformed into TOP l 0 chemically competent cells (Invitrogen, Carlsbad, CA). Overnight cultures grown in LB carbenicillin ( 100 ⁇ g/ml) were diluted a hundred-fold in 50 ml LB carbenicillin ( 100 ⁇ g/ml) in a 250 ml baffled flask.
  • Cultures were grown at 30°C with agitation at 250 rpm until they reached an OD 600 of 0.5 to 0.8, after which protein expression was induced with 1 mM IPTG for 4 h at 30°C. Samples for total protein were taken prior to induction and right before harvesting. Cells were harvested by centrifugation. Cells were frozen at -80°C.
  • Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 ⁇ L/m L of Protease Inhibitor Cocktail H (Calbiochem, San Diego, CA) and I ⁇ l/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14000 rpm for 20 min (at 4°C) and the supernatant was carefully removed.
  • Bugbuster Amine Free Novagen/EMD Biosciences, San Diego, CA
  • Protease Inhibitor Cocktail H Calbiochem, San Diego, CA
  • I ⁇ l/ml of benzonase nuclease Novagen/EMD Biosciences, San Diego, CA
  • tryptophan racemase assay a total of 650 ⁇ g of desalted protein was added for each enzyme based on Pierce BCA total protein analysis with BSA as the standard (Pierce Biotechnology, Inc.. Rockford, IL). Formation of D-tryptophan was measured at 30 minutes, 2 hours, 4 hours and 24 hours.
  • pSE420-cHis cell-free extract of the SEQ ID NO:4 l2 polypeptide served as a positive control for the assay, and cell-free extract of empty vector pSE420-cHis served as a negative control.
  • Racemase candidate proteins were purified from tagged constructs and desalted on PD- I 0 columns. Untagged racemase candidate cell-free extracts were desalted on PD- I 0 columns. Protein concentrations were determined by Pierce BCA protein assay (Pierce Biotechnology, Inc., Rockford, I L) and racemase purity was estimated by Experion Automated Gel System (Experion, version A.0L I 0, Biorad, Hercules, CA).
  • Racemase assays were performed on purified and crude protein extracts as described in Example 1 7. Purified protein having the sequence shown in SEQ I D NO:4 I 2 served as a positive control. For the assay, 5 ⁇ g of equivalent BAR protein was added for the positive control, and an estimated 50 ⁇ g equivalent BAR protein was added for each of the other enzymes based on Pierce BCA total protein analysis and racemase purity estimation by Experion Automated Gel System (Experion, version A.0 I . I 0, Biorad, Hercules, CA). Table 27. D-trp production D-trp production, ⁇ g/mL
  • nd not detected under the conditions of the assay as described above; 4 candidates that appeared to have higher activity than the SEQ ID NO:412 polypeptide - the SEQ I D NO:416, 41 8, 424, and 440 polypeptides not replicated in pET30 with BL2 1 DE3 host; al l experiments conducted with purified protein (approx 50 ⁇ g BAR); previously shown in Part A that there are differences in activity when the same construct is in different host backgrounds.
  • the host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity.
  • Racemase candidates were grouped by amino acid sequence homology, with clusters having 95% or greater homology at amino acid level to a reference sequence. One or more representatives was/were chosen from each group for characterization of tryptophan racemase activity under the conditions described in Part B.
  • SEQ ID NO: 416 is a non-leadered version of the reference SEQ ID NO: I I 0 sequence.
  • SEQ ID NO:4l6 the non-leadered version of the reference candidate, SEQ ID NO: 1 10.
  • SEQ ID NO: 1 16 As the reference sequence, the following racemase candidates had 97% or greater identity at amino acid level to the above reference sequence: SEQ ID NOs: 150, 192, 152, 1 18, 194, 154, 196, 158, and 1 60.
  • SEQ I D NO:420 is a non-leadered version of the reference SEQ ID NO: 1 16 sequence.
  • SEQ ID NO:422 is a non-leadered version of the reference SEQ ID NO: 1 18 sequence.
  • racemase polypeptides having the sequence shown in SEQ I D NO:442, 444, 446, 448, 450, 452, and 454 were provided as pSE420-cHis clones.
  • One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra.
  • the plasmids were transformed into TOP I 0- chemically competent cells (Invitrogen, Carlsbad, CA). Overnight cultures growing in LB carbenicillin ( 100 ⁇ g/ml) were diluted 100x in 50 ml LB carbenicillin in a 250 ml baffled flask.
  • Cultures were grown at 30°C and 250 rpm until they reached an OD 60O of 0.5 to 0.8, after which protein expression was induced with I mM IPTG for 4 h at 30°C. Samples for total protein were taken prior to induction and right before harvesting. Cells were harvested by centrifugation. Cells were frozen at -80°C.
  • Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 ⁇ L/mL of Protease Inhibitor Cocktail Il (Calbiochem, San Diego, CA) and I ⁇ l/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14000 rpm for 20 min (at 4°C) and the supernatant was carefully removed.
  • Example 17 For the tryptophan racemization assay, a total of 1 mg of soluble protein (based on Pierce BCA total protein analysis with BSA as the standard) was added for each racemase candidate and positive controls. Cell free extract of polypeptides having the sequence shown in SEQ ID NO:412, pSE420/TOP I 0 construct served as positive control for the assay, and cell-free extract of TOPl 0 (Invitrogen, Carlsbad, CA) containing vector pSE420-cHis served as a negative control. Total protein concentration was determined using the Pierce BCA protein assay (Pierce Biotechnology, Inc., Rockford, IL) with bovine serum albumin (BSA) as the standard, per the manufacturer's instructions. Formation of D-tryptophan was measured at 30 minutes, 2 hours and 4 hours as described in Example 18.
  • Pierce BCA protein assay Pierce Biotechnology, Inc., Rockford, IL
  • BSA bovine serum albumin
  • racemase candidate extracts tested above polypeptides having the sequence shown in SEQ ID NO:442, 444, 446, 448, 450, 452 and 454, had detectable tryptophan racemase activity under the conditions described above.
  • tryptophan racemase activity was detected for the positive control, the SEQ ID NO:4 l 2 polypeptide extract, and there was no detectable activity in the case of the pSE420-cHis vector control extracts. It is expected that the homologs of the representative racemase candidates having 95% or greater homology at amino acid level (see Table 3 I ) will also have tryptophan racemase activity. Table 31.
  • Racemase candidates described in this example were grouped by amino acid sequence homology with clusters having 95% or greater homology at amino acid level to a reference sequence. One or more representatives were chosen from each group for characterization of tryptophan racemase activity using the conditions described in Part B. Using SEQ ID NO:244 as the reference sequence, the following racemase candidates had 97% or greater identity at amino acid level to the above reference sequence: SEQ ID NO:248, 236, 246, 252, 250, and 254. SEQ ID NO:448 is a non-leadered version of the reference SEQ ID NO:244 sequence.
  • SEQ ID NO:454 is a non-leadered version of the reference SEQ I D NO:288 sequence
  • SEQ ID NO:452 is a non-leadered version of SEQ ID NO:274 sequence
  • SEQ I D NO:446 is a non- leadered version of SEQ ID NO:234 sequence.
  • racemase candidates had 97% or greater identity at amino acid level to the above reference sequence: SEQ ID NO:208, 2 10, 228, 230, 270, 272, 278, 280, 282, 284, 292, 198, 212, 214, and 1 14.
  • SEQ ID NO:204 had 96% identity with SEQ I D NO:218 reference sequence.
  • SEQ ID NO:444 is a non- leadered version of the reference SEQ ID NO:218 sequence. Under the conditions described in Part B (e.g., Example 17), tryptophan racemase activity was detected for the non-leadered version (SEQ ID NO:444) of the reference candidate, SEQ ID NO:218. Thus it would be expected that other racemase candidates with 97% or greater sequence identity at the amino acid level would also have tryptophan racemase activity.
  • SEQ ID NO:436 is a non-leadered version of SEQ ID NO: 1 14 sequence. Under the conditions of the assays described in Part B, tryptophan racemase activity was not detected for the non-leadered version (SEQ ID NO:436) of the racemase candidate SEQ ID NO: 1 14, as shown in Example 12.
  • the SEQ ID NO:441 nucleic acid (encoding the polypeptide having the sequence of SEQ ID NO:442) was subcloned into pET30a with a C-terminal His tag
  • a D56N mutant (corresponding to D76N mutation in A. caviae) was created in
  • SEQ ID NO:442 Mutagenesis was done using the QuickChange-Multi site-directed mutagenesis kit (Stratagene, La JoIIa, CA), using the C-tagged SEQ ID NO:442 gene in pET30a as template. The following mutagenic primer was used to make the D56N change as described in Example 19: 5'-CGCCATCATGAAGGCGAACGCCTACGGTCACG-3' (SEQ ID NO:5 16). [00186] The site-directed mutagenesis was done as described in the manufacturer's protocol. The resulting mutation was detrimental to tryptophan racemase activity in this candidate, whereas, in A. caviae, the corresponding D76N mutation has a positive effect.
  • 465, 467, 469, 471 , 473, 475, and 477 (encoding polypeptides having the sequence shown in SEQ ID NO:456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, and 478) were provided as pSE420-cHis clones.
  • One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra.
  • the plasmids were transformed into TOPl 0 chemically competent cells (Invitrogen, Carlsbad, CA).
  • Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 ⁇ L/mL of Protease Inhibitor Cocktail Il (Calbiochem, San Diego, CA) and I ⁇ l/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14,000 rpm for 20 min (at 4°C) and the supernatant was carefully removed.
  • ID NO:456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478 were prepared as described in other examples.
  • Example 17 For the tryptophan racemization assay a total of 800 ⁇ g of soluble protein was added for each racemase candidate and positive controls. pSE420-cHis/TOP l 0 cell-free extracts of racemase polypeptides having the sequence shown in SEQ ID NO:4 I 2 and 442 served as positive controls for the assay, and cell-free extract of TOP I 0 (Invitrogen, Carlsbad, CA) containing vector pSE420-cHis served as a negative control. Total protein concentration was determined using the Pierce BCA (Pierce Biotechnology, Inc., Rockford, I L) protein assay with bovine serum albumin (BSA) as the standard, per the manufacturer's instructions. Formation of D-tryptophan was measured at 30 minutes, 2 hours and 4 hours as described in Example 18.
  • Pierce BCA Pierce Biotechnology, Inc., Rockford, I L
  • BSA bovine serum albumin
  • Racemase polypeptides having the sequence shown in SEQ ID NO:460, 474 and
  • Racemase polypeptides having the sequence shown in SEQ ID NO:456, 458, 462, 464, 466, 468, 470, 472 and 478 showed no detectable tryptophan racemase activity after 4 hours under the conditions tested. In a follow up experiment, a 24-hour sample was evaluated for D-tryptophan production. None of the racemases listed above showed detectable tryptophan racemase activity at 24 hours under the conditions described above. Of the candidates for which no activity was observed, racemase polypeptides having the sequence shown in SEQ ID NO:456, 458, 462, 464, 466, 468, 470, 472 and 478 exhibited poor or questionable soluble protein expression. The host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity.
  • Racemase candidates were grouped by amino acid sequence homology, with clusters having 95% or greater homology at amino acid level to a reference sequence. One or more representatives was/were chosen from each group for characterization of tryptophan racemase activity using the conditions described in Part B. [00194] Using SEQ ID NO: 108 as the reference sequence, the following racemase candidates had 96% or greater identity at amino acid level to the above reference sequence: SEQ ID NO: I 72, I 78, 180, I 82, 1 84, I 40, 144, I 88, 190, 1 12, 148, 156, 120 and 162. SEQ ID NO:474 is a non-leadered version of the reference SEQ ID NO: 108 sequence.
  • Racemase nucleic acid sequences SEQ ID NO:3 l 3, 325, 341 , 343, 3 17, 329, 327,
  • Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 ⁇ L/mL of Protease Inhibitor Cocktail Il (Calbiochem, San Diego, CA) and I ⁇ l/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14000 rpm for 20 min (at 4°C) and the supernatant was carefully removed.
  • Desalted cell-free extracts were evaluated using tryptophan racemase assays under the conditions described in Example 17, with purified SEQ ID NO:442 polypeptides serving as a positive control.
  • tryptophan racemase assay a total of 10 ⁇ g and 1 00 ⁇ g BAR-equivalent SEQ ID NO:442 racemase (based on Pierce BCA total protein analysis with BSA as the standard and estimation of percentage of BAR protein expressed from Experion, (Experion, version A.01 .10, Biorad, Hercules, CA)), were used as positive controls. I mg of total protein was added for each racemase candidate being tested (based on Pierce BCA total protein analysis with BSA as the standard). Formation of D-tryptophan was measured at 30 minutes, 1 hour, 2 hours, and 4 hours as described in Example 18. In a follow up experiment, a 24-hour sample was evaluated for D-tryptophan production.
  • Racemase nucleic acids SEQ ID NO:32 1 , 323 and 347 were provided as PCR products with Nde I and Not I restriction sites at the 5' and 3' ends, respectively. However all of these sequences had additional Nde I and/or Not I sites internal to the gene sequence so direct siibcloning was not possible.
  • SEQ ID NO:349 was re-amplified by PCR with RTth polymerase (Applied Biosystems, Foster City, CA) and primers adding an Nde I and Xho I restriction site at the 5' and 3 ; ends, respectively.
  • the PCR fragment was digested with Nde ⁇ and Xho ⁇ restriction enzymes and ligated into the Nde ⁇ and Xho ⁇ restriction sites of pET30a. Correct plasmids were verified by digestion with Nde ⁇ and Xho ⁇ and sequencing (Agencourt, Beverly, MA).
  • Nde ⁇ and Xho ⁇ and sequencing Agencourt, Beverly, MA.
  • One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra.
  • Example 1 7 with purified A. caviae D76N BAR (see Example 19) serving as a positive control.
  • a total of 50 ⁇ g BAR equivalent of positive control (based on Pierce BCA total protein analysis with BSA as the standard and estimation of percentage of BAR protein expressed from Experion (Experion, version A.01.10, Biorad, Hercules, CA) was added. I mg of total protein was added for each racemase candidate being tested (based on Pierce BCA total protein analysis with BSA as the standard). Formation of D-tryptophan was measured at 1 hour, 2 hours, 4 hours, and 21 .5 hours as described in Example 18.
  • Tryptophan racemase activity was observed for polypeptides having the sequence shown in SEQ ID NO:322. This enzyme is interesting because it is the smallest racemase protein that was active on tryptophan, with the protein being only 232 amino acids (as compared to 409 amino acids for the A. caviae benchmark, and >300 amino acids for most of the other racemase candidates).
  • Racemase nucleic acids having the sequence of SEQ ID NO:339 and 349 having the sequence of SEQ ID NO:339 and 349
  • nucleic acid having the sequence of SEQ ID ⁇ O:350 was re-amplified by PCR with RTth polymerase (Applied Biosystems, Foster City, CA) and primers adding an Nde ⁇ and Xho ⁇ restriction site at the 5' and 3' ends, respectively.
  • the PCR fragment was digested with Nde I and Xho I restriction enzymes and ligated into the Nde ⁇ and Xho ⁇ restriction sites of pET30a. Correct plasmids were verified by digestion with Nde ⁇ and Xho ⁇ and sequencing (Agencourt, Beverly, MA).
  • Nde ⁇ and Xho ⁇ and sequencing Agencourt, Beverly, MA.
  • One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra.
  • ExpressTM media Solutions I -6, Novagen/EMD Biosciences, San Diego, CA
  • Overnight ExpressTM cultures were grown at 30°C, with agitation at 250 rpm for approximately 20 hours, and cells were harvested by centrifugation when the OD 600nm reached between 6 and 10.
  • ID NO:340 and 350 were prepared as described above.
  • Example 17 with the polypeptide having the sequence of SEQ I D NO:412 serving as a positive control.
  • a total of approximately 5 ⁇ g BAR equivalent of control (based on Pierce BCA total protein analysis with BSA as the standard and estimation of percentage of BAR protein expression level from Experion. (Experion, version A.0l .10, Biorad, Hercules, CA)) was added, and I mg of total cell-free protein extract was added for each racemase candidate being tested (based on Pierce BCA total protein analysis with BSA as the standard). Formation of D-tryptophan was measured at 15 minutes, 2 hours, and 21 hours as described in Example 18.
  • SEQ ID NO:340 or 350 under the conditions tested.
  • Positive control polypeptides having the sequence of SEQ ID NO:4 I 2 showed tryptophan racemase activity.
  • SDS-PAGE analysis showed low soluble protein expression for SEQ ID NO:340 and 350 polypeptides.
  • the host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity.
  • Example 16 Analysis of racemases provided as PCR-4-Blunt TOPO clones. [00215] Racemase nucleic acids having the sequence shown in SEQ ID NO:335, 337, 357,
  • Racemases in these plasmids were amplified with RTth polymerase (Applied Biosystems, Foster City, CA) and primers adding an Nde ⁇ and Xho ⁇ restriction site at the 5' and 3' ends, respectively.
  • Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 ⁇ L/mL of Protease Inhibitor Cocktail II (Calbiochem, San Diego, CA) and I ⁇ l/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14000 rpm for 20 min (at 4°C) and the supernatant was carefully removed.
  • Bugbuster Amine Free Novagen/EMD Biosciences, San Diego, CA
  • Protease Inhibitor Cocktail II Calbiochem, San Diego, CA
  • I ⁇ l/ml of benzonase nuclease Novagen/EMD Biosciences, San Diego, CA
  • Example 1 7 with purified A. caviae D76N BAR (Example 19) serving as a positive control.
  • a total of 100 ⁇ g BAR equivalent of control was added (based on Pierce BCA total protein analysis with BSA as the standard and estimation of percentage of BAR protein expressed from Experion, version A.01 .10, Biorad, Hercules, CA), and 1 mg of total protein was added for each racemase candidate being tested. Formation of D-tryptophan was measured at 30 minutes. 2 hours, 4 hours and 52 hours as described in Example 1 8.
  • Racemase polypeptides having SEQ I D NO:336, 338 and 358 were active.
  • Racemase polypeptides having SEQ I D NO:366, 360, and 362 showed no detectable tryptophan racemase activity under the conditions tested.
  • Polypeptides having SEQ I D NO:366, 360, and 362 all had satisfactory soluble protein expression.
  • the host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity.
  • Example 1 7 Description of racemase assay conditions
  • Racemase assays were performed starting with the L-amino acid isomer and the formation of corresponding D-amino acid was followed. [00223] Assay conditions:
  • the assays were conducted at 30°C with shaking at 225 rpm. Desalted racemase candidate proteins (cell-free extracts or purified preparations) were evaluated for amino acid racemase activity. Wherever possible, appropriate negative and positive controls were included for the assays. Sample aliquots were taken for analysis at various timepoints and formic acid was added to a final concentration of 2% to stop the reaction.
  • Example 1 8 Detection of monatin stereoisomers and chiral detection of lysine, alanine, methionine, tyrosine, leucine, phenylalanine, tryptophan, glutamate, and aspartate [00230]
  • This example describes methods used to detect the presence of stereoisomers of monatin, lysine, alanine, methionine, tyrosine, leucine, phenylalanine, tryptophan, glutamate, and aspartate. It also describes a method for the separation and detection of the four stereoisomers of monatin.
  • LC separations capable of separating all four stereoisomers of monatin were performed on a Phenomenex Luna 2.0 x 250 mm (3 ⁇ m) Cl 8 (2) reversed phase chromatography column at 40°C.
  • the LC mobile phase consisted of A) water containing 0.05%
  • L- and D-amino acids such as lysine, alanine, methionine, tyrosine, leucine, phenylalanine, tryptophan, glutamate, and aspartate from biochemical reaction experiments were first treated with formic acid to denature protein. The sample was then centrifuged and filtered through a 0.2 ⁇ m nylon syringe filter prior to LC/MS/MS analysis. Identification of L- and D-amino acids was based on retention time and mass selective detection. LC separation was accomplished by using Waters 2690 liquid chromatography system and an ASTEC 2.1 mm x 250 mm Chirobiotic TAG chromatography column with column temperature set at 45°C.
  • LC mobile phase A and B were 0.25% acetic acid and 0.25% acetic acid in methanol, respectively. Isocratic elution was used for all methods to separate the L and D isomers. Lysine was eluted using 80% mobile phase A, and 20% B and a flow rate of 0.25 mL/min. Glutamate, alanine, and methionine were separated with elution of 60% mobile phase A and 40% B and a flow rate of 0.25 mL/min.
  • Aspartate, tryptophan, tyrosine, leucine, and phenylalanine were separated isomerically with 30% mobile phase A and 70% B with a flow rate of 0.3 mL/min for aspartate and tryptophan, and 0.25 mL/min for tyrosine, leucine, and phenylalanine.
  • the detection system for analysis of L- and D-amino acids included a Waters 996
  • PDA Photo-Diode Array
  • Micromass Quattro Ultima triple qiiadrupole mass spectrometer The PDA, scanning from 195 to 350 nm, was placed in series between the chromatography system and the mass spectrometer.
  • Parameters for the Micromass Quattro Ultima triple quadrupole mass spectrometer operating in positive electrospray ionization mode (+ES1) were set as the following: Capillary: 3.2 kV; Cone: 20 V; Hex 1 : 12 V; Aperture: 0.1 V; Hex 2: 0.2V; Source temperature: 120°C; Desolvation temperature: 350°C; Desolvation gas: 641 L/h; Cone gas: 39 L/h; Low mass Q l resolution: 16.0; High mass Q I resolution: 16.0; Ion energy 1 : 0.1 ; Entrance: -5; Collision: 20; Exit 1 : 10; Low mass Q2 resolution: 16.0; High mass Q2 resolution: 16.0 Ion energy 2: 1.0; Multiplier: 650 V.
  • MS/MS experiments with Multiple Reaction Monitoring (MRM) mode were set up to selectively monitor reaction transitions of 147.8 to 84.03, 147.8 to 56.3, and 147.8 to 102.2 for glutamate, 133.85 to 74.03, 133.85 to 69.94 and 133.85 to 87.99 for aspartate, 146.89 to 84.09, 146.89 to 55.97 and 146.89 to 67.23 for lysine, 149.80 to 56.1 , 149.8 to 61 .01 , and 149.80 to 104.15 for methionine, 181 .95 to 135.97, 181.95 to 90.88 and 181.95 to 1 1 8.87 for tyrosine, 131.81 to 86.04 and 13 1.81 to 69.3 1 for leucine, 90.0 to 44.3 for alanine, and 165.83 to 102.96, 165.83 to 93.27 and 165.83 to
  • the monatin racemic mixture was esterified, the free amino group was blocked with Cbz, a lactone was formed, and the S,S lactone was selectively hydrolyzed using an immobilized protease enzyme.
  • the monatin can also be separated as described in Bassoli et al., 2005, Eur. J. Org. Chem., 8: 1652-1658.
  • This example describes the cloning of the A. caviae BAR and a D76N mutant that were used as positive controls in some of the Examples.
  • Aeromonas caviae ATCC 14486 degenerate primers were designed (based on conserved regions of known BAR homologs) to obtain the BAR gene from Aeromonas caviae ATCC 14486.
  • Aer deg F2 5'-GCCAGCAACGARGARGCMCGCGT ⁇ ' (SEQ ID NO:54 I );
  • Aer deg R l 5'-TGGCCSTKGATCAGCACA-S' (SEQ I D NO:542)
  • K indicates G or T
  • R indicates A or G
  • S indicates C or G
  • M indicates A or C
  • 50 ⁇ L reaction contained 0.5 ⁇ L template ( ⁇ l 00 ng of A. caviae genomic DNA), 1.6 ⁇ M of each primer, 0.3 mM each dNTP, I 0 U rTth Polymerase XL (Applied Biosystems, Foster City, CA), I X XL buffer, I mM Mg(OAc) 2 and 2.5 ⁇ L dimethyl sulfoxide.
  • the therinocycler program used included a hot start at 94°C for 3 minutes and 30 repetitions of the following steps: 94°C for 30 seconds, 53°C for 30 seconds, and 68°C for 2 minutes. After the 30 repetitions, the sample was maintained at 68°C for 7 minutes and then stored at 4°C. This PCR protocol produced a product of 7 l 5 bp.
  • the PCR product was gel purified from 0.8% TAE-agarose gel using the Qiagen gel extraction kit (Qiagen, Valencia, CA). The product was TOPO cloned and transformed into TOP 10 cells according to manufacturer's protocol (Invitrogen, Carlsbad, CA). The plasmid DNA was purified from the resulting transformants using the Qiagen spin miniprep kit (Qiagen, Valencia, CA) and screened for the correct inserts by restriction digest with EcoR ⁇ . The sequences of plasmids appearing to have the correct insert were verified by dideoxy chain termination DNA sequencing with universal M 13 forward primers.
  • GenomeWalkerTM Universal Kit BD GenomeWalkerTM Universal Kit, Clontech.
  • Gene-specific primers were designed as per GenomeWalkerTM manufacturer's protocols based on sequences obtained using degenerate primer sequences (see above), allowing for a few hundred homologous base pair overlap with original product. These gene-specific primers were subsequently used with GenomeWalkerTM adaptor primers for PCR of upstream and downstream sequences to complete d, caviae BAR ORF.
  • VPEAYFDMVR PGGI IYGDTI PSYTEYKKVM AFKTQVASVN HYPAGNTVGY
  • A. caviae BAR When 200 ⁇ g of purified (tagged) racemase enzymes were used in a tryptophan racemase assay as described in Example 17, at 30 minutes, A. caviae BAR produced 1034 ⁇ g/mL of D-tryptophan. The effect of leader sequences on racemase activity
  • the leaderless racemase when expressed, was found to retain approximately 65% of the activity, as compared with the expression product of the full-length gene.
  • the periplasmic and cytoplasmic protein fractions were isolated for the wild type expression products, as well as the leaderless constructs, as described in the pET System Manual (Novagen, Madison, Wl).
  • the majority of expressed wildtype BAR was found in the periplasm, while the leaderless BAR appeared to remain in the cytoplasm.
  • the reduction in activity of the leaderless ⁇ . caviae BAR may be due to a change in processing and/or folding when expressed in the cytoplasm.
  • a D76N mutant of A. caviae BAR was made to determine if this position was critical for broad activity. Mutagenesis was done using the QuickChange-Multi site-directed mutagenesis kit (Stratagene, La JoIIa, CA), using the C-tagged A. caviae BAR gene in pET30 as template. The following mutagenic primer was used to make a D76N change (nucleotide position 226): 5'-CGC CAT CAT GAA GGC GAA CGC CTA CGG TCA CG-3' (SEQ ID
  • mutant and the wildtype enzyme were produced as described above and assayed as described in Example 17 using 200 micrograms of purified protein (prepared as described herein
  • caviae D76N was C-term His tagged in pET30) and approximately 7 mg/mL of L- tryptophan as substrate.
  • the mutant produced 1929 micrograms per mL of D-tryptophan as compared to 1 149 micrograms per mL for the wildtype enzyme.
  • D76N mutant also reached equilibrium at an earlier time point. The improvement in activity was unexpected.
  • SEQ ID NO:442 was generated. See Example 13.
  • racemase polypeptides had 99% identity to the BAR from A. caviae described in this example: SEQ ID NO:200, 202, 206, 142, 1 86 and 176.
  • SEQ ID NO: 176 had 97% identity at the amino acid level to the BAR from A. caviae described in this example. It is expected that these candidates would also have tryptophan racemase activity given the high sequence homology to an enzyme with demonstrated tryptophan racemase activity.
  • Appendix I shows a table that describes selected characteristics of exemplary nucleic acids and polypeptides of the invention, including sequence identity comparison of the exemplary sequences to public databases.
  • the first row, labeled "SEQ ID NO:”, the numbers "1 , 2" represent the exemplary polypeptide of the invention having a sequence as set forth in SEQ ID NO:2, encoded by, e.g., SEQ ID NO: 1 .
  • the sequences described in Appendix I (the exemplary sequences of the invention) have been subject to a BLAST search (as described herein) against two sets of databases.
  • the first database set is available through NCBI (National Center for Biotechnology Information).
  • NR refers to the Non-Redundant nucleotide database maintained by NCBI. This database is a composite of GenBank, GenBank updates, and EMBL updates.
  • the entries in the column “NR Description” refer to the definition line in any given NCBI record, which includes a description of the sequence, such as the source organism, gene name/protein name, or some description of the function of the sequence.
  • the entries in the column “NR Accession Code” refer to the unique identifier given to a sequence record.
  • NR E-value refers to the Expected value (E-value), which represents the probability that an alignment score as good as the one found between the query sequence (the sequences of the invention) and that particular database sequence would be found in the same number of comparisons between random sequences as was done in the present BLAST search.
  • the entries in the column “NR Organism” refer to the source organism of the sequence identified as the closest BLAST hit.
  • the second database set is collectively known as the GENESEQTM database, which is available through Thomson Derwent (Philadelphia, PA).
  • the results in the "Predicted EC No.” column are determined by a BLAST search against the Kegg (Kyoto Encyclopedia of Genes and Genomes) database. If the top BLAST match has an E-value equal to or less than e-6, the EC number assigned to the top match is entered into the table. [00262]
  • the columns "Query DNA Length” and “Query Protein Length” refer to the number of nucleotides or the number amino acids, respectively, in the sequence of the invention that was searched or queried against either the NCBI or GENESEQTM databases.
  • the columns “Subject DTMA Length” and “Subject Protein Length” refer to the number of nucleotides or the number amino acids, respectively, in the sequence of the top match from the BLAST searches. The results provided in these columns are from the search that returned the lower E-value, either from the NCBl databases or the GENESEQTM database.
  • the columns “%I D Protein” and “%ID DNA” refer to the percent sequence identity between the sequence of the invention and the sequence of the top BLAST match. The results provided in these columns are from the search that returned the lower E-value, either from the NCBI databases or the GENESEQTM database.

Landscapes

  • Organic Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Microbiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Immobilizing And Processing Of Enzymes And Microorganisms (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The invention provides for isomerase (e.g., racemase) and epimerase polypeptides and nucleic acids encoding such polypeptides. Also provided are methods of using such isomerase (e.g., racemase) and epimerase nucleic acids and polypeptides.

Description

ISOMERASES AND EPIMERASES AND METHODS OF USING
CROSS REFERENCE TO RELATED APPLICATION
[0001] This application claims benefit of US Serial Number 61/018,845 filed 03 January 2008 entitled ISOMERASES AND EPIMERASES AND METHODS OF USING, which is hereby incorporated by reference in its entirety.
INCORPORATION BY REFERENCE
[0002] A Sequence Listing is being filed concurrently with the filing of this application and is hereby incorporated by reference.
TECHNICAL FIELD
[0003] This invention relates to nucleic acids and polypeptides, and more particularly to nucleic acids and polypeptides encoding isomerases (e.g., racemases) and epimerases as well as methods of using such isomerases and epimerases.
BACKGROUND
[0004] Isomerases such as racemases as well as epimerases can catalyze the interconversion of substrate enantiomers. Isomerases and epimerases can catalyze the stereochemical inversion around an asymmetric carbon atom in a substrate having one or more centers of asymmetry.
SUMMARY
[0005] This disclosure provides for a number of different isomerase (e.g., racemase) and epimerase polypeptides and the nucleic acids encoding such isomerase and epimerase polypeptides. This disclosure also provides for methods of using such isomerase and epimerase nucleic acids and polypeptides.
[0006] In one aspect, the invention provides methods of isomerizing a substrate. For example, one or more L-amino acids can be converted to the corresponding one or more D-amino acids (or, alternatively, one or more D-amino acids to the corresponding one or more L-amino acids). Such methods generally include combining one or more L-amino acids (or one or more D-amino acids) with a) one or more nucleic acid molecules chosen from the group consisting of SEQ I D NOs: 1 , 3, 5, 7, 9, I I , 13, 15, 17, 19, 2 1 , 23, 25, 27, 29, 3 1 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 5 1 , 53, 55, 57, 59, 61, 63, 65.67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87.89.91.93.95, 97, 99.101. 103, 105, 107, 109, 11 I. I 13, 115, I 17, 119.121, 123, 125, 127, 129.131. 133, 135. 137.139. 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175.177, 179, 181, 183, 185, 187, 189, 191, 193, 195. 197, 199,201,203.205,207.209,211.213.215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239.241, 243, 245, 247, 249, 251, 253. 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301 , 303, 305, 307, 309, 311 , 313, 315, 317, 319, 321 , 323, 325.327.329. 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359.361, 363, 365.367, 369, 371, 373, 375, 377, 379, 381, 383, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407,409,411,413,415,417,419,421,423,425,427,429,431,433,435.437,439,441,443, 445, 447, 449, 451 , 453, 455, 457, 459, 461 , 463, 465, 467, 469, 471 , 473.475, 477, 479, 481 , 483, 485, 487, 489, 491 , 493, 495, and 497, wherein the one or more nucleic acid molecules encode polypeptides having isomerase or epimerase activity; b) a sequence variant of a), wherein the variant encodes a polypeptide having isomerase or epimerase activity; c) a fragment of a) or b), wherein the fragment encodes a polypeptide having isomerase or epimerase activity; d) one or more polypeptides chosen from the group consisting of SEQ ID NOs:2, 4, 6, 8, 10. 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64.66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144. 146, 148, 150, 152. 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186. 188. 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220.222, 224, 226.228, 230, 232, 234, 236.238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258.260, 262, 264.266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296.298, 300.302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324,326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 442 D56N, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482.484, 486, 488, 490, 492, 494, 496, and 498, wherein the one or more polypeptides has isomerase or epimerase activity; e) a variant of d), wherein the variant has isomerase or epimerase activity; or f) a fragment of d) or e), wherein the fragment has isomerase or epimerase activity. [0007] In one embodiment, the one or more nucleic acid molecules are chosen from the group consistingofSEQ IDNOs:1,3, 5, 7,9, U, 13, 15, 17, 19,21,23,25,27,29,31, 33,35,37,39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93,95,97,99, 101, 103, 105, 107, 109, 111, I 13, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199,201,203,205,207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291 , 293, 295, 297, 299, 301 , 303, 305, 307, 309, 311 , 313, 315, 317, 319, 321 , 323, 325, 327, 329, 331, 333, 335, 337, 339; 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409, 41 I, 413, 415, 417, 419, 421 , 423, 425, 427, 429, 431 , 433, 435, 437, 439, 441 , 443, 445, 447, 449, 451 ,
453, 455, 457, 459, 461, 463, 465, 467, 469, 471, 473, 475, 477, 479, 481, 483, 485, 487, 489, 491,493, 495, and 497.
[0008] In one embodiment, the one or more polypeptides are chosen from the group consisting ofSEQ lDNOs:2, 4, 6, 8, 10, 12, 14, 16, 18,20,22,24,26, 28,30,32, 34,36,38,40,42,44,46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 442 D56N, 444, 446, 448, 450, 452,
454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498.
[0009] In certain embodiments, the nucleic acid molecule has the sequence shown in SEQ ID NO:411 and the polypeptide has the sequence shown in SEQ ID NO:412. [0010] In certain embodiments, the variant is a nucleic acid molecule that has at least 98% (e.g., at least 99%) sequence identity to SEQ ID NOs: 1 , 3, 5, 7, 9, I 1 , 13, 15, 17, 19, 21 , 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81,83, 85, 87, 89,91, 93,95,97,99, 101, 103, 105, 107, 109, 11 I, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251 , 253, 255, 257, 259, 261 , 263, 265, 267, 269, 271 , 273, 275, 277, 279, 281 , 283, 285, 287, 289, 291 , 293, 295, 297, 299, 301 , 303, 305, 307, 309, 3 1 1 , 3 1 3, 3 15, 3 17, 3 19, 321 , 323, 325, 327, 329, 33 1 , 333, 335, 337, 339, 341 , 343, 345, 347, 349, 35 1 , 353, 355, 357, 359, 361 , 363, 365, 367, 369, 371 , 373, 375, 377, 379, 381 , 383, 385, 387, 389, 391 , 393, 395, 397, 399, 401 , 403, 405, 407, 409, 41 1 , 413, 41 5, 417, 419, 421 , 423, 425, 427, 429, 43 1 , 433, 435, 437, 439, 441 , 443, 445, 447, 449, 45 1 , 453, 455, 457, 459, 461 , 463, 465, 467, 469, 471 , 473, 475, 477, 479, 481 , 483, 485, 487, 489, 491 , 493, 495, and 497. [0011] In certain embodiments, the variant is a polypeptide that has at least 98% (e.g., at least 99%) sequence identity to SEQ ID NOs:2, 4, 6, 8, I 0, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, 1 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 1 52, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 1 80, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 3 16, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 41 8, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 442 D56N, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498. [0012] In some embodiments, the variant is a nucleic acid that has at least 45% (e.g., at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%) sequence identity to SEQ ID NO:4 I I . In some embodiments, the variant is a polypeptide that has at least 25% (e.g., at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%) sequence identity to SEQ ID NO:412.
[0013] In some embodiments, the variant is a mutant. For example, a representative mutant has a mutation at the residue that aligns with residue 76 of A. caviae BAR. In some embodiments, the variant is a nucleic acid molecule that has been codon optimized. In one embodiment, the variant polypeptide is a chimeric polypeptide.
[0014] In certain embodiments, the nucleic acid molecule is contained within an expression vector and, for example, can be overexpressed. In certain embodiments, the isomerase or epimerase polypeptide lacks a signal sequence or a prepro domain. In some embodiments, the isomerase or epimerase polypeptide is immobilized on a solid support.
[0015] In certain embodiments, the polypeptide fragment is a PFAM domain. Representative polypeptide fragments that include a PFAM domain have the sequence shown in SEQ IDNO: 426, 440, or 462.
[0016] In one embodiment, the amino acid is tryptophan. In other embodiments, the amino acid is alanine. In some embodiments, the amino acid is a substituted amino acid. [0017] In another aspect, the invention provides for methods of converting L-tryptophan to D- tryptophan (or, alternatively, D-tryptophan to L-tryptophan). Such methods typically include combining L-tryptophan (or D-tryptophan) with a) one or more nucleic acid molecules chosen from the group consisting of SEQ I D NOs: I, 3, 5, 7, 9, II, 13, 15, 17, 19,21,23,25,27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85,87,89,91,93,95,97,99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199,201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409, 411, 413, 415, 417, 419, 421, 423, 425, 427, 429, 431, 433, 435, 437, 439, 441, 443, 445, 447, 449, 451, 453, 455, 457, 459, 461, 463, 465, 467, 469, 471, 473, 475, 477, 479, 481, 483, 485, 487, 489, 491, 493, 495, and 497, wherein the one or more nucleic acid molecules encode polypeptides having racemase activity; b) a variant of a), wherein the variant encodes a polypeptide having racemase activity; c) a fragment of a) or b), wherein the fragment encodes a polypeptide having racemase activity; d) one or more polypeptides chosen from the group consisting ofSEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18,20,22, 24,26,28, 30,32,34,36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94,96,98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198,200,202,204,206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346. 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 386, 388, 390, 392, 394, 396. 398, 400, 402, 404, 406, 408, 410, 412. 414, 416, 41 8, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 442 D56N, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498, wherein the one or more polypeptides has racemase activity; e) a variant of d), wherein the variant has racemase activity; or f) a fragment of d) or e), wherein the fragment has racemase activity.
[0018] Representative polypeptides include, without limitation, SEQ ID NOs: 1 72, 178, 180, 182, 184, 140, 144, 188, 190, 1 12, 148, 156, 120, 162, and 108. Representative polypeptides also include, without limitation, SEQ ID NOs: 136, 174, 138, 296, and 1 10. Representative polypeptides further include, without limitation, SEQ ID NOs: 150, 192, 1 52, 1 18, 194, 154, 196, 158, 160, and 1 16. Representative polypeptides include, without limitation, SEQ ID NOs:248, 236, 246, 252, 250, 254, and 244. Representative polypeptides include, without limitation, SEQ ID NOs:274, 234, 220, 222, 226, 232, 240, 242, 258, 260, 264, 266, 286, 290, 170, 216, and 288. Representative polypeptides include, without limitation, SEQ I D NOs:208, 210, 228, 230, 270, 272, 278, 280, 282, 284, 292, 198, 212, 214, 1 14, and 218. Representative polypeptides include, without limitation, SEQ ID NOs:204 and 218.
[0019] In another aspect, the invention provides methods of converting L-tryptophan to D- tryptophan. Such methods generally include combining L-tryptophan with a) a nucleic acid molecule having the sequence shown in SEQ ID NO:4 I I , wherein the nucleic acid molecule encodes a polypeptides having racemase activity; b) a variant of a), wherein the variant encodes a polypeptide having racemase activity; c) a fragment of a) or b), wherein the fragment encodes a polypeptide having racemase activity; d) one or more polypeptides chosen from the group consisting of SEQ ID NO:41 1 , wherein the one or more polypeptides has racemase activity; e) a variant of d), wherein the variant has racemase activity; or f) a fragment of d) or e), wherein the fragment has racemase activity.
[0020] In one embodiment, the tryptophan is a substituted tryptophan. A representative substituted tryptophan is a chlorinated tryptophan (e.g., 6-chloro-D-tryptophan). In another embodiment, the substituted tryptophan is a halogenated tryptophan.
[0021] In yet another aspect, the invention provides methods of making monatin. Such methods generally include combining L-tryptophan with a) one or more nucleic acid molecules chosen from the group consisting of SEQ ID NOs: I , 3, 5, 7, 9, I I , 13, 15, 17, 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 99, 101 , 103, 105, 107, 109, 1 1 1 , 1 13, 1 1 5, 1 17, 1 19, 121 , 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199,201, 203, 205, 207, 209, 211 , 213, 215, 217, 219, 221 , 223, 225, 227, 229, 231 , 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409,411,413,415,417,419,421,423,425,427,429,431,433,435,437,439,441,443,445, 447, 449, 451 , 453, 455, 457, 459, 461 , 463, 465, 467, 469, 471 , 473, 475, 477, 479, 481 , 483, 485, 487, 489, 491, 493, 495, and 497, wherein the one or more nucleic acid molecules encode polypeptides having racemase activity; b) a variant of a), wherein the variant encodes a polypeptide having racemase activity; c) a fragment of a) or b), wherein the fragment encodes a polypeptide having racemase activity; d) one or more polypeptides chosen from the group consisting of SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18,20,22,24,26,28, 30,32,34, 36, 38,40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94,96,98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198,200, 202,204,206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 442 D56N, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498, wherein the one or more polypeptides has racemase activity; e) a variant of d), wherein the variant has racemase activity; or f) a fragment of d) or e), wherein the fragment has racemase activity.
[0022] In some embodiments, such methods further include adding one or more polypeptides having synthase / lyase (EC 4.1.3.- or EC 4.1.2.-) activity or a nucleic acid encoding such a polypeptide and/or one or more polypeptides having D-aminotransferase activity or a nucleic acid encoding such a polypeptide. [0023] In certain embodiments, the inonatin is predominantly R, R monatin. In one embodiment, the nucleic acid has the sequence shown in SEQ ID NO:4 I I and the polypeptide has the sequence shown in SEQ I D NO:412.
[0024] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. A lthough methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control.
[0025] The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention, which may be in one or more embodiments of the invention, will be apparent from the drawings and detailed description, and from the claims. DETAILED DESCRIPTION
[0026] Disclosed herein are a number of different nucleic acid molecules encoding polypeptides having isomerase activity or epimerase activity, lsomerases such as racemases are provided herein that catalyze the racemization of a specific amino acid (e.g., tryptophan) or that catalyze the racemization of more than one amino acid (e.g., broad substrate racemases). In some embodiments, the nucleic acids or polypeptides disclosed herein can be used, for example, to convert L-tryptophan to D-tryptophan.
[0027] Isolated Nucleic Acid Molecules and Purified Polypeptides
[0028] The present invention is based, in part, on the identification of nucleic acid molecules encoding polypeptides having isomerase activity, herein referred to as "isomerase" or
"racemase" nucleic acid molecules or polypeptides, where appropriate. The present invention also is based, in part, on the identification of nucleic acid molecules encoding polypeptides having epimerase activity, herein referred to as "epimerase" nucleic acid molecules or polypeptides, wherein appropriate.
[0029] Particular nucleic acid molecules described herein include the sequences shown in SEQ
ID NOs: l , 3, 5, 7, 9, I I , 13, 15, 17, 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49,
51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 99,
101 , 103, 105, 107, 109, 1 1 1 , 1 13, 1 15, 1 17, 1 19, 121 , 123, 125, 127, 129, 131 , 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199,201, 203,205, 207,209, 211,213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 369, 371, 373, 375, 377, 379, 381, 383, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409, 411 , 413, 415, 417, 419, 421 , 423, 425, 427, 429, 431 , 433, 435, 437, 439, 441 , 443, 445, 447, 449, 451, 453, 455, 457, 459, 461, 463, 465, 467, 469, 471, 473, 475, 477, 479, 481 , 483, 485, 487, 489, 491 , 493, 495, and 497. As used herein, the term "nucleic acid molecule" can include DNA molecules and RNA molecules, analogs of DNA or RNA generated using nucleotide analogs. A nucleic acid molecule of the invention can be single-stranded or double-stranded, depending upon its intended use. Nucleic acid molecules of the invention include molecules that have at least, for example, 75% sequence identity (e.g., at least 80%, 85%, 90%, 95%, or 99% sequence identity) to any of SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89,91,93, 95,97,99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211,213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 369, 371, 373, 375, 377, 379, 381, 383, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409, 411, 413, 415, 417, 419, 421 , 423, 425, 427, 429, 431 , 433, 435, 437, 439, 441 , 443, 445, 447, 449, 451 , 453, 455, 457, 459, 461, 463, 465, 467, 469, 471, 473, 475, 477, 479, 481, 483, 485, 487, 489, 491, 493, 495, and 497. and that have functional isomerase or epimerase activity.
[0030] In calculating percent sequence identity, two sequences are aligned and the number of identical matches of nucleotides or amino acid residues between the two sequences is determined. The number of identical matches is divided by the length of the aligned region (i.e., the number of aligned nucleotides or amino acid residues) and multiplied by 100 to arrive at a percent sequence identity value. It will be appreciated that the length of the aligned region can be a portion of one or both sequences up to the full-length size of the shortest sequence. It will be appreciated that a single sequence can align differently with other sequences and hence, can have different percent sequence identity values over each aligned region. It is noted that the percent identity value is usually rounded to the nearest integer. For example, 78.1 %, 78.2%, 78.3%, and 78.4% are rounded down to 78%, while 78.5%, 78.6%, 78.7%, 78.8%, and 78.9% are rounded up to 79%. It is also noted that the length of the aligned region is always an integer. [0031] The alignment of two or more sequences to determine percent sequence identity is performed using the algorithm described by Altschul et al. (1997, Nucleic Acids Res., 25:3389-3402) as incorporated into BLAST (basic local alignment search tool) programs, available at ncbi.nlm.nih.gov on the World Wide Web. BLAST searches can be performed to determine percent sequence identity between a DAT nucleic acid described herein and any other sequence or portion thereof aligned using the Altschul et al. algorithm. BLASTN is the program used to align and compare the identity between nucleic acid sequences, while BLASTP is the program used to align and compare the identity between amino acid sequences. When utilizing BLAST programs to calculate the percent identity between a sequence of the invention and another sequence, the default parameters of the respective programs are used. [0032] Nucleic acid molecules of the invention, for example, those between about I0 and about 50 nucleotides in length, can be used, under standard amplification conditions, to amplify an isomerase or epimerase nucleic acid molecule. Amplification of an isomerase or epimerase nucleic acid can be for the purpose of detecting the presence or absence of an isomerase or epimerase nucleic acid molecule or for the purpose of obtaining (e.g., cloning) an isomerase or epimerase nucleic acid molecule. As used herein, standard amplification conditions refer to the basic components of an amplification reaction mix, and cycling conditions that include multiple cycles of denaturing the template nucleic acid, annealing the oligonucleotide primers to the template nucleic acid, and extension of the primers by the polymerase to produce an amplification product (see, for example, U.S. Patent Nos. 4,683, 195; 4,683,202; 4,800, 159; and 4,965, 1 88). The basic components of an amplification reaction mix generally include, for example, each of the four deoxynucleoside triphosphates, (e.g., dATP, dCTP, dTTP, and dGTP, or analogs thereof), oligonucleotide primers, template nucleic acid, and a polymerase enzyme. Template nucleic acid is typically denatured at a temperature of at least about 90°C, and extension from primers is typically performed at a temperature of at least about 72°C. In addition, variations to the original PCR methods (e.g., anchor PCR, RACE PCR, or ligation chain reaction (LCR)) have been developed and are known in the art. See, for example, Landegran et al., 1988, Science, 241 : 1 077- 1080; and Nakazawa et al., 1994, Proc. Natl Acad. Sci. USA, 91 :360-364). [0033] The annealing temperature can be used to control the specificity of amplification. The temperature at which primers anneal to template nucleic acid must be below the Tm of each of the primers, but high enough to avoid non-specific annealing of primers to the template nucleic acid. The Tm is the temperature at which half of the DNA duplexes have separated into single strands, and can be predicted for an oligonucleotide primer using the formula provided in section 1 1.46 of Sambrook et al. ( 1989, Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York). Non-specific amplification products are detected as bands on a gel that are not the size expected for the correct amplification product. [0034] Nucleic acid molecules of the invention, for example, those between about 10 and several hundred nucleotides in length (up to several thousand nucleotides in length), can be used, under standard hybridization conditions, to hybridize to an isomerase or epimerase nucleic acid molecule. Hybridization to an isomerase or epimerase nucleic acid molecule can be for the purpose of detecting or obtaining an isomerase or epimerase nucleic acid molecule. As used herein, standard hybridization conditions between nucleic acid molecules are discussed in detail in Sambrook et al. ( 1989, Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY; Sections 7.37-7.57, 9.47-9.57, 1 1.7-1 1 .8, and 1 1.45- 1 1 .57). For oligonucleotide probes less than about 100 nucleotides, Sambrook et al. discloses suitable Southern blot conditions in Sections I 1 .45- 1 1 .46. The Tm between a sequence that is less than 100 nucleotides in length and a second sequence can be calculated using the formula provided in Section 1 1 .46. Sambrook et al. additionally discloses prehybridization and hybridization conditions for a Southern blot that uses oligonucleotide probes greater than about 100 nucleotides (see Sections 9.47-9.52). Hybridizations with an oligonucleotide greater than 100 nucleotides generally are performed 1 5-25°C below the Tm. The Tm between a sequence greater than 100 nucleotides in length and a second sequence can be calculated using the formula provided in Sections 9.50-9.5 1 of Sambrook et al. Additionally, Sambrook et al. recommends the conditions indicated in Section 9.54 for washing a Southern blot that has been probed with an oligonucleotide greater than about 100 nucleotides.
[0035] The conditions under which membranes containing nucleic acids are prehybridized and hybridized, as well as the conditions under which membranes containing nucleic acids are washed to remove excess and non-specifically bound probe can play a significant role in the stringency of the hybridization. For example, hybridization and washing may be carried out under conditions of low stringency, moderate stringency or high stringency. Such conditions are described, for example, in Sambrook et al. section 1 1.45- 1 1 .46. The conditions used to achieve a particular level of stringency will vary, depending on the nature of the nucleic acids being hybridized. For example, the length, degree of complementarity, nucleotide sequence composition (e.g., G/C vs. A/T nucleotide content) and nucleic acid type (e.g., RNA v. DNA) of the hybridizing regions of the nucleic acids can be considered in selecting hybridization conditions. For example, washing conditions can be made more stringent by decreasing the salt concentration in the wash solutions and/or by increasing the temperature at which the washes are performed.
[0036] The amount of hybridization can be quantitated directly on a membrane or from an autoradiograph using, for example, a Phosphorlmager or a Densitometer (Molecular Dynamics, Sunnyvale, CA). It is understood by those of skill in the art that interpreting the amount of hybridization can be affected by, for example, the specific activity of the labeled oligonucleotide probe, the number of probe-binding sites on the target nucleic acid, and the amount of exposure of an autoradiograph or other detection medium. It will be readily appreciated that, although any number of hybridization, washing and detection conditions can be used to examine hybridization of a probe nucleic acid molecule to immobilized target nucleic acids, it is more important to examine hybridization of a probe to target nucleic acids under identical hybridization, washing, and detection conditions. Preferably, the target nucleic acids are on the same membrane. In addition, it can be appreciated by those of skill in the art that appropriate positive and negative controls should be performed with every set of amplification or hybridization reactions to avoid uncertainties related to contamination and/or non-specific annealing of oligonucleotide primers or probes.
[0037] Oligonucleotide primers or probes specifically anneal or hybridize to one or more isomerase or epimerase nucleic acids. For amplification, a pair of oligonucleotide primers generally anneal to opposite strands of the template nucleic acid, and should be an appropriate distance from one another such that the polymerase can effectively polymerize across the region and such that the amplification product can be readily detected using, for example, electrophoresis. Oligonucleotide primers or probes can be designed using, for example, a computer program such as OLIGO (Molecular Biology Insights Inc., Cascade, CO) to assist in designing oligonucleotides. Typically, oligonucleotide primers are 10 to 30 or 40 or 50 nucleotides in length {e.g., 10, I 1 , 12, 13, 14, 15, 16, 1 7, 1 8, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides in length), but can be longer or shorter if appropriate amplification conditions are used. [0038] Non-limiting representative pairs of oligonucleotide primers that were used to amplify isomerase nucleic acid molecules are shown in Tables 16, 26, 35, 37 and 38 (e.g., SEQ ID NOs:503-515, 5 17-543, and 545-548). The sequences shown in SEQ I D NOs: 503-515, 5 17- 543, and 545-548 are non-limiting examples of oligonucleotide primers that can be used to amplify isomerase nucleic acid molecules. Oligonucleotides in accordance with the invention can be obtained by restriction enzyme digestion of an isomerase or epimerase nucleic acid molecules or can be prepared by standard chemical synthesis and other known techniques. [0039] As used herein, an "isolated" nucleic acid molecule is a nucleic acid molecule that is separated from other nucleic acid molecules that are usually associated with the isolated nucleic acid molecule. Thus, an "isolated" nucleic acid molecule includes, without limitation, a nucleic acid molecule that is free of sequences that naturally flank one or both ends of the nucleic acid in the genome of the organism from which the isolated nucleic acid is derived (e.g., a cDNA or genomic DNA fragment produced by PCR or restriction endonuclease digestion). Such an isolated nucleic acid molecule is generally introduced into a vector {e.g., a cloning vector, or an expression vector) for convenience of manipulation or to generate a fusion nucleic acid molecule. In addition, an isolated nucleic acid molecule can include an engineered nucleic acid molecule such as a recombinant or a synthetic nucleic acid molecule. A nucleic acid molecule existing among hundreds to millions of other nucleic acid molecules within, for example, a nucleic acid library (e.g., a cDNA, or genomic library) or a portion of a gel (e.g., agarose, or polyacrylamine) containing restriction-digested genomic DNA is not to be considered an isolated nucleic acid.
[0040] Isolated nucleic acid molecules described herein having isomerase or epimerase activity can be obtained using techniques routine in the art, many of which are described in the Examples herein. For example, isolated nucleic acids within the scope of the invention can be obtained using any method including, without limitation, recombinant nucleic acid technology, the polymerase chain reaction (e.g., PCR, e.g., direct amplification or site-directed mutagenesis), and/or nucleic acid hybridization techniques (e.g., Southern blotting). General PCR techniques are described, for example in PCR Primer: A Laboratory Manual, Dieffenbach & Dveksler, Eds., Cold Spring Harbor Laboratory Press, 1995. Recombinant nucleic acid techniques include, for example, restriction enzyme digestion and ligation, which can be used to isolate an isomerase or epimerase nucleic acid molecule as described herein. Isolated nucleic acids in accordance with the invention also can be chemically synthesized, either as a single nucleic acid molecule or as a series of oligonucleotides.
[0041] Techniques for the manipulation of nucleic acids, such as, e.g., subcloning, labeling probes (e.g., random-primer labeling using Klenow polymerase, nick translation, amplification), sequencing, hybridization, amplification and the like are well described in the scientific and patent literature, see, e.g., Sambrook et al., Eds., 1 989, Molecular Cloning: A Laboratory Manual (2nd Ed.), VoIs 1 -3, Cold Spring Harbor Laboratory; Current Protocols in Molecular Biology, 1997, Ausubel, Ed. John Wiley & Sons, Inc., New York; Laboratory Techniques in Biochemistry and Molecular Biology: Hybridization with Nucleic Acid Probes, Part I. Theory and Nucleic Acid Preparation, Tijssen, Ed. Elsevier, N. Y. ( 1993).
[0042] Purified isomerase or epimerase polypeptides, as well as polypeptide fragments having isomerase or epimerase activity, are within the scope of the invention. Isomerase and epimerase polypeptides refer to polypeptides that catalyze the stereochemical inversion around an asymmetric carbon atom of a substrate. The predicted amino acid sequences of non-limiting exemplary isomerase and epimerase polypeptides are shown in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 1 06, 108, 1 10, 1 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 1 58, 160, 162, 164, 166, 168, 1 70, 172, 1 74, 1 76, 178, 1 80, 1 82, 184, 1 86, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 2 16, 21 8, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 3 16, 3 1 8, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498. The term "purified" polypeptide as used herein refers to a polypeptide that has been separated from cellular components that naturally accompany it. Typically, a polypeptide is considered "purified" when it is at least partically free from the proteins and naturally occurring molecules with which it is naturally associated. The extent of enrichment or purity of an isomerase or epimerase polypeptide can be measured using any appropriate method, e.g., column chromatography, polyacrylamide gel electrophoresis, or HPLC analysis. [0043] The invention also provides for isomerase and epimerase polypeptides that differ in sequence from any of SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, 1 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 1 64, 166, 168, 170, 172, 174, 176, 1 78, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254; 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498. For example, the skilled artisan will appreciate that changes can be introduced into an isomerase or epimerase polypeptide (e.g., SEQIDNOs:2, 4, 6, 8, 10, 12, 14, 16, 18,20,22,24,26,28, 30, 32,34,36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90,92,94,96,98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198,200,202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498) or into an isomerase or epimerase nucleic acid molecule (e.g., SEQ IDNOs:l, 3, 5, 7,9, 11, 13, 15, 17, 19,21,23,25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89,91,93,95,97,99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 369, 371, 373, 375, 377, 379, 381, 383, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409, 411, 413, 415, 417, 419, 421, 423, 425, 427, 429, 431 , 433, 435, 437, 439, 441 , 443, 445, 447, 449, 451 , 453, 455, 457, 459, 461 , 463, 465, 467, 469, 471 , 473, 475, 477, 479, 481 , 483, 485, 487, 489, 491 , 493, 495, and 497), thereby leading to changes in the amino acid sequence of the encoded polypeptide, lsomerase and epimerase polypeptides that differ in sequence from SEQ I D NOs:2, 4, 6, 8, 10, 12, 14, 1 6, 1 8, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, 1 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 1 76, 178, 180, 1 82, 184, 186, 1 88, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 3 14, 3 16, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498 and that retain stereo-inverting activity readily can be identified by screening methods routinely used in the art.
[0044] For example, changes can be introduced into an isomerase or epimerase nucleic acid coding sequence that lead to conservative and/or non-conservative amino acid substitutions at one or more amino acid residues in the encoded isomerase or epimerase polypeptide. Polypeptides that differ in sequence from the amino acid sequences shown in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 1 8, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, 1 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 1 52, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 1 74, 176, 178, 1 80, 182, 184, 1 86, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 21 8, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 3 12, 3 14, 316, 3 18, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498 can be generated by standard techniques such as site-directed or PCR-mediated mutagenesis of a nucleic acid encoding such a polypeptide, or directed evolution. In addition, changes in the polypeptide sequence can be introduced randomly along all or part of the isomerase or epimerase polypeptide, such as by saturation mutagenesis of the corresponding nucleic acid molecule. Alternatively, changes can be introduced into a nucleic acid or polypeptide sequence by chemically synthesizing a nucleic acid molecule or polypeptide having such changes.
[0045] A "conservative amino acid substitution" is one in which one amino acid residue is replaced with a different amino acid residue having a similar side chain. Similarity between amino acid residues has been assessed in the art. For example, Dayhoff et al. ( 1978, in Atlas of Protein Sequence and Structure, 5(Suppl. 3):345-352) provides frequency tables for amino acid substitutions that can be employed as a measure of amino acid similarity. Examples of conservative substitutions include, without limitation,, replacement of an aliphatic amino acid with another aliphatic amino acid; replacement of a serine with a threonine or vice versa; replacement of an acidic residue with another acidic residue; replacement of a residue bearing an amide group with another residue bearing an amide group; exchange of a basic residue with another basic residue; or replacement of an aromatic residue with another aromatic residue. A non-conservative substitution is one in which an amino acid residue is replaced with an amino acid residue that does not have a similar side chain.
[0046] Changes in a nucleic acid can be introduced using one or more mutagens. Mutagens include, without limitation, ultraviolet light, gamma irradiation, or chemical mutagens (e.g., mitomycin, nitrous acid, photoactivated psoralens, sodium bisulfite, nitrous acid, hydroxy lam ine, hydrazine or formic acid). Other mutagens are analogues of nucleotide precursors, e.g., nitrosoguanidine, 5-bromouracil, 2-aminopurine, or acridine. Intercalating agents such as proflavine, acriflavine, quinacrine and the like can also be used.
[0047] Changes also can be introduced into an isomerase or epimerase nucleic acid and/or polypeptide by methods such as error-prone PCR, shuffling, oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, gene reassembly (e.g., GeneReassembly, see, e.g., U.S. Patent No. 6,537,776), Gene Site Saturation Mutagenesis (GSSM), synthetic ligation reassembly (SLR), or a combination thereof. Changes also can be introduced into polypeptides by methods such as recombination, recursive sequence recombination, phosphothioate-modified DNA mutagenesis, uracil-containing template mutagenesis, gapped duplex mutagenesis, point mismatch repair mutagenesis, repair-deficient host strain mutagenesis, chemical mutagenesis, radiogenic mutagenesis, deletion mutagenesis, restriction-selection mutagenesis, restriction-purification mutagenesis, artificial gene synthesis, ensemble mutagenesis, chimeric nucleic acid multimer creation, or any combination thereof. [0048] An isomerase or epimerase nucleic acid can be codon optimized if so desired. For example, a non-preferred or a less preferred codon can be identified and replaced with a preferred or neutrally used codon encoding the same amino acid as the replaced codon. A preferred codon is a codon over-represented in coding sequences in genes in the host cell and a non-preferred or less preferred codon is a codon under-represented in coding sequences in genes in the host cell, thereby modifying the nucleic acid to increase its expression in a host cell. An isomerase or epimerase nucleic acid can be optimized for particular codon usage from any host cell (e.g., any of the host cells described herein). See, for example, U.S. Patent No. 5,795,737 for a representative description of codon optimization. In addition to codon optimization, a nucleic acid can undergo directed evolution. See, for example, U.S. Patent No. 6,361 ,974. [0049] Other changes also are within the scope of this disclosure. For example, one, two, three, four or more amino acids can be removed from the carboxy- and/or amino- terminal ends of an isomerase or epimerase polypeptide without significantly altering the biological activity. In addition, one or more amino acids can be changed to increase or decrease the pi of a polypeptide. In some embodiments, a residue can be changed to, for example, a glutamate. Also provided are chimeric isomerase or epimerase polypeptides. For example, a chimeric isomerase or epimerase polypeptide can include portions of different binding or catalytic domains. Methods of recombining different domains from different polypeptides and screening the resultant chimerics to find the best combination for a particular application or substrate are routine in the art. [0050] One particular change in sequence exemplified herein involves the residue corresponding to residue 76 in A. caviae BAR. In one instance, the polypeptide sequence of SEQ ID NO:441 was aligned with A. caviae BAR and the residue in SEQ ID NO:44 I that aligns with position 76 in A. caviae BAR was identified (residue 56) and changed from Asp to Asn (SEQ ID NO:44 I with D56N). Those of skill in the art can readily identify the residue that corresponds to residue 76 from BAR A. caviae in any of the racemases disclosed herein and introduce a change into the polypeptide sequence at that particular residue.
[0051] By way of example, the invention provides for racemase polypeptides having the sequence shown in SEQ I D NOs: 108, 1 10, 1 16, 244, 288 and 21 8 as well as racemase sequences that differ in sequence from SEQ I D NOs: 108, 1 10, 1 16, 244, 288 and 218. For example, racemase polypeptides having the sequence shown in SEQ ID NOs: 172, 1 78, 180, 182, 184, 140, 144, 1 88, 190, 1 12, 148, 156, 120 and 162 each exhibit 96% or greater sequence identity to the racemase polypeptide having the sequence shown in SEQ ID NO: 108; while SEQ ID NOs: 136, 1 74, 138 and 296 each exhibit 97% or greater sequence identity to SEQ ID NO: 1 10. In addition, SEQ I D NOs: 150, 192, 152, 1 18, 194, 154, 196, 158 and 160 each exhibit 97% or greater sequence identity to SEQ ID NO: I 16; and SEQ ID NOs:248, 236, 246, 252, 250 and 254 each exhibit 97% or greater sequence identity to SEQ ID NO:244. Also, SEQ I D TM0s:274, 234, 220, 222, 226, 232, 240, 242, 258, 260, 264, 266, 286, 290, 170 and 2 16 each exhibit 97% or greater identity to SEQ ID NO:288; SEQ ID NOs:208, 210, 228, 230, 270, 272, 278, 280, 282, 284, 292, 198, 212, 214 and 1 14 each exhibit 97% or greater sequence identity to SEQ I D NO:218; and SEQ ID NO:204 exhibits 96% sequence identity to SEQ I D NO:21 8. These sequence identities between racemase polypeptides are exemplary and are not meant to be exhaustive of all possible sequence identities within or between the isomerase and epimerase nucleic acid and polypeptide sequences disclosed herein. Also as discussed herein, identifying and/or designing nucleic acid or polypeptide sequences that differ in sequence from, for example, one or more isomerase or epimerase sequences are well within the ordinary skill of those in the art.
[0052] In addition, the racemase polypeptide having the sequence shown in SEQ ID NO:412 is novel; the closest polypeptide sequence in the public databases exhibits only 23% sequence identity to SEQ ID NO:4I 2. Therefore, polypeptides of the invention include polypeptides that have at least, for example, 25% sequence identity (e.g., at least 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% sequence identity) to SEQ ID NO:412 or fragments thereof and that have functional racemase activity. The racemase polypeptide having the sequence shown in SEQ ID NO:412 is encoded by the nucleic acid having the sequence shown in SEQ ID NO:41 1. Similarly, SEQ ID NO:41 1 is a novel nucleic acid, for which the closest nucleic acid sequence in the public databases exhibits only 43% sequence identity to SEQ ID NO:4 I I . Therefore, nucleic acid molecules of the invention include molecules that have at least, for example, 45% sequence identity (e.g., at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% sequence identity) to SEQ ID 1X10:41 1 or fragments thereof and that encode a polypeptide that has functional racemase activity. [0053] A fragment of an isomerase and epimerase nucleic acid or polypeptide refers to a portion of a full-length isomerase and epimerase nucleic acid or polypeptide. As used herein, "functional fragments" are those fragments of an isomerase or epimerase polypeptide that retain the respective enzymatic activity. "Functional fragments" also refer to fragments of an isomerase or epimerase nucleic acid that encode a polypeptide that retains the respective enzymatic activity. For example, functional fragments can be used in in vitro or in vivo reactions to catalyze transaminase or oxidation-reduction reactions, respectively. One example of a fragment, without limitation, is the PFAM domain from racemase polypeptides (PF0I 168 and PF00842; Finn et al., 2006, Nuc. Acids Res., Database Issue, 34:D247-D251). The PFAM domain is a fragment of a full-legnth racemase polypeptide that lacks about 30 to about 40 amino acids from the N-terminus of the polypeptide and also lacks about 10 to about 20 amino acids from the C-terminus of the polypeptide. The sequences of representative PFAM domains are shown in SEQ ID NOs:426, 440 and 462.
[0054] This disclosure provides for isomerase and epimerase polypeptides (and the nucleic acids encoding such polypeptides) lacking signal sequences (e.g., signal peptides) or prepro domains, and also provides for isomerases and epimerases that include signal sequences or prepro domains. The signal sequences or prepro domains can be isomerase or epimerase signal sequences or prepro domains, or such signal sequences or prepro domains can be obtained from non-isomerase, non-racemase and non-epimerase sequences. Such signal sequences or prepro domains can be operably linked to an isomerase or epimerase nucleic acid or polypeptide. A prepro domain can be located on the amino terminal or the carboxy terminal end of the polypeptide. Those in the art are familiar with SignalP, which can be used to identify signal peptides and cleavage sites. Representative signal sequences (also known as leader sequences) for racemase polypeptides include, without limitation, MHKKTLLATLIXGLLAGQAVA (SEQ ID NO:50I ), wherein X is F or L, and MPFRRTLLAASLALLITGQAPLYA (SEQ ID NO:502). [0055] Isomerase or epimerase polypeptides can be obtained (e.g., purified) from natural sources (e.g., a biological sample) by known methods such as DEAE ion exchange, gel filtration, and hydroxyapatite chromatography. Natural sources include, but are not limited to, microorganisms such as bacteria and yeast. A purified isomerase or epimerase polypeptide also can be obtained, for example, by cloning and expressing an isomerase or epimerase nucleic acid (e.g., SEQ ID NOs: I, 3, 5, 7, 9, II, 13, 15, 17, 19,21,23,25,27,29,31, 33,35,37,39,41,43,45,47,49,51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199,201,203,205,207, 209,211,213,215, 217, 219, 221 , 223, 225, 227, 229, 231 , 233, 235, 237, 239, 241 , 243, 245, 247, 249, 251 , 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 369, 371, 373, 375, 377, 379, 381, 383, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409, 411 , 413, 415, 417, 419, 421 , 423, 425, 427, 429, 431 , 433, 435, 437, 439, 441 , 443, 445, 447, 449, 451 , 453, 455; 457, 459, 461 , 463, 465, 467, 469, 471 , 473, 475, 477, 479, 481 , 483. 485, 487, 489, 491 , 493, 495, and 497) and purifying the resultant polypeptide using, for example, any of the known expression systems including, but not limited to, glutathione S- transferase (GST), pGEX (Pharmacia Biotech Inc), pMAL (New England Biolabs, Beverly, MA) or pRIT5 (Pharmacia, Piscataway, NJ)). In addition, a purified isomerase or epimerase polypeptide can be obtained by chemical synthesis using, for example, solid-phase synthesis techniques (see e.g., Roberge, 1995, Science, 269:202; Merrifield, 1997, Methods EnzymoL, 289:3- 13).
[0056] A purified isomerase or epimerase polypeptide or a fragment thereof can be used as an immunogen to generate polyclonal or monoclonal antibodies that have specific binding affinity for one or more isomerase or epimerase polypeptides. Such antibodies can be generated using standard techniques that are used routinely in the art. Full-length isomerase or epimerase polypeptides or, alternatively, antigenic fragments of isomerase or epimerase polypeptides can be used as immunogens. An antigenic fragment of an isomerase or epimerase polypeptide usually includes at least 8 (e.g., 10, 15, 20, or 30) amino acid residues of an isomerase or epimerase polypeptide (e.g., having the sequence shown in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 1 10, I 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 1 50, 152, 1 54, 156, 1 58, 160, 162, 164, 166, 168, 1 70, 172, 1 74, 176, 178, 1 80, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 21 8, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 3 10, 312, 3 14, 3 16, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498), and encompasses an epitope of an isomerase or epimerase polypeptide such that an antibody (e.g., polyclonal or monoclonal; chimeric or humanized) raised against the antigenic fragment has specific binding affinity for one or more isomerase or epimerase polypeptides.
[0057] Polypeptides can be detected and quantified by any method known in the art including, but not limited to, nuclear magnetic resonance (NMR), spectrophotometry, radiography (protein radiolabeling), electrophoresis, capillary electrophoresis, high performance liquid chromatography (HPLC), thin layer chromatography (TLC), hyperdiffusion chromatography, various immunological methods, e.g. immunoprecipitation, immunodiffusion, Immunoelectrophoresis, radioimmunoassays (RIAs), enzyme-linked immunosorbent assays (ELISAs), immuno-fluorescent assays, gel electrophoresis (e.g., SDS-PAGE), staining with antibodies, fluorescent activated cell sorter (FACS), pyrolysis mass spectrometry, Fourier-Transform Infrared Spectrometry, Raman spectrometry, GC-MS, and LC-Electrospray and cap-LC-tandem- electrospray mass spectrometries, and the like. Novel bioactivities can also be screened using methods, or variations thereof, described in U.S. Patent No. 6,057, 103. Furthermore, one or more, or, all the polypeptides of a cell can be measured using a protein array.
Methods of Using lsomerase or Epimerase Nucleic Acids and Polypeptides [0058] The isomerase or epimerase polypeptides or the isomerase or epimerase nucleic acids encoding such isomerase and epimerase polypeptides, respectively, can be used in the conversion of one or more L-amino acids to the corresponding D-amino acid(s). It is noted that the reactions described herein are not limited to any particular method, unless otherwise stated. In one embodiment, one or more of the racemase nucleic acids or polypeptides disclosed herein can be used to produce D-tryptophan (or another D-amino acid) from L-tryptophan (or another L-amino acid), or vice versa. The reactions disclosed herein can take place in vivo, in vitro, or a combination thereof.
[0059] Constructs containing isomerase or epimerase nucleic acid molecules are provided. Constructs, including expression vectors, suitable for expressing an isomerase or epimerase polypeptide are commercially available and/or readily produced by recombinant DNA technology methods routine in the art. Representative constructs or vectors include, without limitation, replicons (e.g., RNA replicons, bacteriophages), autonomous self-replicating circular or linear DNA or RNA, a viral vector (e.g., an adenovirus vector, a retroviral vector or an adeno- associated viral vector), a plasmid, a phage, a phagemid, a cosmid, a fosmid, a bacteriophage or an artificial chromosome. The cloning vehicle can comprise an artificial chromosome comprising a bacterial artificial chromosome (BAC), a bacteriophage P I -derived vector (PAC), a yeast artificial chromosome (YAC), or a mammalian artificial chromosome (MAC). Exemplary vectors include, without limitation, pBR322 (ATCC 37017), pKK223-3, pSVK3, pBPV, pMSG, and pSVL (Pharmacia Fine Chemicals, Uppsala, Sweden), GEM I (Promega Biotech, Madison, Wl, USA) pQE70, pQE60, pQE-9 (Qiagen), pD I 0, psiX I 74 pBluescript Il KS, pNH8A, pNH l βa, pNH 18A, pNH46A, pSV2CAT, pOG44, pXTI , pSG (Stratagene), ptrc99a, pKK223-3, pKK233-3, DR540, pRIT5 (Pharmacia), pKK232-8 and pCM7. See, also, U.S. Patent No. 5,21 7,879 for a description of representative plasmids, viruses, and the like. [0060] A vector or construct containing an isomerase or epimerase nucleic acid molecule can have elements necessary for expression operably linked to the isomerase or epimerase nucleic acid. Elements necessary for expression include nucleic acid sequences that direct and regulate expression of nucleic acid coding sequences. One example of an element necessary for expression is a promoter sequence. Promoter sequences are sequences that are capable of driving transcription of a coding sequence. A promoter sequence can be, for example, an isomerase or epimerase promoter sequence, or a non-isomerase or non-epimerase promoter sequence. Non-isomerase and non-epimerase promoters include, for example, bacterial promoters such as lad, lacZ, T3, T7, gpt, lambda PR, lambdaPL and trp as well as eukaryotic promoters such as CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein I. Promoters also can be, for example, constitutive, inducible, and/or tissue-specific. A representative constitutive promoter is the CaMV 35S; representative inducible promoters include, for example, arabinose, tetracycline-inducible and salicylic acid-responsive promoters.
[0061] Additional elements necessary for expression can include introns, enhancer sequences (e.g., an SV40 enhancer), response elements, or inducible elements that modulate expression of a nucleic acid. Elements necessary for expression also can include, for example, a ribosome binding site for translation initiation, splice donor and acceptor sites, and a transcription terminator. Elements necessary for expression can be of bacterial, yeast, insect, mammalian, or viral origin, and vectors or constructs can contain a combination of elements from different origins. Elements necessary for expression are described, for example, in Goeddel, 1990, Gene Expression Technology: Methods in Enzymology, 185, Academic Press, San Diego, CA. [0062] A vector or construct as described herein further can include sequences such as those encoding a selectable marker (e.g., genes encoding dihydrofolate reductase or genes conferring neomycin resistance for eukaryotic cells; genes conferring tetracycline or ampicillin resistance for E. coli; and the gene encoding TRP l for S. cerevisiae), sequences that can be used in purification of an isomerase or epimerase polypeptide (e.g., 6xHis tag), and one or more sequences involved in replication of the vector or construct (e.g., origins of replication). In addition, a vector or construct can contain, for example, one or two regions that have sequence homology for integrating the vector or construct. Vectors and constructs for genomic integration are well known in the art. [0063] As used herein, operably linked means that a promoter and/or other regulatory element(s) are positioned in a vector or construct relative to an isomerase or epimerase nucleic acid in such a way as to direct or regulate expression of the isomerase or epimerase nucleic acid. Generally, promoter and other elements necessary for expression that are operably linked to a transcribed sequence are physically contiguous to the transcribed sequence, i.e., they are cis-acting. Some transcriptional regulatory sequences such as enhancers, however, need not be physically contiguous or located in close proximity to the coding sequences whose expression they enhance.
[0064] Also provided are host cells. Host cells generally contain a nucleic acid sequence of the invention, e.g., a sequence encoding an isomerase or an epimerase, or a vector or construct as described herein. The host cell may be any of the host cells familiar to those skilled in the art such as prokaryotic cells or eukaryotic cells including bacterial cells, fungal cells, yeast cells, mammalian cells, insect cells, or plant cells. Exemplary bacterial cells include any species within the genera Escherichia, Bacillus, Streptomyces, Salmonella, Pseudomonas and Staphylococcus, including, e.g., E. coli, L. lactis, B. subtilis, B. cereus, S. typhimurium, P. fluorescens . Exemplary fungal cells include any species of Aspergillus, and exemplary yeast cells include any species of Pichia, Saccharomyces, Schizosaccharomyces, or Schwanniomyces, including P. pastoris, S. cerevisiae, or S. pombe. Exemplary insect cells include any species of Spodoptera or Drosophila, including Drosophila S2 and Spodoptera Sj9. Exemplary animal cells include CHO, COS, Bowes melanoma, C 127, 3T3, HeLa and BHK cell lines. See, for example, Gluzman, 1981 , Cell, 23: 175. The selection of an appropriate host is within the abilities of those skilled in the art.
[0065] Techniques for introducing nucleic acid into a wide variety of cells are well known and described in the technical and scientific literature. A vector or construct can be introduced into host cells using any of a variety of techniques, including transformation, transfection, transduction, viral infection, gene guns, or Ti-mediated gene transfer. Particular methods include calcium phosphate transfection, DEAE-Dextran mediated transfection, lipofection, or electroporation (Davis et al., 1986, Basic Methods in Molecular Biology). Exemplary methods include CaPO4 precipitation, liposome fusion, lipofection (e.g., LIPOFECTIN™), electroporation, viral infection, etc. The isomerase or epimerase nucleic acids may stably integrate into the genome of the host cell (for example, with retroviral introduction) or may exist either transiently or stably in the cytoplasm (i.e. through the use of traditional plasmids, utilizing standard regulatory sequences, selection markers, etc.). [0066] The content of host cells usually is harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained for further purification. Microbial cells employed for expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or the use of cell lysing agents. Such methods are well known to those skilled in the art. The expressed polypeptide or fragment thereof can be recovered and purified from cell cultures by methods including, but not limited to, precipitation (e.g., ammonium sulfate or ethanol), acid extraction, chromatography (e.g., anion or cation exchange, phosphocellulose, hydrophobic interaction, affinity, hydroxylapatite and lectin). If desired, high performance liquid chromatography (HPLC) can be employed for final purification steps.
[0067] Cell-free translation systems can also be employed to produce a polypeptide of the invention. Cell-free translation systems can use mRNAs transcribed from a DNA construct comprising a promoter operably linked to a nucleic acid encoding the polypeptide or fragment thereof. In some aspects, the DNA construct may be linearized prior to conducting an in vitro transcription reaction. The transcribed mRNA is then incubated with an appropriate cell-free translation extract, such as a rabbit reticulocyte extract, to produce the desired polypeptide or fragment thereof.
[0068] An isomerase or epimerase polypeptide, a fragment, or a variant thereof can be assayed for activity by any number of methods. Methods of detecting or measuring the activity of an enzymatic polypeptide generally include combining a polypeptide, fragment or variant thereof with an appropriate substrate and determining whether the amount of substrate decreases and/or the amount of product increases. The sustrates used to evaluate the activity of a number of racemases disclosed herein typically were one or more L-amino acids (e.g., L-tryptophan) and the products were the corresponding isomerized D-amino acid (e.g., D-tryptophan). In addition, racemases may show very little preference for or between substrate amino acids (e.g., broad activity racemases), while other racemases may exhibit a preference for one or more amino acids. In addition to L- or D-amino acid substrates, it is expected that polypeptides disclosed herein also will utilize substituted L- or D-amino acid substrates such as, without limitation, chlorinated tryptophan or 5-hydroxytryptophan. It is noted that D-isomers can be distinguished and/or separated from L-isomerase using methods known in the art including, but not limited to, chiral chromatography, simulated moving bed (SMB) continuous chromatography, chiral ausiliaries, and/or enzymatic resolution.
[0069] One method for evaluating racemase activity is described in Schonfeld & Bornscheuer (2004, Anal. Chem., 76(4): I 184-8), which describes a polarimetric assay that indentifies alpha- amino acid racemase activity using a glutamate racemase from Lactobacillus fermentii expressed in E. coli, and measuring the time-dependent change of the optical rotation using L-glutamate as the substrate. In addition, methods of evaluating candidate polypeptides for racemase activity are described in Part A and Part B of the Example section herein. For the purposes of determining whether or not a polypeptide falls within the scope of the invention, the methods described in Part B of the Example section are employed.
[0070] Typically, an isomerase or epimerase polypeptide exhibits activity in the range of between about 0.05 to 20 units (e.g., about 0.10, 0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.90, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7, 8, 9, 10, I I , 12, 13, 14, 15, 16, 17, 18, 19 or 19.5 or more units). As used herein, a unit equals one μmol of product released per minute per mg of enzyme. In one embodiment, one unit of activity for a racemase polypeptide is one μmol of an isomer with inverted configuration (from the starting isomer) produced per minute per mg of enzyme (formed from the respective alpha-amino acid or amine). In an alternative embodiment, one unit of activity for an amino acid racemase is one μmol of R-amino acid produced per minute per mg of enzyme formed from the corresponding S-amino acid. In an alternative embodiment, one unit of activity for an amino acid racemase is one μmol of S-amino acid produced per minute per mg of enzyme formed from the corresponding R-amino acid. [0071] The conversion of L-tryptophan to D-tryptophan using one or more of the isomerase or epimerase nucleic acids or polypeptides disclosed herein can be performed in vitro or in vivo, in solution or in a host cell, in series or in parallel. When one or more reactions are performed in vitro, the desired ingredients for the reaction(s) can be combined by admixture in an aqueous reaction medium or solution and maintained for a period of time sufficient for the desired product(s) to be produced. Alternatively, one or more isomerase or epimerase polypeptides used in the one or more of the reactions described herein can be immobilized onto a solid support. Examples of solid supports include those that contain epoxy, aldehyde, chelators, or primary amine groups. Specific examples of suitable solid supports include, but are not limited to, Eupergit® C (Rohm and Haas Company, Philadelphia, PA) resin beads and SEPABEADS® EC- EP (Resindion).
[0072] To generate D-tryptophan from L-tryptophan in vivo, an isomerase or epimerase nucleic acid (e.g., an expression vector) can be introduced into any of the host cells described herein. Depending upon the host cell, many or all of the co-factors (e.g., a metal ion, a coenzyme, a pyridoxal-phosphate, or a phosphopantetheine) and/or substrates necessary for the conversion reactions to take place can be provided in the culture medium. After allowing the in vitro or in vivo reaction to proceed, the efficiency of the conversion can be evaluated by determining whether the amount of substrate has decreased or the amount of product has increased. [0073] In some embodiments, the activity of one or more of the isomerase or epimerase polypeptides disclosed herein can be improved or optimized using any number of strategies known to those of skill in the art. For example, the in vivo or in vitro conditions under which one or more reactions are performed such as pH or temperature can be adjusted to improve or optimize the activity of a polypeptide. In addition, the activity of a polypeptide can be improved or optimized by re-cloning the isomerase or epimerase nucleic acid into a different vector or construct and/or by using a different host cell. For example, a host cell can be used that has been genetically engineered or selected to exhibit increased uptake or production of tryptophan (see, for example, U.S. Patent No. 5,728,555). Further, the activity of an isomerase or epimerase polypeptide can be improved or optimized by ensuring or assisting in the proper folding of the polypeptide (e.g., by using chaperone polypeptides) or in the proper post-translational modifications such as, but not limited to, acetylation, acylation, ADP-ribosylation, amidation, glycosylation, hydroxylation, iodination, methylation, myristolyation, oxidation, pegylation, phosphorylation, prenylation, selenoylation, sulfation, disulfide bond formation, and demethylation as well as covalent attachment of molecules such as flavin, a heme moiety, a nucleotide or nucleotide derivative, a lipid or lipid derivative, and/or a phosphytidylinositol. In addition, the solubility of a polypeptide can be increased using any number of methods known in the art such as, but not limited to, low temperature expression or periplasmic expression. [0074] A number of polypeptides were identified herein that exhibit racemase activity. Specifically, SEQ ID NOs:412, 400, 406, 410, 408, 416, 41 8, 424, 440, 442, 444, 446, 454, 442, 474, 476, 322, 336, 338 and 442 exhibit isomerase activity. As disclosed herein, the sequence shown in SEQ ID NO:412 represents a very unique racemase, as the most homologous sequence in the public databases has only 30% sequence identity to SEQ ID NO:412. Additionally, despite the fact that SEQ ID NO:412 exhibited low solubility in its native form, SEQ ID NO:412 still exhibits very effective racemase activity. SEQ ID NO:322 also is unique as the encoded polypeptide is only 232 amino acids, making SEQ ID NO:322 the smallest active polypeptide identified.
[0075] Use of Isomerase or Epimerase Nucleic Acids or Polypeptides in the Production of
Monatin and Monatin Derivatives
[0076] Monatin is a high-intensity sweetener having the chemical formula:
Figure imgf000029_0001
[0077] Monatin includes two chiral centers leading to four potential stereoisomeric configurations. The R,R configuration (the "R,R stereoisomer" or "R,R monatin"); the S, S configuration (the "S, S stereoisomer" or "S, S monatin"); the R,S configuration (the "R,S stereoisomer" or "R,S monatin"); and the S,R configuration (the "S, R stereoisomer" or "S, R monatin"). As used herein, unless stated otherwise, the term "monatin" is used to refer to compositions including all four stereoisomers of monatin, compositions including any combination of monatin stereoisomers, (e.g., a composition including only the R,R and S, S, stereoisomers of monatin), as well as a single isomeric form (or any of the salts thereof). Due to various numbering systems for monatin, monatin is known by a number of alternative chemical names, including: 2-hydroxy-2-(indol-3-ylmethyl)-4-aminoglutaric acid; 4-amino-2-hydroxy-2- (l H-indol-3-ylmethyl)-pentanedioic acid; 4-hydroxy-4-(3-indolylmethyl)glutamic acid; and, 3- (l -amino- l ,3-dicarboxy-3-hydroxy-but-4-yl) indole.
[0078] Methods of producing various stereoisomers of monatin (e.g., R, R monatin) are disclosed in, for example, WO 07/1331 83 and WO 07/103389. One or more of the racemase polypeptides disclosed herein, in the presence of L-tryptophan, can be used in methods known to those of skill in the art to make a monatin composition. As disclosed in both WO 07/133 183 and WO 07/103389, the conversion of indole-3-pyruvate (or derivatives thereof; see, for example, WO 07/103389) to 2-hydroxy 2-(indol-3ylmethyl)-4-keto glutaric acid ("monatin precursor" or "MP") dictates the first chiral center of monatin, while the conversion of MP to monatin dictates the second chiral center. Therefore, the racemases disclosed herein, with or without another polypeptide (e.g., an R-specific aldolase as disclosed in WO 07/103389 and/or a D- aminotransferase as disclosed in co-pending U.S. Application No. 61/01 8,814 can be used to generate a desired percentage or a minimum or maximum desired percentage of one or more particular monatin stereoisomers (e.g., R, R monatin) relative to other monatin stereoisomers (e.g., S, R monatin). In some embodiments, amino acid racemases that do not isomerize significant amounts of monatin are used rather than racemases that isomerize monatin as a method of maintaining the desired percentage of one or more particular monatin stereoisomers. [0079] Monatin that is produced utilizing one or more of the racemase polypeptides disclosed herein can be at least about 0.5-30% R,R-monatin by weight of the total monatin produced. In other embodiments, the monatin produced using one or more of the polypeptides disclosed herein, is greater than 30% R,R-monatin, by weight of the total monatin produced; for example, the R,R-monatin is 40%, 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or 99% of the total monatin produced. Alternatively, various amounts of two or more preparations of monatin can be combined so as to result in a preparation that is a desired percentage of R,R-monatin. [0080] Monatin produced using one or more of the racemase polypeptides disclosed herein can be, for example, a derivative. "Monatin derivatives" have the following structure:
Figure imgf000030_0001
wherein, Ra, Rb, Rc, Rd, and Re each independently represent any substituent selected from a hydrogen atom, a hydroxyl group, a C1-C3 alkyl group, a C1-C3 alkoxy group, an amino group, or a halogen atom, such as an iodine atom, bromine atom, chlorine atom, or fluorine atom. However, Ra, Rb, Rc, Rd, and Re cannot simultaneously all be hydrogen. Alternatively, Rb and Rc, and/or Rd and Re may together form a C1-C4 alkylene group, respectively. "Substituted monatin" refers to, for example, halogenated or chlorinated monatin or monatin derivatives. See, for example, U.S. Publication No. 2005/01 183 17.
[0081] Monatin derivatives also can be used as sweeteners. For example, chlorinated D- tryptophan, particularly 6-chloro-D-tryptophan, which has structural similarities to R,R monatin, has been identified as a non-nutritive sweetener. Similarly, halogenated and hydroxy-substituted forms of monatin have been found to be sweet. See, for example, U.S. Publication No. 2005/01 18317. Substituted indoles have been shown in the literature to be suitable substrates and have yielded substituted tryptophans. See, for example, Fukuda et al., 1971 , Appl. Environ. Microbiol., 21 :841 -43. The halogen does not appear to sterically hinder the catalytic mechanism or the enantiospecificity of the enzyme. Therefore, halogens and hydroxyl groups should be substitutable for hydrogen, particularly at positions 1 -4 of the benzene ring in the indole of tryptophan, without interfering in subsequent conversions to D- or L-tryptophan, indole-3- pyruvate, MP, or monatin. [0082] In accordance with the present invention, there may be employed conventional molecular biology, microbiology, biochemical, and chemical techniques within the skill of the art. Such techniques are explained fully in the literature. The invention will be further described in the following examples, which do not limit the scope of the invention described in the claims.
EXAMPLES
[0083] The Examples in Part A describe the methodologies used for characterization of the candidate isomerase and epimerase nucleic acids and the encoded polypeptides. Additional characterization of a subset of the isomerase and epimerase nucleic acids and polypeptides, particularly the nucleic acids encoding amino acid racemases, is described in Part B.
Part A Example 1 — Effect of Leader Sequence on Racemase
[0084] Many of the racemases that were discovered had native signal/leader sequences. The signal sequences and corresponding cleavage sites were identified by SignalP 3.0 (at cbs.dtu.dk/services/SignalP/ on the World Wide Web). It was observed that clones containing racemases with leader sequences tended to be more difficult to grow. The clones grew well with fresh transformations, however they did not grow well when they were subcultured or inoculated from glycerol stocks. Samples were grown (or, at least, attempted) a minimum of two times. [0085] The table below indicates several clones that contained their native signal sequences. These samples were in the PCR4-TOPO vector / Top 10 host (Invitrogen, Carlsbad, CA). Growth conditions were over-night in LB/ kanamycin 50 μg/mL, 37°C. All of these samples were difficult to grow. Nineteen of the clones have the following leader sequence: MHKKTLLATLIFGLLAGQAVA (SEQ ID NO:499). Seventeen of the clones have a leader sequence that differs by one amino acid: M H KKTLLATLI LGLLAGQAVA (SEQ ID NO:500). Therefore, the consensus sequence for racemase leader sequences is MHKKTLLATLIXGLLAGQAVA (SEQ ID NO:501 ) where X is F or L. Table 1 shows the leader sequences that were identified.
Table I : Leadered racemase clones
Figure imgf000031_0001
Figure imgf000032_0001
Figure imgf000033_0001
Figure imgf000034_0001
Figure imgf000035_0001
Figure imgf000036_0001
Figure imgf000037_0001
[0086] The wild type (leadered) broad ammo acid racemase (BAR) from Pseudomonas pulida KT2440 BAR (cloned as described in Examples 28 in WO 2007/1033389) was not difficult to grow in the pET30 vector (Novagen, Madison, Wl) and E coll expression host BL21 (DE3) (Novagen, Madison, Wl). The leader sequence for the P putida KT2440 BAR was: MPFRRTLLAASLALLITGQAPLYA (SEQ ID NO:502).
[0087] To further investigate the effect of the leader sequence on growth, some racemases were subcloned into expression vectors with and without the native signal sequences. The samples are listed below (Table 2). The left-hand column indicates the leadered sublone version while the middle column indicates the same gene subcloned without a leader sequence (for example, SEQ ID NO:412 is the leaderless version of SEQ ID NO:490).
[0088] These samples were in the pSE420-cHis vector / MB2946 host (Strych & Benedik, 2002, J Bacteriology, 184:4321 -5). The vector pSE420-cHis is a derivative of pSE420 (Invitrogen, Carlsbad, CA). For pSE420-cHis, the vector was cut with Ncol and HindIII, and ligated with C- His: C-His: 5'-CCA TGG GAG GAT CCA GAT CTC ATC ACC ATC ACC ATC ACT AAG CTT-3- (SEQ ID ΝO:569). The expression of the His-tag in this vector depends on the choice of host and stop codon. That is, if a TAG stop codon and a supE host are used, the His-tag is expressed; if a TAG stop codon and a non-supE host are used, the His-tag is not expressed. Unless indicated otherwise, the His-tag was not expressed in these experiments. [0089] Growth conditions were overnight in LB with 100 μg/mL carbenicillin, 37°C.
Figure imgf000038_0001
[0090] In general, the leadered racemase subclones were more difficult to grow than the non- leadered counterparts under the conditions described in Part A. SEQ I D NOs:490, 494, 496 and 498 were difficult to grow. SEQ ID NOs:428 and 430 would grow; however, they grew extremely slowly and did not reach an inducible ODβoo =0.5 within 8 hours. SEQ ID NO:492 was the only leadered racemase subclone tested that was not difficult to grow. [0091] In summary, leadered racemase candidates generally were harder to grow than the non- leadered counterparts under the conditions described above. The reason for the decrease in viability or robustness has not been identified. The cells could potentially be expelling the plasmids, thereby losing the antibiotic resistance over time. In order to maximize robustness, the number of rounds of growth for racemases with leader sequences was minimized. This was done by storing the DNA and performing fresh transformations each time the constructs were used. [0092] The host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay in the presence of the respective leader sequences. Additionally, under optimized conditions, it is expected that racemases could have tryptophan racemase activity with or without leader sequences (native or artificial such as PeIB).
Example 2 — Improvement of SEQ ID NO:4 l 2 solubility using ArcticExpress™ hosts [0093] The expression of SEQ ID NO:4 l I nucleic acid encoding a racemase polypeptide having the sequence of SEQ ID NO:4 l 2 was analyzed by SDS-PAGE. SEQ I D NO:4 l l nucleic acid expressed well and the resulting SEQ ID NO:4 I 2 polypeptide had high activity even though only a portion (<20%) of the corresponding band in the total protein lane was soluble. In order to improve soluble expression, the racemase was moved into two ArcticExpress™ hosts (Stratagene, La JoHa, CA). The racemase was subcloned into the pET28b vector and the DNA was transformed into ArcticExpress™(DE3) and ArcticExpress™(DE3)RI L and plated on LB kanamycin 50 μg/mL, gentamicin 20 μg/mL, and LB kanamycin 50 μg/mL, gentamicin 20 μg/mL, streptomycin 75 μg/mL, respectively. The pET28b vector was also transformed into each host as a negative control. Samples were grown overnight at 30°C. Four colonies were picked for each construct from each ArcticExpress™ host.
Table 3. Names of constructs in ArcticExpress™(DE3) & ArcticExpress™(DE3)RIL
Figure imgf000039_0001
[0094] Cultures were streaked onto fresh plates with the appropriate antibiotics two days prior to performing a large scale growth. Samples were grown on LB plates with the appropriate antibiotics and incubated overnight at 30°C. The next day, a single colony was picked from each plate and used to inoculate 50 inL of LB with appropriate antibiotics. Samples were incubated overnight at 30°C and 210 rpm. The next day, the culture was used to inoculate 500 mL of LB with the appropriate antibiotics in a 2.8 L baffled flask to an ODήoonm of 0.05. The cultures were grown at 30°C at 210rpm. When the ODόoonm was between 0.4-0.8, the flasks were transferred to an 1 1°C incubator and allowed to incubate for 10 minutes prior to inducing with a final concentration of 1 mM IPTG. Samples were induced overnight at M °C at 210 rpm (with the exception of DE3-2 and DE3-4, which were induced at 1 6°C).
[0095] The next morning the cultures were collected and centrifuged at 6,000 rpm for 20 minutes, and the supernatant was discarded. The pellet was resuspended in 20 mL of 50 mM sodium phosphate buffer (pH 7.5), 400 μg/mL lysozyme, 26 LVmL DNasel . Cells were lysed using a microfluidizer (Microfluidics Corporation, Newton, MA) per the manufacturer's instructions; each sample was passed through the microfluidizer three times. One mL of lysate was set aside for gel analysis of the total protein fraction. The remainder of the lysate was centrifuged at 1 2,000 rpm at 4°C for 30 minutes. The supernatant was saved. Protein concentration was determined using the Bio-Rad Protein Assay (Bio-Rad, Hercules, CA). The soluble and whole cell fraction was then analyzed by SDS-PAGE using 4-20% Tris-glycine gels (Invitrogen, Carlsbad, CA).
Table 4. Approximate soluble expression levels of racemase constructs in pET28b/ArcticEx ress™(DE3) and pET28b/ArcticExpress™( E3)RIL
Figure imgf000040_0001
[0096] As shown above, the soluble expression of the racemase, expressed as a percentage of the corresponding total racemase protein band, was improved in the ArcticExpress™(DE3) & ArcticExpress™(DE3)RIL host.
[0097] Samples were tested for activity using a racemase assay (as described in Example 4). Racemases were loaded at 7.5, 0.75, 0.075 μg/mL total protein and incubated with 10 mM L- tryptophan and 10 μM PLP at pH 8 and 37°C. At indicated timepoints, 50 μL of the reaction product was added to 150 μL of ice cold acetonitrile. Samples were vortexed for 30 seconds and the supernatant was then diluted fifty-fold in methanol. Samples were then analyzed by LC/MS/MS (as described in Example 4) to monitor the D-tryptophan formed and the residual L- tryptophan.
Table 5: Racemase activity of SEQ ID NO:412 expressed from constructs in ArcticExpress™(DE3) or ArcticExpress™(DE3)RlL
Figure imgf000040_0002
7.5 μg/mL total protein [0098] As shown above, all of the constructs were active in ArcticExpress(DE3) and
ArcticExpress(DE3)RIL at a 7.5 μg/mL total protein loading. All the constructs were also active when the protein was loaded at 0.75 and 0.075 μg/mL total protein. The vector/host controls had little or no activity compared to the racemase constructs.
[0099] In summary, the racemase polypeptide having the sequence of SEQ ID NO:412 was active and soluble expression was improved in ArcticExpress™(DE3) &
ArcticExpress™(DE3)RIL.
Example 3 — Activity of Racemase PFAM domain subclones
[00100] Several sets of proprietary degenerate PCR primers were designed as part of a sequence-based discovery effort for the amplification of racemases from mixed population environmental DNA libraries as described in U.S. Patent No. 6,455,254. One set of proprietary degenerate PCR primers amplified the PFAM domain of the racemase exclusively. The racemases were amplified using a sequence-based discovery method as described in U.S. Patent No. 6,455,254. The PFAM domain is slightly smaller than the full-length racemase protein. As compared to the full length racemase, the PFAM domain is missing about 30-40 amino acids from the N-terminus (mostly signal peptide) and about 10-20 amino acids from the C-terminus. [00101] Several racemase PFAM domains were amplified using this method. Three
PFAM domains were selected for subcloning in order to determine if the PFAM domain was sufficient to detect racemase activity. The samples were subcloned into the pSE420-cHis vector (His-tag not expressed) in MB2946 host cells (Strych & Benedik, 2002, J. Bacteriology, 184:4321 -5). The subclones were SEQ ID NOs:425, 439 and 461 encoding SEQ ID NOs:426, 440 and 462, respectively.
[00102] The polypeptide having the sequence shown in SEQ ID NO:426 was selected for activity testing. Flasks containing 50 mL LB, 100 μg/mL carbenicillin and 50 mM D-alanine were inoculated from glycerol stocks and grown overnight at 37°C with shaking. The following morning, flasks containing 400 mL LB, 100 μg/mL carbenicillin and 50 mM D-alanine were inoculated to an OD600nm of 0.05. Cultures were grown at 37°C with shaking and induced with I mM IPTG when the OD600nm was between 0.5-0.8. Cultures were induced overnight at 30°C. [00103] Cell pellets were collected by centrifugation at 6000 rpm for 20 minutes. Cell pellets were resuspended in 20 mL of 50 mM sodium phosphate buffer pH 7.5 with 26 U/ml DNAsel. Cell pellets were lysed in a microfluidizer (Microfluidics Corporation, Newton, MA) per the manufacturer's instructions. Samples were centrifuged at 12,000 rpm for 30 minutes and the soluble fraction was collected. Protein concentration was determined by comparing the absorbance of cell extract containing the SEQ ID NO:426 polypeptide to known standards in the Bio-Rad Protein Assay reagent (Bio-Rad, Hercules, CA).
[00104] Samples were tested for activity using the following racemase assay conditions
(also as described in Example 4). Racemases were loaded at I 0 mg/mL total protein and incubated with I 0 inM L-tryptophan and I 0 μM PLP at pH 8 and a temperature of 37°C. At indicated timepoints (0, 2, 4, and 24 hours), 50 μL of the reaction product was added to 1 50 μL of ice-cold acetonitrile. Samples were vortexed for 30 seconds and passed through a 0.2 μm filter and the filtrate was then diluted fifty-fold in methanol. Samples were then analyzed by LC/MS/MS (as described in Example 4) to monitor the D-tryptophan formed (Table 6).
Table 6. Racemase activity of PFAM
Figure imgf000042_0001
[00105] E.coli MB2946 host cells (Strych & Benedik, supra) was used as the negative control, while wild type Pseudomonas putida KT2440 BAR was used as a positive control. The racemase having SEQ ID NO:426 was active under the conditions described in Example 4. The results above thereby demonstrate that a racemase PFAM domain could have sufficient structural elements to maintain racemase catalytic activity.
Example 4 — Growth and Racemase Assay Procedures Enzyme Preparation
[00106] Gycerol stocks were used to inoculate flasks containing 50 mL of LB with the appropriate antibiotic. The starter culture was grown overnight at 37°C and 230 rpm. The OD600nm of starter culture was checked, and used to inoculate a 400 ml culture to an OD600nm of 0.05. The culture was incubated at 37°C and 230 rpm, and the OD600nm was checked periodically. The cultures were induced, typically with I mM IPTG, when the OD600nm reached between 0.5-0.8. Induced cultures were incubated overnight at 30°C and 230 rpm. The culture was harvested by pelleting cells at 4000 rpm for 15 minutes. The supernatant was poured off, and cell pellets were either lysed immediately or frozen for later use.
[00107] The pellets were resuspended in 20 mis of 50 mM sodium phosphate buffer (pH
7.5) supplemented with 26 U/ml of DNAse. Once the pellet was completely resuspended in the buffer, cells were lysed using a microfluidizer (Microfluidics Corporation, Newton, MA) per the manufacturer's instructions. The cell extract was collected and centrifuged at 1 1 ,000 rpm for 30 minutes. The supernatant was collected in a clean tube and filtered through a 0.2 μm filter. Five mL aliquots of clarified lysate were placed in each vial and freeze-dried using the lyophilizer (Virtis Company, Gardinier, NY) per the manufacturer's instructions. A 1 mL sample was retained for protein estimation using the Bio-Rad Protein Assay Reagent (Bio-Rad, Hercules, CA) and SDS-PAGE analysis. Once the lysate was lyophilized, the amount of protein per vial was calculated.
[00108] Enzymes were prepared for the activity assay by resuspending in 50 mM sodium phosphate (pH 7.5). The racemase assays were usually run with about I 0 - 20 mg/ml total protein.
Racemase Assay
[00109] Ten mM L-tryptophan, I 0 μM PLP, 50 mM sodium phosphate pH 8, I 0 mg/mL racemase (total protein) prepared as described above (see Example 4 — Enzyme Preparation) were combined and incubated at 37°C and 300 rpm. Fifty μL of the reaction product were transferred to 150 μL of ice cold acetonitrile at timepoints (generally 0, 2, 4, and 24 hours) and the samples were vortexed for 30 seconds. The samples were centrifuged at 13,200 rpm for 10 minutes and 4°C and the supernatant was passed through a 0.45 μm filter. The filtrate was diluted 10-fold in methanol. Samples were analyzed by LC/MS/MS to monitor the D-tryptophan formed (see description below). LC/MS/MS method for detecting D- and L-tryptophan
[00110] LC/MS/MS screening was achieved by injecting samples from 96-well plates using a CTCPaI auto-sampler (LEAP Technologies, Carrboro, NC) into a 70/30 MeOH/H2O (0.25% AcOH) mixture provided by LC- l 0ADvp pumps (Shimadzu, Kyoto, Japan) at 0.8 mL/min through a Chirobiotic T column (4.6 x 250 mm) and into the API4000 Turbolon-Spray triple-quad mass spectrometer (Applied Biosystems, Foster City, CA). Ion spray and Multiple Reaction Monitoring (MRM) were performed for the analytes of interest in the positive ion mode and each analysis lasted 15.0 minutes. D- and L-tryptophan parent/daughter ions: 205.16/ 1 88.20.
Example 5 — Racemase activity dependent upon conditions
[00111] SEQ ID NOs:40 l , 385, 395, 413, 419, 421 , 425, 437, 427, 433, and 435 are racemase subclones that, when expressed (into polypeptides having the sequence of SEQ ID NOs:402, 386, 396, 414, 420, 422, 426, 438, 428, 434 and 436, respectively) were active under the conditions described in Part A. These subclones were not active under the conditions described in Part B (see Example 6 for details on SEQ ID NOs:385, 395, and 401 ; see Example 7 for details on SEQ ID NO:4 I 3; see Example 12 for details on SEQ ID NOs:4 l 9, 421 , 425, 427,
433, 435, and 437).
[00112] The racemase subclones were in the pSE420-cHis vector / MB2946 host (Strych
& Benedik, 2002, J. Bacteriology, 184:4321 -5) with the exception of SEQ ID NO:4 I 3. SEQ ID
NO:413 was in pSE420-cHis / Top l 0 host (Invitrogen, Carlsbad,CA). The His-tag was not expressed in any of these subclones.
[00113] The subclones were grown, lysed and lyophilized according to the procedures described in Example 4. Samples were tested for activity using a racemase assay (as described in Example 4). Racemases were incubated with I 0 mM L-tryptophan and I 0 μM PLP at pH 8 and 37°C. All racemases were utilized at I 0 mg/mL total protein concentration with the exception of the polypeptide having the sequence of SEQ ID NO:402. This polypeptide having the sequence of SEQ ID NO:402 was used at 5 mg/mL total protein concentration because there was not enough biomass to allow for a higher loading.
[00114] At indicated timepoints, 50 μL of the reaction product was added to 150 μL of ice cold acetonitrile. Samples were vortexed for 30 seconds and the supernatant was then diluted fifty-fold in methanol. Samples were then analyzed by LC/MS/MS (as described in Example 4) to monitor the D-tryptophan formed and the residual L-tryptophan remaining.
[00115] Tables 7, 8, 9 and I 0 show the racemase activity over time. Samples that were assayed together are grouped together in a single table.
Figure imgf000044_0001
Table 8. Racemase activity assay
Figure imgf000044_0002
Table 9. Racemase activity assay
Figure imgf000045_0001
Table 10. Racemase activity assay
Figure imgf000045_0002
[00116] In summary, racemases having the sequence shown in SEQ ID NOs:402, 386,
396, 414, 420, 422, 426, 438, 434, and 436 were active on tryptophan under the conditions described in Part A. These samples were not active under the conditions described in Part B (see Examples 6, 7, and 12). The differences in observed racemase activity may be attributed to differences in host strains, expression conditions, post-expression cell handling and assay protein-loading. Refer to Example 3 for activity data for a racemase having SEQ I D NO:426. It is noted that the racemase having SEQ ID NO:428 is not included here because it did not reach an inducible OD600 and, therefore, was not induced.
[00117] It is expected that the presence of activity in a polypeptide encoded from a siibcloned nucleic acid is predictive of the presence of activity in the corresponding polypeptide encoded from the full-length or wild type nucleic acid. See, for example, Table 1 1.
Table I :
Figure imgf000045_0003
Part B
Example 6 — Analysis of racemases provided as pSE420 clones
[00118] SEQ I D NOs:385, 387, 389, 391 , 393, 395, 397, 399, 401 , 403, 405, 407, and 409 encoding racemases having SEQ ID NOs:386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408 and 410, respectively, were provided as pSE420-cHis clones. One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., 1995, Gene, l 64( l ):49-53. The plasmids were transformed into E. coli XL- I Blue (Novagen/EMD Biosciences, San Diego, CA) cells as per manufacturer instructions.
[00119] Transformants were grown overnight at 37°C and 250 rpm in 5 ml LB containing ampicillin ( 100 μg/mL). Overnight cultures were used to inoculate 25 mL of the same media in 250 mL baffled shake flasks. Cultures were grown at 30°C and 250 rpm until they reached an OD60O of 0.6, after which protein expression was induced with I mM IPTG for 4.25 h at 30°C. Samples for total protein were taken prior to induction and right before harvesting. Cells were harvested by centrifugation and frozen at -80°C.
[00120] Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 μL/m L of Protease Inhibitor Cocktail Il (Calbiochem, San Diego, CA) and I μl/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14,000 rpm for 20 min (at 4°C) and the supernatant was carefully removed. Detergents and low molecular weight molecules were removed by passage through PD- I 0 columns (GE Healthcare, Piscataway, NJ) previously equilibrated with 100 mM potassium phosphate (pH 7.8) with 0.05 mM PLP. Proteins were eluted with 3.5 mL of the same buffer. Total protein concentration was determined using the Pierce BCA total protein assay with bovine serum albumin (BSA) as the standard, per the manufacturer's instructions (Pierce Biotechnology, Inc., Rockford, IL). The resulting cell-free extract was used for subsequent assays.
[00121] Racemase assays were performed on crude protein extracts as described in
Example 17. For the tryptophan racemase assay, desalted protein was added to target 50 μg racemase protein for each enzyme assay. Calculations were based on Pierce BCA total protein analysis with BSA as the standard (Pierce Biotechnology, Inc., Rockford, IL) and SDS-PAGE visual inspection. Formation of D-tryptophan was measured at 2 hours, and 21 hours. A cell- free extract of empty vector pSE420-cHis served as a negative control.
Table 12. D-tryptophan production
Figure imgf000047_0001
nd = not detected under the conditions of the assay as described above
[00122] It is noted that, when cell-free extracts were analyzed by SDS-PAGE, very low expression was observed. It was concluded, therefore, that the cell-free extracts likely contained significantly less protein than the purified positive control enzyme (wild-type A. caviae), prepared as described in Example 30 of WO 07/ 103389 and as described in Example 19. [00123] Tryptophan racemase activity was detected for racemases having the amino acid sequence shown in SEQ ID NOs:400, 404, 406, 408 and 4 I 0 using the conditions described in Part B. Similar results were obtained for racemases having the polypeptide sequence shown in SEQ ID NO:400, 404, 406, 408 and 410 using the reaction conditions described in Part A. In addition, detectable activity was observed for candidates having the polypeptide sequence shown in SEQ ID NO:386, 396 and 402 using conditions described in Part A, but was not observed using conditions described in Part B (see, for example, Example 5). Detectable activity was not observed for the polypeptide shown in SEQ I D NO:394 under the conditions described in Part A, while very low activity (barely detectable at 21 hours) was observed for the racemase polypeptide shown in SEQ I D NO:394 under the conditions described in Part B. [00124] Some constructs were observed, under the conditions described in Part A, to be unstable in expression systems, particularly those with a leader sequence. The host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity. [00125] It is expected that the presence of activity in a polypeptide encoded from a subcloned nucleic acid is predictive of the presence of activity in the corresponding polypeptide encoded from the full-length or wild type nucleic acid, as indicated below in Table 13.
Table 13.
Figure imgf000048_0001
Example 7 — Characterization of Racemases Having the Sequences Shown in SEQ ID NO:414 and 412
[00126] Racemases having the polypeptide sequence of SEQ ID NO:412 and 414 were both found to be active when assayed for tryptophan racemase activity under the conditions described in Part A. One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra. It should be noted that 10 mg of total protein in the form of lyophilized cell extracts was used in Part A when evaluating racemase activity (see Example 4). In some cases, this was ten times as much total soluble protein as was used in the assays described in Part B. This difference in the amount of protein used in the assays (i.e., of Part A vs. Part B) may explain, at least in part, some of the differences in activity observed with the same polypeptide.
[00127] The nucleic acid having the sequence of SEQ ID NO:4 I 3, which encodes the racemase polypeptide having the sequence shown in SEQ ID NO:4 l 4, was expressed in 3 different hosts in Part A (MB2946, XL- I Blue, and TOPI 0). High activity was observed in cell- free extract from the TOP I 0 host, with only a small amount of activity observed in XL- I Blue and no detectable product formed from the MB2946 host under the conditions of the assay. The nucleic acid having the sequence of SEQ ID NO:4 I I , which encodes the polypeptide having the sequence of SEQ ID NO:4 I 2, was expressed in the MB2946 host and found to be highly active. [00128] The nucleic acids of SEQ I D NOs:4 l I and 4 I 3 were received as pSE420-cH is constructs, and were initially evaluated in E. coli TOP I 0. Strains were grown and induced, and cell extracts were prepared as described in Part B.
[00129] Tryptophan racemase assays were carried out using desalted cell-free extracts under the conditions described in Example 1 7.
[00130] 100 μg purified A. caviae D76N, prepared as described in Example 19, served as a positive control for the assay, and cell-free extract of E. coli host cells containing the empty vector pSE420-cHis served as a negative control. 1.4 mg of total protein was used for polypeptides having SEQ ID NO:4 I 2 and 414.
Table 14. Trp Assay results
Figure imgf000049_0001
nd = not detected under the conditions of the assay as described above; control was purified BAR from A. caviae D76N mutant - 100 μg/ml; not much activity detected in crude extract of SEQ ID NO:414 polypeptides ( 1.4 mg/ml); some limited (very low) activity in extracts containing pSE420-cHis vector control; no band was observed in crude extracts from SEQ ID NO:412 polypeptides, however, good racemase activity was still detected.
[00131] There was very little activity detected in crude extract that contains the polypeptide having the sequence shown in SEQ ID NO:4 l 4 as well as negative control. The polypeptide having the sequence shown in SEQ ID NO:4 I 2 gave high specific activity given that there was barely detectable protein band observed in the soluble fraction (comparing 100 μg of purified A. caviae BAR to an estimated less than 30 μg of SEQ ID NO:4 I 2 polypeptide, assuming it was 2% or less of the total protein).
[00132] The host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity.
[00133] It is expected that the presence of activity in a polypeptide encoded from a subcloned nucleic acid is predictive of the presence of activity in the corresponding polypeptide encoded from the full-length or wild type nucleic acid, as indicated in Table 15 below. Table 15
Figure imgf000050_0001
Example 8 — Polypeptide having the sequence of SEQ ID NO:412 is more active on tryptophan than alanine
[00134] In order to get a more quantitative comparison of SEQ ID NO:4 I 2 to the benchmark BAR (A. caviae D76N from Example 19), SEQ ID NO:4 l I (encoding SEQ ID NO:4 I 2) was PCR-amplified with Ncol and Xho\ restriction sites for subcloning into pET28 (Νovagen / EMD Chemicals, San Diego, CA).
Table 16
Figure imgf000050_0002
[00135] The pET28 constructs were created with and without a C-terminal His tag (tagged constructs were created by using a reverse primer without a stop codon in the PCR). In addition, pET26b constructs were created with a C-terminal His tag. Constructs were sequenced for accuracy (Agencourt Bioscience Inc., Beverly MA) and used to transform BL2 I (DE3) (Novagen/EMD Biosciences, San Diego, CA).
[00136] Transformants were grown and induced in OvemightExpress™ media and cell- free extracts were prepared as described herein. Proteins were purified from tagged constructs on Novagen/EMD Biosciences His-bind columns (Novagen/EMD Biosciences, San Diego, CA) and desalted on PD- I 0 columns; for untagged constructs, cell-free extracts were desalted on PD- I 0 columns. [00137] Protein concentrations were determined by Pierce BCA protein assay and racemase purity was determined by Experion Automated Gel System (Experion, version A.0l . l 0, Biorad, Hercules, CA). Racemase assays were performed on purified and crude protein extracts as described in Example 17. Racemase expression in the pET26b construct was lower than the pET28 vector, however, active protein having the sequence shown in SEQ I D NO:4 I 2 was obtained. Results for SEQ ID NO:4 l I / pET28 (encoding the polypeptide having the sequence of SEQ ID NO:4 l 2) are shown in this example.
Table 17. D-trp production (μg/mL)
Figure imgf000051_0001
*Note 30 μg of the purified protein having the sequence of SEQ ID NO:4 I 2 (encoded by SEQ ID NO:41 1 / pET28) was used as compared to 100 μg of other enzyme preps; nd = not detected under the conditions of the assay as described above; not normalized for racemase loading
[00138] Purified protein having the sequence shown in SEQ ID NO:4l 2 from construct in pET28 was further characterized for racemase activity on tryptophan, alanine, and monatin. Tryptophan, monatin, and alanine assays were performed as described in Example 17, with /1 caviae D76N serving as positive control for racemization assays. The analytical methodology is described in Example 18. It should be noted that, at the time these analyses were performed, the analysis of D-alanine was less quantitative than the analysis for D-tryptophan. Racemase activity of SEQ ID NO:412 for tryptophan and alanine
Table 1 8. D-Trp nmoles/μl/μg protein
Figure imgf000051_0002
Table 19. D-Ala nmoles/μl/μg protein
Time (min) A. caviae D76N SEQ ID NO:412 purified
0 44 nd
5 1320 nd
10 2239 23
20 2602 314
60 4654 1044 nd = not detected under the conditions of the assay as described above
[00139] SEQ ID NO:4l 2 consistently gave higher D-trp activity than the control racemase candidate, BAR, A. caviae D76N. SEQ ID NO:4 l 2 appears to have a higher preference for tryptophan versus alanine as a substrate for racemization. In contrast, A. caviae D76N BAR while active on tryptophan, has a preference for alanine as a substrate. The ability of purified SEQ ID NO:4 l 2 to racemize 7 additional L-amino acids was evaluated and the details are reported in Example 10.
[00140] In addition, the impact of alanine on tryptophan racemase activity was investigated. An experiment was designed to determine the impact of L-alanine on the racemization of L-tryptophan by either BAR A. caviae D76N or racemase polypeptide having the sequence of SEQ ID NO:4 l 2. Racemase enzymes were assayed in the presence of tryptophan and alanine together to further characterize substrate preference/competition. Assay was carried out as described in Example 17, with 30 mM of each substrate (L-Trp and L-Ala) in the reaction. For both racemase enzymes, control racemase assays were conducted in the presence of L-tryptophan alone. The data from these control assays at various time points were considered to be 100% when compared with the respective data from assays with both amino acids. Competition of L-Ala and L-Trp in racemase assay
Table 20. % D-trp formed (100% without L-Ala assumed)
Figure imgf000052_0001
[00141] Despite some initial inhibition of tryptophan racemization between zero and five minutes, the polypeptide having the sequence of SEQ ID NO:4 l 2 had little to no impact on L- alanine. The SEQ ID NO:4 l 4 polypeptide retained 96% - 100% of its tryptophan racemase activity between 20 minutes to the end of the assay at three hours. In contrast, BAR A. caviae D76N only retained 37%-55% of its tryptophan racemase activity in the presence of L-alanine, during the same time period. Thus, the preference of the racemase having SEQ ID NO:4 I 4 for tryptophan as a substrate is advantageous in the presence of competing substrates like alanine.
Example 9 — Racemases lacking monatin racemization activity.
Figure imgf000053_0001
[00142] Neither SEQ ID NO:4l 2 nor the benchmark ^, caviae BAR showed detectable racemization of R, R monatin under the conditions of the assay as described in Example 1 7.
Table 22. Racemase Substrate Specificity
Figure imgf000053_0002
Figure imgf000054_0001
- indicates no detectable racemization under the conditions of the assays after a minimum of 24 hours; * indicates enzymes that were re-cloned in pET30a with a C-terminal H is tag for purification and more quantitative assays
Example 10 — The polypeptide having the sequence of SEQ ID NO:412 is a broad specificity amino acid racemase
[00143] The ability of purified SEQ ID NO:4 l 2 polypeptide to racemize 7 additional L- amino acids was evaluated. The amino acid racemase assay was carried out as described in Example 17, with 30 mM of each L-amino acid substrate and approximately 1 μg of purified racemase polypeptide SEQ ID NO:412 (expressed from SEQ ID NO:41 l /pET28/BL21 (DE3) induction) added for each amino acid substrate assayed.
Table 23. Additional substrates for SEQ ID NO:412
Figure imgf000054_0002
[00144] The polypeptide having the sequence of SEQ I D NO:4 I 2 appears to be an amino acid racemase with broad substrate specificity and seems to prefer bulky, hydrophobic amino acids. [00145] Racemase activity for various amino acids as substrates was observed as follows, under the conditions of the assay as described: [Leucine / Phenylalanine / Tryptophan / Methionine] > [Tyrosine / Alanine] > [Lysine / Aspartic Acid] > Glutamate. [00146] It should be noted that analytical methods for detection of all of the above D- amino acids with the exception of tryptophan are semi-quantitative so these results indicate a trend in racemase activity.
Example 1 1 — Methods to improve solubility of an insoluble protein and its activity on tryptophan
[00147] The polypeptide having the sequence of SEQ ID NO:4 l 2 showed lower solubility than other racemase candidates described in this application, under the expression conditions tested. The insoluble fraction of the SEQ ID NO:4l 2 polypeptide was tested for racemization activity on tryptophan.
[00148] Cell-free extracts of SEQ ID NO:4 l l/pET28 were prepared from frozen cell pellets by adding 5 ml of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 μL/mL of Protease Inhibitor Cocktail II (Calbiochem, San Diego, CA) and I μl/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA), per gm of cell pellet. Cell pellet suspensions were incubated at room temperature with gentle mixing for 15 min; cells pellets were spun out at 14000 rpm for 20 min (at 4°C) and retained for assays. [00149] Cell pellets containing insoluble SEQ ID NO:4 I 2 polypeptide were washed multiple times in phosphate buffered saline to remove traces of supernatant containing soluble SEQ ID NO:412 protein fraction. Washed pellets were used in qualitative tryptophan assays (amount of protein in assay was not quantitated; rather, a set volume of pellet resuspended in phosphate buffer was added to assay). The experiment was perfomed twice, once with pellets that were washed four times, and the second time with frozen pellets that were thawed and washed an additional six times. Tryptophan racemization assays were performed on the insoluble protein suspension as described in Example 17.
Table 24. Insoluble protein assays for polypeptides having the sequence of SEQ ID NO:4 I 2 - active pellets A. D-trp production (μg/mL) in pellets washed I 0X and after freeze thaw
Figure imgf000055_0001
nd = not detected under the conditions of the assay as described above B. D-trp production (μg/mL) in pellets washed 4X with Phosphate buffer
Figure imgf000056_0001
nd = not detected under the conditions of the assay as described above
[00150] SDS-PAGE analysis of cell pellets/insoluble protein fraction from the Bugbuster protocol above, showed a predominant protein band at the expected size (56.3 kD) for the racemase having the sequence of SEQ ID NO:4 l 2. The insoluble SEQ I D NO:4 l 2 protein fraction was observed to have tryptophan racemase activity. D-tryptophan production in the case of 20 μl samples was comparable between the two trials. The variation observed in the case of the 2 μl samples could be attributed to the small volume and sample nature (insoluble protein suspension).
[00151] Preliminary investigations indicated that the polypeptide having the sequence of
SEQ ID NO:4I 2 is not a membrane associated protein, which might be a possibility given the lack of solubility but the presence of activity.
Experiments to improve solubility of the SEQ ID NO:4 l2 polypeptide
[00152] Various host systems reported to improve soluble expression of heterologous proteins were investigated in an effort to improve soluble expression of the SEQ ID NO:4 l 2 polypeptide: E. coli KRX (Promega, Madison, Wl), CopyCutter™ EPI400™ (Epicentre
Biotechnologies, Madison, WI), ArcticExpress™ (Stratagene, La JoIIa, CA ), E. coli HMS 174
(Novagen/EMD Biosciences, San Diego, CA), and E. coli EE2D.
A. Induction in ArcticExpress™
[00153] Competent cells of ArcticExpress™ (DE3) were transformed with SEQ ID
NO:4l l/pET28 and SEQ ID NO:4l l/pET26b as per manufacturer's protocol (Stratagene, La
JoIIa1 CA ).
[00154] Transformants were grown in LB containing kanamycin (50 mg/L) and gentainycin (20 mg/L) overnight at 37°Cand 250 rpm. A 2% inoculum was transferred to 50 mL of Novagen OvernightExpress™ AutoinductionSystem 2 (EMD Biosciences/Novagen catalog
#71366) containing solutions 1 -6, which contains kanamycin and gentamycin. Flasks were grown for 1 .5 days at 15°C and 250 rpm. Cells were harvested and cell extracts prepared as described herein. SDS-PAGE analysis of total and soluble protein was conducted.
[00155] No improvement was seen in solubility in the ArcticExpress™ strain. However, the chaperonin proteins that should be overexpressed in this strain were not observed (expected sizes of I 0 kDa and 60 kDa) on the SDS-PAGE gel. The experiment was repeated with fresh competent cells and induction over 3 days, but SDS results were identical. [00156] When the ArcticExpress™ experiments were repeated with the SEQ ID
NO:4 l l/pET28 construct using the methods of Part A, the data showed an improvement in soluble protein expression (see Example I ).
B. Induction in E. coli Copycutter™
[00157] CopyCutter™ EPI400™ cells were transformed with SEQ ID NO:4 I l/pET28 as per manufacturer instructions (Epicentre Biotechnologies, Madison, WI). Liquid cultures of transformants were grown overnight (LB kanamycin 50, 37°C, 250 rpm) and used to inoculate shake flasks containing 25 mL LB media, kanamycin (50 mg/L) and I X CopyCutter™ induction solution. Cultures were grown at 30°C and 250 rpm for 5 hours. Cultures were harvested and cell extracts were prepared as described herein. SDS-PAGE analysis of total and soluble protein was conducted.
C. Induction in HMS l 74 and EE2D DE3
[00158] E. coli HMS I 74 (Novagen/EMD Biosciences, San Diego, CA) and E. coli
BW30384(DE3) -ompT-metE (" E. coli EE2D") competent cells were transformed with SEQ ID NO:4 l l /pET28 nucleic acid. (Construction of the E. coli BW30384(DE3) -ompT-metE expression host and the transformation protocol are described in WO 2006/066072. Liquid cultures of transformants were grown overnight (LB kanamycin 50, 37°C, 250 rpm) and used to inoculate 50 mL flasks of Novagen OvernightExpress AutoinductionSystem 2 (EMD Biosciences/Novagen catalog #71366) containing solutions 1 -6 and 50 mg/L kanamycin (25 mL in each flask). Cultures were grown at 30°C and 250 to an OD600nm > 10. Cultures were harvested and cell extracts were prepared as described herein. SDS-PAGE analysis of total and soluble protein was conducted.
[00159] In all cases described above, no significant increase in soluble expression of the
SEQ ID NO:4 l 2 polypeptide was observed based on SDS-PAGE analyses. In addition, the nucleic acid encoding the SEQ ID NO:4 I 2 polypeptide (i.e., SEQ I D NO:4 I I ) was subcloned into a derivative of the pET23d vector (Novagen, Madison, Wl) containing the E. coli metE gene and promoter inserted at the NgoMIV restriction site and a second Psi\ restriction site that was added for facile removal of the beta-lactamase gene (bid). The construction of this vector is described in WO 2006/066072. This construct was transformed into E. coli B834 DE3 host system (Novagen/EMD Biosciences, San Diego, CA), without significant increase in soluble expression. [00160] Since the nucleic acid encoding SEQ ID NO:4 l 2, with its native leader sequence, could not be successfully cloned and propagated under the conditions described in Part A, a N- terminal alanine residue was added in place of the native leader sequence of SEQ ID NO:4 l 2. It was determined that deletion of this additional alanine residue had no impact on soluble expression, based on SDS-PAGE analysis.
[00161] The presence of DTT was shown to minimize protein precipitation during purification of selected histidine-tagged D-aminotransferase candidates. The addition of 5 mM DTT during the bugbuster solubilization and subsequent purification of histidine-tagged SEQ ID NO:4 l 2 from induction of SEQ ID NO:4l l/pET28 in BL21 DE3 did not impact soluble expression as observed on SDS-PAGE.
[00162] One skilled in the art could employ various methods reported in the literature to improve soluble expression of the protein.
Example 12 — Analysis of racemases provided as pSE420-cH is clones
[00163] The nucleic acids encoding SEQ ID NO:4 l2, 416, 418, 420, 422, 424, 426, 428,
430, 432, 434, 436, 438 and 440 racemases were provided as pSE420-cHis clones. One skilled in the art could synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra. The plasmids were transformed into TOP l 0 chemically competent cells (Invitrogen, Carlsbad, CA). Overnight cultures grown in LB carbenicillin ( 100 μg/ml) were diluted a hundred-fold in 50 ml LB carbenicillin ( 100 μg/ml) in a 250 ml baffled flask. Cultures were grown at 30°C with agitation at 250 rpm until they reached an OD600 of 0.5 to 0.8, after which protein expression was induced with 1 mM IPTG for 4 h at 30°C. Samples for total protein were taken prior to induction and right before harvesting. Cells were harvested by centrifugation. Cells were frozen at -80°C. [00164] Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 μL/m L of Protease Inhibitor Cocktail H (Calbiochem, San Diego, CA) and I μl/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14000 rpm for 20 min (at 4°C) and the supernatant was carefully removed. Detergents and low molecular weight molecules were removed by passage through PD- I 0 columns (GE Healthcare, Piscataway, NJ) previously equilibrated with 100 mM potassium phosphate (pH 7.8) with 0.05 mM PLP. Proteins were eluted with 3.5 mL of the same buffer. Total protein concentration was determined using the Pierce BCA total protein assay with bovine serum albumin (BSA) as the standard, per the manufacturer's instructions (Pierce Biotechnology, Inc., Rockford, IL). The resulting cell-free extract was used for subsequent assays.
[00165] For the tryptophan racemase assay a total of 650 μg of desalted protein was added for each enzyme based on Pierce BCA total protein analysis with BSA as the standard (Pierce Biotechnology, Inc.. Rockford, IL). Formation of D-tryptophan was measured at 30 minutes, 2 hours, 4 hours and 24 hours. pSE420-cHis cell-free extract of the SEQ ID NO:4 l2 polypeptide served as a positive control for the assay, and cell-free extract of empty vector pSE420-cHis served as a negative control.
Table 25. D-trp production, (pSE420-cHis constructs) D-trp production, μg/mL
Figure imgf000059_0001
nd = not detected under the conditions of the assay as described above *sample was not tested
[00166] Racemase polypeptides having the sequence shown in SEQ ID NO:420, 422, 426,
428, 430, 432, 434, 436, and 438 showed no detectable tryptophan racemase activity after 24 hours under the conditions tested. (Under the conditions described in Part A, good activity was observed for polypeptides having the sequence of SEQ ID NO:420, 422, 426, and 438; very slight activity was detected for polypeptides having the sequence of SEQ I D NO:428, 434, and 436; and no activity was detected for the polypeptide having the sequence of SEQ ID NO:440). [00167] Racemase polypeptides having the sequence of SEQ ID NO:4 I 6, 418, 424 and
440 showed appreciable tryptophan activity in this assay. These were PCR amplified with and without C-terminal His tags for subcloning into pET30a. The oligonucleotides used for amplification are shown in Table 26. Table 26. Oligonucleotide primers
Figure imgf000060_0001
[00168] Tagged and untagged constructs were sequenced for accuracy (Agencourt
Bioscience Inc.. Beverly MA) and transformed into BL21 DE3; transformants were grown and induced in Novagen OvernightExpress AutoinductionSystem 2 (EMD Biosciences/Novagen catalog #71366) containing solutions I -6 with the appropriate antibiotic selection, and cell-free extracts were prepared as described herein. Racemase candidate proteins were purified from tagged constructs and desalted on PD- I 0 columns. Untagged racemase candidate cell-free extracts were desalted on PD- I 0 columns. Protein concentrations were determined by Pierce BCA protein assay (Pierce Biotechnology, Inc., Rockford, I L) and racemase purity was estimated by Experion Automated Gel System (Experion, version A.0L I 0, Biorad, Hercules, CA). [00169] Racemase assays were performed on purified and crude protein extracts as described in Example 1 7. Purified protein having the sequence shown in SEQ I D NO:4 I 2 served as a positive control. For the assay, 5 μg of equivalent BAR protein was added for the positive control, and an estimated 50 μg equivalent BAR protein was added for each of the other enzymes based on Pierce BCA total protein analysis and racemase purity estimation by Experion Automated Gel System (Experion, version A.0 I . I 0, Biorad, Hercules, CA). Table 27. D-trp production D-trp production, μg/mL
Figure imgf000061_0001
nd = not detected under the conditions of the assay as described above; 4 candidates that appeared to have higher activity than the SEQ ID NO:412 polypeptide - the SEQ I D NO:416, 41 8, 424, and 440 polypeptides not replicated in pET30 with BL2 1 DE3 host; al l experiments conducted with purified protein (approx 50 μg BAR); previously shown in Part A that there are differences in activity when the same construct is in different host backgrounds.
[00170] Extracts of polypeptides having the sequence shown in SEQ ID NO:4 I 6, 418, 424 and 440 expressed in pSE420-cHis/TOPl 0 exhibited tryptophan racemase activity, while extracts from the same clones in pET30/BL21 DE3 did not exhibit or exhibited very little tryptophan racemase activity. Polypeptides having the sequence shown in SEQ I D 1MO:424 and 440 showed no detectable tryptophan racemase activity in purified or crude cell extracts when cloned into pET30 and expressed in BL21 DE3, under the conditions tested. Polypeptides having the sequence shown in SEQ ID NO:4 l 6 and 41 8 showed tryptophan racemase activity for both purified and crude extracts.
[00171] Since variations in racemase activity were observed with polypeptides having the sequence shown in SEQ ID NO:4 l 6, 418, 424 and 440 in different vector and host backgrounds, the reproducibility in the original pSE420-cHis vector was investigated. [It is noted that the SEQ ID NO:424 racemase candidate could not be revived from glycerol stocks.] Racemase assay under the conditions described earlier in this Example were was repeated using I mg total protein (from pSE420-cHis/TOPl 0 cell-free extracts) of polypeptides having the amino acid sequence shown in SEQ ID 1MO:416, 418 and 440. The 3 clones showed severely diminished racemase activity. Comparison of the racemase activity for polypeptides having SEQ ID NO:416, 418 and 440 show that inconsistent results were obtained despite using the same vector/host background. Conditions described under Part A resulted in similar observations of clone/construct instability of a few of the racemase candidates. Table 28. D-trp production D-trp production, μg/mL
Figure imgf000062_0001
nd = not detected under the conditions of the assay as described above
[00172] The host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity.
[00173] Racemase candidates were grouped by amino acid sequence homology, with clusters having 95% or greater homology at amino acid level to a reference sequence. One or more representatives was/were chosen from each group for characterization of tryptophan racemase activity under the conditions described in Part B.
[00174] Using SEQ ID NO: l I 0 as the reference sequence, the following racemase candidates had 97% or greater identity at amino acid level to the above reference sequence: SEQ ID NO: 136, 174, 138, and 296. SEQ ID NO:416 is a non-leadered version of the reference SEQ ID NO: I I 0 sequence. Under the conditions described in Part B (see, for example, Example 17), tryptophan racemase activity was detected for the non-leadered version (SEQ ID NO:4l6) of the reference candidate, SEQ ID NO: 1 10. Thus, it would be expected that other racemase candidates with 97% or greater sequence identity at the amino acid level would also have tryptophan racemase activity.
[00175] Using SEQ ID NO: 1 16 as the reference sequence, the following racemase candidates had 97% or greater identity at amino acid level to the above reference sequence: SEQ ID NOs: 150, 192, 152, 1 18, 194, 154, 196, 158, and 1 60. SEQ I D NO:420 is a non-leadered version of the reference SEQ ID NO: 1 16 sequence. SEQ ID NO:422 is a non-leadered version of the reference SEQ ID NO: 1 18 sequence. Under the conditions described in Part B (e.g., Example 17), tryptophan racemase activity was not detected for polypeptides having the amino acid sequence shown in SEQ ID NO:420 and 422, which are the non-leadered versions of SEQ ID NO: 1 16 and 1 18, respectively. However, activity was observed for these polypeptides under the assay conditions described in Part A. The host organisms, expression conditions, and post- expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions or as shown in the assay conditions described in Part A, it is expected that al l of the above racemase cand idates could have tryptophan racemase activity.
[00176] It is expected that the presence of activity in a polypeptide encoded from a subcloned nucleic acid is predictive of the presence of activity in the corresponding polypeptide encoded from the full-length or wild type nucleic acid as indicated in Table 29.
Table 29
Figure imgf000063_0001
Example 13 — Analysis of racemases provided as pSE420-cHis clones
[00177] Nucleic acids having the sequence shown in SEQ ID NO:44 l , 443, 445, 447, 449,
451 , and 453 (encoding racemase polypeptides having the sequence shown in SEQ I D NO:442, 444, 446, 448, 450, 452, and 454) were provided as pSE420-cHis clones. One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra. The plasmids were transformed into TOP I 0- chemically competent cells (Invitrogen, Carlsbad, CA). Overnight cultures growing in LB carbenicillin ( 100 μg/ml) were diluted 100x in 50 ml LB carbenicillin in a 250 ml baffled flask. Cultures were grown at 30°C and 250 rpm until they reached an OD60O of 0.5 to 0.8, after which protein expression was induced with I mM IPTG for 4 h at 30°C. Samples for total protein were taken prior to induction and right before harvesting. Cells were harvested by centrifugation. Cells were frozen at -80°C.
[00178] Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 μL/mL of Protease Inhibitor Cocktail Il (Calbiochem, San Diego, CA) and I μl/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14000 rpm for 20 min (at 4°C) and the supernatant was carefully removed. Detergents and low molecular weight molecules were removed by passage through PD- I 0 columns (GE Healthcare, Piscataway, NJ) previously equilibrated with 100 mM potassium phosphate (pH 7.8) with 0.05 mM PLP. Proteins were eluted with 3.5 iiiL of the same buffer. Total protein concentration was determined using the Pierce BCA protein assay (Pierce Biotechnology, Inc., Rockford, IL) with bovine serum albumin (BSA) as the standard, per the manufacturer's instructions. The resulting cell-free extract was used for subsequent assays.
[00179] Tryptophan racemase assays were carried out under the conditions described in
Example 17. For the tryptophan racemization assay, a total of 1 mg of soluble protein (based on Pierce BCA total protein analysis with BSA as the standard) was added for each racemase candidate and positive controls. Cell free extract of polypeptides having the sequence shown in SEQ ID NO:412, pSE420/TOP I 0 construct served as positive control for the assay, and cell-free extract of TOPl 0 (Invitrogen, Carlsbad, CA) containing vector pSE420-cHis served as a negative control. Total protein concentration was determined using the Pierce BCA protein assay (Pierce Biotechnology, Inc., Rockford, IL) with bovine serum albumin (BSA) as the standard, per the manufacturer's instructions. Formation of D-tryptophan was measured at 30 minutes, 2 hours and 4 hours as described in Example 18.
Table 30. D-trp production D-trp production, μg/mL
Figure imgf000064_0001
nd = not detected under the conditions of the assay as described above
[00180] All of the racemase candidate extracts tested above, polypeptides having the sequence shown in SEQ ID NO:442, 444, 446, 448, 450, 452 and 454, had detectable tryptophan racemase activity under the conditions described above. In addition, tryptophan racemase activity was detected for the positive control, the SEQ ID NO:4 l 2 polypeptide extract, and there was no detectable activity in the case of the pSE420-cHis vector control extracts. It is expected that the homologs of the representative racemase candidates having 95% or greater homology at amino acid level (see Table 3 I ) will also have tryptophan racemase activity. Table 31.
Figure imgf000065_0001
nd, not detected under the conditions of the assay as described above
[00181] Racemase candidates described in this example were grouped by amino acid sequence homology with clusters having 95% or greater homology at amino acid level to a reference sequence. One or more representatives were chosen from each group for characterization of tryptophan racemase activity using the conditions described in Part B. Using SEQ ID NO:244 as the reference sequence, the following racemase candidates had 97% or greater identity at amino acid level to the above reference sequence: SEQ ID NO:248, 236, 246, 252, 250, and 254. SEQ ID NO:448 is a non-leadered version of the reference SEQ ID NO:244 sequence. Under the conditions described in Part B (e.g., Example 17), tryptophan racemase activity was detected for the non-leadered version (SEQ ID NO:448) of the reference candidate, SEQ ID NO:244; as well as the non-leadered version (SEQ ID NO:450) of the candidate, SEQ ID NO:248. Thus, it would be expected that other racemase candidates with 97% or greater sequence identity at the amino acid level would also have tryptophan racemase activity. [00182] Using SEQ I D NO:288 as a reference sequence, the following racemase candidates had 97% or greater identity at amino acid level to the above reference sequence: SEQ ID NO:274, 234, 220, 222, 226, 232, 240, 242, 258, 260, 264, 266, 286, 290, 170, and 216. SEQ ID NO:454 is a non-leadered version of the reference SEQ I D NO:288 sequence; SEQ ID NO:452 is a non-leadered version of SEQ ID NO:274 sequence; and SEQ I D NO:446 is a non- leadered version of SEQ ID NO:234 sequence. Under the conditions of the assay as described in Example 17, tryptophan racemase activity was detected for the non-leadered version (SEQ ID NO:454) of the reference candidate, SEQ ID NO:288; as well as the non-leadered versions (SEQ ID NO:452 and 446) of racemase candidates SEQ ID NO:274 and 234, respectively. Thus, it would be expected that other racemase candidates listed above with 97% or greater sequence identity at the amino acid level would also have tryptophan racemase activity. [00183] Using SEQ ID NO:218 as a reference sequence, the following racemase candidates had 97% or greater identity at amino acid level to the above reference sequence: SEQ ID NO:208, 2 10, 228, 230, 270, 272, 278, 280, 282, 284, 292, 198, 212, 214, and 1 14. SEQ ID NO:204 had 96% identity with SEQ I D NO:218 reference sequence. SEQ ID NO:444 is a non- leadered version of the reference SEQ ID NO:218 sequence. Under the conditions described in Part B (e.g., Example 17), tryptophan racemase activity was detected for the non-leadered version (SEQ ID NO:444) of the reference candidate, SEQ ID NO:218. Thus it would be expected that other racemase candidates with 97% or greater sequence identity at the amino acid level would also have tryptophan racemase activity.
[00184] SEQ ID NO:436 is a non-leadered version of SEQ ID NO: 1 14 sequence. Under the conditions of the assays described in Part B, tryptophan racemase activity was not detected for the non-leadered version (SEQ ID NO:436) of the racemase candidate SEQ ID NO: 1 14, as shown in Example 12.
The SEQ ID NO:441 nucleic acid (encoding the polypeptide having the sequence of SEQ ID NO:442) was subcloned into pET30a with a C-terminal His tag
[00185] A D56N mutant (corresponding to D76N mutation in A. caviae) was created in
SEQ ID NO:442. Mutagenesis was done using the QuickChange-Multi site-directed mutagenesis kit (Stratagene, La JoIIa, CA), using the C-tagged SEQ ID NO:442 gene in pET30a as template. The following mutagenic primer was used to make the D56N change as described in Example 19: 5'-CGCCATCATGAAGGCGAACGCCTACGGTCACG-3' (SEQ ID NO:5 16). [00186] The site-directed mutagenesis was done as described in the manufacturer's protocol. The resulting mutation was detrimental to tryptophan racemase activity in this candidate, whereas, in A. caviae, the corresponding D76N mutation has a positive effect. [00187] It is expected that the presence of activity in a polypeptide encoded from a subcloned nucleic acid is predictive of the presence of activity in the corresponding polypeptide encoded from the full-length or wild type nucleic acid as indicated in Table 32 below.
Table 32
Figure imgf000066_0001
Example 14 — Analysis of racemases provided as pSE420-cH is clones
[00188] Nucleic acids having the sequence shown in SEQ I D NO:455, 457, 459, 461 , 463,
465, 467, 469, 471 , 473, 475, and 477 (encoding polypeptides having the sequence shown in SEQ ID NO:456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, and 478) were provided as pSE420-cHis clones. One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra. The plasmids were transformed into TOPl 0 chemically competent cells (Invitrogen, Carlsbad, CA). Overnight cultures growing in LB carbenicillin ( 100 μg/ml) were diluted 100x in 50 ml LB carbenicillin (100 μg/ml) in a 250 ml baffled flask. Cultures were grown at 30°C at 250 rpm until they reached an OD600 of 0.5 to 0.8, after which protein expression was induced with I mM IPTG for 4 h at 30°C. Samples for total protein were taken prior to induction and right before harvesting. Cells were harvested by centrifugation and frozen at -80°C.
[00189] Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 μL/mL of Protease Inhibitor Cocktail Il (Calbiochem, San Diego, CA) and I μl/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14,000 rpm for 20 min (at 4°C) and the supernatant was carefully removed. Detergents and low molecular weight molecules were removed by passage through PD-10 columns (GE Healthcare, Piscataway, NJ) previously equilibrated with 100 mM potassium phosphate (pH 7.8) with 0.05 mM PLP. Proteins were eluted with 3.5 mL of the same buffer. Total protein concentration was determined using the Pierce BCA (Pierce Biotechnology, Inc., Rockford, IL) protein assay with bovine serum albumin (BSA) as the standard, per the manufacturer's instructions. The resulting cell-free extract was used for subsequent assays.
[00190] Desalted cell-free extracts of racemase polypeptides having the sequence of SEQ
ID NO:456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478 were prepared as described in other examples.
[00191] Tryptophan racemase assays were carried out under the conditions described in
Example 17. For the tryptophan racemization assay a total of 800 μg of soluble protein was added for each racemase candidate and positive controls. pSE420-cHis/TOP l 0 cell-free extracts of racemase polypeptides having the sequence shown in SEQ ID NO:4 I 2 and 442 served as positive controls for the assay, and cell-free extract of TOP I 0 (Invitrogen, Carlsbad, CA) containing vector pSE420-cHis served as a negative control. Total protein concentration was determined using the Pierce BCA (Pierce Biotechnology, Inc., Rockford, I L) protein assay with bovine serum albumin (BSA) as the standard, per the manufacturer's instructions. Formation of D-tryptophan was measured at 30 minutes, 2 hours and 4 hours as described in Example 18.
Table 33. D-trp production D-trp production, μg/mL
Figure imgf000068_0001
nd = not detected under the conditions of the assay as described.
[00192] Racemase polypeptides having the sequence shown in SEQ ID NO:460, 474 and
476 showed tryptophan racemase activity. Racemase polypeptides having the sequence shown in SEQ ID NO:456, 458, 462, 464, 466, 468, 470, 472 and 478 showed no detectable tryptophan racemase activity after 4 hours under the conditions tested. In a follow up experiment, a 24-hour sample was evaluated for D-tryptophan production. None of the racemases listed above showed detectable tryptophan racemase activity at 24 hours under the conditions described above. Of the candidates for which no activity was observed, racemase polypeptides having the sequence shown in SEQ ID NO:456, 458, 462, 464, 466, 468, 470, 472 and 478 exhibited poor or questionable soluble protein expression. The host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity.
[00193] Racemase candidates were grouped by amino acid sequence homology, with clusters having 95% or greater homology at amino acid level to a reference sequence. One or more representatives was/were chosen from each group for characterization of tryptophan racemase activity using the conditions described in Part B. [00194] Using SEQ ID NO: 108 as the reference sequence, the following racemase candidates had 96% or greater identity at amino acid level to the above reference sequence: SEQ ID NO: I 72, I 78, 180, I 82, 1 84, I 40, 144, I 88, 190, 1 12, 148, 156, 120 and 162. SEQ ID NO:474 is a non-leadered version of the reference SEQ ID NO: 108 sequence. Under the conditions described in Part B (e.g., Example 17), tryptophan racemase activity was detected for the non-leadered version (SEQ ID NO:474) of the reference candidate, SEQ ID NO: 108, as well as the non-leadered version (SEQ ID NO:460) of SEQ ID NO: 120 which is 97% identical with the reference candidate, SEQ ID NO: 108. Additionally the non-leadered version (SEQ ID NO:418) of SEQ I D NO: 1 12 was shown to have detectable tryptophan racemase activity as seen in Example 12. Thus it would be expected that the other racemase candidates listed above, with 96% or greater sequence identity at the amino acid level would also have tryptophan racemase activity.
[00195] It is expected that the presence of activity in a polypeptide encoded from a subcloned nucleic acid is predictive of the presence of activity in the corresponding polypeptide encoded from the full-length or wild type nucleic acid.
Table 34
Figure imgf000069_0001
Example 15 — Analysis of racemase candidates provided as PCR products First group
[00196] Racemase nucleic acid sequences SEQ ID NO:3 l 3, 325, 341 , 343, 3 17, 329, 327,
345, 333, and 351 were provided as PCR products with Nde\ and Not\ restriction sites at the 5: and 3- ends, respectively. The PCR fragments were cloned into pCR-Blunt ll-Topo (I nvitrogen, Carlsbad, CA) as recommended by the manufacturer. The sequence was verified by sequencing (Agencourt, Beverly, MA) and an insert with the correct sequence was then released from the vector using Ndel and Noll restriction enzymes and ligated into the Ndel and Noll restriction sites of pET30a. One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra. [00197] The pET30a constructs of all racemase candidates listed above were transformed into the expression host BL21 DE3. Liquid cultures were grown overnight in LB medium (BD
Diagnostics, Franklin Lakes, NJ) containing 50 μg/ml kanamycin at 37°C with agitation at 250 rpm. These overnight cultures were used to inoculate shake flasks containing 50 mL Overnight Express ™ media (Solutions 1 -6, Novagen/EMD Biosciences, San Diego, CA) containing 50 μg/ml kanamycin. Overnight Express™ cultures were grown at 30°C with agitation at 250 rpm for approximately 20 hours, and cells were harvested by centrifiigation when ODβoo reached ~6- 10.
[00198] Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 μL/mL of Protease Inhibitor Cocktail Il (Calbiochem, San Diego, CA) and I μl/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14000 rpm for 20 min (at 4°C) and the supernatant was carefully removed. Detergents and low molecular weight molecules were removed by passage through PD-10 columns (GE Healthcare, Piscataway, NJ) previously equilibrated with 100 mM potassium phosphate (pH 7.8) with 0.05 mM PLP. Proteins were eluted with 3.5 mL of the same buffer. Total protein concentration was determined using the Pierce BCA protein assay with bovine serum albumin (BSA) as the standard, per the manufacturer's instructions (Pierce Biotechnology, Inc., Rockford, IL). The resulting cell-free extract was used for subsequent assays.
[00199] Desalted cell-free extracts were evaluated using tryptophan racemase assays under the conditions described in Example 17, with purified SEQ ID NO:442 polypeptides serving as a positive control. For the tryptophan racemase assay, a total of 10 μg and 1 00 μg BAR-equivalent SEQ ID NO:442 racemase (based on Pierce BCA total protein analysis with BSA as the standard and estimation of percentage of BAR protein expressed from Experion, (Experion, version A.01 .10, Biorad, Hercules, CA)), were used as positive controls. I mg of total protein was added for each racemase candidate being tested (based on Pierce BCA total protein analysis with BSA as the standard). Formation of D-tryptophan was measured at 30 minutes, 1 hour, 2 hours, and 4 hours as described in Example 18. In a follow up experiment, a 24-hour sample was evaluated for D-tryptophan production.
[00200] None of the racemases listed above showed detectable tryptophan racemase activity at 24 hours under the conditions described herein. Tryptophan racemase activity was seen for positive control, SEQ ID NO:442. The host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity. Second group
[00201] Racemase nucleic acids SEQ ID NO:32 1 , 323 and 347 were provided as PCR products with Nde I and Not I restriction sites at the 5' and 3' ends, respectively. However all of these sequences had additional Nde I and/or Not I sites internal to the gene sequence so direct siibcloning was not possible. SEQ ID NO:349 was re-amplified by PCR with RTth polymerase (Applied Biosystems, Foster City, CA) and primers adding an Nde I and Xho I restriction site at the 5' and 3; ends, respectively.
Table 35
Figure imgf000071_0001
[00202] The PCR fragment was digested with Nde\ and Xho\ restriction enzymes and ligated into the Nde\ and Xho\ restriction sites of pET30a. Correct plasmids were verified by digestion with Nde\ and Xho\ and sequencing (Agencourt, Beverly, MA). One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra.
[00203] The pET30a clones of all of the above racemases were transformed into expression host BL21 DE3. Liquid cultures were grown overnight (LB kanamycin 50, 37°C, 250 rpm) and used to inoculate shake flasks containing 50 mL Overnight Express™ media (Solutions 1 -6, Novagen/EMD Biosciences, San Diego, CA) containing kanamycin. Overnight Express™ cultures were grown at 30°C and 250 rpm for approximately 20 hours, and collected when the OD6OO reached -6- 10. Cells were harvested by centrifugation.
[00204] Desalted cell-free extracts of racemase polypeptides SEQ I D NO:322, 324, and
348 were prepared as described above.
[00205] Tryptophan racemase assays were carried out under the conditions described in
Example 1 7, with purified A. caviae D76N BAR (see Example 19) serving as a positive control. For the tryptophan racemase assay, a total of 50 μg BAR equivalent of positive control (based on Pierce BCA total protein analysis with BSA as the standard and estimation of percentage of BAR protein expressed from Experion (Experion, version A.01.10, Biorad, Hercules, CA) was added. I mg of total protein was added for each racemase candidate being tested (based on Pierce BCA total protein analysis with BSA as the standard). Formation of D-tryptophan was measured at 1 hour, 2 hours, 4 hours, and 21 .5 hours as described in Example 18.
Table 36. D-trp production D-trp production, μg/mL
nd = not detected under the conditions of the assay as described above.
[00206] Tryptophan racemase activity was observed for polypeptides having the sequence shown in SEQ ID NO:322. This enzyme is interesting because it is the smallest racemase protein that was active on tryptophan, with the protein being only 232 amino acids (as compared to 409 amino acids for the A. caviae benchmark, and >300 amino acids for most of the other racemase candidates).
[00207] There was no detectable tryptophan racemase activity observed for polypeptides having the sequence shown in SEQ ID NO:324 and 348 under the conditions tested. SDS-PAGE analysis showed good soluble protein expression for the SEQ ID NO:348 polypeptide, but minimal soluble protein expression for the SEQ I D NO:324 polypeptide. The host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity. Third group
[00208] Racemase nucleic acids having the sequence of SEQ ID NO:339 and 349
(encoding polypeptides having the sequence of SEQ I D NO:340 and 350, respectively) were provided as PCR products with Nde\ and NotI restriction sites at the 5' and 3' ends, respectively. However, all of these sequences had additional Nde\ and/or No/I sites internal to the gene sequence so direct subcloning was not carried out. The nucleic acid having the sequence of SEQ ID ΝO:350 was re-amplified by PCR with RTth polymerase (Applied Biosystems, Foster City, CA) and primers adding an Nde\ and Xho\ restriction site at the 5' and 3' ends, respectively.
Table 37
Figure imgf000073_0001
[00209] The PCR fragment was digested with Nde I and Xho I restriction enzymes and ligated into the Nde\ and Xho\ restriction sites of pET30a. Correct plasmids were verified by digestion with Nde\ and Xho\ and sequencing (Agencourt, Beverly, MA). One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra.
[00210] The pET30a constructs of all racemase candidates listed above were transformed into the expression host BL21 DE3. Liquid cultures were grown overnight in LB medium (BD
Diagnostics, Franklin Lakes, NJ) containing 50 μg/ml kanamycin at 37°C with agitation at 250 rpm. These overnight cultures were used to inoculate shake flasks containing 50 mL Overnight
Express™ media (Solutions I -6, Novagen/EMD Biosciences, San Diego, CA) containing 50 μg/ml kanamycin. Overnight Express™ cultures were grown at 30°C, with agitation at 250 rpm for approximately 20 hours, and cells were harvested by centrifugation when the OD600nm reached between 6 and 10.
[00211] Desalted cell-free extracts of racemase polypeptides having the sequence of SEQ
ID NO:340 and 350 were prepared as described above.
[00212] Tryptophan racemase assays were carried out under the conditions described in
Example 17, with the polypeptide having the sequence of SEQ I D NO:412 serving as a positive control. [00213] For the tryptophan racemase assay, a total of approximately 5 μg BAR equivalent of control (based on Pierce BCA total protein analysis with BSA as the standard and estimation of percentage of BAR protein expression level from Experion. (Experion, version A.0l .10, Biorad, Hercules, CA)) was added, and I mg of total cell-free protein extract was added for each racemase candidate being tested (based on Pierce BCA total protein analysis with BSA as the standard). Formation of D-tryptophan was measured at 15 minutes, 2 hours, and 21 hours as described in Example 18.
[00214] No tryptophan racemization was detected for polypeptides having the sequence of
SEQ ID NO:340 or 350 under the conditions tested. Positive control polypeptides having the sequence of SEQ ID NO:4 I 2 showed tryptophan racemase activity. SDS-PAGE analysis showed low soluble protein expression for SEQ ID NO:340 and 350 polypeptides. The host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity.
Example 16 — Analysis of racemases provided as PCR-4-Blunt TOPO clones. [00215] Racemase nucleic acids having the sequence shown in SEQ ID NO:335, 337, 357,
359, 361 , and 365 were provided as PCR-4-Blunt TOPO clones. Racemases in these plasmids were amplified with RTth polymerase (Applied Biosystems, Foster City, CA) and primers adding an Nde\ and Xho\ restriction site at the 5' and 3' ends, respectively.
Table 38
Figure imgf000074_0001
Figure imgf000075_0001
*Same forward and reverse primer pair was used for SEQ ID NO:357, 359, 361 , and 365 due to 100% DNA homology in primer regions.
[00216] The PCR fragments were cloned into pCR-Blunt I l-Topo (Invitrogen, Carlsbad,
CA) as recommended by the manufacturer. The sequence was verified by sequencing (Agencourt, Beverly, MA) and an insert with the correct sequence was then released from the vector using Nde\ and Xho\ restriction enzymes (New England Biolabs, Ipswich, MA) and ligated into the Nde\ and Xho\ restriction sites of pET30a. See Table above for specific primers and plasmids names. (It is noted that the TOPO cloning efforts for the SEQ ID NO:355 nucleic acid were unsuccessful after multiple attempts, so this racemase was not further processed). One skilled in the art can synthesize the genes encoding these racemases using various published techniques for example, as described in Stemmer et al., supra.
[00217] The pET30a constructs of all racemase candidates listed above were transformed into the expression host BL21 DE3. Liquid cultures were grown overnight in LB medium (BD Diagnostics, Franklin Lakes, NJ) containing 50 μg/ml kanamycin at 37°C with agitation at 250 rpm. These overnight cultures were used to inoculate shake flasks containing 50 mL Overnight Express™ media (Solutions I -6, Novagen/EMD Biosciences, San Diego, CA) containing 50 μg/ml kanamycin. Overnight Express™ cultures were grown at 30°C with agitation at 250 rpm for approximately 20 hours, and cells were harvested by centrifugation when OD600 reached ~6- 10.
[00218] Desalted cell-free extracts of racemase polypeptides having SEQ ID NO:336,
338, 358, 360, 362, and 366 were prepared as described below (polypeptides having SEQ ID NO:356 and 364 from this experiment were not further characterized).
[00219] Cell extracts were typically prepared from the above frozen pellets by adding 5 ml per g of cell pellet of Bugbuster Amine Free (Novagen/EMD Biosciences, San Diego, CA) with 5 μL/mL of Protease Inhibitor Cocktail II (Calbiochem, San Diego, CA) and I μl/ml of benzonase nuclease (Novagen/EMD Biosciences, San Diego, CA). Cell solutions were incubated at room temperature with gentle mixing for 15 min; cells were spun out at 14000 rpm for 20 min (at 4°C) and the supernatant was carefully removed. Detergents and low molecular weight molecules were removed by passage through PD-I 0 columns (GE Healthcare, Piscataway, NJ) previously equilibrated with 100 inM potassium phosphate (pH 7.8) with 0.05 mM PLP. Proteins were eluted with 3.5 mL of the same buffer. Total protein concentration was determined using the Pierce BCA (Pierce Biotechnology, Inc., Rockford, IL) protein assay with bovine serum albumin (BSA) as the standard, per the manufacturer's instructions. The resulting cell-free extract was used for subsequent assays.
[00220] Tryptophan racemase assays were carried out under the conditions described in
Example 1 7, with purified A. caviae D76N BAR (Example 19) serving as a positive control. A total of 100 μg BAR equivalent of control was added (based on Pierce BCA total protein analysis with BSA as the standard and estimation of percentage of BAR protein expressed from Experion, version A.01 .10, Biorad, Hercules, CA), and 1 mg of total protein was added for each racemase candidate being tested. Formation of D-tryptophan was measured at 30 minutes. 2 hours, 4 hours and 52 hours as described in Example 1 8.
Table 39. D-trp production D-tryptophan production, μg/mL
Figure imgf000076_0001
nd= not detected under the conditions of the assay as described above.
[00221] Racemase polypeptides having SEQ I D NO:336, 338 and 358 were active.
Racemase polypeptides having SEQ I D NO:366, 360, and 362 showed no detectable tryptophan racemase activity under the conditions tested. Polypeptides having SEQ I D NO:366, 360, and 362 all had satisfactory soluble protein expression. The host organisms, expression conditions, and post expression cell handling can all affect whether there is detectable tryptophan racemase activity under the conditions of the assay. Additionally, under optimized conditions, it is expected that all racemase candidates could have tryptophan racemase activity. Example 1 7 — Description of racemase assay conditions
Leucine, Phenylalanine, Tryptophan, Methionine, Tyrosine, Alanine, Lysine, Aspartic Acid, G lutamate Racemase Assay
[00222] Racemase assays were performed starting with the L-amino acid isomer and the formation of corresponding D-amino acid was followed. [00223] Assay conditions:
[00224] 30 mM L-amino acid (L-Leucine, L-Phenylalanine, L-Tryptophan, L-Methionine,
L-Tyrosine, L-Alanine, L-Lysine, L-Aspartic Acid, or L-G lutamate), 50 mM Potassium phosphate buffer (pH 8.0), 0.05 mM PLP, and water was added to make the volume up to I mL. [00225] The assays were conducted at 30°C with shaking at 225 rpm. Desalted racemase candidate proteins (cell-free extracts or purified preparations) were evaluated for amino acid racemase activity. Wherever possible, appropriate negative and positive controls were included for the assays. Sample aliquots were taken for analysis at various timepoints and formic acid was added to a final concentration of 2% to stop the reaction. Samples were frozen at -80°C, then thawed, centrifuged and filtered through 0.2 μ filter (Pall Life Sciences, Ann Arbor, MI). Samples were analyzed for D-amino acid using the chiral LC/MS/MS method described in Example 1 8.
Monatin Racemase Assay
[00226] A subset of racemase candidates that gave promising tryptophan racemase results was tested for monatin racemization. [00227] Assay conditions:
[00228] 10 mM R,R monatin, 50 mM Potassium phospate buffer (pH 8.0), 0.05 mM PLP, and water were added to make the volume up to I mL.
[00229] The assays were performed at 30°C with shaking at 225 rpm. At various time points, sample aliquots were taken, diluted five-fold with distilled water, then filtered through a 0.2 μ filter (Pall Life Sciences, Ann Arbor, MI) and stored at -80°C for subsequent analysis. Samples were analyzed for the distribution of monatin stereoisomers as described in Example 18.
Example 1 8 — Detection of monatin stereoisomers and chiral detection of lysine, alanine, methionine, tyrosine, leucine, phenylalanine, tryptophan, glutamate, and aspartate [00230] This example describes methods used to detect the presence of stereoisomers of monatin, lysine, alanine, methionine, tyrosine, leucine, phenylalanine, tryptophan, glutamate, and aspartate. It also describes a method for the separation and detection of the four stereoisomers of monatin.
Chiral LC/MS/MS ("MRM") Measurement of Monatin
[00231] Determination of the stereoisomer distribution of monatin in in vitro reactions was accomplished by derivatization with l -fluoro-2-4-dinitrophenyl-5-L-alanine amide
("FDAA"), followed by reversed-phase LC/MS/MS MRM measurement.
Derivatization of Monatin with FDAA
[00232] To 50 μL of sample or Standard and I 0 μL of internal Standard was added 100 μL of a l% solution of FDAA in acetone. Twenty μL of 1 .0 M sodium bicarbonate was added, and the mixture incubated for I h at 40°C with occasional mixing. The sample was removed and cooled, and neutralized with 20 μL of 2.0 M HCI (more HCl may be required to effect neutralization of a buffered biological mixture). After degassing was complete, samples were ready for analysis by LC/MS/MS.
LC/MS/MS Multiple Reaction Monitoring for the Determination of the Stereoisomer
Distribution of Monatin
[00233] Analyses were performed using the LC/MS/MS instrumentation described above.
LC separations capable of separating all four stereoisomers of monatin (specifically FDAA- monatin) were performed on a Phenomenex Luna 2.0 x 250 mm (3 μm) Cl 8 (2) reversed phase chromatography column at 40°C. The LC mobile phase consisted of A) water containing 0.05%
(mass/volume) ammonium acetate and B) acetonitrile. The elution was isocratic at 13% B, 0-2 min, linear from 13% B to 30% B, 2-15 min, linear from 30% B to 80% B, 15- 16 min, isocratic at 80% B 16-21 min, and linear from 80% B to 13% B, 21 -22 min, with an 8 min re-equilibration period between runs. The flow rate was 0.23 mL/min, and PDA absorbance was monitored from
200 nm to 400 nm. All parameters of the ESI-MS were optimized and selected based on generation of deprotonated molecular ions ([M - H]-) of FDAA-monatin, and production of characteristic fragment ions.
[00234] The following instrumental parameters were used for LC/MS analysis of monatin in the negative ion ESI/MS mode: Capillary: 3.0 kV; Cone: 40 V; Hex 1 : 15 V; Aperture: 0. 1 V;
Hex 2: 0.1 V; Source temperature: 120°C; Desolvation temperature: 350°C; Desolvation gas: 662
L/h; Cone gas: 42 L/h; Low mass resolution (Q l ): 14.0; High mass resolution (Ql ): 15.0; Ion energy: 0.5; Entrance: 0V; Collision Energy: 20; Exit: 0V; Low mass resolution (Q2): 15; High mass resolution (Q2): 14; Ion energy (Q2): 2.0; Multiplier: 650. Three FDAA-monatin-specific parent-to daughter transitions are used to specifically detect FDAA-monatin in in vitro and in vivo reactions. The transitions monitored for monatin are 542.97 to 267.94, 542.97 to 499.07, and 542.97 to 525.04. Monatin internal Standard derivative mass transition monitored was 548.2 to 530.2. Identification of FDAA-monatin stereoisomers is based on chromatographic retention time as compared to purified synthetic monatin stereoisomers, and mass spectral data. An internal standard wais used to monitor the progress of the reaction and for confirmation of retention time of the S, S stereoisomer. Detection of L- and D-Am ino Acids by LC/MS/MS
[00235] Samples containing a mixture of L- and D-amino acids such as lysine, alanine, methionine, tyrosine, leucine, phenylalanine, tryptophan, glutamate, and aspartate from biochemical reaction experiments were first treated with formic acid to denature protein. The sample was then centrifuged and filtered through a 0.2 μm nylon syringe filter prior to LC/MS/MS analysis. Identification of L- and D-amino acids was based on retention time and mass selective detection. LC separation was accomplished by using Waters 2690 liquid chromatography system and an ASTEC 2.1 mm x 250 mm Chirobiotic TAG chromatography column with column temperature set at 45°C. LC mobile phase A and B were 0.25% acetic acid and 0.25% acetic acid in methanol, respectively. Isocratic elution was used for all methods to separate the L and D isomers. Lysine was eluted using 80% mobile phase A, and 20% B and a flow rate of 0.25 mL/min. Glutamate, alanine, and methionine were separated with elution of 60% mobile phase A and 40% B and a flow rate of 0.25 mL/min. Aspartate, tryptophan, tyrosine, leucine, and phenylalanine were separated isomerically with 30% mobile phase A and 70% B with a flow rate of 0.3 mL/min for aspartate and tryptophan, and 0.25 mL/min for tyrosine, leucine, and phenylalanine.
[00236] The detection system for analysis of L- and D-amino acids included a Waters 996
Photo-Diode Array (PDA) detector and a Micromass Quattro Ultima triple qiiadrupole mass spectrometer. The PDA, scanning from 195 to 350 nm, was placed in series between the chromatography system and the mass spectrometer. Parameters for the Micromass Quattro Ultima triple quadrupole mass spectrometer operating in positive electrospray ionization mode (+ES1) were set as the following: Capillary: 3.2 kV; Cone: 20 V; Hex 1 : 12 V; Aperture: 0.1 V; Hex 2: 0.2V; Source temperature: 120°C; Desolvation temperature: 350°C; Desolvation gas: 641 L/h; Cone gas: 39 L/h; Low mass Q l resolution: 16.0; High mass Q I resolution: 16.0; Ion energy 1 : 0.1 ; Entrance: -5; Collision: 20; Exit 1 : 10; Low mass Q2 resolution: 16.0; High mass Q2 resolution: 16.0 Ion energy 2: 1.0; Multiplier: 650 V. MS/MS experiments with Multiple Reaction Monitoring (MRM) mode were set up to selectively monitor reaction transitions of 147.8 to 84.03, 147.8 to 56.3, and 147.8 to 102.2 for glutamate, 133.85 to 74.03, 133.85 to 69.94 and 133.85 to 87.99 for aspartate, 146.89 to 84.09, 146.89 to 55.97 and 146.89 to 67.23 for lysine, 149.80 to 56.1 , 149.8 to 61 .01 , and 149.80 to 104.15 for methionine, 181 .95 to 135.97, 181.95 to 90.88 and 181.95 to 1 1 8.87 for tyrosine, 131.81 to 86.04 and 13 1.81 to 69.3 1 for leucine, 90.0 to 44.3 for alanine, and 165.83 to 102.96, 165.83 to 93.27 and 165.83 to 120.06 for phenylalanine. In the case where numerous transitions are listed, the first transition listed was used for quantification. For tryptophan, MS/MS experiments with Multiple Reaction Monitoring (MRM) mode were set up to selectively monitor reaction transitions of 205.0 to 145.91 , 205.0 to 1 17.92, and 205.0 to 1 88.05, and the transition from 212.0 to 150.98 for d8-DL tryptophan. Tryptophan quantification was achieved by determining the ratio of analyte response of transition 205.0 to 145.91 to that of the internal standard, d8-D,L tryptophan. Production of Monatin for Standards and for Assays
[00237] A racemic mixture of R,R and S, S monatin was synthetically produced as described in U.S. Patent No. 5, 128,482.
[00238] The R,R and S, S monatin were separated by a derivatization and hydrolysis step.
Briefly, the monatin racemic mixture was esterified, the free amino group was blocked with Cbz, a lactone was formed, and the S,S lactone was selectively hydrolyzed using an immobilized protease enzyme. The monatin can also be separated as described in Bassoli et al., 2005, Eur. J. Org. Chem., 8: 1652-1658.
Example 19 — Cloning and Analysis of Broad Activity Raceinase (BAR) from Aeromonas caviae
ATCC 14486
[00239] This example describes the cloning of the A. caviae BAR and a D76N mutant that were used as positive controls in some of the Examples.
[00240] Since tryptophan racemase activity was detected in crude extracts from
Aeromonas caviae ATCC 14486, degenerate primers were designed (based on conserved regions of known BAR homologs) to obtain the BAR gene from Aeromonas caviae ATCC 14486.
Degenerate primer sequences are shown below:
[00241] Aer deg F2: 5'-GCCAGCAACGARGARGCMCGCGT^' (SEQ ID NO:54 I ); and
[00242] Aer deg R l : 5'-TGGCCSTKGATCAGCACA-S' (SEQ I D NO:542)
[00243] wherein K indicates G or T, R indicates A or G, S indicates C or G, and M indicates A or C.
[00244] The above primers were used to PCR amplify a 715 bp DNA fragment from A. caviae (ATCC Accession No. 14486) genomic DNA. The following PCR protocol was used: A
50 μL reaction contained 0.5 μL template (~l 00 ng of A. caviae genomic DNA), 1.6 μM of each primer, 0.3 mM each dNTP, I 0 U rTth Polymerase XL (Applied Biosystems, Foster City, CA), I X XL buffer, I mM Mg(OAc)2 and 2.5 μL dimethyl sulfoxide. The therinocycler program used included a hot start at 94°C for 3 minutes and 30 repetitions of the following steps: 94°C for 30 seconds, 53°C for 30 seconds, and 68°C for 2 minutes. After the 30 repetitions, the sample was maintained at 68°C for 7 minutes and then stored at 4°C. This PCR protocol produced a product of 7 l 5 bp.
[00245] The PCR product was gel purified from 0.8% TAE-agarose gel using the Qiagen gel extraction kit (Qiagen, Valencia, CA). The product was TOPO cloned and transformed into TOP 10 cells according to manufacturer's protocol (Invitrogen, Carlsbad, CA). The plasmid DNA was purified from the resulting transformants using the Qiagen spin miniprep kit (Qiagen, Valencia, CA) and screened for the correct inserts by restriction digest with EcoR\ . The sequences of plasmids appearing to have the correct insert were verified by dideoxy chain termination DNA sequencing with universal M 13 forward primers.
[00246] Four libraries were constructed for each strain as per manufacturer's protocols
(BD GenomeWalker™ Universal Kit, Clontech). Gene-specific primers were designed as per GenomeWalker™ manufacturer's protocols based on sequences obtained using degenerate primer sequences (see above), allowing for a few hundred homologous base pair overlap with original product. These gene-specific primers were subsequently used with GenomeWalker™ adaptor primers for PCR of upstream and downstream sequences to complete d, caviae BAR ORF.
[00247] Full-length gene sequence of the A caviae BAR gene: atgcacaaga aaacactgct cgcgaccctg atctttggcc tgctggccgg ccaggcagtc gccgccccct atctgccgct cgccgacgac caccgcaacg gtcaggaaca gaccgccgcc aacgcctggc tggaagtgga tctcggcgcc ttcgagcaca acatccagac cctgaagaat cgcctcggtg acaagggccc gcagatctgc gccatcatga aggcggacgc ctacggtcac ggcatcgacc tgctggtccc ttccgtggtc aaggcaggca tcccctgcat cggcatcgcc agcaacgaag aagcacgtgt tgcccgcgag aagggcttcg aaggtcgcct gatgcgggta cgtgccgcca ccccggatga agtggagcag gccctgccct acaagctgga ggagctcatc ggcagcctgg agagcgccaa ggggatcgcc gacatcgccc agcgccatca caccaacatc ccggtgcaca tcggcctgaa ctccgccggc atgagccgca acggcatcga tctgcgccag gacgatgcca aggccgatgc cctggccatg ctcaagctca aggggatcac cccggtcggc atcatgaccc acttcccggt ggaggagaaa gaggacgtca agctggggct ggcccagttc aagctggact accagtggct catcgacgcc ggcaagctgg atcgcagcaa gctcaccatc cacgccgcca actccttcgc caccctggaa gtaccggaag cctactttga catggtgcgc ccgggcggca tcatctatgg cgacaccatt ccctcctaca ccgagtacaa gaaggtgatg gcgttcaaga cccaggtcgc ctccgtcaac cactacccgg cgggcaacac cgtcggctat gaccgcacct tcaccctcaa gcgcgactcc ctgctggcca acctgccgat gggctactcc gacggctacc gccgcgccat gagcaacaag gcctatgtgc tgatccatgg ccagaaggcc cccgt cgtgg gcaagacttc catgaacacc accatggtgg acgtcaccga catcaagggg atcaaacccg gtgacgaggt ggtcctgttc ggacgccagg gtgatgccga ggtgaaacaa tctgatctgg aggagtacaa cggtgccctc ttggcggaca tgtacaccgt ctggggctat accaacccca agaagatcaa gcgctaa (SEQ ID NO:543).
[00248] The corresponding amino acid sequence for the A. caviae native BAR:
MHKKTLLATL IFGLLAGQAV AAPYLPLADD HRNGQEQTAA NAWLEVDLGA
FEHNIQTLKN RLGDKGPQIC AIMKADAYGH GIDLLVPSW KAGI PCIGIA
SNEEARVARE KGFEGRLMRV RAATPDEVEQ ALPYKLEELI GSLESAKGIA
DIAQRHHTNI PVHIGLNSAG MSRNGIDLRQ DDAKADALAM LKLKGITPVG
IMTHFPVEEK EDVKLGLAQF KLDYQWLIDA GKLDRSKLTI HAANSFATLE
VPEAYFDMVR PGGI IYGDTI PSYTEYKKVM AFKTQVASVN HYPAGNTVGY
DRTFTLKRDS LLANLPMGYS DGYRRAMSNK AYVLIHGQKA PWGKTSMNT
TMVDVTDIKG IKPGDEVVLF GRQGDAEVKQ SDLEEYNGAL LADMYTVWGY TNPKKIKR (SEQI D NO:544).
[00249] The following PCR primers were utilized to clone the native full-length A. caviae
BAR in both tagged and C-terminally his-tagged versions:
[00250] caviae F Nde 1 5' - GGA ACC TTC ATA TGC ACA AGA AAA CAC TGC TCG
CGA CC - 3' (SEQ ID NO:545);
[00251] caviae R BamH\ (untagged) 5' - GGT TCC AAG GAT CCT TAG CGC TTG
ATC TTC TTG GGG TTG - 3' (SEQ ID NO:546); and
[00252] caviae R Xho I (C-term tag) 5' - TTC CAA GGC TCG AGG CGC TTG ATC TTC
TTG GGG TTG GTA - 3' (SEQ ID NO:547).
[00253] The C-terminally tagged enzyme had comparable activity to the untagged native
A. caviae BAR. When 200 μg of purified (tagged) racemase enzymes were used in a tryptophan racemase assay as described in Example 17, at 30 minutes, A. caviae BAR produced 1034 μg/mL of D-tryptophan. The effect of leader sequences on racemase activity
[00254] The first 21 N-terminal amino acid residues of the A. caviae native BAR amino acid sequence above (SEQ ID NO:544) were predicted to be a signal peptide using the program
Signal P 3.0 ((cbs.dtu.dk/services/SignalP/ on the World Wide Web). The following N-terminal primer was used to clone the A. caviae gene without amino acids 2-21 of the leader sequence:
Λ.cαvMinus leader F Ndel 5'-CCT TGG AAC ATA TGG CCC CCT ATC TGC CGC T-3- (SEQ
ID NO:548).
[00255] The leaderless racemase, when expressed, was found to retain approximately 65% of the activity, as compared with the expression product of the full-length gene. The periplasmic and cytoplasmic protein fractions were isolated for the wild type expression products, as well as the leaderless constructs, as described in the pET System Manual (Novagen, Madison, Wl). The majority of expressed wildtype BAR was found in the periplasm, while the leaderless BAR appeared to remain in the cytoplasm. The reduction in activity of the leaderless Λ. caviae BAR may be due to a change in processing and/or folding when expressed in the cytoplasm.
Effect of D76N mutation on A. caviae BAR activity
[00256] A D76N mutant of A. caviae BAR was made to determine if this position was critical for broad activity. Mutagenesis was done using the QuickChange-Multi site-directed mutagenesis kit (Stratagene, La JoIIa, CA), using the C-tagged A. caviae BAR gene in pET30 as template. The following mutagenic primer was used to make a D76N change (nucleotide position 226): 5'-CGC CAT CAT GAA GGC GAA CGC CTA CGG TCA CG-3' (SEQ ID
NO:549). The site-directed mutagenesis was done as described in the manufacturer's protocol.
The mutant and the wildtype enzyme were produced as described above and assayed as described in Example 17 using 200 micrograms of purified protein (prepared as described herein
- purified A. caviae D76N was C-term His tagged in pET30) and approximately 7 mg/mL of L- tryptophan as substrate. At the 30 minute time point, the mutant produced 1929 micrograms per mL of D-tryptophan as compared to 1 149 micrograms per mL for the wildtype enzyme. The
D76N mutant also reached equilibrium at an earlier time point. The improvement in activity was unexpected.
[00257] Based on the high homology in this region for Aeromonas and Pseudomonas
BAR enzymes, it might be expected that similar mutations in other broad activity racemases would also be beneficial. A benefit effect, however, was not observed when a similar mutation in
SEQ ID NO:442 was generated. See Example 13.
[00258] The following racemase polypeptides had 99% identity to the BAR from A. caviae described in this example: SEQ ID NO:200, 202, 206, 142, 1 86 and 176. SEQ ID NO: 176 had 97% identity at the amino acid level to the BAR from A. caviae described in this example. It is expected that these candidates would also have tryptophan racemase activity given the high sequence homology to an enzyme with demonstrated tryptophan racemase activity.
Example 20 — Sequence Identity Matrix
[00259] Appendix I shows a table that describes selected characteristics of exemplary nucleic acids and polypeptides of the invention, including sequence identity comparison of the exemplary sequences to public databases. By way of example and to further aid in understanding Appendix I, the first row, labeled "SEQ ID NO:", the numbers "1 , 2" represent the exemplary polypeptide of the invention having a sequence as set forth in SEQ ID NO:2, encoded by, e.g., SEQ ID NO: 1 . The sequences described in Appendix I (the exemplary sequences of the invention) have been subject to a BLAST search (as described herein) against two sets of databases. The first database set is available through NCBI (National Center for Biotechnology Information). The results from searches against these databases are found in the columns entitled "NR Description", "NR Accession Code", "NR E-value" or "NR Organism". "NR" refers to the Non-Redundant nucleotide database maintained by NCBI. This database is a composite of GenBank, GenBank updates, and EMBL updates. The entries in the column "NR Description" refer to the definition line in any given NCBI record, which includes a description of the sequence, such as the source organism, gene name/protein name, or some description of the function of the sequence. The entries in the column "NR Accession Code" refer to the unique identifier given to a sequence record. The entries in the column "NR E-value" refer to the Expected value (E-value), which represents the probability that an alignment score as good as the one found between the query sequence (the sequences of the invention) and that particular database sequence would be found in the same number of comparisons between random sequences as was done in the present BLAST search. The entries in the column "NR Organism" refer to the source organism of the sequence identified as the closest BLAST hit. [00260] The second database set is collectively known as the GENESEQ™ database, which is available through Thomson Derwent (Philadelphia, PA). The results from searches against this database are found in the columns entitled "GENESEQ Protein Description", "GENESEQ Protein Accession Code", "E-value", "GENESEQ DNA Description", "GENESEQ DNA Accession Code" or "E-value". The information found in these columns is comparable to the information found in the NR columns described above, except that it was derived from BLAST searches against the GENESEQ™ database instead of the NCBI databases. [00261] In addition, this table includes the column "Predicted EC No.". An EC number is the number assigned to a type of enzyme according to a scheme of standardized enzyme nomenclature developed by the Enzyme Commission of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB). The results in the "Predicted EC No." column are determined by a BLAST search against the Kegg (Kyoto Encyclopedia of Genes and Genomes) database. If the top BLAST match has an E-value equal to or less than e-6, the EC number assigned to the top match is entered into the table. [00262] The columns "Query DNA Length" and "Query Protein Length" refer to the number of nucleotides or the number amino acids, respectively, in the sequence of the invention that was searched or queried against either the NCBI or GENESEQ™ databases. The columns "Subject DTMA Length" and "Subject Protein Length" refer to the number of nucleotides or the number amino acids, respectively, in the sequence of the top match from the BLAST searches. The results provided in these columns are from the search that returned the lower E-value, either from the NCBl databases or the GENESEQ™ database. The columns "%I D Protein" and "%ID DNA" refer to the percent sequence identity between the sequence of the invention and the sequence of the top BLAST match. The results provided in these columns are from the search that returned the lower E-value, either from the NCBI databases or the GENESEQ™ database.
OTHER EMBODIMENTS
[00263] It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
Figure imgf000086_0001
Figure imgf000087_0001
Figure imgf000088_0001
Figure imgf000089_0001
Figure imgf000090_0001
Figure imgf000091_0001
Figure imgf000092_0001
Figure imgf000093_0001
Figure imgf000094_0001
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
Figure imgf000098_0001
Figure imgf000099_0001
Figure imgf000100_0001
Figure imgf000101_0001
Figure imgf000102_0001
Figure imgf000103_0001
Figure imgf000104_0001
Figure imgf000105_0001
Figure imgf000106_0001
]CC966 AT 7 O 3 ID Nypoari
>
Π
D
Figure imgf000107_0001
Figure imgf000108_0001
Figure imgf000109_0001
9d A44Dl I
1
Figure imgf000110_0001
Figure imgf000111_0001
Figure imgf000112_0001
Figure imgf000113_0001
Figure imgf000114_0001
Figure imgf000115_0001
Figure imgf000116_0001
Figure imgf000117_0001
Figure imgf000118_0001
Figure imgf000119_0001
Figure imgf000120_0001
Figure imgf000121_0001
Figure imgf000122_0001
Figure imgf000123_0001
Figure imgf000124_0001
Figure imgf000125_0001
O
io e
q n
q n
E r NA te N t tei A
Figure imgf000126_0001
Figure imgf000127_0001
Figure imgf000128_0001
Figure imgf000129_0001
Appendix I
Figure imgf000130_0001
Figure imgf000131_0001
Figure imgf000132_0001
Figure imgf000133_0001
Figure imgf000134_0001
Figure imgf000135_0001
Figure imgf000136_0001
Figure imgf000137_0001
Figure imgf000138_0001
Figure imgf000139_0001
Figure imgf000140_0001
Figure imgf000141_0001
Figure imgf000142_0001
Figure imgf000143_0001
Figure imgf000144_0001
Figure imgf000145_0001
Figure imgf000146_0001
Figure imgf000147_0001
Figure imgf000148_0001
Figure imgf000149_0001
Figure imgf000150_0001
Figure imgf000151_0001
ho M £ CO SEQ ID NO
ee SQmE pyp pp, subsdoa hrhiledeQti SE yg stesznhiinyp p doa ss hr hilubacease rm 1220084 47 4 ppetide [ yp peooasdoaArmn hrhilutdai besatuervi mr racemaseeooass Armnedos Puomna p Eedobactemr aaelnin ] ATCC 7966 ID NO 3 doa66 rhil AT 7D 5 h I NR Accession 18E+08 ypCC9 y eze SQ 1nmE 18E+08 1 18E+08 1 18E+08 1 18E+08 pyp pp, subsdoa hrhilQetide SE Code yg 221 stesnhizinyp p doa subs hrhila rcemase 1224 40708 4 pp , 211edeti y [ p peooasdoaArmn hrhilutida brevsatuei mr
0 0 aceaseoas rm Aeomrnseudooas Pmn p Emedobacter 0 0 0 NR Evalue aaelnin ] ATCC9667 O 3 ID N ypdoaCC hrhil A 7966TD 5 Iy eze SQnmE pyp pp, subsoa h drhilQetde SEi yg 201 snthesizinyp pa ss hdrohilubacease rm08 122 4708 4 pp , 209etide p [y peooasaArmndol hr hiutdai brevisatue mraceasees rm Aromonaseudooas Pmnp Eedobamctere aaninl ]CC966 AT 7 ID NO 3 yp hdohlaCC966ri AT 7D 5 I y enzme SEQ ypp ppubsdoa s hrhil Qetide SE yg 082 sesnthizinyppo hdrhila subsacease rm 1224 40708 4 pp,0 27eetid [yp peooasrmndoaA hrhilutida besatrvi mureaceaseeooas rm Armnseudooas Pmn p Emedobacter aaelnin ]CC966 AT 7 DO 3 I Nypdoa hrhilCC966 AT 7D 5 I Geneseq y eze SQnmE 81 AED10581 Apyp ppE subso hdrhilaQetide SE Protein ED10581 AED10581 AED105 yg 206D stesznhiin10581 yppdoa hrhil subsacease rm 12270808 4 4 Accession , pp 205etdei [yp pAeooasrmndoa hrhilutdai Code besatrvi mueraceaseeooas rm Armnse Pudooasmn p Eedobactemr OOE-133 1 00E-131 1 00E-131 1 OOE-133 1 OOE-130 aaelnin Evalue >
C p Descritoin Q pes Dcritionpgesco NR Dritin Oranism Protein q Genese DNAq Genese
Geneseq DNA DB99538 ADB99538 ADB99538 ADB99538 ADB99538 Accession Code OOE-15 2 00E-14 1 OOE-12 3 00E-16 1 OOE-15 Evalue en
Predicted EC Number
Query DNA Length
Query Protein Length
Subject DNA Length
Subject
Protein Length to S % ID Protein
% ID DNA
Figure imgf000153_0001
Figure imgf000154_0001
Figure imgf000155_0001
Figure imgf000156_0001
Figure imgf000157_0001
Figure imgf000158_0001
Figure imgf000159_0001
Figure imgf000160_0001
Figure imgf000161_0001
Figure imgf000162_0001
Figure imgf000163_0001
Figure imgf000164_0001
Figure imgf000165_0001
Figure imgf000166_0001
Figure imgf000167_0001
Figure imgf000168_0001
Figure imgf000169_0001
O
i e
n
q n
E r
t N t e A
Figure imgf000170_0001
Figure imgf000171_0001
Figure imgf000172_0001
Figure imgf000173_0001
Figure imgf000174_0001
Appendix I
Figure imgf000175_0001
Figure imgf000176_0001
Figure imgf000177_0001
Figure imgf000178_0001
Figure imgf000179_0001
Figure imgf000180_0001
Figure imgf000181_0001
Figure imgf000182_0001
Figure imgf000183_0001
Figure imgf000184_0001

Claims

WHAT IS CLAIMED IS:
I . A method of converting one or more L-amino acids to the corresponding one or more D-amino acids (or one or more D-amino acids to the corresponding one or more L-amino acids), comprising combining one or more L-amino acids (or one or more D-amino acids) with a) one or more nucleic acid molecules chosen from the group consisting of SEQ ID NOs:1, 3, 5, 7, 9, I I, 13, 15, 17, 19, 21, 23, 25, 27,29, 31, 33, 35, 37, 39, 41,43,45,47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165. 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199,201, 203, 205, 207, 209, 211,213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293,295, 297, 299, 301, 303, 305, 307, 309, 31 I, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 369, 371, 373, 375, 377, 379, 381, 383, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409, 411, 413, 415, 417, 419, 421, 423, 425, 427, 429, 431, 433, 435, 437, 439, 441, 443, 445, 447, 449, 451, 453, 455, 457, 459, 461, 463, 465, 467, 469, 471, 473, 475, 477, 479, 481, 483, 485, 487, 489, 491, 493, 495, and 497, wherein said one or more nucleic acid molecules encode polypeptides having isomerase or epimerase activity; b) a sequence variant of a), wherein said variant encodes a polypeptide having isomerase or epimerase activity; c) a fragment of a) or b), wherein said fragment encodes a polypeptide having isomerase or epimerase activity; d) one or more polypeptides chosen from the group consisting of SEQ ID
NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178,
180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 442 D56N, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498, wherein said one or more polypeptides has isomerase or epimerase activity; e) a variant of d), wherein said variant has isomerase or epimerase activity; or f) a fragment of d) or e), wherein said fragment has isomerase or epimerase activity.
2. The method of claim 1, wherein said one or more nucleic acid molecules are chosen from the group consisting of SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19,21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91,93, 95, 97,99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281 , 283, 285, 287, 289, 291 , 293, 295, 297, 299, 301 , 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409, 411, 413, 415, 417, 419, 421, 423, 425, 427, 429,431,433, 435, 437, 439, 441, 443, 4455447, 449, 4511453, 455, 457, 459, 461 , 463, 465, 467, 469, 471 , 473, 475, 477, 479, 481 , 483, 485, 487, 489, 491, 493, 495, and 497.
3. The method of claim I, wherein said one or more polypeptides are chosen from the group consisting of SEQ 1 D NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, I 10, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 442 D56N, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498.
4. The method of claim 1, wherein the nucleic acid molecule has the sequence shown in SEQ IDNO:411.
5. The method of claim 1, wherein the polypeptide has the sequence shown in SEQ IDNO:412.
6. The method of claim 1, wherein said variant is a nucleic acid molecule that has at least 98% or at least 99% sequence identity to SEQ ID NOs: 1 , 3, 5, 7, 9, 11 , 13, 15, 17, 19, 21 , 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91,93,95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193,
195, 197, 199, 201 , 203, 205, 207, 209, 211 , 213, 215, 217, 219, 221 , 223, 225.227, 229, 231 , 233, 235, 237, 239, 241, 243, 245, 247, 249, 251,253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 31 I, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 369, 371, 373, 375, 377, 379, 381, 383, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403,405, 407, 409, 411, 413, 415, 417, 419, 421, 423, 425, 427, 429, 431 , 433, 435, 437, 439, 441 , 443, 445, 447, 449, 451 , 453, 455, 457, 459, 461, 463, 465, 467, 469, 471, 473, 475, 477, 479, 481, 483, 485, 487, 489, 491, 493, 495, and 497.
7. The method of claim I , wherein said variant is a polypeptide that has at least 98% or at least 99% sequence identity to SEQ I D NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, I 10, 112, 114, I 16, I 18, 120, 122, 124, 1 26, 128, 1 30, 132, 1 34, 1 36, 1 38, 140, 142, 144, 146, 148, 1 50, 1 52, 154, 156, 1 58, 160, 162, 1 64, 166, 168, 1 70, 1 72, 1 74, 1 76, 1 78, 1 80, 1 82, 1 84, 1 86, 1 88, 1 90, 192, 194, 1 96, 198, 200, 202, 204, 206, 208, 2 10, 212, 2 14, 21 6, 21 8, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 31 0, 3 12, 3 14, 3 16, 3 1 8, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 41 2, 4 14, 416, 41 8, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 442 D56N, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498.
8. The method of claim I , wherein said variant is a nucleic acid that has at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity to SEQ ID NO:41 1 .
9. The method of claim 1 , wherein said variant is a polypeptide that has at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity to SEQ I D NO:41 2.
10. The method of claim I , wherein said variant is a mutant.
1 1 . The method of claim 10, wherein the mutant has a mutation at the residue that aligns with residue 76 of A. caviae BAR.
12. The method of claim I , wherein the variant is a nucleic acid molecule that has been codon optimized.
13. The method of claim 1 , wherein the nucleic acid molecule is contained within an expression vector.
14. The method of claim 1 3, wherein the nucleic acid molecule is overexpressed.
15. The method of claim 1, wherein the polypeptide lacks a signal sequence or a prepro domain.
16. The method of claim I, wherein the isomerase or epimerase polypeptide is immobilized on a solid support.
17. The method of claim 1, wherein the variant polypeptide is a chimeric polypeptide.
18. The method of claim 1, wherein the polypeptide fragment is a PFAM domain.
19. The method of claim 18, wherein the polypeptide fragment has the sequence shown in SEQ ID NO: 426, 440, or 462.
20. The method of claim 1, wherein the amino acid is selected from the group consisting of tryptophan, alanine and a substituted amino acid.
21. A method of converting L-tryptophan to D-tryptophan (or D-tryptophan to L- tryptophan), comprising combining L-tryptophan (or D-tryptophan) with a) one or more nucleic acid molecules chosen from the group consisting of SEQ IDNOs:!, 3, 5, 7, 9, 11, 13, 15, 17, 19,21, 23, 25, 27, 29, 31, 33, 35, 37, 39,41,43,45,47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 1 I I, 113, I 15, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207,209, 211, 213, 215, 217, 219, 221 , 223, 225, 227, 229, 231 , 233, 235, 237, 239, 241 , 243, 245, 247, 249, 251 , 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291 , 293, 295, 297, 299, 301 , 303, 305, 307, 309, 311 , 313, 315, 317, 319, 321 , 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409, 411, 413, 415, 417, 419, 421 , 423, 425, 427, 429, 431 , 433, 435, 437, 439, 441, 443, 445, 447, 449, 451, 453, 455, 457, 459, 461, 463, 465, 467, 469, 471, 473, 475, 477,479, 481, 483, 485, 487, 489, 491, 493, 495, and 497, wherein said one or more nucleic acid molecules encode polypeptides having racemase activity; b) a variant of a), wherein said variant encodes a polypeptide having racemase activity; c) a fragment of a) or b), wherein said fragment encodes a polypeptide having racemase activity; d) one or more polypeptides chosen from the group consisting of SEQ ID
NOs:2, 4, 6, 8, 1 0, 12, 14, 16, 1 8, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 1 06, 108, 1 10, 1 12, 1 14, 1 1 6, I 18, 120, 122, 124, 126, 128, 130, 1 32, 134, 136, 138, 140, 142, 144, 146, 148, 150, 1 52, 154, 1 56, 1 58, 160, 162, 164, 166, 168, 1 70, 1 72, 1 74, 176, 1 78,
180, 1 82, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 21 8, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 3 12, 3 14, 3 16, 3 1 8, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 41 8, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 442 D56N, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498, wherein said one or more polypeptides has racemase activity; e) a variant of d), wherein said variant has racemase activity; or f) a fragment of d) or e), wherein said fragment has racemase activity.
22. The method of claim 2 1 , wherein said one or more polypeptides are chosen from the group consisting of SEQ I D NOs: 1 72, 178, 1 80, 182, 1 84, 140, 144, 188, 190, 1 12, 148, 1 56, 120, 162, 108, 1 36, 1 74, 1 38, 296, 1 10, 1 50, 192, 1 52, 1 1 8, 194, 154, 1 96, 1 58, 160, 1 16, 248, 236, 246, 252, 250, 254, 244, 274, 234, 220, 222, 226, 232, 240, 242, 258, 260, 264, 266, 286, 290, 1 70, 216, 288, 208, 2 10, 228, 230, 270, 272, 278, 280, 282, 284, 292, 198, 2 12, 2 14, 1 14, 2 18, and 204.
23. The method of claim 2 1 , wherein said tryptophan is a substituted typtophan.
24. A method of converting L-tryptophan to D-tryptophan, comprising combining L-tryptophan with a) a nucleic acid molecule having the sequence shown in SEQ ID N0:411, wherein said nucleic acid molecule encodes a polypeptides having racemase activity; b) a variant of a), wherein said variant encodes a polypeptide having racemase activity; c) a fragment of a) or b), wherein said fragment encodes a polypeptide having racemase activity; d) one or more polypeptides chosen from the group consisting of SEQ ID NO:411, wherein said one or more polypeptides has racemase activity; e) a variant of d), wherein said variant has racemase activity; or f) a fragment of d) or e), wherein said fragment has racemase activity.
25. The method of claim 24, wherein said tryptophan is a substituted tryptophan.
26. The method of claim 25, wherein said substituted tryptophan is selected from the group consisting of a chlorinated tryptophan, a halogenated tryptophan and 6-chloro-D- tryptophan.
27. A method of making monatin, comprising: combining L-tryptophan with a) one or more nucleic acid molecules chosen from the group consisting of SEQ ID NOs:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25,27, 29, 31, 33, 35, 37, 39,41,43, 45,47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199,201, 203, 205,207, 209,211, 213, 215, 217, 219, 221 , 223, 225, 227, 229, 231 , 233, 235, 237, 239, 241 , 243, 245, 247, 249, 251 , 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 385, 387, 389, 391, 393, 395, 397, 399, 401, 403, 405, 407, 409, 411, 413, 415, 417, 419, 421, 423, 425, 427, 429, 431, 433, 435, 437, 439, 441, 443, 445, 447, 449, 451, 453, 455, 457, 459, 461 , 463, 465, 467, 469, 471 , 473, 475, 477, 479, 481 , 483, 485, 487, 489, 491 , 493, 495, and 497, wherein said one or more nucleic acid molecules encode polypeptides having racemase activity; b) a variant of a), wherein said variant encodes a polypeptide having racemase activity; c) a fragment of a) or b), wherein said fragment encodes a polypeptide having racemase activity; d) one or more polypeptides chosen from the group consisting of SEQ ID
NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 1 06, 108, 1 10, 1 12, 1 14, 1 16, 1 18, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178,
180, 1 82, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 442 D56N, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, and 498, wherein said one or more polypeptides has racemase activity; e) a variant of d), wherein said variant has racemase activity; or
0 a fragment of d) or e), wherein said fragment has racemase activity.
28. The method of claim 27, further comprising adding one or more polypeptides having synthase / lyase (EC 4.1 .3.- or EC 4.1.2.-) activity or a nucleic acid encoding such a polypeptide and/or one or more polypeptides having D-aminotransferase activity or a nucleic acid encoding such a polypeptide.
29. The method of claim 27, wherein said monatin is predominantly R, R monatin.
30. The method of claim 27, wherein said nucleic acid has the sequence shown in SEQ ID NO:41 1 and said polypeptide has the sequence shown in SEQ I D NO:412.
PCT/US2008/013968 2008-01-03 2008-12-22 Isomerases and epimerases and methods of using WO2009088442A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/828,714 US20110045547A1 (en) 2008-01-03 2010-07-01 Isomerases and epimerases and methods of using

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US1884508P 2008-01-03 2008-01-03
US61/018,845 2008-01-03

Related Child Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/049599 Continuation WO2011002469A1 (en) 2008-01-03 2009-07-02 Isomerases and epimerases and methods of using

Publications (2)

Publication Number Publication Date
WO2009088442A2 true WO2009088442A2 (en) 2009-07-16
WO2009088442A3 WO2009088442A3 (en) 2010-03-18

Family

ID=40853655

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/013968 WO2009088442A2 (en) 2008-01-03 2008-12-22 Isomerases and epimerases and methods of using

Country Status (1)

Country Link
WO (1) WO2009088442A2 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050095670A1 (en) * 2002-03-01 2005-05-05 Hajime Ikeda Amino acid racemase having low substrate specificity and process for producing racemic amino acid
US20060252135A1 (en) * 2005-04-26 2006-11-09 Cargill, Incorporated Polypeptides and biosynthetic pathways for the production of stereoisomers of monatin and their precursors
US20080020434A1 (en) * 2005-04-26 2008-01-24 Cargill, Incorporated Polypeptides and biosynthetic pathways for the production of stereoisomers of monatin and their precursors

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050095670A1 (en) * 2002-03-01 2005-05-05 Hajime Ikeda Amino acid racemase having low substrate specificity and process for producing racemic amino acid
US20060252135A1 (en) * 2005-04-26 2006-11-09 Cargill, Incorporated Polypeptides and biosynthetic pathways for the production of stereoisomers of monatin and their precursors
US20080020434A1 (en) * 2005-04-26 2008-01-24 Cargill, Incorporated Polypeptides and biosynthetic pathways for the production of stereoisomers of monatin and their precursors

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DATABASE GENBANK [Online] 02 October 2001 'UFP0001 protein aq_274' Database accession no. O66631 *
DATABASE GENBANK [Online] 22 December 2005 'Aquifex aeolicus VFS, complete genome' Database accession no. AE000657 *
DATABASE INTERPRO [Online] 1991 'Alanine racemase, N-terminal' Database accession no. IPR001608 *

Also Published As

Publication number Publication date
WO2009088442A3 (en) 2010-03-18

Similar Documents

Publication Publication Date Title
US20110045547A1 (en) Isomerases and epimerases and methods of using
JP5683962B2 (en) Nucleic acids and polypeptides of aminotransferases and oxidoreductases and methods of use
US7582455B2 (en) Polypeptides and biosynthetic pathways for the production of stereoisomers of monatin and their precursors
AU2007342275B2 (en) Polypeptides and biosynthetic pathways for the production of stereoisomers of monatin and their precursors
EP2361976B1 (en) Polypeptides and biosynthetic pathways for the production of stereoisomers of monatin and their precursors
JP4532116B2 (en) Alanine 2,3-aminomutase
BRPI0716212A2 (en) BETA-ALANINE / ALPHA KETOGLUTARATE AMINOTRANSFERASE FOR PRODUCTION OF 3-HYDROXIPROPIONIC ACID
BRPI0711972A2 (en) methods and systems for increasing the production of equilibrium reactions
WO2009088442A2 (en) Isomerases and epimerases and methods of using
WO2020187739A1 (en) Engineered cells for production of indole-derivatives
AU2012201740B2 (en) Polypeptides and biosynthetic pathways for the production of stereoisomers of monatin and their precursors
WO2022131323A1 (en) Microbe that produces useful substance, and production method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08869974

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08869974

Country of ref document: EP

Kind code of ref document: A2