WO2009006492A2 - Stereoselective resolution of racemic amines - Google Patents

Stereoselective resolution of racemic amines Download PDF

Info

Publication number
WO2009006492A2
WO2009006492A2 PCT/US2008/068951 US2008068951W WO2009006492A2 WO 2009006492 A2 WO2009006492 A2 WO 2009006492A2 US 2008068951 W US2008068951 W US 2008068951W WO 2009006492 A2 WO2009006492 A2 WO 2009006492A2
Authority
WO
WIPO (PCT)
Prior art keywords
enzyme
compound
formula
transaminase
sequence
Prior art date
Application number
PCT/US2008/068951
Other languages
French (fr)
Other versions
WO2009006492A3 (en
Inventor
Ronald L. Hanson
Animesh Goswami
Brian L. Davis
William Lawrence Parker
Ramesh N. Patel
Original Assignee
Bristol-Myers Squibb Company
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bristol-Myers Squibb Company filed Critical Bristol-Myers Squibb Company
Publication of WO2009006492A2 publication Critical patent/WO2009006492A2/en
Publication of WO2009006492A3 publication Critical patent/WO2009006492A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P13/00Preparation of nitrogen-containing organic compounds
    • C12P13/001Amines; Imines
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P41/00Processes using enzymes or microorganisms to separate optical isomers from a racemic mixture
    • C12P41/003Processes using enzymes or microorganisms to separate optical isomers from a racemic mixture by ester formation, lactone formation or the inverse reactions
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P41/00Processes using enzymes or microorganisms to separate optical isomers from a racemic mixture
    • C12P41/006Processes using enzymes or microorganisms to separate optical isomers from a racemic mixture by reactions involving C-N bonds, e.g. nitriles, amides, hydantoins, carbamates, lactames, transamination reactions, or keto group formation from racemic mixtures

Definitions

  • the invention relates to a novel process for the enzymatic resolution of racemic amines.
  • the process provides for the catalytic enzymatic resolution of a amine into its (R)- and or ( ⁇ -isomer.
  • the (R)- or ( ⁇ -isomers produced in accordance with the process of the invention are precursors in molecules therapeutically useful as inhibitors of Corticotropin Releasing Factor for treatment of anxiety.
  • Chirality is a factor to be considered with respect to the efficacy of many drugs and agrochemicals.
  • the production of single enantiomers of chiral intermediates has become increasingly important.
  • Single enantiomers can be produced by chemical or chemoenzymatic synthesis, and biocatalysis, the latter being the emphasis herein.
  • Biocatalysis has many advantages over chemical synthesis which include the enantioselective and regioselective nature of enzyme-catalyzed reactions, and the ability of biocatalysts to carry out biocatalytic reactions at ambient temperature and atmospheric pressure.
  • Biocatalysts avoid problems in isomerization, racemization, epimerization, and rearrangement often associated with the use of extreme conditions in chemical syntheses. Furthermore, microbial cells expressing an enzyme of interest, and the enzymes themselves, can be immobilized and reused for multiple biocatalytic reactions. The enzymes may be over-expressed to make biocatalytic processes economically efficient, and enzymes with modified activity/properties can be readily made by recombinant techniques.
  • the present invention provides novel processes for the enzymatic resolution of racemic amines using either a transaminase or a lipase.
  • the enzyme utilized in the stereoselective process is a transaminase. In another embodiment, the enzyme utilized in the stereoselective process is a lipase.
  • the present invention provides a transaminase polypeptide, encoded by the polynucleotide of SEQ ID NO: 1 and having the encoded amino acid sequences of SEQ ID NO:2, or a functional or biologically active portion of these sequences.
  • the present invention also provides an isolated transaminase polynucleotide as depicted in SEQ ID NO: 1.
  • the present invention provides a polynucleotide sequence comprising the complement of SEQ ID NO: 1, or variants thereof.
  • an object of the invention encompasses variations or modifications of the transaminase sequence which is a result of degeneracy of the genetic code, where the polynucleotide sequences can hybridize under moderate or high stringency conditions to the polynucleotide sequence of SEQ ID NO: 1.
  • the present invention provides an isolated nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide comprising SEQ ID NO:2.
  • the present invention provides compositions comprising the transaminase polynucleotide sequence, or fragments thereof, or the encoded transaminase polypeptide, or fragments or portions thereof.
  • the present invention also provides expression vectors and host cells comprising polynucleotides that encode the transaminase polypeptide of the invention.
  • the present invention provides a process for the preparation of a compound of Formula Ia or Ib
  • Ri is alkyl, aryl, or heterocyclic; and R 2 is cycloalkyl or alkyl; comprising resolving a racemic compound of Formula I
  • the strain of Bacillus is Bacillus megaterium.
  • the strain of Pseudomonas is Pseudomonas sp.
  • the strain of Candida is Candida antarctica.
  • the enzyme is a transaminase. In another embodiment, the enzyme is a lipase.
  • the enzyme is the transaminase according to SEQ
  • processes for preparing compounds of formula Ia or Ib are provided wherein the reaction with an enzyme is carried out either by: introducing a racemic compound of formula I into a medium in which the microorganism is being fermented to form a reaction mixture in which the enzyme is concurrently being formed and catalytically reacts with the racemic compound; or fermenting the microorganism until sufficient growth is realized, and introducing the racemic compound to the microorganism in which the racemic compound of formula I is catalytically reacted with the enzyme.
  • processes for preparing compounds of formula Ia or Ib are provided wherein the amount of the racemic compound of formula I added to the reaction mixture is up to about 50 g/L of the reaction mixture.
  • processes for preparing compounds of formula Ia or Ib are provided wherein an enzyme is isolated and optionally purified.
  • processes for preparing compounds of formula Ia or Ib are provided wherein the reaction catalyzed by an enzyme is carried out by reacting the racemic compound of formula I with the enzyme that was previously isolated and optionally purified before contacting with the racemic compound.
  • processes for preparing compounds of formula Ia or Ib are provided wherein the enzyme is derived from cell extracts.
  • processes for preparing compounds of formula Ia or Ib are provided wherein the enzyme is expressed by a plasmid transformed into E. coli host cells.
  • processes for preparing compounds of formula Ia or Ib are provided wherein the enzyme is obtained from Bacillus megaterium (source of transaminase enzyme), Candida antarctica (source of transaminase lipase) or Pseudomonas sp (source of transaminase enzyme).
  • the enzyme is obtained from Bacillus megaterium strain SC6394.
  • the enzyme is a transaminase that is expressed by a gene having characteristics selected from SEQ ID NO: 1.
  • the enzyme comprises an amino acid sequence comprising SEQ ID NO:2.
  • processes for preparing compounds of formula Ia or Ib are provided wherein the enzyme provides a reaction yield of greater than 42% by weight of the compound of formula Ia or Ib, based on the weight of the racemic aammiinnee iinnppuutt.
  • processes for preparing compounds of formula Ia or Ib are provided wherein the process provides the compounds of formula Ia or Ib in an enantiomeric excess greater than 95%.
  • processes for preparing compounds of formula Ia or Ib are provided wherein the reaction catalyzed by an enzyme is carried out at a pH of between about 5.0 and about 9.0.
  • the present invention provides processes for the preparation of an enzyme for the preparation of compounds of formula Ia* or Ib*
  • processes for the preparation of an enzyme for the preparation of compounds of formula Ia* or Ib* are provided wherein the process of extracting the enzyme comprises lysing the cells of the microorganism and isolating the enzyme.
  • processes for the preparation of an enzyme for the preparation of compounds of formula Ia* or Ib* wherein the processes of purifying the enzyme comprises ion-exchange, hydrophobic, and hydroxyapatite chromatography.
  • processes for preparing compounds of formula Ia* or Ib* are provided wherein
  • the present invention provides processes for preparing a compound of formula Ia* which comprises reacting a compound of the formula I* with Novozym 435 and ethyl caprate in MTBE (methyl ?-butyl ether) to afford Compound Ia*.
  • the present invention provides processes for preparing a compound of formula Ib* (R amine) which comprises reacting a compound of the formula I* with enzyme in presence of potassium phosphate and sodium pyruvate to convert Ia* to a ketone.
  • the present invention provides processes for preparing (R)-seobutylamine by the enzymatic resolution of racemic 5eobutylamine.
  • the present invention provides processes for preparing (R)-sec-butylamine by the enzymatic resolution of racemic 5ec-butylamine, wherein the enzyme is a transaminase from B. megaterium expressed in E. coli.
  • the present invention provides processes for preparing (R)-sec-butylamine which comprises reacting racemic Sec-butylamine with a transaminase from B. megaterium expressed in E. coli in presence of potassium phosphate and sodium pyruvate.
  • alkyl refers to straight or branched chain hydrocarbon groups or radicals having 1 to 6 carbon atoms, such as methyl, ethyl, n-propyl, i-propyl, n- butyl, i-butyl, t-bvXy ⁇ , pentyl, hexyl, cycloalkyl having 3 to 6 carbon atoms, or any subset of the foregoing, any of which may be optionally substituted.
  • cycloalkyl as employed herein alone or as part of another group includes saturated or partially unsaturated (containing 1 or 2 double bonds) cyclic hydrocarbon groups containing 1 to 10 rings, preferably 1 to 3 rings, including monocyclic alkyl, bicyclic alkyl (or bicycloalkyl) and tricyclic alkyl, containing a total of 3 to 20 carbons forming the ring, preferably 3 to 15 carbons, more preferably 3 to 10 carbons, forming the ring and which may be fused to 1 or 2 aromatic rings as described for aryl, which includes cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, cycloheptyl, cyclooctyl, cyclodecyl, cyclododecyl, cyclohexenyl,
  • any of which groups may be optionally substituted with 1 to 4 substituents such as halogen, alkyl, alkoxy, hydroxy, aryl, aryloxy, arylalkyl, cycloalkyl, alkylamido, alkanoylamino, oxo, acyl, arylcarbonylamino, amino, nitro, cyano, thiol, and/or alkylthio, and/or any of the substituents for alkyl.
  • substituents such as halogen, alkyl, alkoxy, hydroxy, aryl, aryloxy, arylalkyl, cycloalkyl, alkylamido, alkanoylamino, oxo, acyl, arylcarbonylamino, amino, nitro, cyano, thiol, and/or alkylthio, and/or any of the substituents for alkyl.
  • amino acid sequence as used herein can refer to an oligopeptide, peptide, polypeptide, or protein sequence, and fragments or portions thereof, as well as to naturally occurring or synthetic molecules, preferably isolated polypeptides of the transaminase. Amino acid sequence fragments are typically from about 4 to about 30, preferably from about 5 to about 15 amino acids in length.
  • the transaminase amino acid sequence of this invention is set forth in SEQ ID NO:2.
  • the terms "transaminase polypeptide” and “transaminase protein” are used interchangeably herein to refer to the encoded products of the transaminase nucleic acid sequence according to the present invention.
  • Isolated transaminase polypeptide refers to the amino acid sequence of substantially purified transaminase, which may be obtained from any bacterial species, preferably Bacillus or Pseudomonas sp., and from a variety of sources, including natural, synthetic, semi-synthetic, or recombinant. More particularly, the transaminase polypeptide of this invention is identified in SEQ ID NO:2. Functional fragments of the transaminase polypeptide are also embraced by the present invention.
  • Similar amino acids are those which have the same or similar physical properties and in many cases, the function is conserved with similar residues. For example, amino acids lysine and arginine are similar; while residues such as proline and cysteine are not considered to be similar.
  • the term "consensus” refers to a sequence that reflects the most common choice of base or amino acid at each position among a series of related DNA, RNA or protein sequences. Areas of particularly good agreement often represent conserved functional domains.
  • a "variant" of a transaminase polypeptide refers to an amino acid sequence that is altered by one or more amino acids.
  • the variant may have "conservative” changes, in which a substituted amino acid has similar structural or chemical properties, e.g., replacement of leucine with isoleucine. More rarely, a variant may have "non-conservative" changes, for example, replacement of a glycine with a tryptophan.
  • the encoded protein may also contain deletions, insertions, or substitutions of amino acid residues, which produce a silent change and result in a functionally equivalent transaminase protein.
  • Deliberate amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues, as long as the biological activity of transaminase protein is retained.
  • negatively charged amino acids may include aspartic acid and glutamic acid
  • positively charged amino acids may include lysine and arginine
  • amino acids with uncharged polar head groups having similar hydrophilicity values may include leucine, isoleucine, and valine; glycine and alanine; asparagine and glutamine; serine and threonine; and phenylalanine and tyrosine.
  • Guidance in determining which amino acid residues may be substituted, inserted, or deleted without abolishing functional biological or immunological activity may be found using computer programs well known in the art, for example, DNASTAR, Inc. software (Madison, WI).
  • Nucleic acid or polynucleotide sequence refers to an isolated oligonucleotide ("oligo"), nucleotide, or polynucleotide, and fragments thereof, and to DNA or RNA of genomic or synthetic origin which may be single- or double-stranded, and represent the sense or anti-sense strand, preferably of the transaminase.
  • fragments include nucleic acid sequences that are greater than 20-60 nucleotides in length, and preferably include fragments that are at least 70-100 nucleotides, or which are at least 1000 nucleotides or greater in length.
  • the transaminase nucleic acid sequence of this invention is specifically identified in SEQ ID NO: 1.
  • an "allele” or "allelic sequence” is an alternative form of the transaminase nucleic acid sequence. Alleles may result from at least one mutation in the transaminase nucleic acid sequence and may yield altered mRNAs or polypeptides whose structure or function may or may not be altered. Any given gene, whether natural or recombinant, may have none, one, or many allelic forms. Common mutational changes, which give rise to alleles, are generally ascribed to natural deletions, additions, or substitutions of nucleotides. Each of these types of changes may occur alone, or in combination with the others, one or more times in a given sequence.
  • Oligonucleotides refer to a transaminase nucleic acid sequence comprising contiguous nucleotides, of at least about 5 nucleotides to about 60 nucleotides, preferably at least about 8 to 10 nucleotides in length, more preferably at least about 12 nucleotides in length, for example, about 15 to 35 nucleotides, or about 15 to 25 nucleotides, or about 20 to 35 nucleotides, which can be typically used in PCR amplification assays, hybridization assays, or in microarrays. It will be understood that the term oligonucleotide is substantially equivalent to the terms primer, probe, or amplimer, as commonly defined in the art.
  • antisense refers to nucleotide sequences, and compositions containing nucleic acid sequences, which are complementary to a specific DNA or RNA sequence.
  • antisense strand is used in reference to a nucleic acid strand that is complementary to the "sense” strand.
  • Antisense (i.e., complementary) nucleic acid molecules include PNAs and may be produced by any method, including synthesis or transcription. Once introduced into a cell, the complementary nucleotides combine with natural sequences produced by the cell to form duplexes, which block either transcription or translation.
  • the designation “negative” is sometimes used in reference to the antisense strand, and “positive” is sometimes used in reference to the sense strand.
  • Altered nucleic acid sequences encoding the transaminase polypeptide include nucleic acid sequences containing deletions, insertions and/or substitutions of different nucleotides resulting in a polynucleotide that encodes the same or a functionally equivalent transaminase polypeptide (i.e., having transaminase activity). Altered nucleic acid sequences may further include polymorphisms of the polynucleotide encoding a transaminase polypeptide; such polymorphisms may or may not be readily detectable using a particular oligonucleotide probe.
  • biologically active i.e., functional refers to a protein or polypeptide or fragment thereof, having structural, regulatory, or biochemical functions of a naturally occurring transaminase molecule.
  • biologically active i.e., functional refers to a protein or polypeptide or fragment thereof, having structural, regulatory, or biochemical functions of a naturally occurring transaminase molecule.
  • complementary or “complementarity” refer to the natural binding of polynucleotides under permissive salt and temperature conditions by base pairing. For example, the sequence “A-G-T” binds to the complementary sequence "T-C-A". Complementarity between two single-stranded molecules may be “partial”, in which only some of the nucleic acids bind, or it may be “complete” when total complementarity exists between single stranded molecules.
  • the degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, which depend upon binding between nucleic acids strands.
  • the term "homology" refers to a degree of complementarity. There may be partial homology or complete homology, wherein complete homology is equivalent to identity.
  • a partially complementary sequence that at least partially inhibits an identical sequence from hybridizing to a target nucleic acid is referred to as the functional term "substantially homologous".
  • the inhibition of hybridization of the completely complementary sequence to the target sequence may be examined using a hybridization assay (for example, Southern or Northern blot, solution hybridization, and the like) under conditions of low stringency.
  • a substantially homologous sequence or probe will compete for and inhibit the binding (i.e., the hybridization) of a completely homologous sequence or probe to the target sequence under conditions of low stringency. Nonetheless, conditions of low stringency do not permit non-specific binding; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction.
  • the absence of non-specific binding may be tested by the use of a second target sequence which lacks even a partial degree of complementarity (for example, less than about 30% identity). In the absence of non-specific binding, the probe will not hybridize to the second non-complementary target sequence.
  • the present invention encompasses any nucleic acid or polypeptide that is at least 98.4% homologous to the nucleic acid and polypeptide sequences of SEQ ID NO: 1 or SEQ ID NO:2, respectively.
  • nucleic acid molecule or polypeptide is at least about 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, or 99.9% identical to a nucleotide sequence of the present invention can be determined conventionally using known computer programs.
  • a preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence also referred to as a global sequence alignment, can be determined using the CLUSTALW computer program (Thompson, J.D.
  • RNA sequence can be compared by converting U's to T's.
  • CLUSTALW algorithm automatically converts U's to T's when comparing RNA sequences to DNA sequences. The result of said global sequence alignment is in percent identity.
  • the pairwise and multiple alignment parameters provided for CLUSTALW above represent the default parameters as provided with the AlignX software program (Vector NTI suite of programs, version 6.0).
  • the present invention encompasses the application of a manual correction to the percent identity results, in the instance where the subject sequence is shorter than the query sequence because of 5' or 3' deletions, not because of internal deletions. If only the local pairwise percent identity is required, no manual correction is needed. However, a manual correction may be applied to determine the global percent identity from a global polynucleotide alignment. Percent identity calculations based upon global polynucleotide alignments are often preferred since they reflect the percent identity between the polynucleotide molecules as a whole (i.e., including any polynucleotide overhangs, not just overlapping regions), as opposed to, only local matching polynucleotides.
  • This corrected score may be used for the purposes of the present invention. Only bases outside the 5' and 3' bases of the subject sequence, as displayed by the CLUSTALW alignment, which are not matched/aligned with the query sequence, are calculated for the purposes of manually adjusting the percent identity score. [0052] For example, a 90 base subject sequence is aligned to a 100 base query sequence to determine percent identity. The deletions occur at the 5' end of the subject sequence and therefore, the CLUSTALW alignment does not show a matched/alignment of the first 10 bases at 5' end.
  • the 10 unpaired bases represent 10% of the sequence (number of bases at the 5' and 3' ends not matched/total number of bases in the query sequence) so 10% is subtracted from the percent identity score calculated by the CLUSTALW program. If the remaining 90 bases were perfectly matched the final percent identity would be 90%.
  • a 90 base subject sequence is compared with a 100 base query sequence. This time the deletions are internal deletions so that there are no bases on the 5' or 3' of the subject sequence which are not matched/aligned with the query. In this case the percent identity calculated by CLUSTALW is not manually corrected. Once again, only bases 5' and 3' of the subject sequence which are not matched/aligned with the query sequence are manually corrected for. No other manual corrections are required for the purposes of the present invention.
  • polypeptide having an amino acid sequence at least, for example, 95% "identical" to a query amino acid sequence of the present invention it is intended that the amino acid sequence of the subject polypeptide is identical to the query sequence except that the subject polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence.
  • the amino acid sequence of the subject polypeptide may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence.
  • up to 5% of the amino acid residues in the subject sequence may be inserted, deleted, or substituted with another amino acid.
  • alterations of the reference sequence may occur at the amino- or carboxy- terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence or in one or more contiguous groups within the reference sequence.
  • a preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence can be determined using the CLUSTALW computer program (Thompson, J.D. et al, Nucleic Acids Research, 2(22):4673-4680 (1994)), which is based on the algorithm of Higgins, D. G. et al., Computer Applications in the Biosciences (CABIOS), 8(2): 189-191 (1992).
  • CLUSTALW computer program Thimpson, J.D. et al, Nucleic Acids Research, 2(22):4673-4680 (1994)
  • the query and subject sequences are both amino acid sequences.
  • the result of said global sequence alignment is in percent identity.
  • the pairwise and multiple alignment parameters provided for CLUSTALW above represent the default parameters as provided with the AlignX software program (Vector NTI suite of programs, version 6.0).
  • the present invention encompasses the application of a manual correction to the percent identity results, in the instance where the subject sequence is shorter than the query sequence because of N- or C-terminal deletions, not because of internal deletions. If only the local pairwise percent identity is required, no manual correction is needed. However, a manual correction may be applied to determine the global percent identity from a global polypeptide alignment. Percent identity calculations based upon global polypeptide alignments are often preferred since they reflect the percent identity between the polypeptide molecules as a whole (i.e., including any polypeptide overhangs, not just overlapping regions), as opposed to, only local matching polypeptides.
  • This final percent identity score is what may be used for the purposes of the present invention. Only residues to the N- and C-termini of the subject sequence, which are not matched/aligned with the query sequence, are considered for the purposes of manually adjusting the percent identity score. That is, only query residue positions outside the farthest N- and C-terminal residues of the subject sequence.
  • a 90 amino acid residue subject sequence is aligned with a 100 residue query sequence to determine percent identity.
  • the deletion occurs at the N-terminus of the subject sequence and therefore, the CLUSTALW alignment does not show a matching/alignment of the first 10 residues at the N-terminus.
  • the 10 unpaired residues represent 10% of the sequence (number of residues at the N- and C- termini not matched/total number of residues in the query sequence) so 10% is subtracted from the percent identity score calculated by the CLUSTALW program. If the remaining 90 residues were perfectly matched the final percent identity would be 90%.
  • a 90 residue subject sequence is compared with a 100 residue query sequence.
  • deletions are internal deletions so there are no residues at the N- or C-termini of the subject sequence, which are not matched/aligned with the query.
  • percent identity calculated by CLUSTALW is not manually corrected.
  • residue positions outside the N- and C-terminal ends of the subject sequence, as displayed in the CLUSTALW alignment, which are not matched/aligned with the query sequence are manually corrected for. No other manual corrections are required for the purposes of the present invention.
  • BLAST and BLAST 2.0 algorithms are also available to those having skill in this art.
  • the BLASTP program uses as defaults a wordlength (W) of 3, and an expectation (E) of 10.
  • the invention encompasses polypeptides having a lower degree of identity but having sufficient similarity so as to perform one or more of the same functions performed by the polypeptide of the present invention (i.e., transaminase activity). Similarity is determined by conserved amino acid substitution. Such substitutions are those that substitute a given amino acid in a polypeptide by another amino acid of like characteristics (e.g., chemical properties). According to Cunningham et al above, such conservative substitutions are likely to be phenotypically silent. Additional guidance concerning which amino acid changes are likely to be phenotypically silent are found in Bowie et al., Science, 247: 1306-1310 (1990).
  • Tolerated conservative amino acid substitutions of the present invention involve replacement of the aliphatic or hydrophobic amino acids Ala, VaI, Leu and He; replacement of the hydroxyl residues Ser and Thr; replacement of the acidic residues Asp and GIu; replacement of the amide residues Asn and GIn, replacement of the basic residues Lys, Arg, and His; replacement of the aromatic residues Phe,
  • amino acid substitutions may also increase protein or peptide stability.
  • the invention encompasses amino acid substitutions that contain, for example, one or more non-peptide bonds (which replace the peptide bonds) in the protein or peptide sequence. Also included are substitutions that include amino acid residues other than naturally occurring L-amino acids, e.g., D-amino acids or non-naturally occurring or synthetic amino acids, e.g., ⁇ or ⁇ amino acids.
  • the present invention also encompasses substitution of amino acids based upon the probability of an amino acid substitution resulting in conservation of function.
  • Such probabilities are determined by aligning multiple genes with related function and assessing the relative penalty of each substitution to proper gene function.
  • Such probabilities are often described in a matrix and are used by some algorithms (e.g., BLAST, CLUSTALW, GAP, etc.) in calculating percent similarity wherein similarity refers to the degree by which one amino acid may substitute for another amino acid without lose of function.
  • An example of such a matrix is the PAM250 or BLOSUM62 matrix.
  • the invention also encompasses substitutions which are typically not classified as conservative, but that may be chemically conservative under certain circumstances.
  • Analysis of enzymatic catalysis for proteases has shown that certain amino acids within the active site of some enzymes may have highly perturbed pKa's due to the unique microenvironment of the active site. Such perturbed pKa's could enable some amino acids to substitute for other amino acids while conserving enzymatic structure and function.
  • amino acids that are known to have amino acids with perturbed pKa's are the Glu-35 residue of Lysozyme, the He- 16 residue of Chymotrypsin, the His-159 residue of Papain, etc.
  • the conservation of function relates to either anomalous protonation or anomalous deprotonation of such amino acids, relative to their canonical, non-perturbed pKa.
  • the pKa perturbation may enable these amino acids to actively participate in general acid-base catalysis due to the unique ionization environment within the enzyme active site.
  • the present invention is directed to polynucleotide fragments of the transaminase polynucleotide of the invention, in addition to polypeptides encoded therein by said polynucleotide and/or fragments.
  • a "polynucleotide fragment” refers to a short polynucleotide having a nucleic acid sequence which is a portion of that shown in SEQ ID NO: 1 or the complementary strand thereto, or is a portion of a polynucleotide sequence encoding the polypeptide of SEQ ID NO:2.
  • the nucleotide fragments of the invention are preferably at least about 15 nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt, at least about 50 nt, at least about 75 nt, or at least about 150 nt in length.
  • a fragment "at least 20 nt in length,” for example, is intended to include 20 or more contiguous bases from the nucleotide sequence shown in SEQ ID NO: 1.
  • “about” includes the particularly recited value, a value larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus, or at both termini.
  • nucleotide fragments have uses that include, but are not limited to, as diagnostic probes and primers as discussed herein. Of course, larger fragments (e.g., 50, 150, 500, 600, 2000 nucleotides) are preferred.
  • polynucleotide fragments of the invention include, for example, fragments comprising, or alternatively consisting of, a sequence from about nucleotide number 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-400, 401-450, 451-500, 501-550, 551-600, 651-700, 701-750, 751-800, 800-850, 851-900, 901-950, 951-1000, 1001-1050, 1051-1100, 1101-1150, 1151-1200, 1201-1250, 1251-1300, 1301-1350, 1351-1400, 1401-1450, 1451-1500, 1501-1550, 1551-1600, 1601-1650, 1651-1700, 1701-1750, 1751-1800, 1801-1850, 1851-1900, 1901-1950, 1951-2000, or 2001 to the end of SEQ ID NO: 1 or the complementary strand thereto.
  • polypeptide fragment refers to an amino acid sequence which is a portion of that contained in SEQ ID NO:2.
  • Protein (polypeptide) fragments may be "free-standing,” or comprised within a larger polypeptide of which the fragment forms a part or region, most preferably as a single continuous region.
  • Representative examples of polypeptide fragments of the invention include, for example, fragments comprising, or alternatively consisting of, from about amino acid number 1-20, 21-40, 41-60, 61-80, 81-100, 102-120, 121-140, 141-160, or 161 to the end of the coding region.
  • polypeptide fragments can be about 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, or 150 amino acids in length.
  • “about” includes the particularly recited ranges or values, and ranges or values larger or smaller by several (5, 4, 3, 2, or 1) amino acids, at either extreme or at both extremes.
  • Polynucleotides encoding these polypeptides are also encompassed by the invention.
  • Preferred polypeptide fragments include the full-length protein. Further preferred polypeptide fragments include the full-length protein having a continuous series of deleted residues from the amino or the carboxy terminus, or both. For example, any number of amino acids, ranging from 1 -60, can be deleted from the amino terminus of the full-length polypeptide. Similarly, any number of amino acids, ranging from 1-30, can be deleted from the carboxy terminus of the full-length protein. Furthermore, any combination of the above amino and carboxy terminus deletions are preferred. Similarly, polynucleotides encoding these polypeptide fragments are also preferred.
  • polypeptide fragments are biologically active fragments.
  • Biologically active fragments are those exhibiting activity similar, but not necessarily identical, to an activity of the polypeptide of the present invention (i.e., transaminase activity).
  • the biological activity of the fragments may include an improved desired activity, or a decreased undesirable activity.
  • Polynucleotides encoding these polypeptide fragments are also encompassed by the invention.
  • the functional activity displayed by a polypeptide encoded by a polynucleotide fragment of the invention may be one or more biological activities typically associated with the full-length polypeptide of the invention (i.e., transaminase activity).
  • fragments may have biological activities which are desirable and directly inapposite to the biological activity of the full-length protein.
  • the functional activity of polypeptides of the invention, including fragments, variants, derivatives, and analogs thereof can be determined by numerous methods available to the skilled artisan, some of which are described elsewhere herein.
  • hybridization refers to any process by which a strand of nucleic acids binds with a complementary strand through base pairing.
  • hybridization complex refers to a complex formed between two nucleic acid sequences by virtue of the formation of hydrogen bonds between complementary G and C bases and between complementary A and T bases. The hydrogen bonds may be further stabilized by base stacking interactions. The two complementary nucleic acid sequences hydrogen bond in an anti-parallel configuration.
  • a hybridization complex may be formed in solution (for example, C o t or R o t analysis), or between one nucleic acid sequence present in solution and another nucleic acid sequence immobilized on a solid phase or support (for example, membranes, filters, chips, pins, or glass slides, or any other appropriate substrate to which cells or their nucleic acids have been affixed).
  • a solid phase or support for example, membranes, filters, chips, pins, or glass slides, or any other appropriate substrate to which cells or their nucleic acids have been affixed.
  • stringency or “stringent conditions” refer to the conditions for hybridization as defined by nucleic acid composition, salt, and temperature. These conditions are well known in the art and may be altered to identify and/or detect identical or related polynucleotide sequences in a sample.
  • a variety of equivalent conditions comprising either low, moderate, or high stringency depend on factors such as the length and nature of the sequence (DNA, RNA, base composition), reaction milieu (in solution or immobilized on a solid substrate), nature of the target nucleic acid (DNA, RNA, base composition), concentration of salts and the presence or absence of other reaction components (for example, formamide, dextran sulfate and/or polyethylene glycol) and reaction temperature (within a range of from about 5°C below the melting temperature of the probe to about 20 0 C to 25°C below the melting temperature).
  • reaction temperature within a range of from about 5°C below the melting temperature of the probe to about 20 0 C to 25°C below the melting temperature.
  • One or more factors may be varied to generate conditions, either low or high stringency that is different from but equivalent to the aforementioned conditions.
  • the stringency of hybridization may be altered in order to identify or detect identical or related polynucleotide sequences.
  • the melting temperature, T m can be approximated by the formulas as well known in the art, depending on a number of parameters, such as the length of the hybrid or probe in number of nucleotides, or hybridization buffer ingredients and conditions (see, for example, Maniatis, T. et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1982) and Sambrook, J.
  • T m decreases approximately I 0 C -1.5 0 C with every 1% decrease in sequence homology.
  • the stability of a hybrid is a function of sodium ion concentration and temperature.
  • the hybridization reaction is initially performed under conditions of low stringency, followed by washes of varying, but higher stringency.
  • Reference to hybridization stringency typically relates to such washing conditions. It is to be understood that the low, moderate and high stringency hybridization or washing conditions can be varied using a variety of ingredients, buffers and temperatures well known to and practiced by the skilled artisan.
  • composition refers broadly to any composition containing a transaminase polynucleotide or polypeptide of the present invention.
  • the composition may comprise a dry formulation or an aqueous solution.
  • Compositions comprising a transaminase polynucleotide sequence (SEQ ID NO: 1) encoding a transaminase polypeptide (SEQ ID NO:2), or fragments thereof, may be employed as hybridization probes.
  • the probes may be stored in a freeze-dried form and may be in association with a stabilizing agent such as a carbohydrate.
  • the probe may be employed in an aqueous solution containing salts (for example, NaCl), detergents or surfactants (for example, SDS) and other components (for example, Denhardt's solution, dry milk, salmon sperm DNA, and the like).
  • salts for example, NaCl
  • detergents or surfactants for example, SDS
  • other components for example, Denhardt's solution, dry milk, salmon sperm DNA, and the like.
  • the term “substantially purified” refers to nucleic acid sequences or amino acid sequences that are removed from their natural environment, isolated or separated, and are at least 60% free, preferably 75% to 85% free, and most preferably 90% to 95%, or greater, free from other components with which they are naturally associated.
  • Transformation or transfection refers to a process by which exogenous DNA, preferably transaminase DNA, enters and changes a recipient cell.
  • Transformation may rely on any known method for the insertion of foreign nucleic acid sequences into a prokaryotic or eukaryotic host cell.
  • the method is selected based on the type of host cell being transformed and may include, but is not limited to, viral infection, electroporation, heat shock, lipofection, and partial bombardment.
  • Such "transformed" cells include stably transformed cells in which the inserted DNA is capable of replication either as an autonomously replicating plasmid or as part of the host chromosome. Transformed cells also include those cells, which transiently express the inserted DNA or RNA for limited periods of time.
  • transaminase polypeptide variants of the transaminase polypeptide are also encompassed by the present invention.
  • a transaminase variant has at least 98.4% amino acid sequence identity to a transaminase amino acid sequence disclosed herein, and more preferably, retains at least one biological, immunological, or other functional characteristic or activity of the non-variant transaminase polypeptide.
  • the present invention encompasses the polynucleotides which encode the transaminase polypeptide. Accordingly, any nucleic acid sequence that encodes the amino acid sequence of the transaminase polypeptide of the invention can be used to produce recombinant molecules that express the transaminase protein. More particularly, the invention encompasses the transaminase polynucleotide comprising the nucleic acid sequence of SEQ ID NO: 1. Additionally, any nucleic acid sequence that encodes a lipase polypeptide may be used to produce recombinant molecules that express the lipase protein.
  • the degeneracy of the genetic code results in many nucleotide sequences that can encode the described polypeptides. Some of the sequences bear minimal or no homology to the nucleotide sequences of any known and naturally occurring gene. Accordingly, the present invention contemplates each and every possible variation of nucleotide sequence that could be made by selecting combinations based on possible codon choices. These combinations are made in accordance with the standard triplet genetic code as applied to the nucleotide sequence of naturally occurring transaminase or lipase, and all such variations are to be considered as being specifically disclosed and able to be understood by the skilled practitioner.
  • nucleic acid sequences which encode the transaminase polypeptide or lipase polypeptide and variants thereof are preferably capable of hybridizing to the nucleotide sequence of the naturally occurring transaminase polypeptide or lipase polypeptide under appropriately selected conditions of stringency, it may be advantageous to produce nucleotide sequences encoding transaminase polypeptide or lipase polypeptides, or derivatives thereof, which possess a substantially different codon usage.
  • codons may be selected to increase the rate at which expression of the peptide/polypeptide occurs in a particular prokaryotic host in accordance with the frequency with which particular codons are utilized by the host.
  • RNA transcripts having more desirable properties such as a greater half-life, than transcripts produced from the naturally occurring sequence.
  • the present invention also encompasses production of DNA sequences, or portions thereof, which encode the transaminase polypeptide or a lipase polypeptide, or derivatives thereof, entirely by synthetic chemistry.
  • the synthetic sequence may be inserted into any of the many available expression vectors and cell systems using reagents that are well known and practiced by those in the art.
  • synthetic chemistry may be used to introduce mutations into a sequence encoding the transaminase polypeptide or lipase polypeptide, or any fragment thereof.
  • a gene delivery vector containing the transaminase polynucleotide, lipase polypeptide or functional fragment thereof is provided.
  • the gene delivery vector contains the polynucleotide, or functional fragment thereof comprising an isolated and purified polynucleotide encoding the bacterial transaminase having the sequence as set forth in any one of SEQ ID NO: 1.
  • the gene delivery vector contains a polynucleotide encoding a bacterial lipase.
  • a longer oligonucleotide probe, or mixtures of probes, for example, degenerate probes can be used to detect longer, or more complex, nucleic acid sequences, such as, for example, genomic or full length DNA.
  • the probe may comprise at least 20-300 nucleotides, preferably, at least 30-100 nucleotides, and more preferably, 50-100 nucleotides.
  • polynucleotide sequences or portions thereof which encode the transaminase polypeptide or peptides, or a lipase polypeptide or peptides can comprise recombinant DNA molecules to direct the expression of the polypeptide products, peptide fragments, or functional equivalents thereof, in appropriate host cells. Because of the inherent degeneracy of the genetic code, other DNA sequences, which encode substantially the same or a functionally equivalent amino acid sequence, may be produced and these sequences may be used to clone and express a transaminase polypeptide or lipase polypeptide as described.
  • nucleotide sequences of the present invention can be engineered using methods generally known in the art in order to alter the transaminase polypeptide or lipase polypeptide-encoding sequences for a variety of reasons, including, but not limited to, alterations which modify the cloning, processing, and/or expression of the gene product.
  • DNA shuffling by random fragmentation, PCR reassembly of gene fragments and synthetic oligonucleotides may be used to engineer the nucleotide sequences.
  • site-directed mutagenesis may be used to insert new restriction sites, alter glycosylation patterns, change codon preference, produce splice variants, or introduce mutations, and the like.
  • sequences encoding the transaminase polypeptide or lipase polypeptide may be synthesized in whole, or in part, using chemical methods well known in the art (see, for example, Caruthers, M.H. et al, Nucl. Acids Res. Symp. Ser., 215-223 (1980) and Horn, T. et al., Nucl. Acids Res. Symp. Ser., 225-232 (1980)).
  • the transaminase protein itself, or a fragment or portion thereof may be produced using chemical methods to synthesize the amino acid sequence of the transaminase polypeptide, or a fragment or portion thereof.
  • peptide synthesis can be performed using various solid-phase techniques (Roberge, J.Y. et al, Science, 269:202-204 (1995)) and automated synthesis can be achieved, for example, using the ABI 43 IA Peptide Synthesizer (PE Biosystems).
  • the newly synthesized transaminase polypeptide or lipase polypeptide or peptide can be substantially purified by preparative high performance liquid chromatography (e.g., Creighton, T., Proteins, Structures and Molecular Principles , W.H. Freeman and Co., New York, NY (1983)), by reverse-phase high performance liquid chromatography (HPLC), or other purification methods as known and practiced in the art.
  • the composition of the synthetic peptides may be confirmed by amino acid analysis or sequencing (e.g., the Edman degradation procedure; Creighton, supra).
  • amino acid sequence of a transaminase polypeptide, lipase polypeptide or any portion thereof can be altered during direct synthesis and/or combined using chemical methods with sequences from other proteins, or any part thereof, to produce a variant polypeptide.
  • the nucleotide sequences encoding the transaminase polypeptide, or functional equivalents may be inserted into an appropriate expression vector, i.e., a vector, which contains the necessary elements for the transcription and translation of the inserted coding sequence.
  • an expression vector contains an isolated and purified polynucleotide sequence as set forth in SEQ ID NO: 1, encoding a bacterial transaminase, or a functional fragment thereof, in which the huma transaminase comprises the amino acid sequence as set forth in SEQ ID NO:2.
  • an expression vector can contain the complement of the aforementioned transaminase nucleic acid sequence.
  • the expression vector comprises a polynucleotide sequence encoding a lipase.
  • Expression vectors derived from retroviruses, adenovirus, herpes or vaccinia viruses, or from various bacterial plasmids can be used for the delivery of nucleotide sequences. Methods, which are well known to those skilled in the art, may be used to construct expression vectors containing sequences encoding a transaminase polypeptide along with appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. Such techniques are described in Sambrook, J. et al., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview, N.Y. (1989)and in Ausubel, F. M. et al., Current Protocols in Molecular Biology , John Wiley & Sons, New York, NY (1989).
  • a variety of expression vector/host systems may be utilized to contain and express sequences encoding the transaminase polypeptide or lipase polypeptide, or peptides.
  • Such expression vector/host systems include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with virus expression vectors (e.g., baculovirus); plant cell systems transformed with virus expression vectors (e.g., cauliflower mosaic virus (CaMV) and tobacco mosaic virus (TMV)), or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or animal cell systems.
  • microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with virus expression vectors (e.g., bac
  • the host cell employed is not limiting to the present invention.
  • the host cell of the invention contains an expression vector comprising an isolated and purified polynucleotide having the nucleic acid sequence of SEQ ID NO: 1 and encoding the bacterial transaminase of this invention, or a functional fragment thereof, comprising an amino acid sequence as set forth in SEQ ID NO:2.
  • the host cell of the invention contains an expression vector comprising an isolated and purified polynucleotide comprising a lipase amino acid sequence.
  • Control elements are those non-translated regions of the vector, e.g., enhancers, promoters, 5' and 3' untranslated regions, which interact with host cellular proteins to carry out transcription and translation. Such elements may vary in their strength and specificity. Depending on the vector system and host utilized, any number of suitable transcription and translation elements, including constitutive and inducible promoters, may be used. Specific initiation signals may also be used to achieve more efficient translation of sequences encoding the transaminase polypeptide or lipase polypeptide. Such signals include the ATG initiation codon and adjacent sequences.
  • transaminase polypeptide or lipase polypeptide In cases where sequences encoding the transaminase polypeptide or lipase polypeptide, its initiation codon, and upstream sequences are inserted into the appropriate expression vector, no additional transcriptional or translational control signals may be needed. However, in cases where only a transaminase or lipase coding sequence, or a fragment thereof, is inserted, exogenous translational control signals, including the ATG initiation codon, are optimally provided. Furthermore, the initiation codon should be in the correct reading frame to insure translation of the entire insert. Exogenous translational elements and initiation codons can be of various origins, both natural and synthetic.
  • Enhancers which are appropriate for the particular cell system that is used, such as those described in the literature (see, e.g., Scharf, D. et al, Results Probl. Cell Differ., 20: 125-162 (1994)).
  • enhancers which are appropriate for the particular cell system that is used, such as those described in the literature (see, e.g., Scharf, D. et al, Results Probl. Cell Differ., 20: 125-162 (1994)).
  • a number of expression vectors may be selected, depending upon the use intended for the expressed product. Such vectors include, but are not limited to, the multifunctional E.
  • the expression vector is pZerO2 (Invitrogen, Carlsbad, CA).
  • pGEX vectors can also be used to express foreign polypeptides as fusion proteins with glutathione S-transferase (GST).
  • GST glutathione S-transferase
  • fusion proteins are soluble and can be easily purified from lysed cells by adsorption to glutathione-agarose beads followed by elution in the presence of free glutathione.
  • Proteins made in such systems can be designed to include heparin, thrombin, or factor XA protease cleavage sites so that the cloned polypeptide of interest can be released from the GST moiety at will.
  • a host cell strain may be chosen for its ability to modulate the expression of the inserted sequences or to process the expressed protein in the desired fashion.
  • modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and acylation.
  • Post-translational processing which cleaves a "prepro" form of the protein may also be used to facilitate correct insertion, folding and/or function.
  • Different host cells having specific cellular machinery and characteristic mechanisms for such post- translational activities are available and may be chosen to ensure the correct modification and processing of the foreign protein.
  • Host cells transformed with a nucleotide sequence encoding the transaminase protein, lipase protein, or fragments thereof may be cultured under conditions suitable for the expression and recovery of the protein from cell culture.
  • the protein produced by a recombinant cell may be secreted or contained intracellularly depending on the sequence and/or the vector used.
  • expression vectors containing polynucleotides which encode a transaminase protein or lipase protein can be designed to contain signal sequences which direct secretion of the transaminase protein or lipase protein through a prokaryotic cell membrane.
  • nucleic acid sequences encoding a transaminase protein or a lipase protein can be joined to a nucleotide sequence encoding a polypeptide domain, which will facilitate purification of soluble proteins.
  • purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine-tryptophan modules that allow purification on immobilized metals; protein A domains that allow purification on immobilized immunoglobulin; and the domain utilized in the FLAGS extension/ affinity purification system (Immunex Corp., Seattle, WA).
  • cleavable linker sequences such as those specific for Factor XA or enterokinase (Invitrogen, San Diego, CA) between the purification domain and transaminase protein or lipase protein may be used to facilitate purification.
  • One such expression vector provides for expression of a fusion protein containing transaminase or lipase and a nucleic acid encoding 6 histidine residues preceding a thioredoxin or an enterokinase cleavage site. The histidine residues facilitate purification on IMAC (immobilized metal ion affinity chromatography) as described by Porath, J. et al, Prot. Exp.
  • enterokinase cleavage site provides a means for purifying from the fusion protein.
  • suitable vectors for fusion protein production see Kroll, DJ. et al., DNA Cell Biol, 12:441-453 (1993).
  • Any number of selection systems may be used to recover transformed cell lines. These include, but are not limited to, the Herpes Simplex Virus thymidine kinase (HSV TK), (Wigler, M. et al., Cell, 11 :223-32 (1977)) and adenine phosphoribosyltransferase (Lowy, I.
  • HSV TK Herpes Simplex Virus thymidine kinase
  • the presence or absence of marker gene expression suggests that the gene of interest is also present, the presence and expression of the desired gene of interest may need to be confirmed.
  • the nucleic acid sequence encoding a transaminase polypeptide or lipase polypeptide is inserted within a marker gene sequence, recombinant cells containing polynucleotide sequence encoding the transaminase polypeptide or lipase polypeptide can be identified by the absence of marker gene function.
  • a marker gene can be placed in tandem with a sequence encoding a transaminase polypeptide or lipase polypeptide under the control of a single promoter.
  • host cells which contain the nucleic acid sequence coding for a transaminase polypeptide of the invention or a lipase polypeptide and which express the transaminase polypeptide or lipase polypeptide product may be identified by a variety of procedures known to those having skill in the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations and protein bioassay or immunoassay techniques, including membrane, solution, or chip based technologies, for the detection and/or quantification of nucleic acid or protein.
  • transaminase polypeptides or lipase polypeptides can be detected by DNA-DNA or DNA-RNA hybridization, or by amplification using probes, portions, or fragments of polynucleotides encoding a transaminase polypeptide.
  • Nucleic acid amplification based assays involve the use of oligonucleotides or oligomers based on the nucleic acid sequences encoding a transaminase polypeptide to detect transformants containing DNA or RNA encoding transaminase polypeptide.
  • fragments of transaminase polypeptides or lipase polypeptides may be produced by direct peptide synthesis using solid phase techniques (Merrifield, J., J. Am. Chem. Soc, 85:2149-2154 (1963)). Protein synthesis may be performed using manual techniques or by automation. Automated synthesis may be achieved, for example, using ABI 43 IA Peptide Synthesizer (PE Biosystems). Various fragments of the transaminase polypeptides or lipase polypeptides can be chemically synthesized separately and then combined using chemical methods to produce the full length molecule.
  • the present invention contemplates a method of detecting transaminase activity in a sample.
  • the method comprises measuring the consumption of ( «S)-sec-butylamine, as exemplified in Example 6, after Marfey's derivatization by HPLC.
  • the present invention contemplates a method of detecting lipase activity in a sample.
  • the method comprises measuring the lipase-catalyzed resolution of RS-1-cyclopropylethylamine.
  • sterile minimal medium contained 10 g/L glycerol, 1 g/L filter-sterilized racemic sec-butylamine or racemic cyclopropylethylamine, MgSO 4 7H 2 O 0.2 g/L, NaCl 0.01 g/L, FeSO 4 7 H 2 O 0.01 g/L, and MnSO 4 4H 2 O 0.01 g/L in 0.1 M pH 7 phosphate buffer ).
  • Tubes were inoculated by loop (from slants) or with one drop of liquid (from vials) or with soil samples extracts. Tubes were shaken at 250 rpm, 28°C for three or more days in a bench-top shaker. Broths were then analyzed for the presence of (R)- and ( «S)-isomers of the amines.
  • the mixture was vortexed for 30 sec with 1 ml ethyl acetate, centrifuged briefly, then 1 ml of the upper layer was dried at 40 0 C under N 2 and the residue was dissolved in 1 ml of the mobile phase for the chiral column.
  • the mixture was diluted to 1 ml with 50% ACN/50% water.
  • Chiral Separation column Chiralpak OD 25x0.46 cm (Daicel Chemical Industries, Ltd.) mobile phase: 95% hexane/5% ethanol flow rate: 1 ml/min column temperature: 18 0 C detection: set at 220 nm injection volume: 10 ⁇ l retention times: R-amine 11.5min, S-amine 12.6 min.
  • Marfey's reagent was used to give diastereomeric derivatives that could be separated with a C18 column.
  • a sample of 10 ⁇ l containing about 0.1 mg amine, 8 ⁇ l 1 M NaHCO 3 , and 40 ⁇ l 1% w/v Marfey's reagent (FDAA, l-fluoro-2,4- dinitrophenyl-5-L-alanine amide) in acetone were combined in a 1.5 ml microfuge tube and heated for 1 h at 40 0 C.
  • the samples were cooled to room temperature, then 8 ⁇ l IN HCl and 934 ⁇ l 50% acetonitrile/water were added, and the solutions were vortexed and filtered into HPLC vials.
  • the flask medium was adjusted to pH 7.0 with NaOH or H 2 SO 4 as necessary, then dispensed into 4-liter flasks in 1 -liter aliquots prior to autoclaving at 121°C for 30 minutes.
  • the tank medium was adjusted to pH 7.0 with NaOH or H 2 SO 4 as necessary and steam sterilized at 121 0 C for 30 minutes.
  • Racemic sec-butylamine (30 g, 410 mmoles), sodium pyruvate (90.23 g, 820 mmoles)and 300 ml 1 M KH 2 PO 4 were dissolved in deionized water and brought to a final volume of 3 L.
  • the pH was adjusted to 7.5 with 85% H3PO 4
  • Bacillus megaterium SC6394 wet cell paste 300 g cells stored at -70 0 C) was thawed and dispersed in 2 L of the substrate solution with an Ultraturrax T25 homogenizer. The cell suspension and the remaining 1 L substrate solution were added to a 5 -L vessel for a Braun Biostat® B.
  • the vessel was connected to a Braun Biostat® B equipped with pH and oxygen electrodes and the suspension was stirred at 300 rpm.
  • the transamination reaction was carried out at 28 0 C, 300 rpm stirring speed, pH 8 was maintained with 10% NaOH and 8.5% H3PO 4 feeds, and air was sparged from near the bottom of the vessel at 6L/min (2 wm).
  • Samples were assayed by derivitization with Marfey's reagent and reverse phase HPLC, and the reaction was stopped when the enantiomeric excess ("ee") of i?-amine reached 100%. After 18 h the ee was
  • the reaction mixture 3.2 kg, was adjusted to pH 13.1 with 10 M NaOH (205 g) and distilled at 1 atm, collecting two 200-mL fractions. The fractions were assayed by dansylation and HPLC of the dansyl derivative (see Note 1 below). The second fraction contained negligible amine and was discarded. The first fraction was redistilled, collecting a 39-mL fraction (bp 60-100 0 C) and an 18-mL fraction (bp 100 0 C). Dansyl/HPLC assay indicated that the first fraction contained 99.8% of the product and the second contained 0.2%. The combined distillate was cooled on ice and adjusted to pH 6.6 with 11.4 mL of 10 M H 2 SO 4 .
  • Reagents 1.5 mg Dns-Cl per mL in MeCN (5.56 mM), stored in the dark (stable for at least one month), 40 mM Li 2 CO 3 adjusted to pH 9.5 with HCl and 2.5% ( ⁇ 0.4 M) ethanolamine in water.
  • Racemic 1-cyclopropylethylamine hydrochloride ( 23 g, 189 mmoles) was dissolved in 184 mL water, 46 mL IO N NaOH was added. The solution was cooled on ice, then 40 mL cone. HCl (11.6 M) was added to adjust the solution to pH 8. Potassium phosphate buffer (230 mL of 1 M, pH 8, diluted with water to 2012 mL) was prepared. Bacillus megaterium SC6394 wet cell paste (230 g cells stored at -70 0 C) was thawed and dispersed in 1 L of the buffer with an Ultraturrax T25 homogenizer.
  • the cell suspension was added to a 5-L vessel for a Braun Biostat® B, then sodium pyruvate (41.6 g, 378 mmoles) dissolved in 200 mL of the phosphate buffer was added.
  • the vessel was connected to a Braun Biostat® B equipped with pH and oxygen electrodes and the suspension was stirred at 300 rpm.
  • the amine solution was added to the stirred suspension and the remainder of the buffer was used to rinse the cells, pyruvate and amine into the vessel.
  • the transamination reaction was carried out at 28 0 C, 300 rpm stirring speed, the pH 8 was maintained with 10% NaOH and 8.5% H 3 PO 4 feeds, and air was sparged from near the bottom of the vessel at 4L/min (2 wm). The air exited from the vessel through a condenser maintained at 4 0 C.
  • reaction mixture (2.4 L, pH 7.9) was adjusted to pH 13.1 with 10 M NaOH (120 g) and mixed with 182 mL of n-butanol (Note 1).
  • the mixture was distilled at atmospheric pressure, collecting 200-mL fractions and the fractions adjusted to pH 4.0-4.5 with sulfuric acid.
  • the rich fractions were combined and concentrated in vacuo, giving 28.3 g of white solid.
  • the transaminase (aminotransferase) activity is determined by HPLC measuring the consumption of (5)-seobutylamine after Marfey's derivatization.
  • a typical reaction mixture (0.5 ml) contains 50 mM Tris-HCl buffer pH 7.5, 1 mg/ml sec-butylamine, 100 mM sodium pyruvate and 0.1 mM pyridoxal 5'-phosphate (PLP).
  • PBP 0.1 mM pyridoxal 5'-phosphate
  • the reaction is initiated by addition of 0.1-10 ⁇ g enzyme, and incubated at 28 0 C, 200 rpm for 2 hours.
  • the reaction is terminated by adding 0.5 ml ethanol to the mixture.
  • the transaminase was purified by three chromatographic steps. First, 100 ml of crude extract (850 mg protein) with 1 M ammonium sulfate was loaded onto a butyl-sepharose column (1.5 x 25 cm) equilibrated with 100 ml of 50 mM potassium phosphate buffer pH 7.0 containing 1 mM DTT and 1 M ammonium sulfate. After washing with 100 ml of the same buffer, the enzyme was eluted with a 100 ml linear gradient of ammonium sulfate from 1 M to 0 and an additional 20 ml of water while collecting 3 ml fractions.
  • the active fractions were pooled (30 ml), and then concentrated and desalted by centrifugation with a CentriconPlus (10 kDa cut off, Millipore Co.) to 2 ml.
  • the concentrate was injected onto a hydroxyapatite column (CHT5 bioscale, Bio-Rad) pre-equilibrated with 25 ml of 50 mM potassium phosphate buffer pH 7.0 at flow rate of 1 ml/min. After washing with 15 ml of the buffer, the column was eluted with a 25 ml linear gradient of 50 to 350 mM potassium phosphate at pH 7.0, and 1 ml fractions were collected.
  • the active fractions (3 ml) were combined and concentrated to approximately 2 ml by centrifugation with a CentriconPlus. Finally, the concentrated and desalted enzyme preparation (2 ml) was injected onto a UnoQ column equilibrated with 50 mM potassium phosphate buffer pH 7.0 containing 1 mM DTT and 50 mM NaCl at a flow rate of 1 ml/min. After washing with 12 ml of the buffer, the enzyme was eluted with a 16 ml linear gradient of 50 to 250 mM NaCl, and 0.75 ml fractions were collected and assayed for enzyme activity.
  • Bacillus megaterium chromosomal DNA was prepared using the procedure described in Ausubel et al, eds., Current Protocols in Molecular Biology , Vol. 2, Section 13.11.2, John Wiley and Sons, New York, NY (1981) with the following modification: The cell pellet was resuspended in 9.5 mL GTE buffer (50 mM glucose, 25 mM Tris-HCl pH 8.0, 10 mM NaEDTA) containing 2 mg/mL lysozyme and incubated at 37°C for 30 min before adding SDS and Proteinase K. [00131] A series of mixed oligonucleotide primers were prepared based on the partial peptide sequences obtained for the enzyme:
  • Primer sets 755 + 757 and 765 + 758 were used to amplify the gene using genomic DNA as target. Combinations of sense and antisense primers were tried with the FailSafe series of PCR buffers (Epicentre Technologies, Madison, WI) and B. megaterium chromosomal DNA as template in 10 ⁇ L reactions. Amplification was carried out in a Hybaid PCR Express thermocycler (ThermoSavant, Holbrook, NY).
  • the amplification conditions included incubation at 94°C for 1 min, followed by 30 cycles at 94°C for 0.5 min; 50 0 C for 0.5 min; and 72°C for 0.5 min.
  • Samples were electrophoresed on a 1.0% agarose gel for 2 hr at 100 v in TAE buffer (0.04 M Trizma base, 0.02 M acetic acid, and 0.001 M EDTA, pH 8.3) containing 0.5 ⁇ g/ml ethidium bromide.
  • SOC medium 250 ⁇ L; per liter, 5 g yeast extract, 20 g Bacto-tryptone, 580 mg NaCl, 186 mg KCl , 940 mg MgCl 2 , 1.2 g MgSO 4 , and 3.6 g glucose
  • SOC medium 250 ⁇ L; per liter, 5 g yeast extract, 20 g Bacto-tryptone, 580 mg NaCl, 186 mg KCl , 940 mg MgCl 2 , 1.2 g MgSO 4 , and 3.6 g glucose
  • B. megaterium genomic DNA was cleaved with a series of restriction endonucleases (Apal, BamHI, BgEI, EcoRI, Hindlll, Kpnl, Pstl, and Smal). Reactions contained 3 ⁇ g DNA, appropriate buffer, and 20 units enzyme in 25 ⁇ L final volume. Digests were carried out for 3 hr at 37°C, then electrophoresed in a 0.8% TAE-agarose gel at 16 v for 18 hr. The DNA was transferred to Hybond N+ nylon filters under alkaline conditions using the VacuGene vacuum blotting unit (Amersham, Piscataway, NJ).
  • chromosomal DNA Twenty ⁇ g of chromosomal DNA was cleaved with 100 U Hzwdlll in a total volume of 200 ⁇ L for 2 hr at 37°C and electrophoresed as described above. The region from 4000-5000 base pairs was cut from the gel and the DNA purified using the QIAquick Gel Isolation kit. The isolated DNA was able to support amplification of a 580-base pair fragment by PCR using oligonucleotides 756 + 758.
  • a sample of the isolated chromosomal DNA was ligated to pZerO2 vector DNA (Invitrogen) digested with HindIII at a 5: 1 (insert:vector) molar ratio in a total volume of 10 ⁇ l at 22°C for 15 min using the Fast Link kit (Epicentre).
  • DNA was precipitated by addition of 100 ⁇ L 1-butanol and pelleted at 13,500 x g in a microcentrifuge for 5 min. Liquid was removed by aspiration, and the DNA was dried in a SpeedVac (Savant Instruments, Farmingdale, NY) for 5 min under low heat. The pellet was resuspended in 4 ⁇ l dH 2 O.
  • the resuspended DNA was transformed by electroporation into 0.04 ml E. coli DHlOB competent cells (Invitrogen) at 2.5 kV, 25 ⁇ F, and 250 ⁇ SOC medium was immediately added (0.96 ml) and the tube containing the transformed cells incubated in a shaker for 1 hr at 37°C and 225 rpm. Colonies containing recombinant plasmids were selected on LB agar plates containing 50 ⁇ g/ml kanamycin sulfate (Sigma Chemicals, St. Louis, MO). Sufficient cells to give ca.
  • Membranes were placed on top of 3MM paper saturated with 1.0 M Tris-HCl, pH7.0/1.5 M NaCl for 10 min. DNA was crosslinked to the filters by exposure to ultraviolet light in a Stratagene UV Stratalinker 2400 set to "auto crosslink" mode (Stratagene, La Jolla, CA). Cell debris was removed from the membranes by immersing in 3X SSC/0.1% SDS and wiping the surface with a wetted Kimwipe® (Kimberly-Clark Co., Roswell, GA), then incubating in the same solution heated to 65°C for 3 hr with agitation. Filters were rinsed with dH 2 ⁇ and used immediately or wrapped in SaranWrap® and stored at 4°C.
  • Hybridization, washing, and detection of the colony blots were performed as described above using the labeled PCR probe. Thirty-six positively hybridizing colonies were inoculated into ImL TB-kanamycin liquid medium in a 2.mL multiwell growth block and shaken at 37°C for 60 hr, 250 rpm. Plasmid DNA was prepared using the Pure Link kit from Invitrogen and resuspended in 25 ⁇ L Tris- HCl pH 8.5. A 1 ⁇ L sample of each plasmid was tested for presence of the BMTA gene by PCR as described previously. Fourteen out of the 36 plasmid isolates successfully amplified the 580-bp fragment.
  • Oligonucleotide primers were prepared containing 1) an Ndel site followed by the first 24 nucleotides of the (5)-transaminse gene (Oligo 763: 5'- GACATATTTAAAT CATATGAGTTTAACAGTGCAAAAAATAAAC-S' (SEQ ID NO: 17)) and 2) the last 24 nucleotides of the (5)-transaminase gene (including stop codon) followed by a BamHI restriction site (antisense of the complementary strand; Oligo 764: 5'- GACATATTT AAATCCATGGGTTTAACAGTGCAAAAAATA AAC -3' (SEQ ID NO: 18); restriction sites are underlined).
  • High-fidelity PCR amplification of the B. megaterium BMTA gene was carried out in a 400 ⁇ L final volume with Z-Taq DNA polymerase (Takara) in vendor-supplied reaction buffer, 0.2 mM each deoxynucleotide triphosphate (dATP, dCTP, dGTP, and dTTP), 0.4 nM each oligonucleotide, 2.5 U polymerase, and 100 ng pZerO2-BMTA plasmid DNA.
  • the amplification conditions were as previously described.
  • the sample was applied to a 1.0% agarose gel and electrophoresed for 1.5 hr, 100 v.
  • the expected 1300-bp fragment was excised from the gel and purified using the QIAquick Gel Isolation kit. DNA concentration was adjusted to 100 ng/ ⁇ L. [00137] Detailed DNA sequence analysis revealed that BMTA gene contained an internal Ndel restriction site, so simultaneous digestion with this enzyme and BamHI was not possible. Instead, 2 ⁇ g of the amplified BMTA fragment was cleaved with 10 U BamHI for 1 hr, 37°C. Then 10 U NcIeI was added and the sample incubated an additional 15 min at 37°C. After agarose gel electrophoresis, the 1300-bp fragment was visible as were two additional fragments of ca.
  • pBMS2004-BMTA was transformed into competent E. coli expression strain BL21 by electroporation as described above.
  • MT5-M2 medium contains Hy-Pea (Quest International) 2.0%; Tastone 154 (Quest), 1.85%; Na 2 HPO 4 , 0.6%; (NH 4 ) 2 SO 4 , 0.125%; glycerol, 4.0%; pH adjusted to 7.2 w/10 N NaOH before autoclaving.
  • Strain SC16578 E coli BL21(pBMS2004-BMTA) was used for the production of B. megaterium SC6394 S-transaminase. Enzyme production was the result of IPTG- induced activation of the appropriate promoter.
  • the four F2 flasks were pooled and the optical density (OD ⁇ oo) was measured. This was done by diluting the broth 2Ox into un-inoculated MT5 medium, and using the same medium as a blank. For the current example, the inoculum OD ⁇ oo was 7.5 U/cm (0.375 U/cm at the 2Ox dilution). The 4 liters of pooled inoculum were then transferred to a 380-liter tank employing a working volume of 250 liters MT5-
  • IPTG added at a level of 50 uM (ca. log 3.5 hours)
  • the medium was batched with de-ionized water and adjusted to pH 7.2 with NaOH.
  • One-liter and 100-ml aliquots were dispensed to 4-liter and 500-ml flasks, respectively, and autoclaved at 121°C for 30 minutes.
  • kanamycin and magnesium sulfate were added to the medium after autoclaving as follows:
  • magnesium sulfate a 24.6% solution was prepared and filter-sterilized though a 0.2 um Nalgene cellulose nitrate filter. The appropriate quantity of this 100Ox solution was then added to each flask (100 ⁇ l to 100 ml, 1 ml to 1 liter). For kanamycin, a 5% solution was similarly prepared, filter-sterilized and dispensed to yield the desired final concentration (100 ⁇ l to 100 ml, 1 ml to 1 liter).
  • xKanamycin was added to the medium after autoclaving as follows: 12.5 g were dissolved in 500 ml de-ionized water, filter-sterilized, and added to a transfer bottle. The kanamycin solution was added to the tank medium just prior to inoculation (log M).
  • IPTG For IPTG, 2.98 g were dissolved in 500 ml de-ionized water, filter- sterilized, and added to a transfer bottle. The IPTG solution was added to the tank medium when growth reached an OD ⁇ oo of ca. 0.8 - 1.2 U/cm and a CO 2 off-gas value of ca. 0.08-0.16%.
  • TLC of the MTBE solution after extraction of the amine with sulfuric acid (silica gel with DCM-MeOH, 19: 1) showed a mixture of decanoic acid (Rf 0.47), N- 1-cyclopropylethyl decanamide (Rf 0.74) and ethyl decanoate (Rf 0.85) (Rydon- Smith detection: the ester and acid give light zones on a dark background after a few minutes whereas the amide gives a black zone).
  • the MTBE solution was extracted with water adjusting the mixture to pH 12.3 with NaOH to remove the acid. Concentration of the resulting organic phase gave 45.2 g of the ester-amide mixture as a waxy solid.
  • a reference sample of the (RS)-N-I -cyclopropylethyl decanamide was prepared from (RS)-I -cyclopropylethylamine and decanoyl chloride. It melted at 52-53 0 C, indicating that the racemate is a conglomerate.
  • Chiral chromatography of (RS)-N- 1 -cyclopropylethyl decanamide was done on a 50 x 4.6 mm Chiralpak AD-H (5 ⁇ m) column, eluting at 1 mL/min with hexanes-MeOH, 99: 1, and monitoring at 200 nm. The (R)-amide eluted at 7.5 min and the (S) amide at 8.0 min.
  • CaI B lipase, lyophile from Candida antarctica 50 mg, from Biocatalytics, Inc. was added to the vial.
  • the vial was placed in the well of a multiwell plate and shaken at 500 rpm at 25°C. [00161] After 24 and 120 hrs, 50 ⁇ l samples were withdrawn for dansylation and analysis. Acetonitrile (0.5 ml) and 50 mM Sodium carbonate solution (0.5 ml, pH 9.5) were added to the sample followed by 0.5 ml of dansyl chloride solution. The reaction mixtures were mixed on a microplate shaker at 300 rpm for 30 min. A solution of 200 mM NH 4 OH (0. ImI) was added and again mixed for 60 min.
  • Reversed Phase HPLC to determine the extent of conversion was done as follows. Column: YMC pack Pro C 18, 150 X 4.6 mm, 3 ⁇ m, Waters Solvent: A (0.05% TFA in Wate ⁇ Methanol 80:20) B (0.05% TFA in Acetonitrile:Methanol 80:20) Gradient from 0% B to 100% B in 20 minutes Flow Rate: 1 ml/min, Temperature: 40 0 C, Detection: UV, 220 nm.
  • the enantiomeric composition of the dansyl derivatives was determined by HPLC on a chiral reversed phase column Chiralpak AS-RH, 150 X 4.6 mm, 5 ⁇ , Chiral Technologies Inc., using isocratic mixture of 76% solvent A (0.05% TFA in Wate ⁇ Methanol 80:20) and 24% solvent B (0.05% TFA in Acetonitrile:Methanol
  • Sodium pyruvate (750 g, 6.816 moles) was added to the reactor, and rinsed with 100 ml water.
  • the E. coli suspension was added to the reactor and rinsed in with 200 ml water.
  • the pH was adjusted to 8.0 with cone. H3PO 4 and/or 25% NaOH, and the final volume was brought to 15 L with water.
  • the reaction was run at 30 0 C, 100 rpm, pH 8. No further adjustment or control of pH was necessary. The reaction was continued for 23h until the ee was >99%.
  • the mixture was adjusted to pH 12-13 with 50% (19N) NaOH (-1500 g).
  • Dow Corning antifoam 100 mL, was added and the jacket temperature was increased to 130 0 C to distill at 1 atm.

Abstract

The invention relates to processes for the enzymatic, stereoselective resolution of racemic amines to provide chiral amines.

Description

STEREOSELECTIVE RESOLUTION OF RACEMIC AMINES
FIELD OF THE INVENTION
[0001] The invention relates to a novel process for the enzymatic resolution of racemic amines. In particular, the process provides for the catalytic enzymatic resolution of a amine into its (R)- and or (^-isomer. The (R)- or (^-isomers produced in accordance with the process of the invention are precursors in molecules therapeutically useful as inhibitors of Corticotropin Releasing Factor for treatment of anxiety.
BACKGROUND OF THE INVENTION
[0002] Chirality is a factor to be considered with respect to the efficacy of many drugs and agrochemicals. The production of single enantiomers of chiral intermediates has become increasingly important. Single enantiomers can be produced by chemical or chemoenzymatic synthesis, and biocatalysis, the latter being the emphasis herein. Biocatalysis has many advantages over chemical synthesis which include the enantioselective and regioselective nature of enzyme-catalyzed reactions, and the ability of biocatalysts to carry out biocatalytic reactions at ambient temperature and atmospheric pressure. Biocatalysts avoid problems in isomerization, racemization, epimerization, and rearrangement often associated with the use of extreme conditions in chemical syntheses. Furthermore, microbial cells expressing an enzyme of interest, and the enzymes themselves, can be immobilized and reused for multiple biocatalytic reactions. The enzymes may be over-expressed to make biocatalytic processes economically efficient, and enzymes with modified activity/properties can be readily made by recombinant techniques. The present invention provides novel processes for the enzymatic resolution of racemic amines using either a transaminase or a lipase.
SUMMARY OF THE INVENTION [0003] In accordance with the present invention, stereoselective processes for preparing compounds of Formula Ia or Ib are provided
Figure imgf000003_0001
Ia Ib wherein Ri and R2 are as defined below.
[0004] In one embodiment, the enzyme utilized in the stereoselective process is a transaminase. In another embodiment, the enzyme utilized in the stereoselective process is a lipase.
[0005] In addition, the present invention provides a transaminase polypeptide, encoded by the polynucleotide of SEQ ID NO: 1 and having the encoded amino acid sequences of SEQ ID NO:2, or a functional or biologically active portion of these sequences.
[0006] In another aspect, the present invention also provides an isolated transaminase polynucleotide as depicted in SEQ ID NO: 1. In another aspect, the present invention provides a polynucleotide sequence comprising the complement of SEQ ID NO: 1, or variants thereof. In addition, an object of the invention encompasses variations or modifications of the transaminase sequence which is a result of degeneracy of the genetic code, where the polynucleotide sequences can hybridize under moderate or high stringency conditions to the polynucleotide sequence of SEQ ID NO: 1. In another aspect, the present invention provides an isolated nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide comprising SEQ ID NO:2.
[0007] In another aspect, the present invention provides compositions comprising the transaminase polynucleotide sequence, or fragments thereof, or the encoded transaminase polypeptide, or fragments or portions thereof. [0008] In another aspect, the present invention also provides expression vectors and host cells comprising polynucleotides that encode the transaminase polypeptide of the invention.
DETAILED DESCRIPTION OF THE INVENTION [0009] In one embodiment, the present invention provides a process for the preparation of a compound of Formula Ia or Ib
Figure imgf000004_0001
Ia Ib wherein Ri is alkyl, aryl, or heterocyclic; and R2 is cycloalkyl or alkyl; comprising resolving a racemic compound of Formula I
Figure imgf000004_0002
I by a reaction catalyzed by an enzyme produced by a microorganism from the group consisting of Bacillus, Candida or Pseudomonas.
[0010] In one embodiment, the strain of Bacillus (source of transaminase enzyme) is Bacillus megaterium. In another embodiment, the strain of Pseudomonas (source of transaminase enzyme) is Pseudomonas sp. In a further embodiment, the strain of Candida (source of lipase enzyme) is Candida antarctica.
[0011] In one embodiment, the enzyme is a transaminase. In another embodiment, the enzyme is a lipase.
[0012] In a further embodiment, the enzyme is the transaminase according to SEQ
ID NO:2. [0013] In another embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein the reaction with an enzyme is carried out either by: introducing a racemic compound of formula I into a medium in which the microorganism is being fermented to form a reaction mixture in which the enzyme is concurrently being formed and catalytically reacts with the racemic compound; or fermenting the microorganism until sufficient growth is realized, and introducing the racemic compound to the microorganism in which the racemic compound of formula I is catalytically reacted with the enzyme. [0014] In still another embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein the amount of the racemic compound of formula I added to the reaction mixture is up to about 50 g/L of the reaction mixture. [0015] In still yet another embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein an enzyme is isolated and optionally purified. [0016] In one embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein the reaction catalyzed by an enzyme is carried out by reacting the racemic compound of formula I with the enzyme that was previously isolated and optionally purified before contacting with the racemic compound.
[0017] In another embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein the enzyme is derived from cell extracts. [0018] In still another embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein the enzyme is expressed by a plasmid transformed into E. coli host cells.
[0019] In still yet another embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein the enzyme is obtained from Bacillus megaterium (source of transaminase enzyme), Candida antarctica (source of transaminase lipase) or Pseudomonas sp (source of transaminase enzyme). In one embodiment, the enzyme is obtained from Bacillus megaterium strain SC6394.
[0020] In another embodiment, the enzyme is a transaminase that is expressed by a gene having characteristics selected from SEQ ID NO: 1. In another embodiment, the enzyme comprises an amino acid sequence comprising SEQ ID NO:2.
[0021] In another embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein the enzyme provides a reaction yield of greater than 42% by weight of the compound of formula Ia or Ib, based on the weight of the racemic aammiinnee iinnppuutt.
[0022] In still another embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein the process provides the compounds of formula Ia or Ib in an enantiomeric excess greater than 95%. [0023] In yet another embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein the reaction catalyzed by an enzyme is carried out at a pH of between about 5.0 and about 9.0.
[0024] In one embodiment, processes for preparing compounds of formula Ia or Ib are provided wherein the compound of formula Ia or Ib is
Figure imgf000006_0001
Ia* (S-amine) Ib* (R-amine) and the compound of formula I is
Figure imgf000006_0002
I*.
[0025] In one embodiment, the present invention provides processes for the preparation of an enzyme for the preparation of compounds of formula Ia* or Ib*
Figure imgf000006_0003
Ia* (S-amine) Ib* (R-amine) from a compound of formula I*
Figure imgf000006_0004
I* comprising (a) either
(i) providing a microorganism selected from the group consisting of Bacillus, Candida, E. coli and Pseudomonas in a growth medium under conditions which allow for expression of an enzyme, or (ii) introducing a gene which encodes for the enzyme into a host microorganism for recombinant expression, introducing the host microorganism in a growth medium under conditions which allow for expression of the enzyme and allowing it to grow and express the enzyme;
(b) optionally, extracting the enzyme from the growth medium; and (c) optionally, purifying the enzyme.
[0026] In another embodiment, processes for the preparation of an enzyme for the preparation of compounds of formula Ia* or Ib* are provided wherein the process of extracting the enzyme comprises lysing the cells of the microorganism and isolating the enzyme.
[0027] In still another embodiment, processes for the preparation of an enzyme for the preparation of compounds of formula Ia* or Ib* are provided wherein the processes of purifying the enzyme comprises ion-exchange, hydrophobic, and hydroxyapatite chromatography. [0028] In one embodiment, processes for preparing compounds of formula Ia* or Ib* are provided wherein
(a) either
(i) providing a microorganism in a growth medium under conditions which allow for expression of an enzyme, or (ii) introducing a gene which encodes for the enzyme into a host microorganism for recombinant expression, introducing the host microorganism in a growth medium under conditions which allow for expression of the enzyme and allowing it to grow and express the enzyme; and
(b) catalytically reacting the enzyme with a compound of formula I* to produce the desired compounds. [0029] In another embodiment, the present invention provides processes for preparing a compound of formula Ia* which comprises reacting a compound of the formula I* with Novozym 435 and ethyl caprate in MTBE (methyl ?-butyl ether) to afford Compound Ia*. [0030] In another embodiment, the present invention provides processes for preparing a compound of formula Ib* (R amine) which comprises reacting a compound of the formula I* with enzyme in presence of potassium phosphate and sodium pyruvate to convert Ia* to a ketone. [0031] In another embodiment, the present invention provides processes for preparing (R)-seobutylamine by the enzymatic resolution of racemic 5eobutylamine. [0032] In another embodiment, the present invention provides processes for preparing (R)-sec-butylamine by the enzymatic resolution of racemic 5ec-butylamine, wherein the enzyme is a transaminase from B. megaterium expressed in E. coli. [0033] In another embodiment, the present invention provides processes for preparing (R)-sec-butylamine which comprises reacting racemic Sec-butylamine with a transaminase from B. megaterium expressed in E. coli in presence of potassium phosphate and sodium pyruvate.
[0034] The invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. This invention also encompasses all combinations of alternative aspects of the invention noted herein. It is understood that any and all embodiments of the present invention may be taken in conjunction with any other embodiment to describe additional embodiments of the present invention. Furthermore, any elements of an embodiment may be combined with any and all other elements from any of the embodiments to describe additional embodiments.
DEFINITIONS
[0035] The term "alkyl" refers to straight or branched chain hydrocarbon groups or radicals having 1 to 6 carbon atoms, such as methyl, ethyl, n-propyl, i-propyl, n- butyl, i-butyl, t-bvXy\, pentyl, hexyl, cycloalkyl having 3 to 6 carbon atoms, or any subset of the foregoing, any of which may be optionally substituted. [0036] The term "cycloalkyl" as employed herein alone or as part of another group includes saturated or partially unsaturated (containing 1 or 2 double bonds) cyclic hydrocarbon groups containing 1 to 10 rings, preferably 1 to 3 rings, including monocyclic alkyl, bicyclic alkyl (or bicycloalkyl) and tricyclic alkyl, containing a total of 3 to 20 carbons forming the ring, preferably 3 to 15 carbons, more preferably 3 to 10 carbons, forming the ring and which may be fused to 1 or 2 aromatic rings as described for aryl, which includes cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, cycloheptyl, cyclooctyl, cyclodecyl, cyclododecyl, cyclohexenyl,
Figure imgf000009_0001
any of which groups may be optionally substituted with 1 to 4 substituents such as halogen, alkyl, alkoxy, hydroxy, aryl, aryloxy, arylalkyl, cycloalkyl, alkylamido, alkanoylamino, oxo, acyl, arylcarbonylamino, amino, nitro, cyano, thiol, and/or alkylthio, and/or any of the substituents for alkyl. [0037] "Amino acid sequence" as used herein can refer to an oligopeptide, peptide, polypeptide, or protein sequence, and fragments or portions thereof, as well as to naturally occurring or synthetic molecules, preferably isolated polypeptides of the transaminase. Amino acid sequence fragments are typically from about 4 to about 30, preferably from about 5 to about 15 amino acids in length. The transaminase amino acid sequence of this invention is set forth in SEQ ID NO:2. The terms "transaminase polypeptide" and "transaminase protein" are used interchangeably herein to refer to the encoded products of the transaminase nucleic acid sequence according to the present invention. [0038] Isolated transaminase polypeptide refers to the amino acid sequence of substantially purified transaminase, which may be obtained from any bacterial species, preferably Bacillus or Pseudomonas sp., and from a variety of sources, including natural, synthetic, semi-synthetic, or recombinant. More particularly, the transaminase polypeptide of this invention is identified in SEQ ID NO:2. Functional fragments of the transaminase polypeptide are also embraced by the present invention.
[0039] "Similar" amino acids are those which have the same or similar physical properties and in many cases, the function is conserved with similar residues. For example, amino acids lysine and arginine are similar; while residues such as proline and cysteine are not considered to be similar.
[0040] The term "consensus" refers to a sequence that reflects the most common choice of base or amino acid at each position among a series of related DNA, RNA or protein sequences. Areas of particularly good agreement often represent conserved functional domains.
[0041] A "variant" of a transaminase polypeptide refers to an amino acid sequence that is altered by one or more amino acids. The variant may have "conservative" changes, in which a substituted amino acid has similar structural or chemical properties, e.g., replacement of leucine with isoleucine. More rarely, a variant may have "non-conservative" changes, for example, replacement of a glycine with a tryptophan. The encoded protein may also contain deletions, insertions, or substitutions of amino acid residues, which produce a silent change and result in a functionally equivalent transaminase protein. Deliberate amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues, as long as the biological activity of transaminase protein is retained. For example, negatively charged amino acids may include aspartic acid and glutamic acid; positively charged amino acids may include lysine and arginine; and amino acids with uncharged polar head groups having similar hydrophilicity values may include leucine, isoleucine, and valine; glycine and alanine; asparagine and glutamine; serine and threonine; and phenylalanine and tyrosine. Guidance in determining which amino acid residues may be substituted, inserted, or deleted without abolishing functional biological or immunological activity may be found using computer programs well known in the art, for example, DNASTAR, Inc. software (Madison, WI).
[0042] "Nucleic acid or polynucleotide sequence", as used herein, refers to an isolated oligonucleotide ("oligo"), nucleotide, or polynucleotide, and fragments thereof, and to DNA or RNA of genomic or synthetic origin which may be single- or double-stranded, and represent the sense or anti-sense strand, preferably of the transaminase. By way of non-limiting examples, fragments include nucleic acid sequences that are greater than 20-60 nucleotides in length, and preferably include fragments that are at least 70-100 nucleotides, or which are at least 1000 nucleotides or greater in length. The transaminase nucleic acid sequence of this invention is specifically identified in SEQ ID NO: 1.
[0043] An "allele" or "allelic sequence" is an alternative form of the transaminase nucleic acid sequence. Alleles may result from at least one mutation in the transaminase nucleic acid sequence and may yield altered mRNAs or polypeptides whose structure or function may or may not be altered. Any given gene, whether natural or recombinant, may have none, one, or many allelic forms. Common mutational changes, which give rise to alleles, are generally ascribed to natural deletions, additions, or substitutions of nucleotides. Each of these types of changes may occur alone, or in combination with the others, one or more times in a given sequence.
[0044] "Oligonucleotides" or "oligomers", as defined herein, refer to a transaminase nucleic acid sequence comprising contiguous nucleotides, of at least about 5 nucleotides to about 60 nucleotides, preferably at least about 8 to 10 nucleotides in length, more preferably at least about 12 nucleotides in length, for example, about 15 to 35 nucleotides, or about 15 to 25 nucleotides, or about 20 to 35 nucleotides, which can be typically used in PCR amplification assays, hybridization assays, or in microarrays. It will be understood that the term oligonucleotide is substantially equivalent to the terms primer, probe, or amplimer, as commonly defined in the art.
[0045] The term "antisense" refers to nucleotide sequences, and compositions containing nucleic acid sequences, which are complementary to a specific DNA or RNA sequence. The term "antisense strand" is used in reference to a nucleic acid strand that is complementary to the "sense" strand. Antisense (i.e., complementary) nucleic acid molecules include PNAs and may be produced by any method, including synthesis or transcription. Once introduced into a cell, the complementary nucleotides combine with natural sequences produced by the cell to form duplexes, which block either transcription or translation. The designation "negative" is sometimes used in reference to the antisense strand, and "positive" is sometimes used in reference to the sense strand.
[0046] "Altered" nucleic acid sequences encoding the transaminase polypeptide include nucleic acid sequences containing deletions, insertions and/or substitutions of different nucleotides resulting in a polynucleotide that encodes the same or a functionally equivalent transaminase polypeptide (i.e., having transaminase activity). Altered nucleic acid sequences may further include polymorphisms of the polynucleotide encoding a transaminase polypeptide; such polymorphisms may or may not be readily detectable using a particular oligonucleotide probe.
[0047] The term "biologically active", i.e., functional, refers to a protein or polypeptide or fragment thereof, having structural, regulatory, or biochemical functions of a naturally occurring transaminase molecule. [0048] The terms "complementary" or "complementarity" refer to the natural binding of polynucleotides under permissive salt and temperature conditions by base pairing. For example, the sequence "A-G-T" binds to the complementary sequence "T-C-A". Complementarity between two single-stranded molecules may be "partial", in which only some of the nucleic acids bind, or it may be "complete" when total complementarity exists between single stranded molecules. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, which depend upon binding between nucleic acids strands. [0049] The term "homology" refers to a degree of complementarity. There may be partial homology or complete homology, wherein complete homology is equivalent to identity. A partially complementary sequence that at least partially inhibits an identical sequence from hybridizing to a target nucleic acid is referred to as the functional term "substantially homologous". The inhibition of hybridization of the completely complementary sequence to the target sequence may be examined using a hybridization assay (for example, Southern or Northern blot, solution hybridization, and the like) under conditions of low stringency. A substantially homologous sequence or probe will compete for and inhibit the binding (i.e., the hybridization) of a completely homologous sequence or probe to the target sequence under conditions of low stringency. Nonetheless, conditions of low stringency do not permit non-specific binding; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction. The absence of non-specific binding may be tested by the use of a second target sequence which lacks even a partial degree of complementarity (for example, less than about 30% identity). In the absence of non-specific binding, the probe will not hybridize to the second non-complementary target sequence. In one embodiment, the present invention encompasses any nucleic acid or polypeptide that is at least 98.4% homologous to the nucleic acid and polypeptide sequences of SEQ ID NO: 1 or SEQ ID NO:2, respectively.
[0050] As a practical matter, whether any particular nucleic acid molecule or polypeptide is at least about 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, or 99.9% identical to a nucleotide sequence of the present invention can be determined conventionally using known computer programs. A preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the CLUSTALW computer program (Thompson, J.D. et al, Nucleic Acids Research, 2(22):4673-4680 (1994)), which is based on the algorithm of Higgins, D. G. et al., Computer Applications in the Biosciences (CABIOS), 8(2): 189- 191 (1992). In a sequence alignment the query and subject sequences are both DNA sequences. An RNA sequence can be compared by converting U's to T's. However, the CLUSTALW algorithm automatically converts U's to T's when comparing RNA sequences to DNA sequences. The result of said global sequence alignment is in percent identity. Preferred parameters used in a CLUSTALW alignment of DNA sequences to calculate percent identity via pairwise alignments are: Matrix=IUB, k- tuple=l, Number of Top Diagonals=5, Gap Penalty=3, Gap Open Penalty 10, Gap Extension Penalty=0.1, Scoring Method=Percent, Window Size=5 or the length of the subject nucleotide sequence, whichever is shorter. For multiple alignments, the following CLUSTALW parameters are preferred: Gap Opening Penalty=10; Gap Extension Parameter=0.05; Gap Separation Penalty Range=8; End Gap Separation Penalty=Off; % Identity for Alignment Delay=40%; Residue Specific Gaps: Off; Hydrophilic Residue Gap=Off; and Transition Weighting=O. The pairwise and multiple alignment parameters provided for CLUSTALW above represent the default parameters as provided with the AlignX software program (Vector NTI suite of programs, version 6.0).
[0051] The present invention encompasses the application of a manual correction to the percent identity results, in the instance where the subject sequence is shorter than the query sequence because of 5' or 3' deletions, not because of internal deletions. If only the local pairwise percent identity is required, no manual correction is needed. However, a manual correction may be applied to determine the global percent identity from a global polynucleotide alignment. Percent identity calculations based upon global polynucleotide alignments are often preferred since they reflect the percent identity between the polynucleotide molecules as a whole (i.e., including any polynucleotide overhangs, not just overlapping regions), as opposed to, only local matching polynucleotides. Manual corrections for global percent identity determinations are required since the CLUSTALW program does not account for 5' and 3' truncations of the subject sequence when calculating percent identity. For subject sequences truncated at the 5' or 3' ends, relative to the query sequence, the percent identity is corrected by calculating the number of bases of the query sequence that are 5' and 3' of the subject sequence, which are not matched/aligned, as a percent of the total bases of the query sequence. Whether a nucleotide is matched/aligned is determined by results of the CLUSTALW sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above CLUSTALW program using the specified parameters, to arrive at a final percent identity score. This corrected score may be used for the purposes of the present invention. Only bases outside the 5' and 3' bases of the subject sequence, as displayed by the CLUSTALW alignment, which are not matched/aligned with the query sequence, are calculated for the purposes of manually adjusting the percent identity score. [0052] For example, a 90 base subject sequence is aligned to a 100 base query sequence to determine percent identity. The deletions occur at the 5' end of the subject sequence and therefore, the CLUSTALW alignment does not show a matched/alignment of the first 10 bases at 5' end. The 10 unpaired bases represent 10% of the sequence (number of bases at the 5' and 3' ends not matched/total number of bases in the query sequence) so 10% is subtracted from the percent identity score calculated by the CLUSTALW program. If the remaining 90 bases were perfectly matched the final percent identity would be 90%. In another example, a 90 base subject sequence is compared with a 100 base query sequence. This time the deletions are internal deletions so that there are no bases on the 5' or 3' of the subject sequence which are not matched/aligned with the query. In this case the percent identity calculated by CLUSTALW is not manually corrected. Once again, only bases 5' and 3' of the subject sequence which are not matched/aligned with the query sequence are manually corrected for. No other manual corrections are required for the purposes of the present invention.
[0053] By a polypeptide having an amino acid sequence at least, for example, 95% "identical" to a query amino acid sequence of the present invention, it is intended that the amino acid sequence of the subject polypeptide is identical to the query sequence except that the subject polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence. In other words, to obtain a polypeptide having an amino acid sequence at least 95% identical to a query amino acid sequence, up to 5% of the amino acid residues in the subject sequence may be inserted, deleted, or substituted with another amino acid. These alterations of the reference sequence may occur at the amino- or carboxy- terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence or in one or more contiguous groups within the reference sequence. [0054] As a practical matter, whether any particular polypeptide is at least about 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, or 99.9% identical to SEQ ID NO:2 can be determined conventionally using known computer programs. A preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the CLUSTALW computer program (Thompson, J.D. et al, Nucleic Acids Research, 2(22):4673-4680 (1994)), which is based on the algorithm of Higgins, D. G. et al., Computer Applications in the Biosciences (CABIOS), 8(2): 189-191 (1992). In a sequence alignment the query and subject sequences are both amino acid sequences. The result of said global sequence alignment is in percent identity. Preferred parameters used in a CLUSTALW alignment of DNA sequences to calculate percent identity via pairwise alignments are: Matrix=BLOSUM, k-tuple=l, Number of Top Diagonals=5, Gap Penalty=3, Gap Open Penalty 10, Gap Extension Penalty=0.1, Scoring Method=Percent, Window Size=5 or the length of the subject nucleotide sequence, whichever is shorter. For multiple alignments, the following CLUSTALW parameters are preferred: Gap Opening Penalty=10; Gap Extension Parameter=0.05; Gap Separation Penalty Range=8; End Gap Separation Penalty=Off; % Identity for Alignment Delay=40%; Residue Specific Gaps:Off; Hydrophilic Residue Gap=Off; and Transition Weighting=O. The pairwise and multiple alignment parameters provided for CLUSTALW above represent the default parameters as provided with the AlignX software program (Vector NTI suite of programs, version 6.0). [0055] The present invention encompasses the application of a manual correction to the percent identity results, in the instance where the subject sequence is shorter than the query sequence because of N- or C-terminal deletions, not because of internal deletions. If only the local pairwise percent identity is required, no manual correction is needed. However, a manual correction may be applied to determine the global percent identity from a global polypeptide alignment. Percent identity calculations based upon global polypeptide alignments are often preferred since they reflect the percent identity between the polypeptide molecules as a whole (i.e., including any polypeptide overhangs, not just overlapping regions), as opposed to, only local matching polypeptides. Manual corrections for global percent identity determinations are required since the CLUSTALW program does not account for N- and C-terminal truncations of the subject sequence when calculating percent identity. For subject sequences truncated at the N- and C-termini, relative to the query sequence, the percent identity is corrected by calculating the number of residues of the query sequence that are N- and C-terminal of the subject sequence, which are not matched/aligned with a corresponding subject residue, as a percent of the total bases of the query sequence. Whether a residue is matched/aligned is determined by results of the CLUSTALW sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above CLUSTALW program using the specified parameters, to arrive at a final percent identity score. This final percent identity score is what may be used for the purposes of the present invention. Only residues to the N- and C-termini of the subject sequence, which are not matched/aligned with the query sequence, are considered for the purposes of manually adjusting the percent identity score. That is, only query residue positions outside the farthest N- and C-terminal residues of the subject sequence.
[0056] For example, a 90 amino acid residue subject sequence is aligned with a 100 residue query sequence to determine percent identity. The deletion occurs at the N-terminus of the subject sequence and therefore, the CLUSTALW alignment does not show a matching/alignment of the first 10 residues at the N-terminus. The 10 unpaired residues represent 10% of the sequence (number of residues at the N- and C- termini not matched/total number of residues in the query sequence) so 10% is subtracted from the percent identity score calculated by the CLUSTALW program. If the remaining 90 residues were perfectly matched the final percent identity would be 90%. In another example, a 90 residue subject sequence is compared with a 100 residue query sequence. This time the deletions are internal deletions so there are no residues at the N- or C-termini of the subject sequence, which are not matched/aligned with the query. In this case the percent identity calculated by CLUSTALW is not manually corrected. Once again, only residue positions outside the N- and C-terminal ends of the subject sequence, as displayed in the CLUSTALW alignment, which are not matched/aligned with the query sequence, are manually corrected for. No other manual corrections are required for the purposes of the present invention. [0057] In addition to the above method of aligning two or more polynucleotide or polypeptide sequences to arrive at a percent identity value for the aligned sequences, it may be desirable in some circumstances to use a modified version of the CLUSTALW algorithm which takes into account known structural features of the sequences to be aligned, such as for example, the SWISS-PROT designations for each sequence. The result of such a modified CLUSTALW algorithm may provide a more accurate value of the percent identity for two polynucleotide or polypeptide sequences. Support for such a modified version of CLUSTALW is provided within the CLUSTALW algorithm and would be readily appreciated to one of skill in the art of bioinformatics.
[0058] Also available to those having skill in this art are the BLAST and BLAST 2.0 algorithms (Altschul et al., Nucleic Acids Research, 25:3389-3402 (1977) and Altschul et al., J. MoL Biol., 215:403-410 (1990)). The BLASTN program for nucleic acid sequences uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, and an expectation (E) of 10. The BLOSUM62 scoring matrix (Henikoff et al., Proc. Natl. Acad. ScL, USA, 89: 10915 (1989)) uses alignments (B) of 50, expectation (E) of 10, M=5, N=4, and a comparison of both strands.
[0059] The invention encompasses polypeptides having a lower degree of identity but having sufficient similarity so as to perform one or more of the same functions performed by the polypeptide of the present invention (i.e., transaminase activity). Similarity is determined by conserved amino acid substitution. Such substitutions are those that substitute a given amino acid in a polypeptide by another amino acid of like characteristics (e.g., chemical properties). According to Cunningham et al above, such conservative substitutions are likely to be phenotypically silent. Additional guidance concerning which amino acid changes are likely to be phenotypically silent are found in Bowie et al., Science, 247: 1306-1310 (1990).
[0060] Tolerated conservative amino acid substitutions of the present invention involve replacement of the aliphatic or hydrophobic amino acids Ala, VaI, Leu and He; replacement of the hydroxyl residues Ser and Thr; replacement of the acidic residues Asp and GIu; replacement of the amide residues Asn and GIn, replacement of the basic residues Lys, Arg, and His; replacement of the aromatic residues Phe,
Tyr, and Trp, and replacement of the small-sized amino acids Ala, Ser, Thr, Met, and
GIy.
[0061] In addition, the present invention also encompasses the conservative substitutions provided in Table IA below.
TABLE IA
I For Amino Acid | Code Replace with any of:
Figure imgf000019_0001
[0062] Aside from the uses described above, such amino acid substitutions may also increase protein or peptide stability. The invention encompasses amino acid substitutions that contain, for example, one or more non-peptide bonds (which replace the peptide bonds) in the protein or peptide sequence. Also included are substitutions that include amino acid residues other than naturally occurring L-amino acids, e.g., D-amino acids or non-naturally occurring or synthetic amino acids, e.g., β or γ amino acids.
[0063] Both identity and similarity can be readily calculated by reference to the following publications: Lesk, A.M., ed., Computational Molecular Biology, Oxford University Press, New York (1988); Smith, D. W., ed., Biocomputing: Informatics and Genome Projects, Academic Press, New York (1993); Griffin, A.M., et al., eds., Informatics Computer Analysis of Sequence Data, Part 1, Humana Press, New Jersey (1994); von Heinje, G., Sequence Analysis in Molecular Biology , Academic Press (1987); and Gribskov, M. et al, eds., Sequence Analysis Primer, M Stockton Press, New York (1991).
[0064] In addition, the present invention also encompasses substitution of amino acids based upon the probability of an amino acid substitution resulting in conservation of function. Such probabilities are determined by aligning multiple genes with related function and assessing the relative penalty of each substitution to proper gene function. Such probabilities are often described in a matrix and are used by some algorithms (e.g., BLAST, CLUSTALW, GAP, etc.) in calculating percent similarity wherein similarity refers to the degree by which one amino acid may substitute for another amino acid without lose of function. An example of such a matrix is the PAM250 or BLOSUM62 matrix.
[0065] Aside from the canonical chemically conservative substitutions referenced above, the invention also encompasses substitutions which are typically not classified as conservative, but that may be chemically conservative under certain circumstances. Analysis of enzymatic catalysis for proteases, for example, has shown that certain amino acids within the active site of some enzymes may have highly perturbed pKa's due to the unique microenvironment of the active site. Such perturbed pKa's could enable some amino acids to substitute for other amino acids while conserving enzymatic structure and function. Examples of amino acids that are known to have amino acids with perturbed pKa's are the Glu-35 residue of Lysozyme, the He- 16 residue of Chymotrypsin, the His-159 residue of Papain, etc. The conservation of function relates to either anomalous protonation or anomalous deprotonation of such amino acids, relative to their canonical, non-perturbed pKa. The pKa perturbation may enable these amino acids to actively participate in general acid-base catalysis due to the unique ionization environment within the enzyme active site. Thus, substituting an amino acid capable of serving as either a general acid or general base within the microenvironment of an enzyme active site or cavity, as may be the case, in the same or similar capacity as the wild-type amino acid, would effectively serve as a conservative amino substitution. [0066] The present invention is directed to polynucleotide fragments of the transaminase polynucleotide of the invention, in addition to polypeptides encoded therein by said polynucleotide and/or fragments. [0067] In the present invention, a "polynucleotide fragment" refers to a short polynucleotide having a nucleic acid sequence which is a portion of that shown in SEQ ID NO: 1 or the complementary strand thereto, or is a portion of a polynucleotide sequence encoding the polypeptide of SEQ ID NO:2. The nucleotide fragments of the invention are preferably at least about 15 nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt, at least about 50 nt, at least about 75 nt, or at least about 150 nt in length. A fragment "at least 20 nt in length," for example, is intended to include 20 or more contiguous bases from the nucleotide sequence shown in SEQ ID NO: 1. In this context "about" includes the particularly recited value, a value larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus, or at both termini. These nucleotide fragments have uses that include, but are not limited to, as diagnostic probes and primers as discussed herein. Of course, larger fragments (e.g., 50, 150, 500, 600, 2000 nucleotides) are preferred. [0068] Moreover, representative examples of polynucleotide fragments of the invention, include, for example, fragments comprising, or alternatively consisting of, a sequence from about nucleotide number 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-400, 401-450, 451-500, 501-550, 551-600, 651-700, 701-750, 751-800, 800-850, 851-900, 901-950, 951-1000, 1001-1050, 1051-1100, 1101-1150, 1151-1200, 1201-1250, 1251-1300, 1301-1350, 1351-1400, 1401-1450, 1451-1500, 1501-1550, 1551-1600, 1601-1650, 1651-1700, 1701-1750, 1751-1800, 1801-1850, 1851-1900, 1901-1950, 1951-2000, or 2001 to the end of SEQ ID NO: 1 or the complementary strand thereto. In this context "about" includes the particularly recited ranges, and ranges larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. Preferably, these fragments encode a polypeptide which has transaminase activity. More preferably, these polynucleotides can be used as probes or primers as discussed herein. Also encompassed by the present invention are polynucleotides which hybridize to these nucleic acid molecules under stringent hybridization conditions or lower stringency conditions, as are the polypeptides encoded by these polynucleotides.
[0069] In the present invention, a "polypeptide fragment" refers to an amino acid sequence which is a portion of that contained in SEQ ID NO:2. Protein (polypeptide) fragments may be "free-standing," or comprised within a larger polypeptide of which the fragment forms a part or region, most preferably as a single continuous region. Representative examples of polypeptide fragments of the invention, include, for example, fragments comprising, or alternatively consisting of, from about amino acid number 1-20, 21-40, 41-60, 61-80, 81-100, 102-120, 121-140, 141-160, or 161 to the end of the coding region. Moreover, polypeptide fragments can be about 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, or 150 amino acids in length. In this context "about" includes the particularly recited ranges or values, and ranges or values larger or smaller by several (5, 4, 3, 2, or 1) amino acids, at either extreme or at both extremes. Polynucleotides encoding these polypeptides are also encompassed by the invention.
[0070] Preferred polypeptide fragments include the full-length protein. Further preferred polypeptide fragments include the full-length protein having a continuous series of deleted residues from the amino or the carboxy terminus, or both. For example, any number of amino acids, ranging from 1 -60, can be deleted from the amino terminus of the full-length polypeptide. Similarly, any number of amino acids, ranging from 1-30, can be deleted from the carboxy terminus of the full-length protein. Furthermore, any combination of the above amino and carboxy terminus deletions are preferred. Similarly, polynucleotides encoding these polypeptide fragments are also preferred.
[0071] Other preferred polypeptide fragments are biologically active fragments. Biologically active fragments are those exhibiting activity similar, but not necessarily identical, to an activity of the polypeptide of the present invention (i.e., transaminase activity). The biological activity of the fragments may include an improved desired activity, or a decreased undesirable activity. Polynucleotides encoding these polypeptide fragments are also encompassed by the invention. [0072] In a preferred embodiment, the functional activity displayed by a polypeptide encoded by a polynucleotide fragment of the invention may be one or more biological activities typically associated with the full-length polypeptide of the invention (i.e., transaminase activity). However, the skilled artisan would appreciate that some fragments may have biological activities which are desirable and directly inapposite to the biological activity of the full-length protein. The functional activity of polypeptides of the invention, including fragments, variants, derivatives, and analogs thereof can be determined by numerous methods available to the skilled artisan, some of which are described elsewhere herein.
[0073] The term "hybridization" refers to any process by which a strand of nucleic acids binds with a complementary strand through base pairing. The term "hybridization complex" refers to a complex formed between two nucleic acid sequences by virtue of the formation of hydrogen bonds between complementary G and C bases and between complementary A and T bases. The hydrogen bonds may be further stabilized by base stacking interactions. The two complementary nucleic acid sequences hydrogen bond in an anti-parallel configuration. A hybridization complex may be formed in solution (for example, Cot or Rot analysis), or between one nucleic acid sequence present in solution and another nucleic acid sequence immobilized on a solid phase or support (for example, membranes, filters, chips, pins, or glass slides, or any other appropriate substrate to which cells or their nucleic acids have been affixed).
[0074] The terms "stringency" or "stringent conditions" refer to the conditions for hybridization as defined by nucleic acid composition, salt, and temperature. These conditions are well known in the art and may be altered to identify and/or detect identical or related polynucleotide sequences in a sample. A variety of equivalent conditions comprising either low, moderate, or high stringency depend on factors such as the length and nature of the sequence (DNA, RNA, base composition), reaction milieu (in solution or immobilized on a solid substrate), nature of the target nucleic acid (DNA, RNA, base composition), concentration of salts and the presence or absence of other reaction components (for example, formamide, dextran sulfate and/or polyethylene glycol) and reaction temperature (within a range of from about 5°C below the melting temperature of the probe to about 200C to 25°C below the melting temperature). One or more factors may be varied to generate conditions, either low or high stringency that is different from but equivalent to the aforementioned conditions. [0075] As will be understood by those of skill in the art, the stringency of hybridization may be altered in order to identify or detect identical or related polynucleotide sequences. As will be further appreciated by the skilled practitioner, the melting temperature, Tm, can be approximated by the formulas as well known in the art, depending on a number of parameters, such as the length of the hybrid or probe in number of nucleotides, or hybridization buffer ingredients and conditions (see, for example, Maniatis, T. et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1982) and Sambrook, J. et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1989); Ausubel, F.M. et al., eds., Current Protocols in Molecular Biology, Vol. 1, "Preparation and Analysis of DNA", John Wiley and Sons, Inc., Suppls. 26, 29, 35 and 42, pp. 2.10.7- 2.10.16 (1994-1995); Wahl, G.M. et al., Meth. Enzymol, 152:399-407 (1987)); and Kimmel, A.R., Meth. Enzymol, 152:507-511 (1987)).
[0076] As a general guide, Tm decreases approximately I0C -1.50C with every 1% decrease in sequence homology. Also, in general, the stability of a hybrid is a function of sodium ion concentration and temperature. Typically, the hybridization reaction is initially performed under conditions of low stringency, followed by washes of varying, but higher stringency. Reference to hybridization stringency, for example, high, moderate, or low stringency, typically relates to such washing conditions. It is to be understood that the low, moderate and high stringency hybridization or washing conditions can be varied using a variety of ingredients, buffers and temperatures well known to and practiced by the skilled artisan.
[0077] A "composition", as defined herein, refers broadly to any composition containing a transaminase polynucleotide or polypeptide of the present invention. The composition may comprise a dry formulation or an aqueous solution. Compositions comprising a transaminase polynucleotide sequence (SEQ ID NO: 1) encoding a transaminase polypeptide (SEQ ID NO:2), or fragments thereof, may be employed as hybridization probes. The probes may be stored in a freeze-dried form and may be in association with a stabilizing agent such as a carbohydrate. In hybridizations, the probe may be employed in an aqueous solution containing salts (for example, NaCl), detergents or surfactants (for example, SDS) and other components (for example, Denhardt's solution, dry milk, salmon sperm DNA, and the like). [0078] The term "substantially purified" refers to nucleic acid sequences or amino acid sequences that are removed from their natural environment, isolated or separated, and are at least 60% free, preferably 75% to 85% free, and most preferably 90% to 95%, or greater, free from other components with which they are naturally associated. [0079] "Transformation" or transfection refers to a process by which exogenous DNA, preferably transaminase DNA, enters and changes a recipient cell. It may occur under natural or artificial conditions using various methods well known in the art. Transformation may rely on any known method for the insertion of foreign nucleic acid sequences into a prokaryotic or eukaryotic host cell. The method is selected based on the type of host cell being transformed and may include, but is not limited to, viral infection, electroporation, heat shock, lipofection, and partial bombardment. Such "transformed" cells include stably transformed cells in which the inserted DNA is capable of replication either as an autonomously replicating plasmid or as part of the host chromosome. Transformed cells also include those cells, which transiently express the inserted DNA or RNA for limited periods of time.
[0080] Variants of the transaminase polypeptide are also encompassed by the present invention. Preferably, a transaminase variant has at least 98.4% amino acid sequence identity to a transaminase amino acid sequence disclosed herein, and more preferably, retains at least one biological, immunological, or other functional characteristic or activity of the non-variant transaminase polypeptide.
Recombinant Expression
[0081] In another embodiment, the present invention encompasses the polynucleotides which encode the transaminase polypeptide. Accordingly, any nucleic acid sequence that encodes the amino acid sequence of the transaminase polypeptide of the invention can be used to produce recombinant molecules that express the transaminase protein. More particularly, the invention encompasses the transaminase polynucleotide comprising the nucleic acid sequence of SEQ ID NO: 1. Additionally, any nucleic acid sequence that encodes a lipase polypeptide may be used to produce recombinant molecules that express the lipase protein.
[0082] As will be appreciated by the skilled practitioner in the art, the degeneracy of the genetic code results in many nucleotide sequences that can encode the described polypeptides. Some of the sequences bear minimal or no homology to the nucleotide sequences of any known and naturally occurring gene. Accordingly, the present invention contemplates each and every possible variation of nucleotide sequence that could be made by selecting combinations based on possible codon choices. These combinations are made in accordance with the standard triplet genetic code as applied to the nucleotide sequence of naturally occurring transaminase or lipase, and all such variations are to be considered as being specifically disclosed and able to be understood by the skilled practitioner. [0083] Although nucleic acid sequences which encode the transaminase polypeptide or lipase polypeptide and variants thereof are preferably capable of hybridizing to the nucleotide sequence of the naturally occurring transaminase polypeptide or lipase polypeptide under appropriately selected conditions of stringency, it may be advantageous to produce nucleotide sequences encoding transaminase polypeptide or lipase polypeptides, or derivatives thereof, which possess a substantially different codon usage. For example, codons may be selected to increase the rate at which expression of the peptide/polypeptide occurs in a particular prokaryotic host in accordance with the frequency with which particular codons are utilized by the host. Another reason for substantially altering the nucleotide sequence encoding the transaminase polypeptide or lipase polypeptide, or its derivatives, without altering the encoded amino acid sequences, includes the production of RNA transcripts having more desirable properties, such as a greater half-life, than transcripts produced from the naturally occurring sequence.
[0084] The present invention also encompasses production of DNA sequences, or portions thereof, which encode the transaminase polypeptide or a lipase polypeptide, or derivatives thereof, entirely by synthetic chemistry. After production, the synthetic sequence may be inserted into any of the many available expression vectors and cell systems using reagents that are well known and practiced by those in the art. Moreover, synthetic chemistry may be used to introduce mutations into a sequence encoding the transaminase polypeptide or lipase polypeptide, or any fragment thereof. [0085] In an embodiment of the present invention, a gene delivery vector containing the transaminase polynucleotide, lipase polypeptide or functional fragment thereof is provided. In one embodiment, the gene delivery vector contains the polynucleotide, or functional fragment thereof comprising an isolated and purified polynucleotide encoding the bacterial transaminase having the sequence as set forth in any one of SEQ ID NO: 1. In another embodiment, the gene delivery vector contains a polynucleotide encoding a bacterial lipase. [0086] It will also be appreciated by those skilled in the pertinent art that in addition to the primers disclosed herein, a longer oligonucleotide probe, or mixtures of probes, for example, degenerate probes, can be used to detect longer, or more complex, nucleic acid sequences, such as, for example, genomic or full length DNA. In such cases, the probe may comprise at least 20-300 nucleotides, preferably, at least 30-100 nucleotides, and more preferably, 50-100 nucleotides.
[0087] In another embodiment of the present invention, polynucleotide sequences or portions thereof which encode the transaminase polypeptide or peptides, or a lipase polypeptide or peptides can comprise recombinant DNA molecules to direct the expression of the polypeptide products, peptide fragments, or functional equivalents thereof, in appropriate host cells. Because of the inherent degeneracy of the genetic code, other DNA sequences, which encode substantially the same or a functionally equivalent amino acid sequence, may be produced and these sequences may be used to clone and express a transaminase polypeptide or lipase polypeptide as described. [0088] The nucleotide sequences of the present invention can be engineered using methods generally known in the art in order to alter the transaminase polypeptide or lipase polypeptide-encoding sequences for a variety of reasons, including, but not limited to, alterations which modify the cloning, processing, and/or expression of the gene product. DNA shuffling by random fragmentation, PCR reassembly of gene fragments and synthetic oligonucleotides may be used to engineer the nucleotide sequences. For example, site-directed mutagenesis may be used to insert new restriction sites, alter glycosylation patterns, change codon preference, produce splice variants, or introduce mutations, and the like.
[0089] In a further embodiment, sequences encoding the transaminase polypeptide or lipase polypeptide may be synthesized in whole, or in part, using chemical methods well known in the art (see, for example, Caruthers, M.H. et al, Nucl. Acids Res. Symp. Ser., 215-223 (1980) and Horn, T. et al., Nucl. Acids Res. Symp. Ser., 225-232 (1980)). Alternatively, the transaminase protein itself, or a fragment or portion thereof, may be produced using chemical methods to synthesize the amino acid sequence of the transaminase polypeptide, or a fragment or portion thereof. For example, peptide synthesis can be performed using various solid-phase techniques (Roberge, J.Y. et al, Science, 269:202-204 (1995)) and automated synthesis can be achieved, for example, using the ABI 43 IA Peptide Synthesizer (PE Biosystems).
[0090] The newly synthesized transaminase polypeptide or lipase polypeptide or peptide can be substantially purified by preparative high performance liquid chromatography (e.g., Creighton, T., Proteins, Structures and Molecular Principles , W.H. Freeman and Co., New York, NY (1983)), by reverse-phase high performance liquid chromatography (HPLC), or other purification methods as known and practiced in the art. The composition of the synthetic peptides may be confirmed by amino acid analysis or sequencing (e.g., the Edman degradation procedure; Creighton, supra). In addition, the amino acid sequence of a transaminase polypeptide, lipase polypeptide or any portion thereof, can be altered during direct synthesis and/or combined using chemical methods with sequences from other proteins, or any part thereof, to produce a variant polypeptide.
[0091] To express a biologically active transaminase polypeptide, lipase polypeptide or peptide, the nucleotide sequences encoding the transaminase polypeptide, or functional equivalents, may be inserted into an appropriate expression vector, i.e., a vector, which contains the necessary elements for the transcription and translation of the inserted coding sequence.
[0092] In one embodiment of the present invention, an expression vector contains an isolated and purified polynucleotide sequence as set forth in SEQ ID NO: 1, encoding a bacterial transaminase, or a functional fragment thereof, in which the huma transaminase comprises the amino acid sequence as set forth in SEQ ID NO:2. Alternatively, an expression vector can contain the complement of the aforementioned transaminase nucleic acid sequence. In another embodiment, the expression vector comprises a polynucleotide sequence encoding a lipase. [0093] Expression vectors derived from retroviruses, adenovirus, herpes or vaccinia viruses, or from various bacterial plasmids can be used for the delivery of nucleotide sequences. Methods, which are well known to those skilled in the art, may be used to construct expression vectors containing sequences encoding a transaminase polypeptide along with appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. Such techniques are described in Sambrook, J. et al., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview, N.Y. (1989)and in Ausubel, F. M. et al., Current Protocols in Molecular Biology , John Wiley & Sons, New York, NY (1989).
[0094] A variety of expression vector/host systems may be utilized to contain and express sequences encoding the transaminase polypeptide or lipase polypeptide, or peptides. Such expression vector/host systems include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with virus expression vectors (e.g., baculovirus); plant cell systems transformed with virus expression vectors (e.g., cauliflower mosaic virus (CaMV) and tobacco mosaic virus (TMV)), or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or animal cell systems. The host cell employed is not limiting to the present invention. In one embodiment, the host cell of the invention contains an expression vector comprising an isolated and purified polynucleotide having the nucleic acid sequence of SEQ ID NO: 1 and encoding the bacterial transaminase of this invention, or a functional fragment thereof, comprising an amino acid sequence as set forth in SEQ ID NO:2. In another embodiment, the host cell of the invention contains an expression vector comprising an isolated and purified polynucleotide comprising a lipase amino acid sequence. [0095] "Control elements" or "regulatory sequences" are those non-translated regions of the vector, e.g., enhancers, promoters, 5' and 3' untranslated regions, which interact with host cellular proteins to carry out transcription and translation. Such elements may vary in their strength and specificity. Depending on the vector system and host utilized, any number of suitable transcription and translation elements, including constitutive and inducible promoters, may be used. Specific initiation signals may also be used to achieve more efficient translation of sequences encoding the transaminase polypeptide or lipase polypeptide. Such signals include the ATG initiation codon and adjacent sequences. In cases where sequences encoding the transaminase polypeptide or lipase polypeptide, its initiation codon, and upstream sequences are inserted into the appropriate expression vector, no additional transcriptional or translational control signals may be needed. However, in cases where only a transaminase or lipase coding sequence, or a fragment thereof, is inserted, exogenous translational control signals, including the ATG initiation codon, are optimally provided. Furthermore, the initiation codon should be in the correct reading frame to insure translation of the entire insert. Exogenous translational elements and initiation codons can be of various origins, both natural and synthetic. The efficiency of expression can be enhanced by the inclusion of enhancers which are appropriate for the particular cell system that is used, such as those described in the literature (see, e.g., Scharf, D. et al, Results Probl. Cell Differ., 20: 125-162 (1994)). [0096] In bacterial systems, a number of expression vectors may be selected, depending upon the use intended for the expressed product. Such vectors include, but are not limited to, the multifunctional E. coli cloning and expression vectors such as BLUESCRIPT (Stratagene), in which the sequence encoding the transaminase polypeptide can be ligated into the vector in- frame with sequences for the amino- terminal Met and the subsequent 7 residues of β-galactosidase, so that a hybrid protein is produced; pIN vectors (see, Van Heeke, G. et al., J. Biol. Chem., 264:5503- 5509 (1989)); and the like. In one embodiment, the expression vector is pZerO2 (Invitrogen, Carlsbad, CA). pGEX vectors (Promega, Madison, WI) can also be used to express foreign polypeptides as fusion proteins with glutathione S-transferase (GST). In general, such fusion proteins are soluble and can be easily purified from lysed cells by adsorption to glutathione-agarose beads followed by elution in the presence of free glutathione. Proteins made in such systems can be designed to include heparin, thrombin, or factor XA protease cleavage sites so that the cloned polypeptide of interest can be released from the GST moiety at will. [0097] Moreover, a host cell strain may be chosen for its ability to modulate the expression of the inserted sequences or to process the expressed protein in the desired fashion. Such modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing which cleaves a "prepro" form of the protein may also be used to facilitate correct insertion, folding and/or function. Different host cells having specific cellular machinery and characteristic mechanisms for such post- translational activities are available and may be chosen to ensure the correct modification and processing of the foreign protein.
[0098] Host cells transformed with a nucleotide sequence encoding the transaminase protein, lipase protein, or fragments thereof, may be cultured under conditions suitable for the expression and recovery of the protein from cell culture. The protein produced by a recombinant cell may be secreted or contained intracellularly depending on the sequence and/or the vector used. As will be understood by those having skill in the art, expression vectors containing polynucleotides which encode a transaminase protein or lipase protein can be designed to contain signal sequences which direct secretion of the transaminase protein or lipase protein through a prokaryotic cell membrane. Other constructions can be used to join nucleic acid sequences encoding a transaminase protein or a lipase protein to a nucleotide sequence encoding a polypeptide domain, which will facilitate purification of soluble proteins. Such purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine-tryptophan modules that allow purification on immobilized metals; protein A domains that allow purification on immobilized immunoglobulin; and the domain utilized in the FLAGS extension/ affinity purification system (Immunex Corp., Seattle, WA). The inclusion of cleavable linker sequences such as those specific for Factor XA or enterokinase (Invitrogen, San Diego, CA) between the purification domain and transaminase protein or lipase protein may be used to facilitate purification. One such expression vector provides for expression of a fusion protein containing transaminase or lipase and a nucleic acid encoding 6 histidine residues preceding a thioredoxin or an enterokinase cleavage site. The histidine residues facilitate purification on IMAC (immobilized metal ion affinity chromatography) as described by Porath, J. et al, Prot. Exp. Purif., 3:263-281 (1992), while the enterokinase cleavage site provides a means for purifying from the fusion protein. For a discussion of suitable vectors for fusion protein production, see Kroll, DJ. et al., DNA Cell Biol, 12:441-453 (1993). [0099] Any number of selection systems may be used to recover transformed cell lines. These include, but are not limited to, the Herpes Simplex Virus thymidine kinase (HSV TK), (Wigler, M. et al., Cell, 11 :223-32 (1977)) and adenine phosphoribosyltransferase (Lowy, I. et al., Cell, 22:817-23 (1980)) genes which can be employed in tk" or aprt" cells, respectively. Also, anti-metabolite, antibiotic or herbicide resistance can be used as the basis for selection; for example, dhfr, which confers resistance to methotrexate (Wigler, M. et al., Proc. Natl. Acad. Sci., 77:3567- 3570 (1980)); npt, which confers resistance to the aminoglycosides neomycin and G- 418 (Colbere-Garapin, F. et al., J. MoI. Biol, 150: 1-14 (1981)); and als or pat, which confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively (Murry, supra). Additional selectable genes have been described, for example, trpB, which allows cells to utilize indole in place of tryptophan, or hisD, which allows cells to utilize histinol in place of histidine (Hartman, S. C. et al., Proc. Natl. Acad. ScL, 85:8047-8051 (1988)). Recently, the use of visible markers has gained popularity with such markers as the anthocyanins, β-glucuronidase and its substrate GUS, and luciferase and its substrate luciferin, which are widely used not only to identify transformants, but also to quantify the amount of transient or stable protein expression that is attributable to a specific vector system (Rhodes, CA. et al., Methods MoI. Biol, 55: 121-131 (1995)).
[00100] Although the presence or absence of marker gene expression suggests that the gene of interest is also present, the presence and expression of the desired gene of interest may need to be confirmed. For example, if the nucleic acid sequence encoding a transaminase polypeptide or lipase polypeptide is inserted within a marker gene sequence, recombinant cells containing polynucleotide sequence encoding the transaminase polypeptide or lipase polypeptide can be identified by the absence of marker gene function. Alternatively, a marker gene can be placed in tandem with a sequence encoding a transaminase polypeptide or lipase polypeptide under the control of a single promoter. Expression of the marker gene in response to induction or selection typically indicates co-expression of the tandem gene. [00101] Alternatively, host cells which contain the nucleic acid sequence coding for a transaminase polypeptide of the invention or a lipase polypeptide and which express the transaminase polypeptide or lipase polypeptide product may be identified by a variety of procedures known to those having skill in the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations and protein bioassay or immunoassay techniques, including membrane, solution, or chip based technologies, for the detection and/or quantification of nucleic acid or protein. [00102] The presence of polynucleotide sequences encoding transaminase polypeptides or lipase polypeptides can be detected by DNA-DNA or DNA-RNA hybridization, or by amplification using probes, portions, or fragments of polynucleotides encoding a transaminase polypeptide. Nucleic acid amplification based assays involve the use of oligonucleotides or oligomers based on the nucleic acid sequences encoding a transaminase polypeptide to detect transformants containing DNA or RNA encoding transaminase polypeptide. [00103] In addition to recombinant production, fragments of transaminase polypeptides or lipase polypeptides may be produced by direct peptide synthesis using solid phase techniques (Merrifield, J., J. Am. Chem. Soc, 85:2149-2154 (1963)). Protein synthesis may be performed using manual techniques or by automation. Automated synthesis may be achieved, for example, using ABI 43 IA Peptide Synthesizer (PE Biosystems). Various fragments of the transaminase polypeptides or lipase polypeptides can be chemically synthesized separately and then combined using chemical methods to produce the full length molecule.
Transaminase Assay [00104] In another aspect, the present invention contemplates a method of detecting transaminase activity in a sample. In one embodiment, the method comprises measuring the consumption of («S)-sec-butylamine, as exemplified in Example 6, after Marfey's derivatization by HPLC.
Lipase Assay
[00105] In another aspect, the present invention contemplates a method of detecting lipase activity in a sample. In one embodiment, the method comprises measuring the lipase-catalyzed resolution of RS-1-cyclopropylethylamine.
EXAMPLES EXAMPLE 1 Enzymatic Resolution of Racemic Cyclopropylethylamine
Figure imgf000034_0001
(Λ)-l-cyclopropylethylamine + 1-cyclopropylethanone
Isolation and Screening of Cultures for Transaminase Activity
[00106] 17 soil isolates, 14 Bacillus strains from seven species, and a variety of other Culture Collection organisms were screened. Cultures were screened on minimal medium containing racemic secondary butylamine or racemic cyclopropylethylamine as nitrogen sources. One ml of sterile minimal medium was dispensed to a sterile 15 -ml capped plastic tube (minimal medium contained 10 g/L glycerol, 1 g/L filter-sterilized racemic sec-butylamine or racemic cyclopropylethylamine, MgSO4 7H2O 0.2 g/L, NaCl 0.01 g/L, FeSO4 7 H2O 0.01 g/L, and MnSO4 4H2O 0.01 g/L in 0.1 M pH 7 phosphate buffer ). [00107] Tubes were inoculated by loop (from slants) or with one drop of liquid (from vials) or with soil samples extracts. Tubes were shaken at 250 rpm, 28°C for three or more days in a bench-top shaker. Broths were then analyzed for the presence of (R)- and («S)-isomers of the amines.
[00108] Several active cultures were found, although ultimately a Culture Collection strain (Bacillus megaterium SC 6394) was used. Several cultures gave positive results which gave resolution of racemic seobutylamine and racemic cyclopropylethylamine to (R)-seobutylamine and (R)-cyclopropylethylamine. Two prominent cultures were several strains of Bacillus megaterium and Pseudomonas sp., for example Pseudomonas veronii, a strain isolated from soil and identified by 16S ribosomal RNA gene sequencing.
Analysis of 1-Cyclopropylethylamine
[00109] Samples of 4 to 40 μl containing about 40 μg amine were mixed with 0.2 ml 50 mM Na2CO3 (adjusted to pH 9.5 with HCl) and 0.2 ml dansyl chloride (2 mg/ml in acetonitrile) in a microfuge tube and incubated for 30 min at room temperature. The reaction was quenched by addition of 0.02 ml 2.8% NH4OH and incubated for an additional 15 min at room temperature. For chiral analysis, the mixture was vortexed for 30 sec with 1 ml ethyl acetate, centrifuged briefly, then 1 ml of the upper layer was dried at 40 0C under N2 and the residue was dissolved in 1 ml of the mobile phase for the chiral column. For quantitation with a Cl 8 column, the mixture was diluted to 1 ml with 50% ACN/50% water.
HPLC Methods Quantitation with C18 Column column: YMC Pak ODS A 15x0.6 cm 3μ mobile phase: gradient of 20 to 90% acetonitrile/water (0.05% trifluoroacetic acid in both) from 0 to 12 min, 20% acetonitrile/80% water (0.05% trifluoroacetic acid) from 12.01 to 15 min flow rate: 1 ml/min column temperature: 40 0C detection: set at 220 nm injection volume: 5 μl retention time: racemic and R-amine 9.9 min
Chiral Separation column: Chiralpak OD 25x0.46 cm (Daicel Chemical Industries, Ltd.) mobile phase: 95% hexane/5% ethanol flow rate: 1 ml/min column temperature: 18 0C detection: set at 220 nm injection volume: 10 μl retention times: R-amine 11.5min, S-amine 12.6 min.
Quantitation and ee of Sec-butylamine with Marfey's Reagent
[00110] Marfey's reagent was used to give diastereomeric derivatives that could be separated with a C18 column. A sample of 10 μl containing about 0.1 mg amine, 8 μl 1 M NaHCO3, and 40 μl 1% w/v Marfey's reagent (FDAA, l-fluoro-2,4- dinitrophenyl-5-L-alanine amide) in acetone were combined in a 1.5 ml microfuge tube and heated for 1 h at 40 0C. The samples were cooled to room temperature, then 8 μl IN HCl and 934 μl 50% acetonitrile/water were added, and the solutions were vortexed and filtered into HPLC vials.
HPLC Method for Sec-butylamine Derivative with Marfey's Reagent column: YMC Pak ODS A 15x0.6 cm 3μ mobile phase: 55% methanol/45 % water (containing 0.05% trifluoroacetic acid) flow rate: 1 ml/min column temperature: 40 0C detection: set at 340 nm injection volume: 5 μl R enantiomer retention time: 14.5 min
S enantiomer retention time: 15.6 min
Growth of Bacillus megaterium SC 6394 in 750-L Fermentor for
Production of (^-transaminase Inoculum Development
[00111] Four frozen vials of Bacillus megaterium SC6394 were thawed and their entire contents (1.5 ml) were transferred to each of four 4-liter flasks containing 1 liter of SG-MO medium (see Table 1). These Fl stage flasks were incubated at 28°C for 29 hours and 230 rpm. The four Fl flasks were then pooled and transferred to a 1000-liter tank employing a working volume of 750 liters SG-MO tank medium (see Table 2).
750-L Fermentation Process
[00112] The recommended fermentation process conditions for a 1000-liter tank containing 750 liters of SG-MO medium are as follows: Temperature: 28°C pH control: none Agitation: 300 rpm
Aeration: 750 lpm (1 wm)
Dissolved oxygen: no control
Back pressure: 10 psi
Antifoam: SAG 5693 on demand
Harvest: ca. 72 hours
Cell recovery: 26.636 kg cell paste.
Medium Composition
TABLE 1 SG-MO Flask Medium
Figure imgf000037_0001
[00113] The flask medium was adjusted to pH 7.0 with NaOH or H2SO4 as necessary, then dispensed into 4-liter flasks in 1 -liter aliquots prior to autoclaving at 121°C for 30 minutes.
TABLE 2 SG-MO Tank Medium
Figure imgf000037_0002
[00114] The tank medium was adjusted to pH 7.0 with NaOH or H2SO4 as necessary and steam sterilized at 1210C for 30 minutes.
Harvest [00115] Based on a clear, steady decline in the CO2 off-gas profile, the whole broth was harvested by centrifugation at 72 hours. The centrifuged cells were washed with 50 mM pH 7.0 potassium phosphate buffer. The cell paste was then stored at -700C until use.
Preparation of (/?)-s£c-butylamine from Racemic S'tfc-butylamine using
Bacillus megaterium SC6394
[00116] Racemic sec-butylamine (30 g, 410 mmoles), sodium pyruvate (90.23 g, 820 mmoles)and 300 ml 1 M KH2PO4 were dissolved in deionized water and brought to a final volume of 3 L. The pH was adjusted to 7.5 with 85% H3PO4 Bacillus megaterium SC6394 wet cell paste (300 g cells stored at -70 0C) was thawed and dispersed in 2 L of the substrate solution with an Ultraturrax T25 homogenizer. The cell suspension and the remaining 1 L substrate solution were added to a 5 -L vessel for a Braun Biostat® B. The vessel was connected to a Braun Biostat® B equipped with pH and oxygen electrodes and the suspension was stirred at 300 rpm. The transamination reaction was carried out at 28 0C, 300 rpm stirring speed, pH 8 was maintained with 10% NaOH and 8.5% H3PO4 feeds, and air was sparged from near the bottom of the vessel at 6L/min (2 wm). Samples were assayed by derivitization with Marfey's reagent and reverse phase HPLC, and the reaction was stopped when the enantiomeric excess ("ee") of i?-amine reached 100%. After 18 h the ee was
20.1%, but the next sample after 42 h was .99.9% ee. Initially O2 was 56% saturation but dropped to 0% saturation by 18h and remained at 0% until the reaction was terminated after 44 h. Initially the NaOH feed was required to maintain pH 8, but by 18 h the H3PO4 feed was required. The cells were removed by centrifugation at 16,900xg for 20 min, and then the pellets were washed with 30 mL water and centrifuged again. The combined supernatants (3140 mL, 14.7 g R-amine, 100% ee,) were stored at 4 0C until product isolation.
[00117] The reaction mixture, 3.2 kg, was adjusted to pH 13.1 with 10 M NaOH (205 g) and distilled at 1 atm, collecting two 200-mL fractions. The fractions were assayed by dansylation and HPLC of the dansyl derivative (see Note 1 below). The second fraction contained negligible amine and was discarded. The first fraction was redistilled, collecting a 39-mL fraction (bp 60-100 0C) and an 18-mL fraction (bp 1000C). Dansyl/HPLC assay indicated that the first fraction contained 99.8% of the product and the second contained 0.2%. The combined distillate was cooled on ice and adjusted to pH 6.6 with 11.4 mL of 10 M H2SO4. The solution was concentrated in vacuo, giving 27 g of semisolid residue. This was dissolved as much as possible in 136 mL of methanol (in which (R)-seobutylamine sulfate is very soluble) with sonication and heating. The mixture was cooled to room temperature and filtered, washing with 136 mL of methanol. The methanol-insoluble solid, 4.06 g, was identified as ammonium sulfate by IR spectroscopy. Concentration of the filtrate gave 19.4 g of crude (R)-sec-butylamine- 1/2H2SO4, ee 99.2% (see Note 2 below), purity 97.8 weight %.
[00118] The solid was dissolved in 100 mL of methanol at reflux, and 200 mL of absolute ethanol was added in 50-mL portions, returning to reflux after each addition. Product crystallized after the first addition. A total of 200 mL of solvent (bp ~68 to 75.5 0C) was then distilled out. The mixture was cooled (ice bath) and filtered, washing with 50 mL of cold ethanol. The solid was dried in vacuo at room temperature, giving 17.6 g of (R)-sec-butylamine- 1/2H2SO4, as a nacreous colorless solid, mp 280-295 0C (dec), ee 99.4% (Note T).
[00119] The mother liquor/wash was concentrated and the residue recrystallized from 10 mL of MeOH-EtOAc, 1 : 1, giving an additional 0.68 g oi(R)-sec- butylamine- 1/2H2SO4 as a colorless solid, ee 99.4%, purity 97.2 weight % relative to the standard. Concentration of the mother liquor/wash gave 0.79 g of brown solid, ee 99.1% (Note T), purity 64.0% relative to the standard.
Note 1) The dansylation procedure was adapted from Tapuhi, Y. et al., "Dansylation of amino acids for high-performance liquid chromatographic analysis", Analytical Biochemistry, 115: 123-129 (1981).
[00120] Reagents: 1.5 mg Dns-Cl per mL in MeCN (5.56 mM), stored in the dark (stable for at least one month), 40 mM Li2CO3 adjusted to pH 9.5 with HCl and 2.5% (~0.4 M) ethanolamine in water.
[00121] Mix 5 μL of a <40 mM aqueous solution of the amine with 100 μL of lithium carbonate buffer, add 50 μL of the DnsCl-MeCN solution and shake briefly. After 0.5 hours in the dark at room temperature, quench with 5 μL of 2.5% ethanolamine. This rapidly discharges the residual yellow Dns-Cl color. Dilute with 1340 μL of mobile phase A for assay.
[00122] Reversed phase HPLC of the dansyl derivative was done on a YMC AQ (4.6 x 50 mm) column, eluting at 2 mL/min with 0.15 mM H3PO4 in water-MeCN, 35:65, with detection at 248 nm. Dns-sec-butylamine eluted at 2.43 min.
Note 2) Chiral assay was done by HPLC of the derivative formed with 1- fluoro-2,4-dinitro-5-L-valinamide (Marfey reagent) using a procedure derived from Harada, K. -I. et al., J. Chromatogr. A, 921 : 187-195 (2001). A mixture of 50 μL of 50 mM analyte, 20 μL of 1 M NaHCO3 and 100 μL of a 1% (w/v) solution of L-FDVA in acetone (0.033 M) was vortex briefly and incubated at 40 0C for 1 hour. Then 20 μL of 1 M HCl was added and the mixture vortexed and diluted with 810 μL of MeCN. This solution was further diluted 10-fold with MeCN-water or MeOH-water, 1: 1, for HPLC assay. Chromatography was done on a YMC-Pack ProC18, 4.6x150 mm, 3 μm, column eluting at 0.75 mL/min with 15 mM H3PO4 in MeOH-water, 72:28, with detection at 340 nm. The Marfey derivatives of (R)- and (S)-sec- butylamine eluted in and 14.53 and 15.80 min.
EXAMPLE 2
Preparation of (φ-l-cyclopropylethylamine from Racemic 1-cyclopropylethylamine using Bacillus megaterium SC6394
pyruvate alanine
B. megaterium SC6394
Figure imgf000040_0001
Figure imgf000040_0002
(R)-I -cyclopropylethylamine + 1-cyclopropyl methyl ketone
[00123] Racemic 1-cyclopropylethylamine hydrochloride ( 23 g, 189 mmoles) was dissolved in 184 mL water, 46 mL IO N NaOH was added. The solution was cooled on ice, then 40 mL cone. HCl (11.6 M) was added to adjust the solution to pH 8. Potassium phosphate buffer (230 mL of 1 M, pH 8, diluted with water to 2012 mL) was prepared. Bacillus megaterium SC6394 wet cell paste (230 g cells stored at -700C) was thawed and dispersed in 1 L of the buffer with an Ultraturrax T25 homogenizer. The cell suspension was added to a 5-L vessel for a Braun Biostat® B, then sodium pyruvate (41.6 g, 378 mmoles) dissolved in 200 mL of the phosphate buffer was added. The vessel was connected to a Braun Biostat® B equipped with pH and oxygen electrodes and the suspension was stirred at 300 rpm. The amine solution was added to the stirred suspension and the remainder of the buffer was used to rinse the cells, pyruvate and amine into the vessel. The transamination reaction was carried out at 28 0C, 300 rpm stirring speed, the pH 8 was maintained with 10% NaOH and 8.5% H3PO4 feeds, and air was sparged from near the bottom of the vessel at 4L/min (2 wm). The air exited from the vessel through a condenser maintained at 4 0C.
Samples were assayed by dansylation and chiral HPLC, and the reaction was stopped when the ee of R-amine reached 100%. After 22 h the ee was 16.2%, but the next sample after 46 h was >99.9% ee. Initially O2 was 65% saturation but dropped to 0% saturation by 22h and remained at 0% until the reaction was terminated after 46h. Initially the NaOH feed was required to maintain pH 8, but by 22h the H3PO4 feed was required. The final concentration of sodium pyruvate determined by HPLC was 5.2 mg/mL. The cells were removed by centrifugation at 16,900xg for 20 min, and then the pellets were washed with 20 mL water and centrifuged again. The combined supernatants (2380 mL) contained 9.94 g R-cyclopropylethylamine amine with an ee. of >99.9%.
[00124] The reaction mixture (2.4 L, pH 7.9) was adjusted to pH 13.1 with 10 M NaOH (120 g) and mixed with 182 mL of n-butanol (Note 1). The mixture was distilled at atmospheric pressure, collecting 200-mL fractions and the fractions adjusted to pH 4.0-4.5 with sulfuric acid. Assay by dansylation and reversed-phase HPLC, Notes 2 and 3, indicated only a trace of product in the third fraction. The rich fractions were combined and concentrated in vacuo, giving 28.3 g of white solid. This was suspended with sonication in 142 mL of methanol (in which 1- cyclopropylethylamine- I/2H2SO4 is very soluble) and filtered, washing the insoluble material (ammonium sulfate, 3 g) with 142 mL of methanol. The filtrate was concentrated in vacuo, chasing with ethanol and drying to constant weight at room temperature, to give 10.43 g of colorless solid, 1-cyclopropylethylamine- 1/2H2SO4, mp 258-260 0C (dec), yield 82.2% (uncorrected for purity). The ee, determined by dansylation and chiral HPLC, Notes 2 and 4, was 99.4%.
Notes: 1) The butanol is added to control foaming.
2) Dansylation procedure (adapted from Tapuhi, Y. et al., "Dansylation of amino acids for high-performance liquid chromatographic analysis", Analytical Biochemistry, 115: 123-129 (1981)). Reagents: 1.5 mg Dns-Cl per mL in MeCN (5.56 mM), stored in the dark (stable for at least one month), 40 mM Li2CO3 adjusted to pH 9.5 with HCl and 2.5% (-0.4 M) ethanolamine in water. Mix 5 μL of a <40 mM aqueous solution of the amine with 100 μL of lithium carbonate buffer, add 50 μL of the DnsCl-MeCN solution and shake briefly. After 0.5 hours in the dark at room temperature, quench with 5 μL of 2.5% ethanolamine. This rapidly discharges the residual yellow Dns-Cl color. 3) For reversed phase HPLC, dilute the dansylation assay mixture with
1340 μL of mobile phase A and chromatograph on YMC AQ (4.6 x 50 mm) with 20 or 70% B where A = 0.15 mM H3PO4 in water and B = 0.15 mM H3PO4 in water- MeCN, 1 : 1, detecting at 248 nm. With 20%B, DnS-NH2 and DnS-NHCH2CH2OH elute in 1.7 and 3.1 min. With 70% B, Dns- 1 -cyclopropylethylamine elutes in 2.7 min.
4) For chiral HPLC, dilute the dansylation assay mixture with 200 μL of water, extract with EtOAc, wash the extract with water, concentrate the extract to dryness and dissolve the residue in 250 μL of heptane-EtOH, 19: 1. Inject as is (to detect trace enantiomer) or after suitable dilution with mobile phase. Chromatography on Chiralpak AD-H, 4.6 x 150 mm, at 1 mL/min with hexanes- methanol, 97:3, elutes the R enantiomer in 9.4 min and the S enantiomer in 10.7 min (detection at 248 nm).
EXAMPLE 3 Purification, Cloning, and Expression of S-transaminase
Purification of (^-transaminase (Aminotransferase) Enzyme Assay
[00125] The transaminase (aminotransferase) activity is determined by HPLC measuring the consumption of (5)-seobutylamine after Marfey's derivatization. A typical reaction mixture (0.5 ml) contains 50 mM Tris-HCl buffer pH 7.5, 1 mg/ml sec-butylamine, 100 mM sodium pyruvate and 0.1 mM pyridoxal 5'-phosphate (PLP). The reaction is initiated by addition of 0.1-10 μg enzyme, and incubated at 28 0C, 200 rpm for 2 hours. The reaction is terminated by adding 0.5 ml ethanol to the mixture. After mixing and centrifugation for 5 min, 10 μl of the supernatant is used for derivatization. Aminotransferase activity is defined as S-sec-butylamine consumed in micromole per minute. The consumption of S-secbutylamine based on derivatization and HPLC quantitation was linear with different amounts of enzyme over a 6-hour period.
Preparation of Crude Extract [00126] For preparation of crude extract, 30 g (wet weight) of frozen (-70 0C) cells was suspended in 300 ml of 50 mM potassium phosphate buffer pH 7.0 containing 1 mM dithiothreitol (DTT), then disintegrated by three passages through a Micro fluidizer at 12,000 psi. The disintegrated cells were centrifuged at 50,000 x g for 90 min at 4 0C to remove cell debris. The supernatant was collected and stored at - 20 0C for enzyme purification.
Enzyme Purification
[00127] The transaminase was purified by three chromatographic steps. First, 100 ml of crude extract (850 mg protein) with 1 M ammonium sulfate was loaded onto a butyl-sepharose column (1.5 x 25 cm) equilibrated with 100 ml of 50 mM potassium phosphate buffer pH 7.0 containing 1 mM DTT and 1 M ammonium sulfate. After washing with 100 ml of the same buffer, the enzyme was eluted with a 100 ml linear gradient of ammonium sulfate from 1 M to 0 and an additional 20 ml of water while collecting 3 ml fractions. The active fractions were pooled (30 ml), and then concentrated and desalted by centrifugation with a CentriconPlus (10 kDa cut off, Millipore Co.) to 2 ml. Next, the concentrate was injected onto a hydroxyapatite column (CHT5 bioscale, Bio-Rad) pre-equilibrated with 25 ml of 50 mM potassium phosphate buffer pH 7.0 at flow rate of 1 ml/min. After washing with 15 ml of the buffer, the column was eluted with a 25 ml linear gradient of 50 to 350 mM potassium phosphate at pH 7.0, and 1 ml fractions were collected. The active fractions (3 ml) were combined and concentrated to approximately 2 ml by centrifugation with a CentriconPlus. Finally, the concentrated and desalted enzyme preparation (2 ml) was injected onto a UnoQ column equilibrated with 50 mM potassium phosphate buffer pH 7.0 containing 1 mM DTT and 50 mM NaCl at a flow rate of 1 ml/min. After washing with 12 ml of the buffer, the enzyme was eluted with a 16 ml linear gradient of 50 to 250 mM NaCl, and 0.75 ml fractions were collected and assayed for enzyme activity.
[00128] The transaminase was purified over 446-fold by the three columns, and the purified enzyme displayed a predominant protein band at 50 kDa on SDS-PAGE after staining with SimplyBlue™ SafeStain (Invitrogen, San Diego) and destaining with water. The purification steps and results are summarized in the following table.
Figure imgf000044_0001
Protein Sequencing
[00129] Purified transaminase (aminotransferase) protein was blotted on a PVDF membrane after SDS-PAGE with a 10% NuPAGE gel and MOPS running buffer at 200 v according to the instruction of the manufacturer. The enzyme band at 50 kDa was excised from the membrane. N-terminal sequencing and tryptic digestion and internal peptide sequencing were performed by the Keck Protein Sequencing Facility at Yale University. The N-terminal sequence and five internal peptide sequences were as follows:
N-Terminal: SLTVQKINXEQVKE (SEQ ID NO:3) Internal #1 : LYTNRPLVVTR (SEQ ID NO:4) Internal #2: SVLIGGVMPNCMR (SEQ ID NO:5) Internal #3: IGASLNVSR (SEQ ID NO:6) Internal #4: FVSTGSEA VETALNIAR (SEQ ID NO:7) Internal #5: GLSSSSLPAGAVLVS (SEQ ID NO:8)
Cloning of the (5)-transaminase (BMTA) from Bacillus megaterium SC6394 [00130] Bacillus megaterium chromosomal DNA was prepared using the procedure described in Ausubel et al, eds., Current Protocols in Molecular Biology , Vol. 2, Section 13.11.2, John Wiley and Sons, New York, NY (1981) with the following modification: The cell pellet was resuspended in 9.5 mL GTE buffer (50 mM glucose, 25 mM Tris-HCl pH 8.0, 10 mM NaEDTA) containing 2 mg/mL lysozyme and incubated at 37°C for 30 min before adding SDS and Proteinase K. [00131] A series of mixed oligonucleotide primers were prepared based on the partial peptide sequences obtained for the enzyme:
Figure imgf000046_0001
where "A"= adenosine, "C"= cytosine, "G" = guanosine, "T" = thymidine, "W"=A + T, "R"=A+G, "Y"=C + T, and "I"=deoxyinosine. Primer sets 755 + 757 and 765 + 758 were used to amplify the gene using genomic DNA as target. Combinations of sense and antisense primers were tried with the FailSafe series of PCR buffers (Epicentre Technologies, Madison, WI) and B. megaterium chromosomal DNA as template in 10 μL reactions. Amplification was carried out in a Hybaid PCR Express thermocycler (ThermoSavant, Holbrook, NY). The amplification conditions included incubation at 94°C for 1 min, followed by 30 cycles at 94°C for 0.5 min; 500C for 0.5 min; and 72°C for 0.5 min. Samples were electrophoresed on a 1.0% agarose gel for 2 hr at 100 v in TAE buffer (0.04 M Trizma base, 0.02 M acetic acid, and 0.001 M EDTA, pH 8.3) containing 0.5 μg/ml ethidium bromide.
[00132] Strong amplification of a single fragment of the expected molecular weight (based on homology to other transaminase) was obtained in 11 out of the 12 buffers tested using oligonucleotide pair 756 + 758 (ca. 580 base pairs). The fragment was excised from all lanes where it appeared in the gel and purified using the Qiagen Gel Purification Kit (Qiagen) and ligated to vector pTOPO-TA (Invitrogen) according to the manufacturer's recommendations. Chemically- competent Escherichia coli TOP 10 cells (Invitrogen) were transformed with 2 μL of the ligation reaction as directed, by electroporation using a BioRad GenePulser unit (BioRad, Hercules, CA) at 2.5 kV, 25 μF, and 200Ω. SOC medium (250 μL; per liter, 5 g yeast extract, 20 g Bacto-tryptone, 580 mg NaCl, 186 mg KCl , 940 mg MgCl2 , 1.2 g MgSO4, and 3.6 g glucose) was added to the transformed cells, which were shaken at 37°C for 1 hr at 225 rpm and then spread onto an LB agar plate containing 50 μg/mL kanamycin sulfate. The plate was incubated for 20 hr at 37°C. The presence of the PCR fragment of pTOPO-TA was confirmed by swirling seven single colonies from the LB kanamycin plate into 10 μL of the PCR reaction mix described above. Thermocycling and gel electrophoresis conditions were as previously described. All colonies tested supported strong amplification of a 580-bp fragment. Plasmid DNA was extracted from ImL cultures of four positive colonies using the Eppendorf Fast Plasmid kit and ca. 250 ng of each digested with 5 U restriction endonuclease EcoRI at 37°C for 1 hr. After agarose gel electrophoresis, three of the four plasmids released the expected 580-bp fragment and were submitted for DNA sequencing. BLAST search of the translated sequence revealed strong homology to many 4-aminobutyrate transaminase genes.
[00133] For Southern hybridization, B. megaterium genomic DNA was cleaved with a series of restriction endonucleases (Apal, BamHI, BgEI, EcoRI, Hindlll, Kpnl, Pstl, and Smal). Reactions contained 3 μg DNA, appropriate buffer, and 20 units enzyme in 25 μL final volume. Digests were carried out for 3 hr at 37°C, then electrophoresed in a 0.8% TAE-agarose gel at 16 v for 18 hr. The DNA was transferred to Hybond N+ nylon filters under alkaline conditions using the VacuGene vacuum blotting unit (Amersham, Piscataway, NJ). A digoxygenin-labeled PCR fragment using the PCR DIG Probe Synthesis Kit with oligos 756 + 758 was prepared as directed by the vendor (Roche Biomedicals, Indianapolis, IN). Hybridization of the labeled BMTA-specific PCR fragment to the blotted DNA digests was performed in EasyHyb solution (Roche Biomedicals, Indianapolis, IN) for 18 hr at 42°C. Stringency washes were carried out in 0.5 x SSC (20 X SSC = 173.5 g NaCl and 88.2 g NaCl, pH 7.0), 0.1% sodium dodecyl sulfate at 68°C for 2 x 15 minutes. Detection using a fluorescein-labeled, anti-digoxygenin antibody was performed as recommended by the manufacturer (Roche). A 4.5 kb Hindlll fragment that strongly hybridized to the probe was chosen for further work.
[00134] Twenty μg of chromosomal DNA was cleaved with 100 U Hzwdlll in a total volume of 200 μL for 2 hr at 37°C and electrophoresed as described above. The region from 4000-5000 base pairs was cut from the gel and the DNA purified using the QIAquick Gel Isolation kit. The isolated DNA was able to support amplification of a 580-base pair fragment by PCR using oligonucleotides 756 + 758. A sample of the isolated chromosomal DNA was ligated to pZerO2 vector DNA (Invitrogen) digested with HindIII at a 5: 1 (insert:vector) molar ratio in a total volume of 10 μl at 22°C for 15 min using the Fast Link kit (Epicentre). DNA was precipitated by addition of 100 μL 1-butanol and pelleted at 13,500 x g in a microcentrifuge for 5 min. Liquid was removed by aspiration, and the DNA was dried in a SpeedVac (Savant Instruments, Farmingdale, NY) for 5 min under low heat. The pellet was resuspended in 4 μl dH2O. The resuspended DNA was transformed by electroporation into 0.04 ml E. coli DHlOB competent cells (Invitrogen) at 2.5 kV, 25 μF, and 250 Ω SOC medium was immediately added (0.96 ml) and the tube containing the transformed cells incubated in a shaker for 1 hr at 37°C and 225 rpm. Colonies containing recombinant plasmids were selected on LB agar plates containing 50 μg/ml kanamycin sulfate (Sigma Chemicals, St. Louis, MO). Sufficient cells to give ca. 10,000 colonies were spread onto a 132 mm Hybond N+ membrane (Amersham Pharmacia) placed on top of LB agar medium containing 50 μg/ml kanamycin and incubated at 37°C for 20 hr. Colonies were replicated onto two fresh filters that were placed on top of LB kanamycin agar medium. The filters were incubated at 37°C for 4 hr. Colonies were lysed in situ by placing the filters on a piece of Whatman 3MM paper (Whatman International, Maidstone, UK) saturated with 0.5 M NaOH for 5 min. The filters were dried for 5 min on Whatman paper, then neutralized on 3MM paper soaked in 1.0 M Tris-HCl, pH 7.5 for 2 min, and dried for 2 min. Membranes were placed on top of 3MM paper saturated with 1.0 M Tris-HCl, pH7.0/1.5 M NaCl for 10 min. DNA was crosslinked to the filters by exposure to ultraviolet light in a Stratagene UV Stratalinker 2400 set to "auto crosslink" mode (Stratagene, La Jolla, CA). Cell debris was removed from the membranes by immersing in 3X SSC/0.1% SDS and wiping the surface with a wetted Kimwipe® (Kimberly-Clark Co., Roswell, GA), then incubating in the same solution heated to 65°C for 3 hr with agitation. Filters were rinsed with dH2θ and used immediately or wrapped in SaranWrap® and stored at 4°C. Hybridization, washing, and detection of the colony blots were performed as described above using the labeled PCR probe. Thirty-six positively hybridizing colonies were inoculated into ImL TB-kanamycin liquid medium in a 2.mL multiwell growth block and shaken at 37°C for 60 hr, 250 rpm. Plasmid DNA was prepared using the Pure Link kit from Invitrogen and resuspended in 25 μL Tris- HCl pH 8.5. A 1 μL sample of each plasmid was tested for presence of the BMTA gene by PCR as described previously. Fourteen out of the 36 plasmid isolates successfully amplified the 580-bp fragment. Four of these samples were digested with HindIII and electrophoresed on an agarose gel; all revealed the presence of a single 5.0 kb fragment (plus vector sequence), named pZerO2-BMTA. [00135] Using the previously obtained sequence for the BMTA PCR fragment, additional primers extending towards the 5' and 3' ends of the gene were prepared. These primers were used with the pZerO2 clones containing the 5.0 kb HindIII insert and permitted the identification of an open reading frame with strong homology to other 4-aminobutyrate transaminase genes. The complete nucleotide and corresponding amino acid sequence are as shown in the Figure 1.
Subcloning of the BMTA Gene into E. coli Expression Vector pBMS2004 [00136] Oligonucleotide primers were prepared containing 1) an Ndel site followed by the first 24 nucleotides of the (5)-transaminse gene (Oligo 763: 5'- GACATATTTAAAT CATATGAGTTTAACAGTGCAAAAAATAAAC-S' (SEQ ID NO: 17)) and 2) the last 24 nucleotides of the (5)-transaminase gene (including stop codon) followed by a BamHI restriction site (antisense of the complementary strand; Oligo 764: 5'- GACATATTT AAATCCATGGGTTTAACAGTGCAAAAAATA AAC -3' (SEQ ID NO: 18); restriction sites are underlined). High-fidelity PCR amplification of the B. megaterium BMTA gene was carried out in a 400 μL final volume with Z-Taq DNA polymerase (Takara) in vendor-supplied reaction buffer, 0.2 mM each deoxynucleotide triphosphate (dATP, dCTP, dGTP, and dTTP), 0.4 nM each oligonucleotide, 2.5 U polymerase, and 100 ng pZerO2-BMTA plasmid DNA. The amplification conditions were as previously described. The sample was applied to a 1.0% agarose gel and electrophoresed for 1.5 hr, 100 v. The expected 1300-bp fragment was excised from the gel and purified using the QIAquick Gel Isolation kit. DNA concentration was adjusted to 100 ng/μL. [00137] Detailed DNA sequence analysis revealed that BMTA gene contained an internal Ndel restriction site, so simultaneous digestion with this enzyme and BamHI was not possible. Instead, 2 μg of the amplified BMTA fragment was cleaved with 10 U BamHI for 1 hr, 37°C. Then 10 U NcIeI was added and the sample incubated an additional 15 min at 37°C. After agarose gel electrophoresis, the 1300-bp fragment was visible as were two additional fragments of ca. 700 and 600 bp (the latter two fragments representing cleavage at the internal Nclel site). The largest fragment was purified and ligated to plasmid pBMS2004 that was previously digested with Nclel + BamHI. Ligation conditions and transformation of DHlOB were as previously described. Kanamycin-resistant colonies were screened for the presence of the BMTA gene by colony PCR using oligos 763 + 766. Plasmid DNA was prepared from cultures of two colonies that contained the insert and were verified to possess the expected 850 bp Nclel fragment.
Expression of the BMTA Protein in E. coli
[00138] pBMS2004-BMTA was transformed into competent E. coli expression strain BL21 by electroporation as described above. For shake flask expression work, a single kanamycin-resistant colony was initially grown in MT5-M2 + kanamycin for 20-24 hr, 300C, 250 rpm. MT5-M2 medium contains Hy-Pea (Quest International) 2.0%; Tastone 154 (Quest), 1.85%; Na2HPO4, 0.6%; (NH4)2SO4, 0.125%; glycerol, 4.0%; pH adjusted to 7.2 w/10 N NaOH before autoclaving. [00139] The optical density at 600 nm (OD6oo) was recorded and fresh medium inoculated with the culture to a starting ODβoo of 0.30. The flask was incubated as described above until the OD6oo reached -0.8-1.0. Isopropyl-thio-β-D-galactoside (IPTG) was added from a 1 M filter-sterilized stock in dH2O to a final concentration of 50 μM and the culture allowed to grow for an additional 22 hr. Cells were harvested by centrifugation at 5,000 x g at 4°C in a Beckman JA 5.3 rotor. Medium was discarded and the pellet resuspended in an equal volume of 0.1 M potassium phosphate buffer, pH 7.0. Cells were recentrifuged under identical conditions and the buffer removed. The wet cell weight was recorded and samples were stored frozen at -200C or used immediately for assays. Gel electrophoresis of a sample of BL21(pBMS2004-BMTA) cells on a 10% Tris-Glycine polyacrylamide gel using SDS-MOPS running buffer (Invitrogen) revealed a novel overexpressed protein band of Mr= 52,000 daltons. Fermentation of Escherichia coli Strain SC 16578 Expressing B. megaterium (5)-transaminase
[00140] Strain SC16578 [E coli BL21(pBMS2004-BMTA)] was used for the production of B. megaterium SC6394 S-transaminase. Enzyme production was the result of IPTG- induced activation of the appropriate promoter.
Inoculum Development
[00141] Two frozen vials of SC16578 were thawed and the entire contents (1.5 ml) were transferred to each of two 500-ml flasks containing 100 ml of MT 5 medium (see Table 1). The Fl stage flasks were incubated at 300C for 24 hours and 250 rpm.
From each Fl flask, 5 ml were transferred to two 4-liter flasks containing 1 liter of the same MT5 medium (total: 4 flasks). The four F2 flasks were incubated at 300C and
230 rpm for an additional 20 hours.
[00142] The four F2 flasks were pooled and the optical density (ODβoo) was measured. This was done by diluting the broth 2Ox into un-inoculated MT5 medium, and using the same medium as a blank. For the current example, the inoculum ODβoo was 7.5 U/cm (0.375 U/cm at the 2Ox dilution). The 4 liters of pooled inoculum were then transferred to a 380-liter tank employing a working volume of 250 liters MT5-
M2 medium (see Table 2).
250-L Tank Fermentation
[00143] The recommended fermentation process conditions for a 380-liter tank containing 250 liters of MT5-M2 medium are as follows:
Temperature: 300C pH control: none
Agitation: 300 rpm
Aeration: 250 lpm (1 wm)
Dissolved oxygen: no control
Back pressure: 10 psi Antifoam: UCON on demand
Induction: IPTG added at a level of 50 uM (ca. log 3.5 hours)
Harvest: ca. 21 hours [00144] Following inoculation, optical density was measured hourly to determine induction time. Determinations were made by centrifuging 1 ml broth samples at 5000g for 10 minutes and re-suspending the cell pellet in 10 ml water. The 10x dilutions were then measured versus a log M (uninoculated) tank sample prepared in the same manner and used as a blank. For later log samples, pellets were diluted an additional 10x with water for a final dilution of 10Ox. At an OD6oo of ca. 0.8 to 1.2 U/cm, the IPTG solution was added to yield a final concentration of 50 uM. For the current example, the OD was 0.91 U/cm. This addition time should correspond to a CO2 off-gas value of ca. 0.08-0.16%, with a value of 0.12% obtained for the current example.
Fermentation Final Cell Density Recovery
Cell density range: OD6oo ca. 41 U/cm Cell recovery: 12.3 kg cell paste
Medium Composition
MT5 Flask Medium
Figure imgf000052_0001
The medium was batched with de-ionized water and adjusted to pH 7.2 with NaOH. One-liter and 100-ml aliquots were dispensed to 4-liter and 500-ml flasks, respectively, and autoclaved at 121°C for 30 minutes. kanamycin and magnesium sulfate were added to the medium after autoclaving as follows:
For magnesium sulfate, a 24.6% solution was prepared and filter-sterilized though a 0.2 um Nalgene cellulose nitrate filter. The appropriate quantity of this 100Ox solution was then added to each flask (100 μl to 100 ml, 1 ml to 1 liter). For kanamycin, a 5% solution was similarly prepared, filter-sterilized and dispensed to yield the desired final concentration (100 μl to 100 ml, 1 ml to 1 liter).
MT5-M2 Tank Medium
Figure imgf000053_0001
The pH was adjusted to 7.2 with NaOH and sterilized according to the established tank protocol (30 minutes at 121°C). xKanamycin was added to the medium after autoclaving as follows: 12.5 g were dissolved in 500 ml de-ionized water, filter-sterilized, and added to a transfer bottle. The kanamycin solution was added to the tank medium just prior to inoculation (log M).
Figure imgf000053_0002
[00145] For IPTG, 2.98 g were dissolved in 500 ml de-ionized water, filter- sterilized, and added to a transfer bottle. The IPTG solution was added to the tank medium when growth reached an ODβoo of ca. 0.8 - 1.2 U/cm and a CO2 off-gas value of ca. 0.08-0.16%.
Harvest
[00146] Based on a clear, steady decline in the CO2 off-gas profile, the whole broth was harvested by centrifugation at 21 hours (17.5 hours post-induction). During centrifugation, the cells were washed with 50 mM pH 7.0 potassium phosphate buffer. The cell paste was then stored at -700C. EXAMPLE 4 Lipase-catalyzed Resolution of (ΛS^-l-cyclopropylethylamine
Figure imgf000054_0001
[00147] In a 2 L flask, 100 g immobilized lipase from Candida antarctica (Novozym 435 from Novozymes Inc.) and MTBE (500 ml) were mixed. (RS)-I- cyclopropylethylamine (10 ml = 8.1O g, Eastman Chemical) and ethyl caprate (100 ml, Aldrich) were added following an additional 500 mL of MTBE. The contents were stirred at room temperature (210C). After 5 minutes, 4 hrs and 22 hrs, the stirring was stopped briefly to take out small samples (100 μl) for analysis. The samples were extracted with 200 μl of IN HCl to obtain the residual amine. The ee and amount of residual amine were evaluated by dansylation as described above. The results are shown in the table below.
[00148] HPLC after 22 hrs showed only the (S>enantiomer of 1- cyclopropylethylamine remaining. The stirring was stopped after 27 hrs. The contents were filtered through filter paper to remove the immobilized enzyme. The solid was washed with MTBE (2 X 100 ml). The combined MTBE extract contained the product of resolution. A sample from this MTBE solution (100 μl) was extracted with 200 μl IN HCl and analyzed by converting to the Dansyl derivative.
Figure imgf000054_0002
[00149] The (R)-amine peak in the 22 hr sample and in the final product was not observed in the usual HPLC chromatograms and was calculated by extensive enhancement of baseline. The actual ee of (5*)-amine in 22 hr sample and in the final product was most likely higher than 99.3%. About 49% of the starting 1- cyclopropylethylamine remained in the final MTBE extract. Isolation of (5)-l-cyclopropylethylamine
[00150] The filtrate (854 mL) from the above experiment was stirred with 214 mL of water and the pH adjusted to 2.9 with 10 M H2SO4 (1.44 mL). The aqueous phase was separated and the upper phase washed with 100 mL of water. The combined aqueous extract was washed with 100 mL of MTBE, adjusted to pH 4.10 with 1 M NaOH (0.99 mL) and concentrated in vacuo to a wet residue. On mixing with MeOH and EtOAc the residue crystallized. Concentration and drying in vacuo gave 3.84 g of crude product. [00151] The crude product was dissolved as much as possible in 11 mL of MeOH and the turbid solution filtered to remove inorganic sulfates (0.12 g), washing with 14 mL of MeOH. EtOAc, 31.5 mL, was added to the filtrate and the mixture heated to reflux to dissolve most of the solid. The product crystallized on cooling to room temperature. The solid was filtered out, washed with 15 mL of EtOAc and dried in vacuo, giving 2.32 g of (5)-l-cyclopropylethylamine, hemi-sulfate salt, mp 248-2560C (dec), ee 99.9%, 17.3 mmoles, 36.3% yield.
[00152] The combined mother liquor and rinse was concentrated in vacuo, giving 1.46 g of residue. This was dissolved in 10 mL of MeOH and 12 mL of EtOAc at reflux, and 13 mL of EtOAc was added slowly to the boiling mixture to crystallize the product. The mixture was cooled to room temperature, filtered and the solid washed with 10 mL of EtOAc and dried in vacuo to give 1.18 g of additional product, mp 258-260 0C (dec), ee 99.8%, 8.8 mmoles, 18.5% yield. Concentration of the mother liquor/rinse gave 0.27 g of residue, ee 99.6%, 4.2% yield. (Total isolated yield 59%).
Isolation of (/?)-iV-l-cyclopropylethyl decanamide
[00153] TLC of the MTBE solution after extraction of the amine with sulfuric acid (silica gel with DCM-MeOH, 19: 1) showed a mixture of decanoic acid (Rf 0.47), N- 1-cyclopropylethyl decanamide (Rf 0.74) and ethyl decanoate (Rf 0.85) (Rydon- Smith detection: the ester and acid give light zones on a dark background after a few minutes whereas the amide gives a black zone). The MTBE solution was extracted with water adjusting the mixture to pH 12.3 with NaOH to remove the acid. Concentration of the resulting organic phase gave 45.2 g of the ester-amide mixture as a waxy solid. This was dissolved in dichloromethane (DCM) and applied to a 5 x 26-cm column of silica gel (230 g of silica gel slurry packed in DCM). Elution with 1 L of DCM swept the ester (Rf 0.62 on silica gel with DCM vs. Rf 0.19 for the amide) from the column. Continued elution with 600 mL of MTBE eluted the amide (Rf 0.69 on silica gel with MTBE) contaminated with a little decanoic acid. The amide- rich effluent (200 mL) was extracted with 50 mL of 1 M NaOH to remove the acid. The upper phase was washed with several small portions of water and concentrated in vacuo, giving 8.94 g (37.3 mmoles, 78.5% yield of (R)-N- 1 -cyclopropyl ethyl decanamide as a colorless crystalline solid, mp 67.5-70.0 0C, ee 85%. Recrystallization of a 110-mg sample of from 1.1 mL of MeCN gave 74 mg of purified (R)-N- 1 -cyclopropylethyl decanamide mp 69.5-70.5 0C, ee 95.5%. [00154] A reference sample of the (RS)-N-I -cyclopropylethyl decanamide was prepared from (RS)-I -cyclopropylethylamine and decanoyl chloride. It melted at 52-53 0C, indicating that the racemate is a conglomerate. Chiral chromatography of (RS)-N- 1 -cyclopropylethyl decanamide was done on a 50 x 4.6 mm Chiralpak AD-H (5 μm) column, eluting at 1 mL/min with hexanes-MeOH, 99: 1, and monitoring at 200 nm. The (R)-amide eluted at 7.5 min and the (S) amide at 8.0 min.
HPLC Analysis of (/?5)-l-Cyclopropylethylamine After Dansylation
Figure imgf000056_0001
[00155] A solution of Dansyl chloride in acetonitrile at a concentration of 2 mg/ml was made. A solution of 50 mM sodium carbonate in water was prepared and the pH was adjusted to 9.5 by addition of HCl. (RS)-I -cyclopropylethylamine hydrochloride salt (20 mg) was suspended in 10 ml tert-amyl alcohol. Triethylamine (20 μl) was added to the suspension and mixed in the vortex mixer whereby everything dissolved. The solution was filtered through a 0.2 μ filter. To 100 μl of the above (RS)-I- cyclopropylethyl amine solution, 0.9 ml of the sodium carbonate solution was added and mixed. For control reaction, 100 μl tert-amyl alcohol only was used. The Dansyl chloride solution (1 ml) was added to the solution and mixed by vortexing. The mixture was kept at room temperature for 30 minutes with occasional mixing. The reaction was quenched by addition of 0.1 ml of a 0.2M NH4OH solution. After 15 minutes, the color was changed from yellow to colorless. The solution was filtered through 0.2 μ filter for HPLC.
[00156] Reversed phase HPLC to determine the extent of conversion was done as follows:
Column: YMC pack Pro C 18, 150 X 4.6 mm, 3 μm, Waters Solvent: A (0.05% TFA in WateπMethanol 80:20) B (0.05% TFA in Acetonitrile:Methanol 80:20)
Gradient from 0% B to 100% B in 20 minutes Flow Rate: 1 ml/min, Temperature: 400C Detection: UV, 220 nm
[00157] HPLC of the control (no 1 -cyclopropylethylamine) showed peaks at 2.36 and 6.06 min. Dansyl derivative of (RS)- 1 -cylclopropylethylamine showed a strong peak at 13.27 min in addition to the above peaks for the control. The peak at 13.27 min was due to the dansyl derivative of 1 -cyclopropylethylamine. LC-MS corresponding to this peak showed a positive ion at 319.3 (M+H) confirming the dansyl derivative of 1 -cyclopropylethylamine.
[00158] The enantiomeric compositions of the dansyl derivatives were analyzed on the chiral reversed phase column Chiralpak AS-RH, 150 X 4.6 mm, 5 μ, Chiral Technologies Inc., using the isocratic solvent combinations of 76% A (0.05% TFA in WateπMethanol 80:20) and 24% B (0.05% TFA in Acetonitrile:Methanol 80:20) at 300C and detection by UV at 220 nm. The retention times are: Dansyl derivative of (R)-l-cyclopropylethyl amine 33.66 min, Dansyl derivative of (S)-I- cyclopropylethylamine 36.67 min. EXAMPLE 5 Enzymatic Resolution of (/?5)-l-Cyclopropyl-2-methoxyethylamine
Figure imgf000058_0001
[00159] (R5)-l-Cyclopropyl-2-methoxyethylamine hydrochloride (100 mg) was suspended in a mixture of MTBE (49 ml) and triethylamine (1 ml). The suspension was stirred gently in a closed container for 16 hours. The mixture was filtered. The solution was used for enzymatic resolution as described below. The insoluble solid was triethylamine hydrochloride and was discarded. [00160] To one vial, 1 ml of the above solution [containing (R1S)-I -cyclopropyl-2- methoxyethylamine from 2 mg amine hydrochloride] was added. Ethyl caprate (20μl) was added. CaI B lipase, lyophile from Candida antarctica (50 mg, from Biocatalytics, Inc.) was added to the vial. The vial was placed in the well of a multiwell plate and shaken at 500 rpm at 25°C. [00161] After 24 and 120 hrs, 50 μl samples were withdrawn for dansylation and analysis. Acetonitrile (0.5 ml) and 50 mM Sodium carbonate solution (0.5 ml, pH 9.5) were added to the sample followed by 0.5 ml of dansyl chloride solution. The reaction mixtures were mixed on a microplate shaker at 300 rpm for 30 min. A solution of 200 mM NH4OH (0. ImI) was added and again mixed for 60 min. Each solution was filtered and analyzed by HPLC as described above and the results are shown in the table below. CaI B lipase, lyophilysed enzyme from Candida antarctica showed enantiospecificity. The remaining amine showed enantiomeric excess (ee) of 91.3% for one amine enantiomer after 120 hours.
Figure imgf000059_0002
HPLC Analysis of (/?S)-l-Cyclopropyl-2-methoxyethylamine After Dansylation
Figure imgf000059_0001
[00162] Dansyl chloride in CH3CN (2 mg/ml) and 50 mM Na2CO3 (pH 9.5) were prepared as described before.
[00163] (R5)-l-Cyclopropyl-2-methoxyethylamine hydrochloride salt (18 mg) was suspended in 18 ml tert-amyl alcohol. Triethylamine (27 μl) was added to the suspension. The mixture was stirred on a magnetic stirrer for 30 minutes whereby everything dissolved.
[00164] To 100 μl of the above (i?5)-l-Cyclopropyl-2-methoxyethylamine solution, 1 ml of acetonitrile and 1 ml Na2CO3 solution was added and mixed. For blank reaction, 100 μl tert-amyl alcohol only was used. Dansyl chloride solution (1 ml) was added to the solution. The mixture was placed on an end-over-end mixer at room temperature for 30 minutes. The reaction was quenched by addition of 0.1 ml of a 0.2M NH4OH solution. After 15 minutes of mixing on the same mixer, the color was changed from yellow to colorless. The solution was filtered through 0.2 μ filter for HPLC. [00165] Reversed Phase HPLC to determine the extent of conversion was done as follows. Column: YMC pack Pro C 18, 150 X 4.6 mm, 3 μm, Waters Solvent: A (0.05% TFA in WateπMethanol 80:20) B (0.05% TFA in Acetonitrile:Methanol 80:20) Gradient from 0% B to 100% B in 20 minutes Flow Rate: 1 ml/min, Temperature: 400C, Detection: UV, 220 nm.
[00166] The control [no (R5)-l-cyclopropyl-2-methoxyethylamine] showed peaks at 2.38 and 6.37min. Dansyl derivative of (R5)-l-cyclopropyl-2-methoxyethylamine showed a peak at 12.46 min in addition to the above peaks for control. The peak at 12.46 min was for the dansyl derivative of (RS)- 1 -cyclopropyl-2-methoxyethylamine. The enantiomeric composition of the dansyl derivatives was determined by HPLC on a chiral reversed phase column Chiralpak AS-RH, 150 X 4.6 mm, 5 μ, Chiral Technologies Inc., using isocratic mixture of 76% solvent A (0.05% TFA in WateπMethanol 80:20) and 24% solvent B (0.05% TFA in Acetonitrile:Methanol
80:20) for a total run time 45 minutes at a flow Rate: 0.5 ml/min, Temperature: 300C, and detection by UV at 220 nm. The peaks for the dansyl derivatives of the two enantiomers of (R5)-l-cyclopropyl-2-methoxyethylamine were at 20.18 min and 23.62 min with area ratio 49.59:50.41. The peaks showed baseline separation.
EXAMPLE 6 Enzymatic Resolution of Sec-butylamine using Transaminase from
B. megaterium Expressed in E. coli
750 g (racemic amine, free base), 5.13 moles of 7?-amine, H3PO4 to pH 8; 750 g Na pyruvate (6.82 moles)
E. coli SCl 6578
Figure imgf000061_0002
1) Distill the amine at pH 12-13 and neutralize the distillate with H2SO4
2) Crystallize from water - n-BuOH
Figure imgf000061_0003
(R)-sec-butylamine
[00167] Two L of 10 mM potassium phosphate buffer, pH 7, was added to 375 g frozen E. coli SC16578. After thawing, the cells were dispersed with an Ultraturrax homogenizer. Ten L water was added to a jacketed 20-L reactor, equipped with a stirrer and pH electrode. Stirring was started at 100 rpm, 25 0C. Cone. H3PO4 (345 ml, 5.07 moles) was added to the reactor followed by 750 g racemic seobutylamine (5.127 moles) and stirring was continued. The pH was adjusted to 8.2 with cone. H3PO4 and/or 25% NaOH and the solution was allowed to cool to 30 0C. Sodium pyruvate (750 g, 6.816 moles) was added to the reactor, and rinsed with 100 ml water. The E. coli suspension was added to the reactor and rinsed in with 200 ml water. The pH was adjusted to 8.0 with cone. H3PO4 and/or 25% NaOH, and the final volume was brought to 15 L with water. The reaction was run at 30 0C, 100 rpm, pH 8. No further adjustment or control of pH was necessary. The reaction was continued for 23h until the ee was >99%. The mixture was adjusted to pH 12-13 with 50% (19N) NaOH (-1500 g). Dow Corning antifoam, 100 mL, was added and the jacket temperature was increased to 130 0C to distill at 1 atm. SAG 5693 antifoam (100 ml) was added as needed to control foaming. Distillate was collected until the temperature was stable at -100 0C. Additional 500-mL fractions were collected, titrating each with IN sulfuric acid to pH 4-6 until the quantity of sec-butylamine in the distillate was insignificant. All distillate fractions containing significant sec- butylamine were combined, cooled in an ice bath and titrated to pH 5-7 with 30% (7.45N) sulfuric acid.
[00168] The neutralized distillate was concentrated in vacuo, adding n-BuOH to remove water azeotropically. Water in the supernate was monitored by KF. When the water level was 0.5% or lower, the mixture was filtered, and the cake was washed with 500 mL of n-BuOH. The cake was partially dried by suction on the filter and then dried in vacuo at room temperature to constant weight, giving 585 g oi (R)-sec- butylamine»l/2 H2SO4 (93.3% yield, 99.18% ee).
[00169] The contents of all patents, patent applications, published articles, books, reference manuals, texts and abstracts cited herein are hereby incorporated by reference in their entirety to more fully describe the state of the art to which the invention pertains. The foregoing description and examples are not intended in any respect to limit the scope of the potential embodiments of the claimed invention.

Claims

WHAT IS CLAIMED IS:
1. A process for the preparation of a compound of Formula Ia or Ib
Figure imgf000063_0001
Ia Ib wherein Ri is alkyl, aryl, or heterocyclic; and R2 is cycloalkyl or alkyl; comprising resolving a racemic compound of Formula I
Figure imgf000063_0002
I by reaction catalyzed by an enzyme produced by a microorganism from the group consisting of Bacillus, Candida or Pseudomonas.
2. The process of claim 1, wherein the enzyme is a transaminase or lipase.
3. The process of claim 1, wherein the reaction catalyzed by an enzyme is carried out either by: introducing a racemic compound of formula I into a medium in which the microorganism is being fermented to form a reaction mixture in which the enzyme is concurrently being expressed and catalytically reacted with the racemic compound; or fermenting the microorganism until sufficient growth is realized; and introducing the racemic compound to form a reaction mixture in which the racemic compound of formula I is catalytically reacted with the enzyme.
4. The process of claim 1, wherein the amount of the racemic compound of formula I added to the reaction mixture is up to about 50 g/L of the reaction mixture.
5. The process of claim 1, further comprising isolating, and optionally purifying, the compound of formula Ia or Ib.
6. The process of claim 1 , wherein the reaction with the enzyme is carried out by reacting the racemic compound of formula I with enzyme that was previously isolated and optionally purified before contacting with the racemic compound.
7. The process of claim 6, wherein the enzyme is derived from cell extracts.
8. The process of claim 1, wherein the enzyme is expressed by a plasmid transformed into E. coli host cells.
9. The process of claim 1, wherein the enzyme is obtained from Bacillus megaterium.
10. The process of claim 1, wherein the enzyme is obtained from Bacillus megaterium strain SC6394.
11. The process of claim 1, wherein the enzyme is encoded by a polynucleotide sequence comprising SEQ ID NO: 1.
12. The process of claim 1, wherein the enzyme comprises an amino acid sequence comprising SEQ ID NO:2.
13. The process of claim 2, wherein the enzyme of step (a) provides a reaction yield of greater than 42% by weight of the compound of formula Ia or Ib, based on the weight of the input racemic amine of formula I.
14. The process of claim 13, wherein the process provides the compound of formula Ia or Ib in an enantiomeric excess greater than 95%.
15. The process of claim 1, wherein the reaction with the enzyme is carried out at a pH of between about 5.0 and about 9.0.
16. The process of claim 1, wherein the compound of formula Ia or Ib is
Figure imgf000065_0001
Ia* Ib* and the compound of formula I is
Figure imgf000065_0002
I*.
17. A process for the preparation of an enzyme for the preparation of a compound of formula Ia* or Ib*
Figure imgf000065_0003
from a compound of formula I*
Figure imgf000066_0001
I* comprising
(a) either (i) providing a microorganism selected from the group consisting of Bacillus, Candida, E. coli and Pseudomonas in a growth medium under conditions which allow for expression of an enzyme, or
(ii) introducing a gene which encodes for the enzyme into a host microorganism for recombinant expression, introducing the host microorganism in a growth medium under conditions which allow for expression of the enzyme and allowing it to grow and express the enzyme;
(b) optionally, extracting the enzyme from the growth medium; and
(c) optionally, purifying the enzyme.
18. The process of claim 17, wherein the process of extracting the enzyme comprises lysing the cells of the microorganism and isolating the enzyme.
19. The process of claim 17, wherein the process of purifying the enzyme comprises ion-exchange, hydrophobic and hydroxyapatite chromatography.
20. The process of claim 17, wherein the enzyme is a transaminase or lipase.
21. The process of claim 20, wherein the transaminase comprises an amino acid sequence comprising SEQ ID NO:2.
22. The process of claim 21, wherein the amino acid sequence is encoded by a polynucleotide sequence comprising SEQ ID NO: 1.
23. A process for preparing a compound of formula Ia* or Ib* from a compound of formula I* comprising
(a) either
(i) providing a microorganism in a growth medium under conditions which allow for expression of an enzyme, or
(ii) introducing a gene which encodes for the enzyme into a host microorganism for recombinant expression, introducing the host microorganism in a growth medium under conditions which allow for expression of the enzyme and allowing it to grow and express the enzyme; and (b) catalytically reacting the enzyme with a compound of formula I* to produce the desired compound.
24. The process of claim 23, wherein the enzyme is a transaminase.
25. The process of claim 24, wherein the transaminase comprises an amino acid sequence comprising SEQ ID NO:2.
26. The process of claim 25, wherein the amino acid sequence is encoded by a polynucleotide sequence comprising SEQ ID NO: 1.
27. A process for the preparation of a compound of formula Ia* which comprises reacting a compound of the formula I* with Novozym 435 and ethyl caprate in methyl ?-butyl ether to afford Compound Ia*.
28. A process for the preparation of a compound of formula Ib* which comprises reacting a compound of the formula I* with enzyme in the presence of potassium phosphate and sodium pyruvate to afford Compound Ib*.
29. An isolated nucleic acid molecule comprising a polynucleotide sequence encoding a polypeptide comprising SEQ ID NO:2.
30. The process of claim 23, wherein the compound of the formula Ib* is (R)-secbutyl amine and the compound of formula I* is 5ec-butylamine.
31. The process of claim 30, wherein the enzyme is a transaminase from B. megaterium expressed in E. coli.
32. The process of claim 28,wherein the compound of the formula Ib* is (R)-secbutyl amine and the compound of formula I* is 5ec-butylamine.
33. A process for the preparation of (R)-seobutylamine which comprises reacting racemic 5eobutylamine with a transaminase from B. megaterium expressed in E. coli in presence of potassium phosphate and sodium pyruvate to afford (R)-sec- butylamine.
PCT/US2008/068951 2007-07-02 2008-07-02 Stereoselective resolution of racemic amines WO2009006492A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US94749107P 2007-07-02 2007-07-02
US60/947,491 2007-07-02

Publications (2)

Publication Number Publication Date
WO2009006492A2 true WO2009006492A2 (en) 2009-01-08
WO2009006492A3 WO2009006492A3 (en) 2009-05-28

Family

ID=40226813

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/068951 WO2009006492A2 (en) 2007-07-02 2008-07-02 Stereoselective resolution of racemic amines

Country Status (1)

Country Link
WO (1) WO2009006492A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107287255A (en) * 2016-03-31 2017-10-24 南京诺云生物科技有限公司 The new application of Pseudomonas veronii CIP104663 albumen

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006063336A2 (en) * 2004-12-10 2006-06-15 Cambrex North Brunswick, Inc. Thermostable omega-transaminases

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006063336A2 (en) * 2004-12-10 2006-06-15 Cambrex North Brunswick, Inc. Thermostable omega-transaminases

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GASTALDI ET AL: "Dynamic kinetic resolution of amines involving biocatalysis and in situ free radical mediated racemization" ORGANIC LETTERS, vol. 9, March 2007 (2007-03), pages 837-839, XP002521885 *
GOSWAMI ET AL: "Enzymatic resolution of sec-butylamine" TETRAHEDRON: ASYMMETRY, vol. 16, 2005, pages 1715-1719, XP004861958 *
HANSON ET AL: "Preparation of (R)-amines from racemic amines with an (S)-amine transferase from Bacillus megaterium" ADVANCED SYNTHESIS & CATALYSIS, vol. 350, 9 May 2008 (2008-05-09), pages 1367-1375, XP002521356 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107287255A (en) * 2016-03-31 2017-10-24 南京诺云生物科技有限公司 The new application of Pseudomonas veronii CIP104663 albumen

Also Published As

Publication number Publication date
WO2009006492A3 (en) 2009-05-28

Similar Documents

Publication Publication Date Title
US6429004B1 (en) Amidase
CA2360376C (en) Novel carbonyl reductase, gene thereof and method of using the same
US7785837B2 (en) Production of 3-hydroxypropionic acid using beta-alanine/pyruvate aminotransferase
AU716692B2 (en) Esterases
US11965193B2 (en) Use of stereoselective transaminase in asymmetric synthesis of chiral amine
JP2000505291A (en) Transaminase and aminotransferase
US7670807B2 (en) RNA-dependent DNA polymerase from Geobacillus stearothermophilus
WO2006022664A2 (en) Alanine 2, 3 aminomutases
CN107858340B (en) High-catalytic-activity D-fructose-6-phosphate aldolase A mutant, recombinant expression vector, genetically engineered bacterium and application thereof
KR20050052664A (en) Aldehyde dehydrogenase gene
CN110079516B (en) Improved nitrile hydratase
JP5516664B2 (en) N-acetyl- (R, S) -β-amino acid acylase gene
KR20210146922A (en) UDP-Rhamnose Biosynthetic Production
AU2203399A (en) Pyruvate carboxylase from (corynebacterium glutamicum)
CN110592045B (en) Recombinant esterase, gene, engineering bacterium and application of recombinant esterase to resolution of (R, S) -indoline-2-ethyl formate
CA2368953A1 (en) Proteins related to gaba metabolism
CN111133105B (en) D-amino acid dehydrogenase
WO2009006492A2 (en) Stereoselective resolution of racemic amines
EP4230723A1 (en) Polypeptide with aspartate kinase activity and use thereof in production of amino acid
JP5119783B2 (en) N-acetyl- (R, S) -β-amino acid acylase gene
KR20080016287A (en) A novel soil microorganism, a novel beta transaminase seperated from the soil microorganism, a gene encoding the beta transaminase, and a method for producing enantiomerically pure beta amino acids and their derivatives by using the same
CN117210429A (en) Histidine trimethylase EgtD mutant and application thereof
WO2007097429A1 (en) Novel acylamidase gene and use thereof
KR20070017562A (en) New omegaaminotransferase, gene thereof and method of using the same
WO2012106579A1 (en) Amino acid dehydrogenase and its use in preparing amino acids from keto acids

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08781253

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08781253

Country of ref document: EP

Kind code of ref document: A2