WO2006128590A1

WO2006128590A1 - Hydrolases, nucleic acids encoding them and methods for making and using them

Info

Publication number: WO2006128590A1
Application number: PCT/EP2006/004736
Authority: WO
Inventors: Daniel Mink; Joannes Gerardus Theodorus Kierkels; Grace Desantis; Robert Farwell; Lawrence Whipple; Mark Burk
Original assignee: Dsm Ip Assets B.V.
Priority date: 2005-05-31
Filing date: 2006-05-18
Publication date: 2006-12-07

Abstract

The invention provides hydrolases, polynucleotides encoding them, and methods of making and using these polynucleotides and polypeptides. In one aspect, the invention is directed to polypeptides, e.g., enzymes, having a hydrolase activity, e.g., an esterase, acylase, lipase, phospholipase (e.g., phospholipase A, B, C and/or D activity, patatin activity, lipid acyl hydrolase (LAH) activity) or protease activity, including thermostable and thermotolerant hydrolase activity, and polynucleotides encoding these enzymes, and making and using these polynucleotides and polypeptides. The hydrolase activities of the polypeptides and peptides of the invention include esterase activity, lipase activity (hydrolysis of lipids), acidolysis reactions (to replace an esterified fatty acid with a free fatty acid), transesterification reactions (exchange of fatty acids between triglycerides), ester synthesis, ester interchange reactions, phospholipase activity and protease activity (hydrolysis of peptide bonds). The polypeptides of the invention can be used in a variety of pharmaceutical, agricultural and industrial contexts, including the manufacture of cosmetics and nutraceuticals. In another aspect, the polypeptides of the invention are used to synthesize enantiomerically pure chiral products.

Description

HYDROLASES, NUCLEIC ACIDS ENCODING THEM AND METHODS FOR MAKING AND USING THEM

This invention relates to molecular and cellular biology and biochemistry. In one aspect, the invention provides hydrolases, polynucleotides encoding them, and methods of making and using these polynucleotides and polypeptides. In one aspect, the invention is directed to polypeptides, e.g., enzymes, having a hydrolase activity, e.g., an esterase, acylase, lipase, phospholipase or protease activity, including thermostable and thermotolerant hydrolase activity, and polynucleotides encoding these enzymes, and making and using these polynucleotides and polypeptides. The hydrolase activities of the polypeptides and peptides of the invention include esterase activity, lipase activity (hydrolysis of lipids), acidolysis reactions (to replace an esterified fatty acid with a free fatty acid), transesterification reactions (exchange of fatty acids between triglycerides), ester synthesis, ester interchange reactions, phospholipase activity (e.g., phospholipase A, B, C and/or D activity, patatin activity, lipid acyl hydrolase (LAH) activity) and protease activity (hydrolysis of peptide bonds). The polypeptides of the invention can be used in a variety of pharmaceutical, agricultural and industrial contexts, including the manufacture of cosmetics and nutraceuticals. In another aspect, the polypeptides of the invention are used to synthesize enantiomerically pure chiral products. In one embodiment in particular, the invention relates to a process for the preparation of enantiomerically enriched compounds useful as building blocks in the synthesis of a number of pharmaceutically active compounds.

BACKGROUND

SUMMARY

The invention provides polypeptides, for example, enzymes and catalytic antibodies, having a hydrolase activity, e.g., an esterase, acylase, lipase, phospholipase or protease activity, including thermostable and thermotolerant hydrolase activities, and enantiospecific activities, and polynucleotides encoding these polypeptides, including vectors, host cells, transgenic plants and non-human animals, and methods for making and using these polynucleotides and polypeptides.

The invention provides isolated or recombinant nucleic acids comprising a nucleic acid sequence having at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71 %, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence identity to an exemplary nucleic acid of the invention over a region of at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 100, 125, 150, 175, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550 or more, residues, wherein the nucleic acid encodes at least one polypeptide having a hydrolase activity, e.g., an esterase, acylase, lipase, phospholipase or protease activity. The sequence identities can be determined by analysis with a sequence comparison algorithm or by a visual inspection. Exemplary nucleic acids of the invention include isolated or recombinant nucleic acids comprising a nucleic acid sequence as set forth in SEQ ID NO:1 , SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11 or SEQ ID NO: 13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21 , SEQ ID NO:23, SEQ ID NO:25 and subsequences thereof, e.g., at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500 or more residues in length, or over the full length of a gene or transcript.

Exemplary nucleic acids of the invention also include isolated or recombinant nucleic acids encoding a polypeptide having a sequence as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12 or SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24 or SEQ ID NO:26 and subsequences thereof and variants thereof. In one aspect, the polypeptide has a hydrolase activity, e.g., an esterase, acylase, lipase, phospholipase or protease activity. In one aspect, the hydrolase activity is a regioselective and/or chemoselective activity.

In one aspect, the sequence comparison algorithm is a BLAST version 2.2.2 algorithm where a filtering setting is set to blastall -p blastp -d "nr pataa" - F F, and all other options are set to default.

Another aspect of the invention is an isolated or recombinant nucleic acid including at least 10 consecutive bases of a nucleic acid sequence of the invention, sequences substantially identical thereto, and the sequences complementary thereto. The invention provides isolated or recombinant nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid of the invention, e.g., an exemplary nucleic acid of the invention comprising a sequence as set forth in SEQ ID NO:1 , SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11 , SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21 , SEQ ID NO:23 or SEQ ID NO:25, or fragments or subsequences thereof. In one aspect, the nucleic acid encodes a polypeptide having a hydrolase activity, e.g., an esterase, acylase, lipase, phospholipase or protease activity. The nucleic acid can be at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 100, 125, 150, 175, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500 or more residues in length or the full length of the gene or transcript. In one aspect, the stringent conditions include a wash step comprising a wash in 0.2X SSC at a temperature of about 65⁰C for about 15 minutes. The invention provides methods of amplifying a nucleic acid encoding a polypeptide having a hydrolase activity, e.g., an esterase, acylase, lipase, phospholipase or protease activity, comprising amplification of a template nucleic acid with an amplification primer sequence pair capable of amplifying a nucleic acid sequence of the invention, or fragments or subsequences thereof. The invention provides expression cassettes comprising a nucleic acid of the invention or a subsequence thereof. In one aspect, the expression cassette can comprise the nucleic acid that is operably linked to a promoter.

The invention provides cloning vehicles comprising an expression cassette (e.g., a vector) of the invention or a nucleic acid of the invention. The cloning vehicle can be a viral vector, a plasmid, a phage, a phagemid, a cosmid, a fosmid, a bacteriophage or an artificial chromosome. The invention provides transformed cell comprising a nucleic acid of the invention or an expression cassette (e.g., a vector) of the invention, or a cloning vehicle of the invention. In one aspect, the transformed cell can be a bacterial cell. The invention provides an antisense oligonucleotide comprising a nucleic acid sequence complementary to or capable of hybridizing under stringent conditions to a nucleic acid of the invention. The invention provides an isolated or recombinant polypeptide comprising an amino acid sequence having at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61 %, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, - A -

80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more or complete (100%) sequence identity to an exemplary polypeptide or peptide of the invention- over a region of at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 100, 125, 150, 175, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550 or more residues, or over the full length of the polypeptide, and the sequence identities are determined by analysis with a sequence comparison algorithm or by a visual inspection. Exemplary polypeptide or peptide sequences of the invention include SEQ ID NO:2, SEQ ID NO:4 or SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24 or SEQ ID NO:26, and subsequences thereof and variants thereof, e.g., at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500 or more residues in length, or over the full length of an enzyme. Exemplary polypeptide or peptide sequences of the invention include sequence encoded by a nucleic acid of the invention. Exemplary polypeptide or peptide sequences of the invention include polypeptides or peptides specifically bound by an antibody of the invention. In one aspect, a polypeptide of the invention has at least one hydrolase activity, e.g., an esterase, acylase, lipase, phospholipase or protease activity. In one aspect, the activity is a regioselective and/or chemoselective activity.

Another aspect of the invention is an isolated or recombinant polypeptide or peptide including at least 10 consecutive amino acids of a polypeptide or peptide sequence of the invention, sequences substantially identical thereto, and the sequences complementary thereto. In one aspect, the invention provides chimeric proteins comprising a first domain comprising a signal sequence of the invention and at least a second domain. The protein can be a fusion protein. The second domain can comprise an enzyme. The enzyme can be a hydrolase (e.g., a hydrolase of the invention, or, another hydrolase). The invention provides protein preparations comprising a polypeptide of the invention, wherein the protein preparation comprises a liquid, a solid or a gel.

The invention provides heterodimers comprising a polypeptide of the invention and a second domain. In one aspect, the second domain can be a polypeptide and the heterodimer can be a fusion protein. In one aspect, the second domain can be an epitope or a tag. In one aspect, the invention provides homodimers comprising a polypeptide of the invention.

The invention provides immobilized polypeptides having a hydrolase activity, wherein the polypeptide comprises a polypeptide of the invention, a polypeptide encoded by a nucleic acid of the invention, or a polypeptide comprising a polypeptide of the invention and a second domain. In one aspect, the polypeptide can be immobilized in or on a cell, a vesicle, a liposome, a film, a membrane, a metal, a resin, a polymer, a ceramic, a glass, a microelectrode, a graphitic particle, a bead, a gel, a plate, crystals, a tablet, a pill, a capsule, a powder, an agglomerate, a surface, a porous structure, an array or a capillary tube, or materials such as grains, husks, bark, skin, hair, enamel, bone, shell and materials deriving from them. Polynucleotides, polypeptides and enzymes of the invention can be formulated in a solid form such as a powder, lyophilized preparations, granules, tablets, bars, crystals, capsules, pills, pellets, or in a liquid form such as an aqueous solution, an aerosol, a gel, a paste, a slurry, an aqueous/oil emulsion, a cream, a capsule, vesicular, or micellar suspension. The invention provides methods of producing a recombinant polypeptide comprising the steps of: (a) providing a nucleic acid of the invention operably linked to a promoter; and (b) expressing the nucleic acid of step (a) under conditions that allow expression of the polypeptide, thereby producing a recombinant polypeptide. In one aspect, the method can further comprise transforming a host cell with the nucleic acid of step (a) followed by expressing the nucleic acid of step (a), thereby producing a recombinant polypeptide in a transformed cell.

The invention provides methods for identifying a polypeptide having a hydrolase activity comprising the following steps: (a) providing a polypeptide of the invention; or a polypeptide encoded by a nucleic acid of the invention; (b) providing a hydrolase substrate; and (c) contacting the polypeptide or a fragment or variant thereof of step (a) with the substrate of step (b) and detecting a decrease in the amount of substrate or an increase in the amount of a reaction product, wherein a decrease in the amount of the substrate or an increase in the amount of the reaction product detects a polypeptide having a hydrolase activity. The invention provides methods of generating a variant of a nucleic acid encoding a polypeptide having a hydrolase activity comprising the steps of: (a) providing a template nucleic acid comprising a nucleic acid of the invention; and (b) modifying, deleting or adding one or more nucleotides in the template sequence, or a combination thereof, to generate a variant of the template nucleic acid. In one aspect, the method can further comprise expressing the variant nucleic acid to generate a variant hydrolase polypeptide. The modifications, additions or deletions can be introduced by a method comprising error-prone PCR, shuffling, oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, site-specific mutagenesis, gene reassembly, Gene Site Saturated Mutagenesis™ (GSSM™), synthetic ligation reassembly (SLR) or a combination thereof. In another aspect, the modifications, additions or deletions are introduced by a method comprising recombination, recursive sequence recombination, phosphothioate-modified DNA mutagenesis, uracil-containing template mutagenesis, gapped duplex mutagenesis, point mismatch repair mutagenesis, repair-deficient host strain mutagenesis, chemical mutagenesis, radiogenic mutagenesis, deletion mutagenesis, restriction-selection mutagenesis, restriction-purification mutagenesis, artificial gene synthesis, ensemble mutagenesis, chimeric nucleic acid multimer creation and a combination thereof. In one aspect, the method can be iteratively repeated until a hydrolase having an altered or different activity or an altered or different stability from that of a polypeptide encoded by the template nucleic acid is produced. In one aspect, the variant hydrolase polypeptide is thermotolerant, and retains some activity after being exposed to an elevated temperature. Alternatively, the variant hydrolase polypeptide has a hydrolase activity under a high temperature, wherein the hydrolase encoded by the template nucleic acid is not active under the high temperature. In one aspect, the method can be iteratively repeated until a hydrolase coding sequence having an altered codon usage from that of the template nucleic acid is produced. In another aspect, the method can be iteratively repeated until a hydrolase gene having higher or lower level of message expression or stability from that of the template nucleic acid is produced.

The invention provides methods for modifying codons in a nucleic acid encoding a polypeptide having a hydrolase activity; the method comprising the following steps: (a) providing a nucleic acid of the invention; and, (b) identifying a codon in the nucleic acid of step (a) and replacing it with a different codon encoding the same amino acid as the replaced codon, thereby modifying codons in a nucleic acid encoding a hydrolase.

The invention provides methods for modifying codons in a nucleic acid encoding a polypeptide having a hydrolase activity to increase its expression in a host cell, the method comprising the following steps: (a) providing a nucleic acid of the invention encoding a hydrolase polypeptide; and, (b) identifying a non-preferred or a less preferred codon in the nucleic acid of step (a) and replacing it with a preferred or neutrally used codon encoding the same amino acid as the replaced codon, wherein a preferred codon is a codon over-represented in coding sequences in genes in the host cell and a non-preferred or less preferred codon is a codon under-represented in coding sequences in genes in the host cell, thereby modifying the nucleic acid to increase its expression in a host cell.

The invention provides methods for producing a library of nucleic acids encoding a plurality of modified hydrolase active sites or substrate binding sites, wherein the modified active sites or substrate binding sites are derived from a first nucleic acid comprising a sequence encoding a first active site or a first substrate binding site the method comprising the following steps: (a) providing a first nucleic acid encoding a first active site or first substrate binding site, wherein the first nucleic acid sequence comprises a sequence that hybridizes under stringent conditions to a nucleic acid of the invention, and the nucleic acid encodes a hydrolase active site or a hydrolase substrate binding site; (b) providing a set of mutagenic oligonucleotides that encode naturally-occurring amino acid variants at a plurality of targeted codons in the first nucleic acid; and, (c) using the set of mutagenic oligonucleotides to generate a set of active site-encoding or substrate binding site-encoding variant nucleic acids encoding a range of amino acid variations at each amino acid codon that was mutagenized, thereby producing a library of nucleic acids encoding a plurality of modified hydrolase active sites or substrate binding sites. In one aspect, the method comprises mutagenizing the first nucleic acid of step (a) by a method comprising an optimized directed evolution system, Gene Site Saturated Mutagenesis™ (GSSM™), synthetic ligation reassembly (SLR), error-prone PCR, shuffling, oligonucleotide- directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, site-specific mutagenesis, gene reassembly, Gene Site Saturated Mutagenesis™ (GSSM™), synthetic ligation reassembly (SLR) and a combination thereof. In another aspect, the method comprises mutagenizing the first nucleic acid of step (a) or variants by a method comprising recombination, recursive sequence recombination, phosphothioate-modified DNA mutagenesis, uracil-containing template mutagenesis, gapped duplex mutagenesis, point mismatch repair mutagenesis, repair- deficient host strain mutagenesis, chemical mutagenesis, radiogenic mutagenesis, deletion mutagenesis, restriction-selection mutagenesis, restriction-purification mutagenesis, artificial gene synthesis, ensemble mutagenesis, chimeric nucleic acid multimer creation and a combination thereof.

The invention provides methods for determining a functional fragment of a hydrolase enzyme comprising the steps of: (a) providing a hydrolase enzyme, wherein the enzyme comprises a polypeptide of the invention, or a polypeptide encoded by a nucleic acid of the invention, or a subsequence thereof; and (b) deleting a plurality of amino acid residues from the sequence of step (a) and testing the remaining subsequence for a hydrolase activity, thereby determining a functional fragment of a hydrolase enzyme. In one aspect, the hydrolase activity is measured by providing a hydrolase substrate and detecting a decrease in the amount of the substrate or an increase in the amount of a reaction product.

The invention provides methods for overexpressing a recombinant hydrolase polypeptide in a cell comprising expressing a vector comprising a nucleic acid comprising a nucleic acid of the invention or a nucleic acid sequence of the invention, wherein the sequence identities are determined by analysis with a sequence comparison algorithm or by visual inspection, wherein overexpression is effected by use of a high activity promoter, a dicistronic vector or by gene amplification of the vector.

The invention provides signal sequences comprising or consisting of a peptide having a subsequence of a polypeptide of the invention. The invention provides a chimeric protein comprising a first domain comprising a signal sequence of the invention and at least a second domain. The protein can be a fusion protein. The second domain can comprise an enzyme. The enzyme can be a hydrolase.

The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.

All publications, patents, patent applications, GenBank sequences and ATCC deposits, cited herein are hereby expressly incorporated by reference for all purposes.

DETAILED DESCRIPTION

In one aspect, the invention provides hydrolases, polynucleotides encoding them, and methods of making and using these polynucleotides and polypeptides. In one aspect, the invention is directed to polypeptides, e.g., enzymes, having a hydrolase activity, e.g., an esterase, acylase, lipase, phospholipase or protease activity, including thermostable and thermotolerant hydrolase activity, and polynucleotides encoding these enzymes, and making and using these polynucleotides and polypeptides. The hydrolase activities of the polypeptides and peptides of the invention include esterase activity, lipase activity (hydrolysis of lipids), acidolysis reactions (to replace an esterified fatty acid with a free fatty acid), transesterification reactions (exchange of fatty acids between triglycerides), ester synthesis, ester interchange reactions, phospholipase activity (e.g., phospholipase A, B, C and/or D activity, patatin activity, lipid acyl hydrolase (LAH) activity) and protease activity (hydrolysis of peptide bonds). In another aspect, the polypeptides of the invention are used to synthesize enantiomerically pure chiral products. The polypeptides of the invention can be used in a variety of pharmaceutical, agricultural and industrial contexts, including the manufacture of cosmetics and nutraceuticals.

Enzymes of the invention can be highly selective catalysts. They can have the ability to catalyze reactions with stereo-, regio-, and chemo- selectivities not possible in conventional synthetic chemistry. Enzymes of the invention can be versatile. In various aspects, they can function in organic solvents, operate at extreme pHs (for example, high pHs and low pHs) extreme temperatures (for example, high temperatures and low temperatures), extreme salinity levels (for example, high salinity and low salinity), and catalyze reactions with compounds that are structurally unrelated to their natural, physiological substrates.

In one aspect, the methods and compositions (hydrolases) of the invention can be used to for stereoselective^ hydrolyzing mixtures of a compound of formula 2

wherein R³ respectively R⁴ stands for a d-Cβ-alkyl-, a

or a C₃- C₈-cycloalkyl-C_rC₄alkyl- rest group and wherein R⁴ respectively R³ stands for a C₁-C₈- alkyl- rest group and wherein R³ and R⁴ are not the same.

The term esterase includes all polypeptides having an esterase activity, i.e., the polypeptides of the invention can have any esterase activity. For example, the invention provides polypeptides capable of hydrolyzing ester groups to organic acids and alcohols. The term "esterase" also encompasses polypeptides having lipase activity (in the hydrolysis of lipids), acidolysis reactions (to replace an esterified fatty acid with a free fatty acid), trans-esterification reactions (exchange of fatty acids between triglycerides), ester synthesis and ester interchange reactions. The polypeptides of the invention can be enantiospecific, e.g., as when used in chemoenzymatic reactions in the synthesis of medicaments and insecticides. The polynucleotides of the invention encode polypeptides having esterase activity. A hydrolase variant (e.g., "lipase variant", "esterase variant", "protease variant" "phospholipase variant") can have an amino acid sequence which is derived from the amino acid sequence of a "precursor". The precursor can include naturally-occurring hydrolase and/or a recombinant hydrolase. The amino acid sequence of the hydrolase variant is "derived" from the precursor hydrolase amino acid sequence by the substitution, deletion or insertion of one or more amino acids of the precursor amino acid sequence. Such modification is of the "precursor DNA sequence" which encodes the amino acid sequence of the precursor lipase rather than manipulation of the precursor hydrolase enzyme per se. Suitable methods for such manipulation of the precursor DNA sequence include methods disclosed herein, as well as methods known to those skilled in the art. A "coding sequence of or a "sequence encodes" a particular polypeptide or protein, is a nucleic acid sequence which is transcribed and translated into a polypeptide or protein when placed under the control of appropriate regulatory sequences.

The term "expression cassette" as used herein refers to a nucleotide sequence which is capable of affecting expression of a structural gene (i.e., a protein coding sequence, such as a hydrolase of the invention) in a host compatible with such sequences. Expression cassettes include at least a promoter operably linked with the polypeptide coding sequence; and, optionally, with other sequences, e.g., transcription termination signals. Additional factors necessary or helpful in effecting expression may also be used, e.g., enhancers. "Operably linked" as used herein refers to linkage of a promoter upstream from a DNA sequence such that the promoter mediates transcription of the DNA sequence. Thus, expression cassettes also include plasmids, expression vectors, recombinant viruses, any form of recombinant "naked DNA" vector, and the like. A "vector" comprises a nucleic acid which can infect, transfect, transiently or permanently transduce a cell. It will be recognized that a vector can be a naked nucleic acid, or a nucleic acid complexed with protein or lipid. The vector optionally comprises viral or bacterial nucleic acids and/or proteins, and/or membranes (e.g., a cell membrane, a viral lipid envelope, etc.). Vectors include, but are not limited to replicons (e.g., RNA replicons, bacteriophages) to which fragments of DNA may be attached and become replicated. Vectors thus include, but are not limited to RNA, autonomous self-replicating circular or linear DNA or RNA (e.g., plasmids, viruses, and the like, see, e.g., U.S. Patent No. 5,217,879), and includes both the expression and non-expression plasmids. Where a recombinant microorganism or cell culture is described as hosting an "expression vector" this includes both extra-chromosomal circular and linear DNA and DNA that has been incorporated into the host chromosome(s). Where a vector is being maintained by a host cell, the vector may either be stably replicated by the cells during mitosis as an autonomous structure, or is incorporated within the host's genome.

The term "gene" can include a nucleic acid sequence comprising a segment of DNA involved in producing a transcription product (e.g., a message), which in turn is translated to produce a polypeptide chain, or regulates gene transcription, reproduction or stability. Genes can include, inter alia, regions preceding and following the coding region, such as leader and trailer, promoters and enhancers, as well as, where applicable, intervening sequences (introns) between individual coding segments (exons).

The phrases "nucleic acid" or "nucleic acid sequence" can include an oligonucleotide, nucleotide, polynucleotide, or to a fragment of any of these, to DNA or RNA (e.g., mRNA, rRNA, tRNA, RNAi) of genomic or synthetic origin which may be single-stranded or double-stranded and may represent a sense or antisense strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material, natural or synthetic in origin, including, e.g., RNAi (double-stranded "interfering" RNA), ribonucleoproteins (e.g., iRNPs). The term encompasses nucleic acids, i.e., oligonucleotides, containing known analogues of natural nucleotides. The term also encompasses nucleic-acid-like structures with synthetic backbones, see e.g., Mata (1997) Toxicol. Appl. Pharmacol. 144:189-197; Strauss-Soukup (1997) Biochemistry 36:8692-8698; Samstag (1996) Antisense Nucleic Acid Drug Dev 6:153-156.

As used herein, the term "promoter" includes all sequences capable of driving transcription of a coding sequence in a cell, e.g., a plant cell. Thus, promoters used in the constructs of the invention include c/s-acting transcriptional control elements and regulatory sequences that are involved in regulating or modulating the timing and/or rate of transcription of a gene. For example, a promoter can be a cis- acting transcriptional control element, including an enhancer, a promoter, a transcription terminator, an origin of replication, a chromosomal integration sequence, 5' and 3' untranslated regions, or an intronic sequence, which are involved in transcriptional regulation. These cis-acting sequences typically interact with proteins or other biomolecules to carry out (turn on/off, regulate, modulate, etc.) transcription. "Constitutive" promoters are those that drive expression continuously under most environmental conditions and states of development or cell differentiation. "Inducible" or "regulatable" promoters direct expression of the nucleic acid of the invention under the influence of environmental conditions or developmental conditions. Examples of environmental conditions that may affect transcription by inducible promoters include anaerobic conditions, elevated temperature, drought, or the presence of light.

The term "isolated" can mean that the material is removed from its original environment (e.g., the natural environment if it is naturally occurring). For example, a naturally occurring polynucleotide or polypeptide present in a living animal is not isolated, but the same polynucleotide or polypeptide, separated from some or all of the coexisting materials in the natural system, is isolated. Such polynucleotides could be part of a vector and/or such polynucleotides or polypeptides could be part of a composition, and still be isolated in that such vector or composition is not part of its natural environment. As used herein, an isolated material or composition can also be a "purified" composition, i.e., it does not require absolute purity; rather, it is intended as a relative definition. Individual nucleic acids obtained from a library can be conventionally purified to electrophoretic homogeneity. In alternative aspects, the invention provides nucleic acids which have been purified from genomic DNA or from other sequences in a library or other environment by at least one, two, three, four, five or more orders of magnitude.

The term "recombinant" can mean that the nucleic acid is adjacent to a "backbone" nucleic acid to which it is not adjacent in its natural environment. In one aspect, nucleic acids represent 5% or more of the number of nucleic acid inserts in a population of nucleic acid "backbone molecules." "Backbone molecules" according to the invention include nucleic acids such as expression vectors, self-replicating nucleic acids, viruses, integrating nucleic acids, and other vectors or nucleic acids used to maintain or manipulate a nucleic acid insert of interest. In one aspect, the enriched nucleic acids represent 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more of the number of nucleic acid inserts in the population of recombinant backbone molecules. "Recombinant" polypeptides or proteins refer to polypeptides or proteins produced by recombinant DNA techniques; e.g., produced from cells transformed by an exogenous DNA construct encoding the desired polypeptide or protein. "Synthetic" polypeptides or protein are those prepared by chemical synthesis, as described in further detail, below.

A promoter sequence can be "operably linked to" a coding sequence when RNA polymerase which initiates transcription at the promoter will transcribe the coding sequence into mRNA, as discussed further, below.

"Oligonucleotide" can include either a single stranded polydeoxynucleotide or two complementary polydeoxynucleotide strands which may be chemically synthesized. Such synthetic oligonucleotides have no 5¹ phosphate and thus will not ligate to another oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated. The phrase "substantially identical" in the context of two nucleic acids or polypeptides, can refer to two or more sequences that have, e.g., at least about 50%, 51 %, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61 %, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more, nucleotide or amino acid residue (sequence) identity, when compared and aligned for maximum correspondence, as measured using one any known sequence comparison algorithm, as discussed in detail below, or by visual inspection. In alternative aspects, the invention provides nucleic acid and polypeptide sequences having substantial identity to a nucleic acid of the invention, e.g., an exemplary sequence of the invention, over a region of at least about 10, 20, 30, 40, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 residues, or a region ranging from between about 50 residues to the full length of the nucleic acid or polypeptide. Nucleic acid sequences of the invention can be substantially identical over the entire length of a polypeptide coding region.

Additionally a "substantially identical" amino acid sequence is a sequence that differs from a reference sequence by one or more conservative or non- conservative amino acid substitutions, deletions, or insertions, particularly when such a substitution occurs at a site that is not the active site of the molecule, and provided that the polypeptide essentially retains its functional properties. A conservative amino acid substitution, for example, substitutes one amino acid for another of the same class (e.g., substitution of one hydrophobic amino acid, such as isoleucine, valine, leucine, or methionine, for another, or substitution of one polar amino acid for another, such as substitution of arginine for lysine, glutamic acid for aspartic acid or glutamine for asparagine). One or more amino acids can be deleted, for example, from a hydrolase, resulting in modification of the structure of the polypeptide, without significantly altering its biological activity. For example, amino- or carboxyl-terminal amino acids that are not required for hydrolase activity can be removed.

"Hybridization" refers to the process by which a nucleic acid strand joins with a complementary strand through base pairing. Hybridization reactions can be sensitive and selective so that a particular sequence of interest can be identified even in samples in which it is present at low concentrations. Stringent conditions can be defined by, for example, the concentrations of salt or formamide in the prehybridization and hybridization solutions and/or wash solutions, or by the hybridization temperature, and are well known in the art. For example, stringency can be increased by reducing the concentration of salt, increasing the concentration of formamide, or raising the hybridization and/or wash temperature, altering the time of hybridization, as described in detail, below. In alternative aspects, nucleic acids of the invention are defined by their ability to hybridize under various stringency conditions (e.g., high, medium, and low), as set forth herein.

The term "variant" can include polynucleotides or polypeptides of the invention modified at one or more base pairs, codons, introns, exons, or amino acid residues (respectively) yet still retain the biological activity of a hydrolase of the invention. Variants can be produced by any number of means included methods such as, for example, error-prone PCR, shuffling, oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, site-specific mutagenesis, gene reassembly, GSSM and any combination thereof. Techniques for producing variant hydrolases having activity at a pH or temperature, for example, that is different from a wild-type hydrolase, are included herein.

The term "saturation mutagenesis" or "GSSM™" includes a method that uses degenerate oligonucleotide primers to introduce point mutations into a polynucleotide, as described in detail, below. The term "optimized directed evolution system" or "optimized directed evolution" includes a method for reassembling fragments of related nucleic acid sequences, e.g., related genes, and explained in detail, below. The term "synthetic ligation reassembly" or "SLR" includes a method of ligating oligonucleotide fragments in a non-stochastic fashion, and explained in detail, below.

Generating and Manipulating Nucleic Acids

The invention provides nucleic acids, including expression cassettes such as expression vectors, encoding the polypeptides (e.g., hydrolases, antibodies) of the invention. The invention also includes methods for discovering new hydrolase sequences using the nucleic acids of the invention. Also provided are methods for modifying the nucleic acids of the invention by, e.g., synthetic ligation reassembly, optimized directed evolution system and/or saturation mutagenesis.

The nucleic acids of the invention can be made, isolated and/or manipulated by, e.g., cloning and expression of cDNA libraries, amplification of message or genomic DNA by PCR, and the like. In practicing the methods of the invention, homologous genes can be modified by manipulating a template nucleic acid, as described herein. The invention can be practiced in conjunction with any method or protocol or device known in the art, which are well described in the scientific and patent literature.

General Techniques

The nucleic acids used to practice this invention, whether RNA, iRNA, antisense nucleic acid, cDNA, genomic DNA, vectors, viruses or hybrids thereof, may be isolated from a variety of sources, genetically engineered, amplified, and/or expressed/ generated recombinantly. Recombinant polypeptides generated from these nucleic acids can be individually isolated or cloned and tested for a desired activity. Any recombinant expression system can be used, including bacterial, mammalian, yeast, insect or plant cell expression systems.

Alternatively, these nucleic acids can be synthesized in vitro by well- known chemical synthesis techniques, as described in, e.g., Adams (1983) J. Am. Chem. Soc. 105:661 ; Belousov (1997) Nucleic Acids Res. 25:3440-3444; Frenkel (1995) Free Radic. Biol. Med. 19:373-380; Blommers (1994) Biochemistry 33:7886- 7896; Narang (1979) Meth. Enzymol. 68:90; Brown (1979) Meth. Enzymol. 68:109; Beaucage (1981) Tetra. Lett. 22:1859; U.S. Patent No. 4,458,066. Techniques for the manipulation of nucleic acids, such as, e.g., subcloning, labeling probes (e.g., random-primer labeling using Klenow polymerase, nick translation, amplification), sequencing, hybridization and the like are well described in the scientific and patent literature, see, e.g., Sambrook, ed., MOLECULAR CLONING: A LABORATORY MANUAL (2ND ED.), VoIs. 1-3, Cold Spring Harbor Laboratory, (1989); CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, Ausubel, ed. John Wiley & Sons, Inc., New York (1997); LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY: HYBRIDIZATION WITH NUCLEIC ACID PROBES, Part I. Theory and Nucleic Acid Preparation, Tijssen, ed. Elsevier, N.Y. (1993).

Another useful means of obtaining and manipulating nucleic acids used to practice the methods of the invention is to clone from genomic samples, and, if desired, screen and re-clone inserts isolated or amplified from, e.g., genomic clones or cDNA clones. Sources of nucleic acid used in the methods of the invention include genomic or cDNA libraries contained in, e.g., mammalian artificial chromosomes (MACs), see, e.g., U.S. Patent Nos. 5,721,118; 6,025,155; human artificial chromosomes, see, e.g., Rosenfeld (1997) Nat. Genet. 15:333-335; yeast artificial chromosomes (YAC); bacterial artificial chromosomes (BAC); P1 artificial chromosomes, see, e.g., Woon (1998) Genomics 50:306-316; P1 -derived vectors (PACs), see, e.g., Kern (1997) Biotechniques 23:120-124; cosmids, recombinant viruses, phages or plasmids.

In one aspect, a nucleic acid encoding a polypeptide of the invention is assembled in appropriate phase with a leader sequence capable of directing secretion of the translated polypeptide or fragment thereof. The invention provides fusion proteins and nucleic acids encoding them. A polypeptide of the invention can be fused to a heterologous peptide or polypeptide, such as N-terminal identification peptides which impart desired characteristics, such as increased stability or simplified purification. Peptides and polypeptides of the invention can also be synthesized and expressed as fusion proteins with one or more additional domains linked thereto for, e.g., producing a more immunogenic peptide, to more readily isolate a recombinantly synthesized peptide, to identify and isolate antibodies and antibody-expressing B cells, and the like. Detection and purification facilitating domains include, e.g., metal chelating peptides such as polyhistidine tracts and histidine-tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification system (Immunex Corp, Seattle WA). The inclusion of a cleavable linker sequences such as Factor Xa or enterokinase (Invitrogen, San Diego CA) between a purification domain and the motif-comprising peptide or polypeptide to facilitate purification. For example, an expression vector can include an epitope-encoding nucleic acid sequence linked to six histidine residues followed by a thioredoxin and an enterokinase cleavage site (see e.g., Williams (1995) Biochemistry 34:1787-1797; Dobeli (1998) Protein Expr. Purif. 12:404-414). The histidine residues facilitate detection and purification while the enterokinase cleavage site provides a means for purifying the epitope from the remainder of the fusion protein. Technology pertaining to vectors encoding fusion proteins and application of fusion proteins are well described in the scientific and patent literature, see e.g., Kroll (1993) DNA Cell. Biol., 12:441-53.

Transcriptional and translational control sequences The invention provides nucleic acid (e.g., DNA, iRNA) sequences of the invention operatively linked to expression (e.g., transcriptional or translational) control sequence(s), e.g., promoters or enhancers, to direct or modulate RNA synthesis/ expression. The expression control sequence can be in an expression vector. Exemplary bacterial promoters for expressing a polypeptide in bacteria include the E. coli lac or trp promoters, the lacl promoter, the lacZ promoter, the T3 promoter, the T7 promoter, the gpt promoter, the lambda PR promoter, the lambda PL promoter, promoters from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), and the acid phosphatase promoter.

Expression vectors and cloning vehicles

The invention provides expression vectors and cloning vehicles comprising nucleic acids of the invention, e.g., sequences encoding the hydrolases and antibodies of the invention. Expression vectors and cloning vehicles of the invention can comprise viral particles, baculovirus, phage, plasmids, phagemids, cosmids, fosmids, bacterial artificial chromosomes, viral DNA (e.g., vaccinia, adenovirus, foul pox virus, pseudorabies and derivatives of SV40), P1-based artificial chromosomes, yeast plasmids, yeast artificial chromosomes, and any other vectors specific for specific hosts of interest (such as Bacillus, Aspergillus and yeast). Vectors of the invention can include chromosomal, non-chromosomal and synthetic DNA sequences. Large numbers of suitable vectors are known to those of skill in the art, and are commercially available. Exemplary vectors include: bacterial: pQE vectors (Qiagen), pBluescript plasmids, pNH vectors, (lambda-ZAP vectors (Stratagene); ptrc99a, pKK223-3, pDR540, pRIT2T (Pharmacia); Eukaryotic: pXT1 , pSG5 (Stratagene), pSVK3, pBPV, pMSG, pSVLSV40 (Pharmacia). However, any other plasmid or other vector may be used so long as they are replicable and viable in the host. Low copy number or high copy number vectors may be employed with the present invention.

The expression vector may comprise a promoter, a ribosome binding site for translation initiation and a transcription terminator. The vector may also include appropriate sequences for amplifying expression. Mammalian expression vectors can comprise an origin of replication, any necessary ribosome binding sites, a polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking non-transcribed sequences. In one aspect, the expression vectors contain one or more selectable marker genes to permit selection of host cells containing the vector. Such selectable markers include genes encoding dihydrofolate reductase or genes conferring neomycin resistance for eukaryotic cell culture, genes conferring tetracycline or ampicillin resistance in E. coli, and the S. cerevisiae TRP1 gene. Promoter regions can be selected from any desired gene using chloramphenicol transferase (CAT) vectors or other vectors with selectable markers. A DNA sequence may be inserted into a vector by a variety of procedures. In general, the DNA sequence is ligated to the desired position in the vector following digestion of the insert and the vector with appropriate restriction endonucleases. Alternatively, blunt ends in both the insert and the vector may be ligated. A variety of cloning techniques are known in the art, e.g., as described in Ausubel and Sambrook. Such procedures and others are deemed to be within the scope of those skilled in the art.

The vector may be in the form of a plasmid, a viral particle, or a phage. Other vectors include chromosomal, non-chromosomal and synthetic DNA sequences, derivatives of SV40; bacterial plasmids, phage DNA, baculovirus, yeast plasmids, vectors derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, and pseudorabies. A variety of cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described by, e.g., Sambrook.

Particular bacterial vectors which may be used include the commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017), pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden), GEM1 (Promega Biotec, Madison, Wl, USA) pQE70, pQE60, pQE-9 (Qiagen), pD10, psiX174 pBluescript Il KS, pNH8A, pNH16a, pNH18A, pNH46A (Stratagene), ptrc99a, pKK223-3, pKK233-3, DR540, pRIT5 (Pharmacia), pKK232-8 and pCM7. Particular eukaryotic vectors include pSV2CAT, pOG44, pXT1 , pSG

(Stratagene) pSVK3, pBPV, pMSG, and pSVL (Pharmacia). However, any other vector may be used as long as it is replicable and viable in the host cell.

Expression vectors of the invention may also include a selectable marker gene to allow for the selection of bacterial strains that have been transformed, e.g., genes which render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, kanamycin, neomycin and tetracycline. Selectable markers can also include biosynthetic genes, such as those in the histidine, tryptophan and leucine biosynthetic pathways.

Host cells and transformed cells

The invention also provides a transformed cell comprising a nucleic acid sequence of the invention, e.g., a sequence encoding a hydrolase of the invention, or a vector of the invention. The host cell may be any of the host cells familiar to those skilled in the art, including prokaryotic cells, such as bacterial cells,. Enzymes of the invention can be expressed in any host cell, e.g., any bacterial cell, Exemplary bacterial cells include E. coli, Lactococcus lactis, Streptomyces, Bacillus subtilis, Bacillus cereus, Salmonella typhimurium or any species within the genera Bacillus, Streptomyces and Staphylococcus. The selection of an appropriate host is within the abilities of those skilled in the art. The vector may be introduced into the host cells using any of a variety of techniques, including transformation, transfection, transduction, viral infection, gene guns, or Ti-mediated gene transfer. Particular methods include calcium phosphate transfection, DEAE-Dextran mediated transfection, lipofection, or electroporation (Davis, L., Dibner, M., Battey, I., Basic Methods in Molecular Biology, (1986)). Where appropriate, the engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants or amplifying the genes of the invention. Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the selected promoter may be induced by appropriate means (e.g., temperature shift or chemical induction) and the cells may be cultured for an additional period to allow them to produce the desired polypeptide or fragment thereof.

Cells can be harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract is retained for further purification. Microbial cells employed for expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. Such methods are well known to those skilled in the art. The expressed polypeptide or fragment thereof can be recovered and purified from recombinant cell cultures by methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography, lectin chromatography and gel filtration (eg size exclusion chromatography). Protein refolding steps can be used, as necessary, in completing configuration of the polypeptide. If desired, high performance liquid chromatography (HPLC) can be employed for final purification steps.

The constructs in host cells can be used in a conventional manner to produce the gene product encoded by the recombinant sequence. Polypeptides of the invention may or may not also include an initial methionine amino acid residue.

Cell-free translation systems can also be employed to produce a polypeptide of the invention. Cell-free translation systems can use mRNAs transcribed from a DNA construct comprising a promoter operably linked to a nucleic acid encoding the polypeptide or fragment thereof. In some aspects, the DNA construct may be linearized prior to conducting an in vitro transcription reaction. The transcribed mRNA is then incubated with an appropriate cell-free translation extract, such as a rabbit reticulocyte extract, to produce the desired polypeptide or fragment thereof.

Amplification of Nucleic Acids

In practicing the invention, nucleic acids encoding the polypeptides of the invention, or modified nucleic acids, can be reproduced by, e.g., amplification. The invention provides amplification primer sequence pairs for amplifying nucleic acids encoding a hydrolase, e.g., an esterase, acylase, lipase, phospholipase or protease, where the primer pairs are capable of amplifying nucleic acid sequences including the exemplary SEQ ID NO:1 , SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11 , SEQ ID NO: 13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21 , SEQ ID NO:23 or SEQ ID NO:25. One of skill in the art can design amplification primer sequence pairs for any part of or the full length of these sequences.

Determining the degree of sequence identity

The invention provides nucleic acids having at least nucleic acid, or complete (100%) sequence identity to a nucleic acid of the invention, e.g., an exemplary nucleic acid of the invention (e.g., having a sequence as set forth in SEQ ID NO:1 , SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11 , SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21 , SEQ ID NO:23 or SEQ ID NO:25); and polypeptides having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence identity to a polypeptide of the invention, e.g., an exemplary polypeptide having a sequence as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, or SEQ ID NO:26. In alternative aspects, the sequence identity can be over a region of at least about 5, 10, 20, 30, 40, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550,

600, 650, 700, 750, 800, 850, 900, 950, 1000, or more consecutive residues, or the full length of the nucleic acid or polypeptide. The extent of sequence identity (homology) may be determined using any computer program and associated parameters, including those described herein, such as BLAST 2.2.2. or FASTA version 3.0t78, with the default parameters.

In one aspect of the invention, to determine if a nucleic acid has the requisite sequence identity to be within the scope of the invention, the NCBI BLAST 2.2.2 programs is used, default options to blastp. There are about 38 setting options in the BLAST 2.2.2 program. In this exemplary aspect of the invention, all default values are used except for the default filtering setting (i.e., all parameters set to default except filtering which is set to OFF); in its place a "-F F" setting is used, which disables filtering. Use of default filtering often results in Karlin-Altschul violations due to short length of sequence. The default values used in this exemplary aspect of the invention, include:

"Filter for low complexity: ON

Word Size: 3 Matrix: Blosum62

Gap Costs: Existence: 11

Extension: 1"

Other default settings are: filter for low complexity OFF, word size of 3 for protein, BLOSUM62 matrix, gap existence penalty of -11 and a gap extension penalty of -1. Note that the "-W" option defaults to 0. Preferably, Expect is set at 10. This means that, if not set, the word size defaults to 3 for proteins and 11 for nucleotides.

Hybridization of nucleic acids The invention provides isolated or recombinant nucleic acids that hybridize under stringent conditions to nucleic acid of the invention, e.g., an exemplary sequence of the invention, e.g., a sequence as set forth in SEQ ID NO:1 , SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11 , SEQ ID NO: 13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21 , SEQ ID NO:23 or SEQ ID NO:25 and subsequences thereof, or a nucleic acid that encodes a polypeptide of the invention. The stringent conditions can be highly stringent conditions, medium stringent conditions, low stringent conditions, including the high and reduced stringency conditions described herein.

In alternative embodiments, nucleic acids of the invention as defined by their ability to hybridize under stringent conditions can be between about five residues and the full length of nucleic acid of the invention; e.g., they can be at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 55, 60, 65, 70, 75, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, or more, residues in length. Nucleic acids shorter than full length are also included. These nucleic acids can be useful as, e.g., hybridization probes, labeling probes, PCR oligonucleotide probes, iRNA, antisense or sequences encoding antibody binding peptides (epitopes), motifs, active sites and the like.

In one aspect, nucleic acids of the invention are defined by their ability to hybridize under high stringency comprises conditions of about 50% formamide at about 37°C to 42°C. In one aspect, nucleic acids of the invention are defined by their ability to hybridize under reduced stringency comprising conditions in about 35% to 25% formamide at about 30⁰C to 35°C.

Alternatively, nucleic acids of the invention are defined by their ability to hybridize under high stringency comprising conditions at 42°C in 50% formamide, 5X SSPE, 0.3% SDS, and a repetitive sequence blocking nucleic acid, such as cot-1 or salmon sperm DNA (e.g., 200 n/ml sheared and denatured salmon sperm DNA). In one aspect, nucleic acids of the invention are defined by their ability to hybridize under reduced stringency conditions comprising 35% formamide at a reduced temperature of 35°C.

Following hybridization, the filter may be washed with 6X SSC, 0.5% SDS at 5O⁰C. These conditions are considered to be "moderate" conditions above 25% formamide and "low" conditions below 25% formamide. A specific example of "moderate" hybridization conditions is when the above hybridization is conducted at 30% formamide. A specific example of "low stringency" hybridization conditions is when the above hybridization is conducted at 10% formamide.

The temperature range corresponding to a particular level of stringency can be further narrowed by calculating the purine to pyrimidine ratio of the nucleic acid of interest and adjusting the temperature accordingly. Nucleic acids of the invention are also defined by their ability to hybridize under high, medium, and low stringency conditions as set forth in Ausubel and Sambrook. Variations on the above ranges and conditions are well known in the art. Hybridization conditions are discussed further, below.

The above procedure may be modified to identify nucleic acids having decreasing levels of homology to the probe sequence. For example, to obtain nucleic acids of decreasing homology to the detectable probe, less stringent conditions may be used. For example, the hybridization temperature may be decreased in increments of 5°C from 68°C to 42°C in a hybridization buffer having a Na⁺ concentration of approximately 1 M. Following hybridization, the filter may be washed with 2X SSC, 0.5% SDS at the temperature of hybridization. These conditions are considered to be "moderate" conditions above 50⁰C and "low" conditions below 50°C. A specific example of "moderate" hybridization conditions is when the above hybridization is conducted at 55⁰C. A specific example of "low stringency" hybridization conditions is when the above hybridization is conducted at 45°C. Alternatively, the hybridization may be carried out in buffers, such as 6X SSC, containing formamide at a temperature of 42°C. In this case, the concentration of formamide in the hybridization buffer may be reduced in 5% increments from 50% to 0% to identify clones having decreasing levels of homology to the probe. Following hybridization, the filter may be washed with 6X SSC, 0.5% SDS at 5O⁰C. These conditions are considered to be "moderate" conditions above 25% formamide and "low" conditions below 25% formamide. A specific example of "moderate" hybridization conditions is when the above hybridization is conducted at 30% formamide. A specific example of "low stringency" hybridization conditions is when the above hybridization is conducted at 10% formamide.

However, the selection of a hybridization format is not critical - it is the stringency of the wash conditions that set forth the conditions which determine whether a nucleic acid is within the scope of the invention. Wash conditions used to identify nucleic acids within the scope of the invention include, e.g.: a salt concentration of about 0.02 molar at pH 7 and a temperature of at least about 50⁰C or about 55⁰C to about 60⁰C; or, a salt concentration of about 0.15 M NaCI at 72⁰C for about 15 minutes; or, a salt concentration of about 0.2X SSC at a temperature of at least about 50°C or about 55°C to about 60⁰C for about 15 to about 20 minutes; or, the hybridization complex is washed twice with a solution with a salt concentration of about 2X SSC containing 0.1% SDS at room temperature for 15 minutes and then washed twice by 0.1 X SSC containing 0.1% SDS at 68oC for 15 minutes; or, equivalent conditions. See Sambrook, Tijssen and Ausubel for a description of SSC buffer and equivalent conditions.

These methods may be used to isolate nucleic acids of the invention.

Modification of Nucleic Acids

The invention provides methods of generating variants of the nucleic acids of the invention, e.g., those encoding a hydrolase of the invention. These methods can be repeated or used in various combinations to generate hydrolases or antibodies having an altered or different activity or an altered or different stability from that of a hydrolase or antibody encoded by the template nucleic acid. These methods also can be repeated or used in various combinations, e.g., to generate variations in gene / message expression, message translation or message stability. In another aspect, the genetic composition of a cell is altered by, e.g., modification of a homologous gene ex vivo, followed by its reinsertion into the cell.

A nucleic acid of the invention can be altered by any means. For example, random or stochastic methods, or, non-stochastic, or "directed evolution," methods, see, e.g., U.S. Patent No. 6,361 ,974. Methods for random mutation of genes are well known in the art, see, e.g., U.S. Patent No. 5,830,696. For example, mutagens can be used to randomly mutate a gene. Mutagens include, e.g., ultraviolet light or gamma irradiation, or a chemical mutagen, e.g., mitomycin, nitrous acid, photoactivated psoralens, alone or in combination, to induce DNA breaks amenable to repair by recombination. Other chemical mutagens include, for example, sodium bisulfite, nitrous acid, hydroxylamine, hydrazine or formic acid. Other mutagens are analogues of nucleotide precursors, e.g., nitrosoguanidine, 5-bromouracil, 2- aminopurine, or acridine. These agents can be added to a PCR reaction in place of the nucleotide precursor thereby mutating the sequence. Intercalating agents such as proflavine, acriflavine, quinacrine and the like can also be used.

Any technique in molecular biology can be used, e.g., random PCR mutagenesis, see, e.g., Rice (1992) Proc. Natl. Acad. Sci. USA 89:5467-5471 ; or, combinatorial multiple cassette mutagenesis, see, e.g., Crameri (1995) Biotechniques 18:194-196. Alternatively, nucleic acids, e.g., genes, can be reassembled after random, or "stochastic," fragmentation, see, e.g., U.S. Patent Nos. 6,291 ,242; 6,287,862; 6,287,861 ; 5,955,358; 5,830,721 ; 5,824,514; 5,811 ,238; 5,605,793. In alternative aspects, modifications, additions or deletions are introduced by error-prone PCR, shuffling, oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, site-specific mutagenesis, gene reassembly, Gene Site Saturated Mutagenesis™ (GSSM™), synthetic ligation reassembly (SLR), recombination, recursive sequence recombination, phosphothioate- modified DNA mutagenesis, uracil-containing template mutagenesis, gapped duplex mutagenesis, point mismatch repair mutagenesis, repair-deficient host strain mutagenesis, chemical mutagenesis, radiogenic mutagenesis, deletion mutagenesis, restriction-selection mutagenesis, restriction-purification mutagenesis, artificial gene synthesis, ensemble mutagenesis, chimeric nucleic acid multimer creation, and/or a combination of these and other methods. Polypeptides and peptides

The invention provides isolated or recombinant polypeptides having a sequence identity (e.g., at least 50% sequence identity) to an exemplary sequence of the invention, e.g., SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, or SEQ ID NO:26. As discussed above, the identity can be over the full length of the polypeptide, or, the identity can be over a region of at least about 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700 or more residues. Polypeptides of the invention can also be shorter than the full length of exemplary polypeptides. In one aspect, the invention provides a polypeptide comprising only a subsequence of a sequence of the invention, exemplary subsequences can be about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 100, 125, 150, 175, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, or more residues. In alternative aspects, the invention provides polypeptides (peptides, fragments) ranging in size between about 5 and the full length of a polypeptide, e.g., an enzyme, such as a hydrolase, including an esterase, an acylase, a lipase, a phospholipase or a protease; exemplary sizes being of about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 100, 125, 150, 175, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, or more residues, e.g., contiguous residues of an exemplary hydrolase of the invention. Peptides of the invention can be useful as, e.g., labeling probes, antigens, toleragens, motifs, hydrolase active sites. Polypeptides of the invention also include antibodies capable of binding to a hydrolase of the invention.

The polypeptides of the invention include hydrolases in an active or inactive form. For example, the polypeptides of the invention include proproteins before "maturation" or processing of prepro sequences, e.g., by a proprotein-processing enzyme, such as a proprotein convertase to generate an "active" mature protein. The polypeptides of the invention include hydrolases inactive for other reasons, e.g., before "activation" by a post-translational processing event, e.g., an endo- or exo-peptidase or proteinase action, a phosphorylation event, an amidation, a glycosylation or a sulfation, a dimerization event, and the like. Methods for identifying "prepro" domain sequences and signal sequences are well known in the art, see, e.g., Van de Ven (1993) Crit. Rev. Oncog. 4(2):115-136. For example, to identify a prepro sequence, the protein is purified from the extracellular space and the N-terminal protein sequence is determined and compared to the unprocessed form. The polypeptides of the invention include all active forms, including active subsequences, e.g., catalytic domains or active sites, of an enzyme of the invention. In one aspect, the invention provides catalytic domains or active sites as set forth below. In one aspect, the invention provides a peptide or polypeptide comprising or consisting of an active site domain as predicted through use of a database such as Pfam (which is a large collection of multiple sequence alignments and hidden Markov models covering many common protein families, The Pfam protein families database, A. Bateman, E. Birney, L. Cerruti, R. Durbin, L. Etwiller, S. R. Eddy, S. Griffiths-Jones, K.L. Howe, M. Marshall, and E. L. L. Sonnhammer, Nucleic Acids Research, 30(1):276- 280, 2002) or equivalent. The invention includes polypeptides with or without a signal sequence and/or a prepro sequence. The invention includes polypeptides with heterologous signal sequences and/or prepro sequences. The prepro sequence (including a sequence of the invention used as a heterologous prepro domain) can be located on the amino terminal or the carboxy terminal end of the protein. The invention also includes isolated or recombinant signal sequences, prepro sequences and catalytic domains (e.g., "active sites") comprising sequences of the invention.

Polypeptides and peptides of the invention can be isolated from natural sources, be synthetic, or be recombinantly generated polypeptides. Peptides and proteins can be recombinantly expressed in vitro or in vivo. The peptides and polypeptides of the invention can be made and isolated using any method known in the art. Polypeptide and peptides of the invention can also be synthesized, whole or in part, using chemical methods well known in the art. See e.g., Caruthers (1980) Nucleic Acids Res. Symp. Ser. 215-223; Horn (1980) Nucleic Acids Res. Symp. Ser. 225-232; Banga, A. K., Therapeutic Peptides and Proteins, Formulation, Processing and Delivery Systems (1995) Technomic Publishing Co., Lancaster, PA. For example, peptide synthesis can be performed using various solid-phase techniques (see e.g., Roberge (1995) Science 269:202; Merrifield (1997) Methods Enzymol. 289:3-13) and automated synthesis may be achieved, e.g., using the ABI 431A Peptide Synthesizer (Perkin Elmer) in accordance with the instructions provided by the manufacturer. The invention also provides methods for modifying the polypeptides of the invention by either natural processes, such as post-translational processing (e.g., phosphorylation, acylation, etc), or by chemical modification techniques, and the resulting modified polypeptides. Modifications can occur anywhere in the polypeptide, including the peptide backbone, the amino acid side-chains and the amino or carboxyl termini. It will be appreciated that the same type of modification may be present in the same or varying degrees at several sites in a given polypeptide. Also a given polypeptide may have many types of modifications. Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of a phosphatidylinositol, cross-linking cyclization, disulfide bond formation, demethylation, formation of covalent cross-links, formation of cysteine, formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, hydroxylation, iodination, methylation, myristolyation, oxidation, pegylation, proteolytic processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, and transfer-RNA mediated addition of amino acids to protein such as arginylation. See, e.g., Creighton, T.E., Proteins - Structure and Molecular Properties 2nd Ed., W.H. Freeman and Company, New York (1993); Posttranslational Covalent Modification of Proteins, B.C. Johnson, Ed., Academic Press, New York, pp. 1-12 (1983). Solid-phase chemical peptide synthesis methods can also be used to synthesize the polypeptides, or fragments thereof, of the invention. Such method have been known in the art since the early 1960's (Merrifield, R. B., J. Am. Chem. Soc, 85:2149-2154, 1963) (See also Stewart, J. M. and Young, J. D., Solid Phase Peptide Synthesis, 2nd Ed., Pierce Chemical Co., Rockford, III., pp. 1 1-12)) and have recently been employed in commercially available laboratory peptide design and synthesis kits (Cambridge Research Biochemicals). Such commercially available laboratory kits have generally utilized the teachings of H. M. Geysen et al, Proc. Natl. Acad. ScL, USA, 81 :3998 (1984) and provide for synthesizing peptides upon the tips of a multitude of "rods" or "pins" all of which are connected to a single plate. When such a system is utilized, a plate of rods or pins is inverted and inserted into a second plate of corresponding wells or reservoirs, which contain solutions for attaching or anchoring an appropriate amino acid to the pin's or rod's tips. By repeating such a process step, i.e., inverting and inserting the rod's and pin's tips into appropriate solutions, amino acids are built into desired peptides. In addition, a number of available FMOC peptide synthesis systems are available. For example, assembly of a polypeptide or fragment can be carried out on a solid support using an Applied Biosystems, Inc. Model 431A™ automated peptide synthesizer. Such equipment provides ready access to the peptides of the invention, either by direct synthesis or by synthesis of a series of fragments that can be coupled using other known techniques. Enzymes of the invention

The invention provides novel hydrolases, including esterases, acylases, lipases, phospholipases or proteases, e.g., proteins comprising at least about 50% sequence identity to an exemplary polypeptide of the invention, e.g., a protein having a sequence as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, or SEQ ID NO:26, and methods for making and using them. The polypeptides of the invention can have any hydrolase activity, e.g., an esterase, acylase, lipase, phospholipase or protease activity. In alternative aspects, the hydrolases of the invention can have modified or new activities as compared to the exemplary hydrolases or the activities described herein. For example, the invention includes hydrolases with and without signal sequences and the signal sequences themselves. The invention includes immobilized hydrolases, anti- hydrolase antibodies and fragments thereof. The invention includes homodimers and heterocomplexes, e.g., fusion proteins, heterodimers, etc., comprising the hydrolases of the invention. The invention includes hydrolases having activity over a broad range of high and low temperatures and pH's (e.g., acidic and basic aqueous conditions).

The invention includes hydrolase enzymes which are non-naturally occurring hydrolases having a different hydrolase activity, stability, substrate specificity, pH profile and/or performance characteristic as compared to the non-naturally occurring hydrolase. These hydrolases have an amino acid sequence not found in nature. They can be derived by substitution of a plurality of amino acid residues of a precursor hydrolase with different amino acids. The precursor hydrolase may be a naturally-occurring hydrolase or a recombinant hydrolase. In one aspect, the hydrolase variants encompass the substitution of any of the naturally occurring L-amino acids at the designated amino acid residue positions.

The invention provides fusion of N-terminal or C-terminal subsequences of enzymes of the invention (e.g., signal sequences, prepro sequences) with other polypeptides, active proteins or protein fragments. The production of an enzyme of the invention (e.g., a hydrolase, e.g., a lipase such as a phospholipase) may also be accomplished by expressing the enzyme as an inactive fusion protein that is later activated by a proteolytic cleavage event (using either an endogenous or exogenous protease activity, e.g. trypsin) that results in the separation of the fusion protein partner and the mature enzyme, e.g., hydrolase of the invention. In one aspect, the fusion protein of the invention is expressed from a hybrid nucleotide construct that encodes a single open reading frame containing the following elements: the nucleotide sequence for the fusion protein, a linker sequence (defined as a nucleotide sequence that encodes a flexible amino acid sequence that joins two less flexible protein domains), protease cleavage recognition site, and the mature enzyme (e.g., any enzyme of the invention, e.g., a hydrolase) sequence. In alternative aspects, the fusion protein can comprise a pectate lyase sequence, a xylanase sequence, a phosphatidic acid phosphatase sequence, or another sequence, e.g., a sequence that has previously been shown to be over-expressed in a host system of interest. Any host system can be used (see discussion, above), for example, E. coli or Pichia pastoris. The arrangement of the nucleotide sequences in the chimeric nucleotide construction can be determined based on the protein expression levels achieved with each fusion construct. Proceeding from the 5' end of the nucleotide construct to the 3' prime end of the construct, in one aspect, the nucleotide sequences is assembled as follows: Signal sequence/fusion protein/linker sequence/protease cleavage recognition site/ mature enzyme (e.g., any enzyme of the invention, e.g., a hydrolase) or Signal sequence/pro sequence/mature enzyme/linker sequence/fusion protein. The expression of enzyme (e.g., any enzyme of the invention, e.g., a hydrolase) as an inactive fusion protein may improve the overall expression of the enzyme's sequence, may reduce any potential toxicity associated with the overproduction of active enzyme and/or may increase the shelf life of enzyme prior to use because enzyme would be inactive until the fusion protein e.g. pectate lyase is separated from the enzyme, e.g., hydrolase of the invention.

Immobilized hydrolases In one aspect, the hydrolase of the invention, e.g., esterases, acylases, lipases, phospholipases or proteases, are used as immobilized forms, e.g., to process lipids, in the structured synthesis of lipids, to digest proteins and the like. Any immobilization method or form of support can be used, e.g., arrays, beads, capillary supports and the like, as described above. In one aspect, hydrolase immobilization can occur upon an inert support such as diethylaminoethyl-cellulose, porous glass, chitin or cells. Cells that express hydrolases of the invention can be immobilized by cross- linking, e.g. with glutaraldehyde to a substrate surface. Immobilized hydrolases of the invention can be prepared containing hydrolase bound to a dry, porous particulate hydrophobic support, with a surfactant, such as a polyoxyethylene sorbitan fatty acid ester or a polyglycerol fatty acid ester. The support can be an aliphatic olefinic polymer, such as a polyethylene or a polypropylene, a homo- or copolymer of styrene or a blend thereof or a pre-treated inorganic support. These supports can be selected from aliphatic olefinic polymers, oxidation polymers, blends of these polymers or pre-treated inorganic supports in order to make these supports hydrophobic. This pre-treatment can comprise silanization with an organic silicon compound. The inorganic material can be a silica, an alumina, a glass or a ceramic. Supports can be made from polystyrene, copolymers of styrene, polyethylene, polypropylene or from co-polymers derived from (meth)acrylates. See, e.g., U.S. Patent No. 5,773,266. The hydrolase enzymes, fragments thereof and nucleic acids that encode the enzymes and fragments can be affixed to a solid support. This is often economical and efficient in the use of the hydrolases in industrial processes. For example, a consortium or cocktail of hydrolase enzymes (or active fragments thereof), which are used in a specific chemical reaction, can be attached to a solid support and dunked into a process vat. The enzymatic reaction can occur. Then, the solid support can be taken out of the vat, along with the enzymes affixed thereto, for repeated use. In one embodiment of the invention, an isolated nucleic acid of the invention is affixed to a solid support. In another embodiment of the invention, the solid support is selected from the group of a gel, a resin, a polymer, a ceramic, a glass, a microelectrode and any combination thereof.

For example, solid supports useful in this invention include gels. Some examples of gels include Sepharose, gelatin, glutaraldehyde, chitosan-treated glutaraldehyde, albumin-glutaraldehyde, chitosan-Xanthan, toyopearl gel (polymer gel), alginate, alginate-polylysine, carrageenan, agarose, glyoxyl agarose, magnetic agarose, dextran-agarose, poly(Carbamoyl Sulfonate) hydrogel, BSA-PEG hydrogel, phosphorylated polyvinyl alcohol (PVA), monoaminoethyl-N-aminoethyl (MANA), amino, or any combination thereof.

Other solid supports useful in the present invention are resins or polymers. Some examples of resins or polymers include cellulose, acrylamide, nylon, rayon, polyester, anion-exchange resin, AMBERLITE™ XAD-7, AMBERLITE™ XAD-8, AMBERLITE™ IRA-94, AMBERLITE™ IRC-50, polyvinyl, polyacrylic, polymethacrylate, or any combination thereof.

Another type of solid support useful in the present invention is ceramic. Some examples include non-porous ceramic, porous ceramic, SiO₂, AI₂O₃. Another type of solid support useful in the present invention is glass. Some examples include non-porous glass, porous glass, aminopropyl glass or any combination thereof. Another type of solid support that can be used is a microelectrode. An example is a polyethyleneimine-coated magnetite. Graphitic particles can be used as a solid support. Other types of solid support useful in the present invention are diatomaceous earth products and silicates. Some examples include CELITE^® KENITE^®, DIACTIV^®, PRIMISIL^®, DIAFIL^® diatomites and MICRO-CEL^®, CALFLO^®, SILASORB™, and CELKATE^® synthetic calcium and magnesium silicates.

Another example of a solid support is a cell, such as a red blood cell.

Screening for Lipase/Esterase Activity

Colonies are picked with sterile toothpicks and used to singly inoculate each of the wells of 96-well microtiter plates. The wells contained 250 μl_ of LB media with 100 μg/mL ampicillin, 80 μg/mL methicillin, and 10% v/v glycerol (LB Amp/Meth, glycerol). The cells were grown overnight at 37⁰C without shaking. This constituted generation of the "Source GenBank." Each well of the Source GenBank thus contained a stock culture of E. coli cells, each of which contained a pBluescript with a unique DNA insert.

Plates of the source GenBank were used to multiply inoculate a single plate (the "condensed plate") containing in each well 200 μL of LB Amp/Meth, glycerol. This step was performed using the High Density Replicating Tool (HDRT) of the Beckman Biomek with a 1% bleach, water, isopropanol, air-dry sterilization cycle in between each inoculation. Each well of the condensed plate thus contained 10 to 12 different pBluescript clones from each of the source library plates. The condensed plate was grown for 16 hours at 37° C. and then used to inoculate two white 96-well

Polyfiltronics microtiter daughter plates containing in each well 250 μL of LB Amp/Meth (no glycerol). The original condensed plate was put in storage -8O⁰C. The two condensed daughter plates were incubated at 37⁰C for 18 hours.

The short chain esterase '600 μM substrate stock solution' was prepared as follows: 25 mg of each of the following compounds was dissolved in the appropriate volume of DMSO to yield a 25.2 mM solution. The compounds used were (R,S)-5-hydroxy-3-oxo-5-phenethyl-octanoic acid ethyl ester and (R,S)-5-hydroxy-3- oxo-5-phenethyl-octanoic acid methyl ester. Two hundred fifty microliters of each DMSO solution was added to ca 9 mL of 50 mM, pH 7.5 HEPES buffer which contained 0.6% of Triton X-100 and 0.6 mg per ml_ of dodecyl maltoside (Anatrace, Maumee , OH). The volume was taken to 10.5 mL with the above HEPES buffer to yield a slightly cloudy suspension.

The long chain '600 μM substrate stock solution' was prepared as follows: 25 mg of each of the following compounds was dissolved in DMSO to 25.2 mM as above. The compounds used were (R,S)-5-hydroxy-3-oxo-5-phenethyl-octanoic acid ethyl ester and (R,S)-5-hydroxy-3-oxo-5-phenethyl-octanoic acid methyl ester. All required brief warming in a 70° C. bath to achieve dissolution. Two hundred fifty microliters of each DMSO solution was added to the HEPES buffer and diluted to 10.5 mL as above. All seven umbelliferones were obtained from Sigma Chemical Co. (St. Louis, MO).

Fifty μL of the long chain esterase or short chain esterase '600 μM substrate stock solution' was added to each of the wells of a white condensed plate using the Biomek to yield a final concentration of substrate of about 100 μM. The fluorescence values were recorded (excitation=326 nm, emission=450 nm) on a plate- reading fluorometer immediately after addition of the substrate. The plate was incubated at 7O⁰C for 60 minutes in the case of the long chain substrates, and 30 minutes at RT in the case of the short chain substrates. The fluorescence values were recorded again. The initial and final fluorescence values were compared to determine if an active clone was present.

To isolate the individual clone which carried the activity, the Source GenBank plates were thawed and the individual wells used to singly inoculate a new plate containing LB Amp/Meth. As above, the plate was incubated at 37⁰C to grow the cells, 50 μL of 600 μM substrate stock solution was added using the Biomek and the fluorescence was determined. Once the active well from the source plate was identified, cells from this active well were streaked on agar with LB/Amp/Meth and grown overnight at 37° C. to obtain single colonies. Eight single colonies were picked with a sterile toothpick and used to singly inoculate the wells of a 96-well microtiter plate. The wells contained 250 μL of LB Amp/Meth. The cells were grown overnight at 37⁰C without shaking. A 200 μL aliquot was removed from each well and assayed with the appropriate long or short chain substrates as above. The most active clone was identified and the remaining 50 μ L of culture was used to streak an agar plate with LB/Amp/Meth. Eight single colonies were picked, grown and assayed as above. The most active clone was used to inoculate 3 mL cultures of LB/Amp/Meth, which were grown overnight. The plasmid DNA was isolated from the cultures and utilized for sequencing, the sequence information of which is incorporated herein.

In one embodiment, the hydrolases of the invention/and or the polynucleotides encoding them can suitable be used in a process for the preparation of an enantiomerically enriched compound of formula 1

wherein R¹ stands for OR¹², wherein R¹² stands for an ester residue or wherein R¹ stands for OR¹⁴, wherein R¹⁴ stands for an ester residue, but is not the same as R¹², wherein R³ respectively R⁴ stands for a C^Cs-alky!-, a C₆-C₁₀-aryl-Ci-C₄-alkyl- or a C₃- Cβ-cycloalkyl-d-C_Aalkyl- rest group and preferably, wherein R⁴ respectively R³ stands for a (VCe-alkyl- rest group, and wherein R³ and R⁴ are not the same.

Therefore, in one aspect, the invention relates to a process for the preparation of an enantiomerically enriched compound of formula 1

(D

wherein R¹ stands for OR¹², wherein R¹² stands for an ester residue or wherein R¹ stands for OR¹⁴, wherein R¹⁴ stands for an ester residue, but is not the same as R¹², wherein R³ respectively R⁴ stands for a

or a C₃- C₈-cycloalkyl-Ci-C₄alkyl- rest group, preferably wherein R⁴ respectively R³ stands for a CrCβ-alkyl- rest group, and wherein R³ and R⁴ are not the same by reacting a stereoselective hydrolase according to the invention with a mixture of enantiomers of a compound of formula 2

(2) wherein R³, R⁴ and R¹² are as defined above in the presence of a nucleophile, which nucleophile is H₂O or which nucleophile is an alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above and either, - in case the nucleophile is H₂O or an alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above, collecting the remaining enantiomerically enriched compound of formula 1A

¹²

(1A) wherein R³, R⁴ and R¹² are as defined above or - in case the nucleophile is the alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above- collecting the resulting enantiomerically enriched ester of formula 3

wherein R³, R⁴ and R¹⁴ are as defined above.

Preferably, the invention relates to a process for the preparation of an enantiomerically enriched compound of formula 1

(1)

wherein R¹ stands for OR¹², wherein R¹² stands for an ester residue or wherein R¹ stands for OR¹⁴, wherein R¹⁴ stands for an ester residue, but is not the same as R¹², wherein R³ respectively R⁴ stands for a d-Cs-alkyl-, a

or a C₃-

rest group, preferably wherein R⁴ respectively R³ stands for a C_rC₈-alkyl- rest group and wherein R³ and R⁴ are not the same by reacting a stereoselective hydrolase according to the invention with a mixture of enantiomers of a compound of formula 2

wherein R³, R⁴ and R¹² are as defined above in the presence of a nucleophile, which nucleophile is H₂O or which nucleophile is an alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above and collecting the remaining enantiomerically enriched compound of formula 1A ¹²

(1A)

wherein R³, R⁴ and R¹² are as defined above.

Preferably, R¹² and/or R¹⁴ stand for a C₁-C₆ alkyl such as methyl, ethyl, propyl, i-propyl, isobutyl, sec-butyl, tert-butyl, isopentyl, neopentyl, tert-pentyl, isohexyl or for a substituted alkyl, for example benzyl, for instance R¹² may stand for a C₁C₄ alkyl of benzyl, more preferably R¹² and/or R¹⁴ stand for methyl or ethyl.

Preferably R³ stands for propyl or iso-propyl, R⁴ preferably stands for phenylpropyl. Most preferably, the enantiomerically enriched compound of formula 1 is (RJ-δ-hydroxy-S-oxo-δ-phenethyl-octanoic acid ethyl ester, (S)-5-hydroxy-3-oxo-5- phenethyl-octanoic acid ethyl ester, (R)-5-hydroxy-3-oxo-5-phenethyl-octanoic acid methyl ester or (S)-5-hydroxy-3-oxo-5-phenethyl-octanoic acid methyl ester.

In the context of this invention, the term ester residue is defined as any residue that can take the place of H in the corresponding carboxylic acid. The choice of ester residue is in principle not critical for the invention. For example R¹² or R¹⁴ may stand for an alkyl, preferably a C₁-C₆ alkyl, for example for methyl, ethyl, propyl, i-propyl, t-butyl, n-butyl or n-propyl, preferably for methyl or ethyl; or for an optionally substituted aryl, for example for phenyl, p-nitrophenyl, or pentafluorophenyl; or for an optionally substituted alkylaryl, for example for benzyl. In the framework of the invention with the term 'enantiomerically enriched' is meant 'having an enantiomeric excess (e.e.) of either the (R)- or (S) - enantiomer of a compound'. Preferably, the enantiomeric excess is > 80%, more preferably > 85%, even more preferably > 90%, in particular >95%, more in particular > 97%, even more in particular > 98%, most in particular > 99%. With 'mixture of enantiomers' is meant a random mixture of (R) and

(S)-enantiomers. Typically, a racemic mixture of the compound of formula 2 is used (i.e. when (R): (S) is 1 :1), but of course the process of the invention may also be performed, - for further enantiomeric enrichment -, on an already enantiomerically enriched mixture of enantiomers. With 'stereoselectivity' of the hydrolase is meant that the hydrolase preferably catalyzes the conversion of one of the enantiomers of the compound of formula 2. In the compound of formula 2 at least the C-atom substituted with R³ and R⁴ is chiral (since R³ and R⁴ are not the same) and the stereoselectivity of the hydrolase according to the invention should at least discriminate for the chiral center formed by this C-atom.

If a hydrolase with a specific stereoselectivity is used, one enantiomer is preferably converted. This means that if a hydrolase with an opposite stereoselectivity is used, the other enantiomer is preferably converted.

The enantiomer that is preferably converted strongly depends on the structure of the substrate and the stereoselectivity of the hydrolase. Empirical methods that can predict the enantiomer that is preferably converted by the hydrolase exist (Hydrolases in organic synthesis. Eds. Kazlauskas and Bornscheuer. Wiley-VCH, 1999). In addition, the person skilled in the art may also rely on experimental data. Thus, an enzymatic reaction using a stereoselective hydrolase will either yield the (S)- product or the (R)-product, leaving behind the remaining substrate with the opposite stereochemical configuration.

The stereoselectivity of a (poly)peptide may be expressed in terms of E-ratio, the ratio of the specificity constants V_max /K_m of the conversion of the two enantiomers as described in C-S. Chen, Y Fujimoto, G. Girdaukas, C. J. Sih., J. Am. Chem. Soc. 1982, 104, 7294-7299. Preferably, the hydrolase according to the invention has an E-ratio > 5, more preferably an E-ratio > 10, even more preferably an E-ratio > 50, most preferably an E-ratio > 100.

The choice of the reaction conditions of the process of the invention depends on the choice of hydrolase. Preferably, the temperature of the process is chosen between 0 and 90⁰C, in particular between 5 and 65, more in particular between 10 and 50⁰C; Preferably, the pH of the process is chosen between 4 and 10, more preferably between 4 and 9. The pH is preferably kept constant, for instance by using a pH-stat titrator or by addition of sodiumbicarbonate, preferably in amounts of 1 molar equivalent with respect to the substrate (i.e. the mixture of enantiomers of a compound of formula 2).

The choice of solvent depends on which nucleophile is used and on the choice of enzyme. For instance if the nucleophile is water, the solvent may for example be water, an aqueous solvent, for example water with a water-miscible organic solvent, for instance t-butanol, dioxane, methanol, ethanol, tetrahydrofuran, aceton or dimethylsulfoxide; or a two-phase system of water and a water-immiscible solvent, for example toluene, hexane, heptane, methyl f-butyl ether, methyl iso-butyl ketone. If the nucleophile is the alcohol of formula HOR¹⁴, the solvent is preferably an organic solvent comprising at least 1 equivalent of HOR¹⁴, wherein R¹⁴ stands for an ester residue, but is not the same as R¹². Examples of organic solvents, which may comprise the alcohol are THF, CH₃CN, heptane, toluene, hexane, methyl-t-butyl-ether, methyl-iso-butyl ketone. Of course the organic solvent may also be the same as the alcohol used as the nucleophile.

Preferably, the solvent is water optionally with as a co-solvent ethanol or tert-butanol. Preferably the concentration of the mixture of enantiomers of the compound of formula 2 is chosen between 20 and 40w/w%, more preferably between 25 and 35w/w%.

Collecting includes for example isolation by means of conventional methods, for example ultrafiltration, concentration, column chromatography, extraction or crystallization and further reaction of the obtained product (resulting enantiomerically enriched ester of formula 3, or remaining enantiomerically enriched ester of formula 1A).

The resulting enantiomerically enriched compound of formula 3 or the remaining enantiomerically enriched compound of formula 1A can suitably be used as building blocks in the preparation of pharmaceuticals.

For example, the enantiomerically enriched compound of formula 3 or the enantiomerically enriched compound of formula 1A may be cyclisized in the presence of a base to form the corresponding enantiomerically enriched compound of formula 6

wherein R³ and R⁴ are as defined above. Cyclization of the compound of formula 3 or the compound of formula 1 A can be performed in a manner known per se, for instance by using a base as described in WO 02/068403 or in US 6,500,963 B2. Examples of bases suitable for the cyclization of the compound of formula 3 or the compound of formula 1A include: carbonate, OH, metalhydrides, for instance sodiumhydride; organometals, metalamides, for instance butyllithium; metaldialkyamides, for instance lithiumdiethylamide, lithiumdiisopropylamide and metalhexamethyldisilizanes, for instance lithiumhexamethyldisilizane. As metal cations may, for example be used lithium, sodium, potassium, rubidium, cesium, magnesium, calcium, titan, silicium, tin- and lanthanoide, preferably lithium or sodium, more preferably lithium.

Therefore, in another aspect, the invention relates to a process for the preparation of an enantiomerically enriched lacton of formula 6

wherein R³ and R⁴ are as defined above comprising the steps of reacting a stereoselective hydrolase according to the invention with a mixture of enantiomers of a compound of formula 2

wherein R³, R⁴ and R¹² are as defined above in the presence of a nucleophile, which nucleophile is H₂O or which nucleophile is an alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above and either, - in case the nucleophile is H₂O or an alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above, collecting the remaining enantiomerically enriched compound of formula 1A

(1A)

wherein R³, R⁴ and R¹² are as defined above or - in case the nucleophile is the alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above- collecting the resulting enantiomerically enriched ester of formula 3

wherein R³, R⁴ and R¹⁴ are as defined above and cyclization of the resulting enantiomerically enriched ester of formula 3 or of the remaining enantiomerically enriched ester of formula 1A in the presence of a base.

A process for the preparation of enantiomerically enriched compounds of formula 6 is disclosed in US 6,500,963. A disadvantage of said non- enzymatic process is that the process requires the use of an additional compound (a chiral amino alcohol), which either may need to be recycled in a non-quantitative manner or which is lost.

The process of the invention is a commercially feasible enzymatic process. For instance, it only requires the use of catalytic amounts of stereoselective hydrolytic enzyme and the recovery of (expensive) chiral amino alcohol is not necessary.

Furthermore, the process of the invention is the first process in which a compound having a remote and quartemary chiral center is stereoselective^ converted by an enzyme.

Chiral center has its conventional meaning in the art. For example, a

C-atom having four different substituents is a chiral center.

Quaternary center has its conventional meaning in the art. In the compound of formula 1 , 1A, 2 or 3, the chiral center is the C-atom substituted with R³ and R⁴. This chiral center is also a quaternary center.

Reactive center also has its conventional meaning in the art. For example, in the compound of formula 1 , the reactive center is -C(O)R¹.

With remote chiral center is meant that the chiral center is at least three bonds away from the reactive center. Enzyme catalyzed stereoselective conversions for compounds with remote chiral centers are known. For example, in Hydrolases in organic synthesis. Eds.

Kazlauskas and Bornscheuer. Wiley- VCH, 1999, it is disclosed that some carboxylic acids with remote sterocenter can be enzymatically converted with some degree of stereoselectivity. Hughes et. al. (1990) J. Org. Chem. 55, 6252-6259 show the asymmetric hydrolysis of esters having remote prochiral centers. Bhalerao et. al. report on the lipase-catalyzed regio- and stereoselective hydrolysis of (ω-2)-acetoxy-ω- bromoalkenoates by a lipase from Candida cylindracea.

Hedenstrόm et. al. (2002), Tetrahedron asymmetry, 13, 835-844 report, the highly stereoselective Candida rugosa lipase-catalyzed esterification of the 2- to 8-methyldecanoic acids, showing enantiomeric ratio's ranging from E = 2.8 - 68.

Fadnavis et. al. (1997) Tetrahedron Asymmetry, 8, 337-339 show a lipase catalyzed stereoselective esterification of racemic α-lipoic, having a chiral center four carbon atoms away from the reactive center, which resulted in the formation of a product with a maximum enantiomeric excess of 23.8%. Enzyme catalyzed stereoselective conversion of compounds having quaternary chiral centers are also known. For example Yee et. al. (1992) J. Org.

Chem., 57, 3525-3527 describe the enzyme catalyzed stereoselective hydrolysis of tertiary α-substituted carboxylic acids esters. Sugai et. al. disclose the enzymatic preparation of enantiomerically enriched tertiary α-benzyloxy acid esters using a lipase derived from Candida rugosa, for which enzyme they report E-ratios between E = 10 - 52. Spero et. al. (1996) J. Org. Chem., 61 , 7398-7401 show stereoselective synthesis of the (S)-α,α-disubstituted phenethylamine from α,α-disubstituted amino acid esters using a lipase.

However, up till the present invention, no example of a process wherein a compound having a remote and quaternary chiral center is stereoselective^ converted by an enzyme has been reported.

Enantiomerically enriched 5,6-dihydro-4-hydroxy-2-pyrones, such as the compounds of formula 6 are important building blocks for the synthesis of a number of pharmaceutically active compounds, for instance for non-peptidic HIV protease inhibitors, more specifically in the potent and orally bioavailable HIV-protease inhibitor tipranavir (US 6,500,963 B2). Also described in US 6,500, 963 B2 is a process for the preparation of tipranavir from enantiomerically enriched 5,6-dihydro-4-hydroxy-2- pyrones, in particular from enantiomerically enriched 5,6-dihydro-4-hydroxy-6- phenethyl-6-propyl-2H-pyran-2-on. US 6,500,963 B2 describes the synthesis of 5,6- dihydro-4-hydroxy-2-pyrone-sulphoamides and discloses that a key step in the synthesis of said compound is the reaction of 5,6-dihydro-4-hydroxy-2-pyrones with a suitably substituted carbonyl compound to form a condensation product (Knoevenagel reaction). This condensation product in turn can be reduced in the presence of H₂ and a suitable chiral metal catalyst to yield the final structure of the pharmaceutically active compound. For instance, for tipranavir, enantiomerically enriched 5,6-dihydro-4- hydroxy-6-phenethyl-6-propyl-2H-pyran-2-on can be reacted with a compound of formula 4

(4) after which the formed condensation product can be reduced in the presence of H₂ and a suitable chiral metal catalyst to form tipranavir.

The preparation of the racemic mixture of the compound of formula 2 is for instance described in WO 02/068403, hereby included by reference. The invention will now be illustrated by the following examples without however being limited thereto.

General procedures

Production of the enzyme with sequence SEQ ID NO:24: To be able to efficiently produce the enzyme with sequence SEQ ID NO:24, the gene was transferred to an expression system suitable for large scale fermentations.

The gene encoding the enzyme with sequence SEQ ID NO:24 (given in SEQ ID NO:23) was amplified with PCR according to standard procedures and equipped with Ndel and Xmal restriction sites at the N-terminal and C-terminal coding sequence, respectively. The PCR fragment obtained was subsequently digested with Ndel and Xmal, and inserted in the similarly digested plasmid pKAFssECaro (described in the published patent application WO 00/66751) thereby replacing the PenG acylase gene of the parent plasmid.

The expression vector thus obtained was introduced in E.coli K12 strains HB101 (ATCC 33694) and RV308 (ATCC31608), and transformants were selected for expression of active enzyme with sequence SEQ ID NO:24.

10 L scale fermentations were carried out essentially as described in WO 00/66751 , and the enzyme was recovered by killing of the cells by adding 1- octanol to a final concentration of 4 g/l. Next the broth was homogenized and cell debris was removed by filtration, followed by active carbon treatment and germ filtration. The resulting product was washed and concentrated by means of ultrafiltration and diafiltration.

Achiral HPLC: Determination of the conversion Conversions were monitored by analyzing the concentration of the remaining ester by means of achiral HPLC, using a Varian lnertsil ODS-3 (50 mm x 4.6 mm I. D. 3μm) column. The HPLC was operated at 45°C and a flow of 1.5 ml_/min. The components were eluted using a gradient of 50 mM H₃PO₄ buffer pH 2.3/acetonitrile (85/15) and pure acetonitrile.

Sample preparation: Approximately 200 mg of the reaction mixture was diluted with ethanol up to a volume of 25 ml_ after which the concentration of the remaining ester was determined. Chiral HPLC: Determination of the enantiomeric excess

Chiral HPLC analysis for determination of the enantiomeric excess of the product and the remaining substrate were performed using a Diacel Chiralpak AD column (250 mm x 4.6 mm I. D.). The HPLC was operated at 22°C at a flow rate of 1.2 mL/min using a mobile phase consisting of n-heptane/ethanol/trifluoroacetic acid (93/7/0.1). Detection of components was performed using a UV-detector at a wavelength of 254 nm.

Sample preparation: about 0.5 mL of reaction mixture was acidified with 50 μL 37% HCI. To this 0.5 mL ethylacetate was added. The resulting mixture was stirred for 3 minutes after which the organic and aqueous layer were separated with an Eppendorf centrifuge. After separating the organic layer and evaporating the ethylacetate in a speed vac centrifuge the samples were analyzed.

Example 1 : Determination of activity and enantioselectivity of three hydrolases

In order to determine the activity and enantioselectivity of the enzymes with sequences SEQ ID NO:24 (encoded by SEQ ID NO:23), SEQ ID NO:6 (encoded by SEQ ID NO:5)and SEQ ID NO:22 (encoded by SEQ ID NO:21), three 50 mL erlenmeyer flasks were filled with 8.9 mL 25 mM potassium phosphate buffer pH 7.0, 100 mg (R₁S)- δ-Hydroxy-S-oxo-δ-phenethyl-octanoic acid ethyl ester, 1 mL methanol and 15mg/mL of the enzyme. The flasks were incubated on an orbital shaker at 22°C. At 0, 1 , 2.5, 3.5, 19 and 30 hours approximately, 500 μL reaction mixture was taken, acidified with 50 μL of 4M HCI and extracted with ethylacetate (2X) and concentrated in a speed vac. The residues were diluted 10 times in HPLC mobile phase and analyzed on chiral HPLC. The enantiomeric excess of the remaining ester (ee), conversion (C) and the enantiomeric ratio (E) are shown in table 1. The result indicate that hydrolase with sequence SEQ ID NO:24 shows the highest enantioselectivity.

Table 1 : Conversion (C), enantiomeric excess of the remaining ester (ee) and enantiomeric ratio (E) of the hydrolases having a sequence as listed in SEQ ID NO. 24, SEQ ID NO:6 and SEQ ID NO:22 for the hydrolysis of (R₁S)- 5-Hydroxy-3-oxo-5- phenethyl-octanoic acid ethyl ester.

Example 2: Comparison of a methyl-ester and an ethyl-ester Reaction a: To 7.5 ml. water, 0.3 g KHCO₃ and 1.49 g (5.1 mmol)

(R₁S)- δ-Hydroxy-S-oxo-δ-phenethyl-octanoic acid methyl ester was added. The resulting emulsion was stirred vigorously for 10 minutes at 4O⁰C. The enzymatic conversions were started by adding 5 ml_ of an enzyme solution containing the enzyme with sequence SEQ ID NO:24. Reaction b: To 5 ml_ water, 0.3 g KHCO₃ and 1.56 g (5.1 mmol)

(R₁S)- 5-Hydroxy-3-oxo-5-phenethyl-octanoic acid ethyl ester was added. The resulting emulsion was stirred vigorously for 10 minutes at 4O⁰C. The enzymatic conversions were started by adding 10 ml. of an enzyme solution containing the enzyme with sequence SEQ ID NO: 24. The performance of the enzyme on the methyl ester was compared to the ethyl ester. In order to do so, periodically, samples were taken and analyzed by means of chiral HPLC (for procedure see above). The results are presented in the table 2, showing the enantiomeric excess of the remaining ester (ee) as a function of time (t) for the methyl ester and the ethyl ester. From the HPLC data it was calculated that the enantioselectivity for the ethyl ester was higher compared tot the methyl ester.

Table 2: Conversion of (R₁S)- δ-Hydroxy-S-oxo-δ-phenethyl-octanoic acid methyl ester (reaction a) and (R₁S)- δ-Hydroxy-S-oxo-δ-phenethyl-octanoic acid ethyl ester (reaction b), using the enzyme with sequence SEQ ID NO:24.

Example 3: Determination of the pH-optimum

The optimal pH for the conversion of (R₁S)- 5-Hydroxy-3-oxo-5- phenethyl-octanoic acid ethyl ester was determined using a H. E. L. Auto-Mate. Reaction mixtures containing 3 g (9.8 mmol) (R₁S)- 5-Hydroxy-3-oxo-5-phenethyl- octanoic acid ethyl ester, 11 g water and 15,4 g of cell free extract of the the enzyme with sequence SEQ ID NO:24. were prepared on ice in 50 ml H. E. L. reaction vessels. The reaction vessels were transferred to the temperature controlled oil baths of the H. E. L. Auto-Mate. The temperature of the oil bath was maintained at O⁰C. The reactions were stirred at 900 RPM, and the pH of the reactions was set at 6, 7, 8 and 9 using 1 mol/L of phosphoric acid (for pH 6) or 1 mol/L sodium hydroxide (for pH 7, 8 and 9). The temperature of the oil bath was increased to 40⁰C. The pH of the reactions was maintained at pH 6, 7, 8 and 9 respectively using 1 mol/L of sodium hydroxide. Samples were taken at regular intervals, and analyzed as described above. The results presented in table 3, show the conversion as a function of time. The results point out that the highest conversion was obtained at pH 9.

Table 3: Determination of the pH-optimum of the enzyme with sequence SEQ ID NO:24 for the hydrolysis of (R₁S)- 5-Hydroxy-3-oxo-5-phenethyl-octanoic acid ethyl ester.

Example 4: Determination of the temperature-optimum

The effect of temperature on the performance of the the enzyme with sequence SEQ ID NO:24 was investigated by starting 4 parallel reactions ranging in temperature from 20°C to 50°C. In order to do so, 250 mg (0.82 mmol) (R₁S)- 5- Hydroxy-3-oxo-5-phenethyl-octanoic acid ethyl ester and 25 ml 25 mM potassium phosphate buffer pH 7 was put in a reaction vessel of a H. E. L. Auto-Mate. The reaction vessels were thermostated at 20⁰C, 30°C, 4O⁰C and 50°C, respectively. After stirring for 5 minutes the reactions were started by adding 200 mg of lyophilized enzyme with sequence SEQ ID NO:24. The pH was kept constant at pH 7 by adding 1 mol/L NaOH. Samples were taken and analyzed by means of HPLC. The results, presented in table 4, show the enantiomeric excess of the remaining ester (ee) as a function of the reaction time (t). From table 4 it can be concluded that the highest hydrolase activity is obtained at 50⁰C.

Table 4: Determination of the temperature-optimum of the enzyme with sequence SEQ ID NO:24 for the hydrolysis of (R₁S)- δ-Hydroxy-S-oxo-δ-phenethyl-octanoic acid ethyl ester.

Example 5: pH control using KHCO₃ or titration with NaOH

Reaction a: A thermostated reaction vessel was filled with 9 mL 50 mM potassium phosphate buffer, pH 7.5, and 1.5 g (4.9 mmol) (R₁S)- 5-Hydroxy-3-oxo- 5-phenethyl-octanoic acid ethyl ester. The resulting emulsion was stirred vigorously for 10 minutes at 4O⁰C. The enzymatic conversion was started by adding 5 mL of an enzyme solution containing the enzyme with sequence SEQ ID NO:24. The pH of the reaction mixture was kept at pH 7.5 by adding 1 mol/L NaOH using a Metrohm pH stat, model 718 Stat Titrino.

Reaction b: In parallel, a separate vessel was filled with 9 mL water, 0.3 g KHCO₃ and 1.5 g (4.9 mmol) (R₁S)- 5-Hydroxy-3-oxo-5-phenethyl-octanoic acid ethyl ester. The resulting emulsion was stirred vigorously for 10 minutes at 4O⁰C. The enzymatic conversion was started by adding 5 mL of an enzyme solution containing the enzyme with sequence SEQ ID NO: 24.

Periodically, samples were taken and analyzed by means of achiral and chiral HPLC, according to the procedures described above. The conversion (C) and the enantiomeric excess of the remaining substrate are presented in table 5. As shown in table 5 controlling the pH using KHCO₃ or titration with NaOH show similar results with respect to the hydrolase activity and enantioselectivity.

Table 5: Conversion (C) and enantiomeric excess for the remaining substrate (ee) as a function of time for the hydrolysis of (R₁S)- 5-Hydroxy-3-oxo-5- phenethyl-octanoic acid ethyl ester using the enzyme with sequence SEQ ID NO:24, keeping the pH constant using either titration with NaOH (reaction a) or KHCO₃ (reaction b).

Example 6: Conversion with crude cell free extract

A thermostated 1.5 L glass reactor was filled with 203 g water and 20.6 g KHCO₃. The temperature was brought to 40⁰C and 103.2 g (0.34 mol) (R₁S)- 5- Hydroxy-3-oxo-5-phenethyl-octanoic acid ethyl ester was added. The resulting emulsion was stirred vigorously for 10 minutes at 4O⁰C. The enzymatic reaction was started by addition of 746.5 g crude cell free extract enzyme solution containing the the enzyme with sequence SEQ ID NO:24.

Periodically, samples were taken and analyzed by means of achiral and chiral HPLC (for procedures see above). The enzyme preferentially hydrolyzed the (S)-enantiomer, leaving behind the remaining (R)- ester. The results are presented in table 6, showing the conversion and the enantiomeric excess of the remaining (R)- ester.

Table 6: Conversion and resulting ee of the remaining (R)-ester for the hydrolysis of (R₁S)- 5-Hydroxy-3-oxo-5-phenethyl-octanoic acid ethyl ester using the enzyme with sequence SEQ ID NO:24.

Claims

1. A process for the preparation of an enantiomerically enriched compound of formula 1

(1) wherein R¹ stands for OR¹², wherein R¹² stands for an ester residue or wherein R¹ stands for OR¹⁴, wherein R¹⁴ stands for an ester residue, but is not the same as R¹², wherein R³ respectively R⁴ stands for a d-Cβ-alkyl-, a C₆- C₁₀-aryl-Ci-C₄-alkyl- or a C₃-C₈-cycloalkyl-Ci-C₄alkyl- rest group and wherein R³ and R⁴ are not the same by reacting a stereoselective hydrolase encoded by an isolated or recombinant nucleic acid comprising a nucleic acid sequence having at least

50% sequence identity to SEQ ID NO:1 , SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11 , SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21 , SEQ ID NO:23 or SEQ ID NO:25 over a region of at least about 100 residues with a mixture of enantiomers of a compound of formula 2

wherein R³, R⁴ and R¹² are as defined above in the presence of a nucleophile, which nucleophile is H₂O or which nucleophile is an alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above and either, in case the nucleophile is H₂O or an alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above, collecting the remaining enantiomerically enriched compound of formula 1A ¹²

(1A)

wherein R³, R⁴ and R¹⁴ are as defined above.

2. A process for the preparation of an enantiomerically enriched compound of formula 1

(1)

wherein R¹ stands for OR¹², wherein R¹² stands for an ester residue or wherein R¹ stands for OR¹⁴, wherein R¹⁴ stands for an ester residue, but is not the same as R¹², wherein R³ respectively R⁴ stands for a Ci-C₈-alkyl-, a C₆-

or a Ca-Cβ-cycloalkyl-d-C_Aalkyl- rest group and wherein R³ and R⁴ are not the same by reacting a stereoselective hydrolase encoded by an isolated or recombinant nucleic acid, wherein the nucleic acid comprises a sequence that hybridizes under stringent conditions to a nucleic acid comprising SEQ ID NO:1 , SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11 , SEQ ID NO: 13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21 , SEQ ID NO:23 or SEQ ID NO:25 with a mixture of enantiomers of a compound of formula 2

wherein R³, R⁴ and R¹² are as defined above in the presence of a nucleophile, which nucleophile is H₂O or which nucleophile is an alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above and either, in case the nucleophile is H₂O or an alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above, collecting the remaining enantiomerically enriched compound of formula 1A

¹²

(1A)

wherein R³, R⁴ and R¹⁴ are as defined above.

A process for the preparation of an enantiomerically enriched compound of formula 1

(1)

wherein R¹ stands for OR¹², wherein R¹² stands for an ester residue or wherein R¹ stands for OR¹⁴, wherein R¹⁴ stands for an ester residue, but is not the same as R¹², wherein R³ respectively R⁴ stands for a CrC₈-alkyl-, a C₆- do-aryl-CVC^alkyl- or a C₃-C₈-cycloalkyl-C₁-C₄alkyl- rest group and wherein

R³ and R⁴ are not the same by reacting a stereoselective hydrolase having at least 50% sequence identity to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10,

SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID

NO:20, SEQ ID NO:22, SEQ ID NO:24 or SEQ ID NO:26, over a region of at least about 100 residues with a mixture of enantiomers of a compound of formula 2

¹²

(1A)

wherein R , R and R are as defined above or - in case the nucleophile is the alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above- collecting the resulting enantiomerically enriched ester of formula 3

wherein R³, R⁴ and R¹⁴ are as defined above.

Process according to any one of claims 1-3, wherein R³ respectively R⁴ stands for a CrCβ-alkyl group.

Process according to any one of claims 1-4, wherein the stereoselective hydrolase has the amino acid sequence of SEQ ID NO:6, SEQ ID NO:22 or

SEQ ID NO:24.

Process according to any one of claims 1-5, comprising the steps of reacting the stereoselective hydrolase with a mixture of enantiomers of a compound of formula 2

wherein R³, R⁴ and R¹² are as defined above in the presence of a nucleophile, which nucleophile is H₂O or which nucleophile is an alcohol of formula HOR¹⁴, wherein R¹⁴ is as defined above and collecting the remaining enantiomerically enriched compound of formula 1A

(1A)

wherein R , R and R are as defined above. Process according to any one of claims 1-6, wherein the enantiomerically enriched compound of formula 1 is (R)-5-hydroxy-3-oxo-5-phenethyl-octanoic acid ethyl ester, (S)-5-hydroxy-3-oxo-5-phenethyl-octanoic acid ethyl ester, (R)-5-hydroxy-3-oxo-5-phenethyl-octanoic acid methyl ester or (S)-5-hydroxy- 3-oxo-5-phenethyl-octanoic acid methyl ester.

8. Process for the preparation of an enantiomerically enriched lacton of formula 6

(6)

wherein R³ and R⁴ are as defined above comprising the steps of the process of any one of claims 1-6 subsequent and cyclization of the resulting enantiomerically enriched ester of formula 3 or of the remaining enantiomerically enriched ester of formula 1 A in the presence of a base. Process for the preparation of tipranavir comprising the steps of the process of any of claims 1-8.