WO2004097028A2 - Utilisation de la rmn et du deuterium pour la decouverte de biocatalyseurs et d'activites biocatalytiques - Google Patents

Utilisation de la rmn et du deuterium pour la decouverte de biocatalyseurs et d'activites biocatalytiques Download PDF

Info

Publication number
WO2004097028A2
WO2004097028A2 PCT/US2004/012447 US2004012447W WO2004097028A2 WO 2004097028 A2 WO2004097028 A2 WO 2004097028A2 US 2004012447 W US2004012447 W US 2004012447W WO 2004097028 A2 WO2004097028 A2 WO 2004097028A2
Authority
WO
WIPO (PCT)
Prior art keywords
nmr
sample
cells
library
medium
Prior art date
Application number
PCT/US2004/012447
Other languages
English (en)
Other versions
WO2004097028A3 (fr
Inventor
Venkiteswaran Subramanian
Christopher P. Christenson
Claire B. Conboy
Kai Li
Oscar David Redwine
Lamy Jean Chopin Iii
Barbara A. Miller
Paul L. Morabito
Richard C. Winterton
Bettina M. Rosner
Original Assignee
Dow Global Technolgies Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dow Global Technolgies Inc. filed Critical Dow Global Technolgies Inc.
Publication of WO2004097028A2 publication Critical patent/WO2004097028A2/fr
Publication of WO2004097028A3 publication Critical patent/WO2004097028A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/02Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving viable microorganisms
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N24/00Investigating or analyzing materials by the use of nuclear magnetic resonance, electron paramagnetic resonance or other spin effects
    • G01N24/08Investigating or analyzing materials by the use of nuclear magnetic resonance, electron paramagnetic resonance or other spin effects by using nuclear magnetic resonance
    • G01N24/088Assessment or manipulation of a chemical or biochemical reaction, e.g. verification whether a chemical reaction occurred or whether a ligand binds to a receptor in drug screening or assessing reaction kinetics
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R33/00Arrangements or instruments for measuring magnetic variables
    • G01R33/20Arrangements or instruments for measuring magnetic variables involving magnetic resonance
    • G01R33/44Arrangements or instruments for measuring magnetic variables involving magnetic resonance using nuclear magnetic resonance [NMR]
    • G01R33/46NMR spectroscopy
    • G01R33/465NMR spectroscopy applied to biological material, e.g. in vitro testing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2500/00Screening for compounds of potential therapeutic value

Definitions

  • the present invention relates to methods and apparatus for the screening of biocatalysts.
  • the present invention relates to high throughput nuclear magnetic resonance methods for the screening of biocatalysts and the discovery of novel biocatalysts and novel biocatalytic activities.
  • Biocatalysts are used for production of chiral molecules, pharmaceutical products and intermediates, agricultural products and intermediates, medium to high value chemicals, and polymers and monomers thereof, from bio- and petro-chemical feedstocks. Biocatalysts are also used as reporting reagents in laboratory and clinical tests, as analyte-specific catalysts in bioanalytical devices such as bioelectrodes and biosensors, and as indicator catalysts for evaluating potential bioactivities and toxicities of chemicals being developed for in vivo administration.
  • biocatalyst discovery consists of three independent but closely related aspects: gene discovery, microbe discovery, and enzyme discovery.
  • gene discovery the goal is to isolate and identify genes that encode certain functions, from gene libraries.
  • Such libraries can be: organism-specific gene libraries, such as microbial gene libraries; organism-non-specific gene libraries, such as environmental gene libraries (e.g., gene libraries obtained from mixed cultures, soils, and sediments); synthetic gene libraries; and gene libraries derived from any such genes (e.g., as by nucleic acid mutation, recombination, and/or selection processes).
  • Traditional microbe discovery involves the screening of microbes from diversified environments to identify those that can catalyze target reactions.
  • Traditional enzyme discovery involves the screening of proprietary or commercially available enzymes and/or enzyme collections for those that can catalyze specific reactions.
  • One approach is to develop analyte-specific chemical assay methods to screen for metabolites produced by targeted biocatalytic reactions; examples of such chemical assay methods include colorimetric, fluorometric, and antibody-based methods (Sarubbi et al, FEBS Lett, 1991, 279(2), 265).
  • the other approach is to use oligonucleotide probe hybridization screening of gene libraries to rapidly identify genes homologous to those known to encode enzymes catalyzing a targeted biocatalytic reaction or a biochemically similar reaction; the enzymes encoded by the homologous genes identified thereby are then expressed and screened for specific reactions of interest (See e.g., U.S. Patent No. 6,030,779).
  • RNA and DNA hybridization technology partial sequence information of the target gene is needed for designing the starting probe.
  • This target gene sequence information can be obtained either by use of bioinformatics, wherein partial genetic information is deduced by comparing the consensus regions of similar genes encoding the desired enzymatic activity or a closely related activity, or by partial sequencing of the purified protein and reverse translating a corresponding coding sequence segment.
  • the probe is then designed to contain this target gene sequence information, or its complement. This probe can be used to isolate the gene from microbial gene libraries or from other sources of target nucleic acids.
  • the present invention relates to methods and apparatus for the screening of biocatalysts.
  • the present invention relates to high throughput nuclear magnetic resonance methods for the screening of biocatalysts, and the discovery of novel biocatalysts and novel biocatalytic activities.
  • the present invention provides a process for discovery of biocatalytic activities, comprising providing at least one biological entity suspected of having biocatalytic activity; at least one organic test substrate; a biocatalysis medium; and an NMR apparatus capable of being operated in high-throughput mode; and exposing the biological entity to a medium comprising at least partially deuterated protons; contacting in the biocatalysis medium the biological entity with the test substrate(s) to form a biocatalytic system in which the biological entity is capable of catalyzing the transformation of the test substrate(s) into at least one metabolite; obtaining a test sample containing at least a portion of the biocatalytic system, the test sample containing the metabolite(s) or the test substrate(s); performing high-throughput NMR using the NMR apparatus on the test sample, thereby generating and recording at least one test NMR spectrum; and comparing the test NMR spectrum to a control NMR spectrum in order to identify a difference between the test sample and the control sample.
  • the difference between the test sample and the control sample comprises the presence of the metabolite in the test sample, an increase in concentration of the metabolite in the test sample, or a reduction in concentration of the test substrate in the test sample. In some embodiments, the difference is indicative of the presence of catalytic activity in the biological entity toward the test substrate. In other embodiments, the difference between the test sample and the control sample comprises absence of the metabolite in the test sample, the absence of an increase in concentration of the metabolite in the test sample, or the absence of reduction in concentration of the test substrate in the test sample. In some embodiments, the difference is indicative of the absence of catalytic activity in the biological entity toward the test substrate.
  • the method further comprises repeating steps B) though E) until at least one identification is obtained that is indicative of the presence of catalytic activity toward the test substrate.
  • the method further comprises the step of determining the chemical identity of the metabolite.
  • the method further comprises the step of identifying the biocatalyst that performed the biocatalysis.
  • the method further comprises the step of isolating the biocatalyst that performed the biocatalysis.
  • the method further comprises the step of identifying at least a portion of a nucleic acid molecule encoding the biocatalyst that performed the biocatalysis.
  • the method further comprises the step of isolating at least a portion of a nucleic acid molecule encoding the biocatalyst(s) that performed the biocatalysis.
  • the method further comprises the step of comparing the chemical identity, or the NMR spectrum of the metabolite to entries in a database selected from the group consisting of a chemical identity database and a NMR spectra database to identify at least one identical or closely related compound.
  • the method further comprises the step of accessing, from the database, data for at least one biological property of the identical or closely related compound; and determining the type or degree of the biological property or properties that would be associated with the metabolite.
  • the organic test substrate has at least one non-exchangeable proton.
  • the NMR apparatus is a high throughput NMR apparatus.
  • the biocatalysis medium is fully deuterated. In other embodiments, the biocatalysis medium is partially deuterated. In still other embodiments, the biocatalysis medium is non-deuterated.
  • exposing the biological entity to a medium comprising at least partially deuterated protons comprises exposing the biocatalyst to a fully deuterated medium. In other embodiments, exposing the biological entity to a medium comprising at least partially deuterated protons comprises exposing the biocatalyst to a partially deuterated medium.
  • the method further comprises providing a library of biological entities.
  • the library of biological entities is contained in a population of cells.
  • the library of biological entities comprises a library of exogenous nucleic acid sequences encoding proteins.
  • the exogenous nucleic acid library comprises a library comprising a genomic DNA library, a cDNA library, and an EST library, an organism library, a mutant library, an enzyme library, a directed evolution library, or a library of randomized sequence segments.
  • the cells are intact cells.
  • the cells are removed from the medium prior to the performing high throughput NMR.
  • the cells are lysed cells.
  • the cells are part of a tissue.
  • the cells comprise mammalian cells, plant cells, protist cells, fungi cells (e.g., Aspergillus species, Candida species, Saccharomyces species, and Yarrowia species), bacteria cells (e.g., E. coli, Bacillus species, Klebsiella species, and Pseudomonas species), or arcliaea cells.
  • the performing high-throughput NMR comprises performing high-throughput NMR on at least 600, preferably at least 1000, 600, preferably at least 1000, even more preferably at least 2000, still more preferably at least 5000, and yet more preferably at least 10,000 different samples per day.
  • the test substrate is a drug candidate.
  • the biocatalytic activity is a drug-metabolizing enzyme.
  • the test substrate is an environmental toxin.
  • the biocatalytic activity is a metabolizing enzyme.
  • the environmental toxin is a pesticide.
  • the environmental toxin is an industrial byproduct.
  • the screening comprises quantitative detection of the presence or absence of alterations to the substrate.
  • the screening comprises detecting the kinetics of formation of alterations to the substrate.
  • performing high throughput NMR comprises performing proton NMR.
  • the present invention further provides a system for discovery of biocatalytic activities, comprising at least one biological entity suspected of having biocatalytic activity; wherein the biological entity has been exposed to a medium comprising at least partially deuterated protons; at least one organic test substrate; and an NMR apparatus configured for high-throughput operation.
  • the system further comprises a biocatalysis medium.
  • the system is configured for determining differences between a test sample comprising the biological entity and the organic test substrate and a control sample comprising the organic test substrate.
  • the difference between the test sample and the control sample comprises the presence of the metabolite in the test sample, an increase in concentration of the metabolite in the test sample, or a reduction in concentration of the test substrate in the test sample.
  • the difference is indicative of the presence of catalytic activity in the biological entity toward the test substrate.
  • the difference between the test sample and the control sample comprises the absence of the metabolite in the test sample, the absence of an increase in concentration of the metabolite in the test sample, or the absence of reduction in concentration of the test substrate in the test sample.
  • the difference is indicative of the absence of catalytic activity in the biological entity toward the test substrate.
  • the system is configured for determining the chemical identity of a metabolite of the test substrate.
  • the system is configured for identifying a biocatalyst responsible for the biocatalytic activity.
  • the system is configured for isolating the biocatalyst. In some embodiments, the system is configured for identifying at least a portion of a nucleic acid molecule encoding the biocatalyst. In some embodiments, the system is configured for the step of comparing the chemical identity, or the NMR spectrum of the metabolite to entries in a database selected from the group consisting of a chemical identity database and a NMR spectra database to identify at least one identical or closely related compound. In some embodiments, the system is configured for accessing, from the database, data for at least one biological property of the identical or closely related compound; and determining the type or degree of the biological property or properties associated with the metabolite.
  • the biocatalysis medium is fully deuterated. In other embodiments, the biocatalysis medium is partially deuterated. In still other embodiments, the biocatalysis medium is non-deuterated. In some embodiments, the biological entity has been exposed to a medium comprising fully deuterated protons.
  • the organic test substrate has at least one non-exchangeable proton.
  • the system further comprises a library of biological entities.
  • the library of biological entities is contained in a population of cells.
  • the library of biological entities comprises a library of exogenous nucleic acid sequences encoding proteins.
  • the exogenous nucleic acid library comprises a library comprising a genomic DNA library, a cDNA library, and an EST library, an organism library, a mutant library, an enzyme library, a directed evolution library, or a library of randomized sequence segments.
  • the cells are intact cells. In other embodiments, the cells are lysed cells. In some embodiments, the cells are part of a tissue.
  • the cells comprise mammalian cells, plant cells, protist cells, fungi cells (e.g, Aspergillus species, Candida species, Saccharomyces species, and Yarrowia species), bacteria cells (e.g., E. coli, Bacillus species, Klebsiella species, and Pseudomonas species), or archaea cells.
  • fungi cells e.g, Aspergillus species, Candida species, Saccharomyces species, and Yarrowia species
  • bacteria cells e.g., E. coli, Bacillus species, Klebsiella species, and Pseudomonas species
  • archaea cells e.g., Archaea cells.
  • the NMR apparatus is configured for performing high- throughput NMR on at least 600, preferably at least 1000, 600, preferably at least 1000, even more preferably at least 2000, still more preferably at least 5000, and yet more preferably at least 10,000 different samples per day.
  • the test substrate is a drug candidate.
  • the biocatalytic activity is a drug-metabolizing enzyme.
  • the test substrate is an environmental toxin.
  • the biocatalytic activity is a metabolizing enzyme.
  • the environmental toxin is a pesticide.
  • the environmental toxin is an industrial byproduct.
  • the present invention additionally provides a composition comprising an isolated and purified nucleic acid encoding a polypeptide having the amino acid sequence of SEQ ID NO: 10.
  • the nucleic acid is operably linked to a heterologous promoter.
  • the nucleic acid is contained within a vector.
  • the vector is within a host cell.
  • the polypeptide has nitrilase and/or nitrile hydratase activity.
  • the present invention further provides a composition comprising an isolated and purified nucleic acid that hybridizes under conditions of low stringency to a nucleic acid having the sequence of SEQ ID NO: 9.
  • the present invention provides a vector comprising the nucleic acid or a host cell comprising the vector.
  • the host cell is located in an organism, wherein the organism is a non-human animal.
  • the present invention provides a composition comprising a polypeptide encoded by a nucleic acid having a sequence comprising SEQ ID NO:9 or variants thereof that are at least 80%, preferably at least 90% and even more preferably at least 95% identical to SEQ ID NO: 9.
  • the polypeptide has nitrilase activity.
  • the polypeptide further has nitrile hydratase activity.
  • the present invention additionally provides a composition comprising an enzyme having both nitrilase and nitrile hydratase activity.
  • the enzyme has the amino acid sequence of SEQ ID NO: 10.
  • the present invention provides a system, comprising a nuclear magnetic resonance apparatus; and a sample handling apparatus in communication with the nuclear magnetic apparatus, wherein the sample handling apparatus is configured for the sequential injection of a plurality of samples into the nuclear magnetic resonance apparatus, the samples forming discrete plugs separated by a plug of air or other gas.
  • the system is configured for the analysis of at least 120 samples per hour.
  • the sample handling apparatus comprises a 3-way valve and a sample loop.
  • the 3-way valve is located on a cannula control arm of the sample handling apparatus.
  • the 3-way valve comprises a slider valve or a diaphragm valve.
  • the solvent comprises D 2 O.
  • the sample handling apparatus is a robot.
  • the system is controlled by software and a computer processor.
  • the samples are in a microtitre plate.
  • the system further comprises a second 3-way valve located in between the 3-way valve and the nuclear magnetic resonance apparatus.
  • the present invention further provides a method, comprising providing a plurality of samples comprising nuclear magnetic resonance detectable compounds; and a system comprising a nuclear magnetic resonance apparatus; and a sample handling apparatus in communication with the nuclear magnetic apparatus, wherein the sample handling apparatus is configured for the sequential injection of samples into the nuclear magnetic resonance apparatus, the samples being discrete plugs separated by a plug of air or other gas; and detecting the compounds with the system.
  • the system is configured for the analysis of at least 120 samples per hour.
  • the sample handling apparatus comprises a 3-way valve and a sample loop.
  • the 3-way valve is located on a cannula control arm of the sample handling apparatus.
  • the 3-way valve comprises a slider valve or a diaphragm valve.
  • the solvent comprises D 2 O.
  • the sample handling apparatus is a robot.
  • the system is controlled by software and a computer processor.
  • the samples are in a microtitre plate.
  • the system further comprises a second 3-way valve located in between the 3-way valve and the nuclear magnetic resonance apparatus.
  • the plurality of samples comprise a population of cells transformed with an exogenous nucleic acid library, wherein the cells have been contacted with a substrate.
  • detecting the compounds comprises detecting the presence or absence of alterations to the substrate.
  • the population of cells are grown in a medium comprising deuterium substituted for hydrogen.
  • the exogenous nucleic acid library encodes a plurality of biocatalysts.
  • the biocatalysts are enzymes.
  • the present invention also provides a method, comprising providing a nuclear magnetic resonance apparatus and a plurality of samples; and sequentially injecting the plurality of samples into the nuclear magnetic resonance apparatus, wherein samples are discrete plugs separated by a plug of air or other gas.
  • the system is configured for the analysis of at least 120 samples per hour.
  • the sequentially injecting the plurality of sample is performed by a sample handling apparatus comprising a 3-way valve and a sample loop.
  • the 3-way valve is located on a cannula control arm of the sample handling apparatus.
  • the 3-way valve is selected from the group consisting of a slider valve and a diaphragm valve.
  • the sample handling apparatus is a robot.
  • the plurality of samples comprise a population of cells transformed with an exogenous nucleic acid library, wherein the cells have been contacted with a substrate.
  • detecting the compounds comprises detecting the presence or absence of alterations to the substrate.
  • the population of cells are grown in a medium comprising deuterium substituted for hydrogen.
  • the exogenous nucleic acid library encodes a plurality of biocatalysts.
  • the present invention provides a method comprising providing a population of cells transformed with an exogenous nucleic acid library; at least one substrate; and a nuclear magnetic resonance apparatus; and culturing the population of cells in a medium comprising deuterium substituted for hydrogen to generate a population of deuterated cells; exposing the population of deuterated cells to the substrate; and screening the population of deuterated cells for alterations to the substrate using the nuclear magnetic resonance apparatus, wherein alterations to the substrate are indicative of the presence of an activity encoded by a member of the exogenous nucleic acid library.
  • the present invention further provides a system, comprising a population of cells transformed with an exogenous nucleic acid library, wherein the population of cells comprise deuterium substituted for hydrogen; at least one substrate; and a nuclear magnetic resonance apparatus configured for the screening of the population of cells for the presence or absence of alterations to the substrate.
  • the present invention also provides a method comprising providing at least one type of cell; a library of compounds; and a nuclear magnetic resonance apparatus; and culturing the cell in a medium comprising deuterium substituted for hydrogen to generate deuterated cells; exposing the population of deuterated cells to the library of compounds; and screening the population of deuterated cells for alterations to the library of compounds using the nuclear magnetic resonance apparatus.
  • Figure 1 shows 1H NMR comparison of whole-cell biocatalysts. a) Cells cultivated in Luria broth (LB) medium, b) Cells cultivated in deuterated (deu) medium.
  • Figure 2 shows whole cell dehalogenation reaction with deu E. coli BL21 (DE3)/pKL.DO .1.11 OA as a biocatalyst.
  • Figure 3 shows a comparison NMR spectra of library samples: a) no biocatalyst activity present; b) positive control BL21(DE3)/pKL.DOW.l.l 10A.
  • Figure 4 shows 1H NMR spectra comparing the per-deuterated culture conditions and the partially deuterated conditions used in some embodiments of the present invention.
  • Spectrum A was obtained using a totally deuterated growth medium containing 5 mM trichloropropane.
  • Spectrum B was obtained after 24 hour culture incubation using LB, and subsequent reaction incubation in minimal medium containing 10% deuterated growth medium.
  • Figure 5 shows an 1H NMR spectrum indicating biocatalysis of hydrocinnamonitrile.
  • the triplets at 3.0 ppm and 2.85 ppm are the aliphatic methylene protons of the substrate.
  • the additional two sets of triplets observed at 2.95 ppm, 2.89 ppm, and 2.60 ppm and 2.48 ppm are indicative of product formation.
  • the two triplets at 2.89 ppm and 2.48 ppm have been assigned to the corresponding acid.
  • the culture procedure used for these samples utilized Luria Broth (LB) for the growth phase and a minimal salt media with 10% Deu medium for the biocatalysis phase.
  • LB Luria Broth
  • Figure 6 shows a DNA sequence of the genomic insert (1766 bp), the red protion is the putative ORF of the enzyme DOW2447.
  • Figure 7 shows the amino acid sequence of the putative protein DOW2447 ORF of the enzyme DOW2447.
  • Figure 8 shows a plasmid map of the plasmid pPMl, which contains the cloned gene DOW2447.
  • Figure 9 illustrates exemplary enzymatic activities of DOW2447 against hydrocinnamonitrile.
  • Figure 10 shows NMR spectra of samples for JM109/pPMl with mono and d- nitrile substrates. A) 0 hour sample indicting substrates hydrocinnamonitrile and glutaronitrile. B) 48 hour sample indicating amide product from hydrocinnamonitrile and mixture of amides and acids from di-nitrile glutaronitrile.
  • Figure 11 shows NMR spectra of samples for JM109/pPMl with the four nitrile compounds (benzonifrile, hydrocinnamonitrile, acetonitrile and butyronitrile) added to the reaction. A) 0 hour sample. B) 24 hour samples.
  • Figure 12 shows the VAST accessory in normal operation. 1) workstation. 2) Liquids handler. 3) Injector. 4) Sample delivery line. 5) Flow-probe. 6) NMR Magnet.
  • Figure 13 shows a configuration for Plug-Flow NMR using a Gilson 215 and
  • Narian Flow-probe and a 3-way sampling valve used in some embodiments of the present invention.
  • Figure 14 shows a schematic of the system loading process for plug flow ⁇ MR used in some embodiments of the present invention.
  • Figure 15 shows an alternate injection scheme used in some embodiments of the present invention.
  • Workstation 2) Liquids handler. 3) 3-way valve.
  • Sample delivery line 5) Flow-probe. 6) ⁇ MR Magnet.
  • Sample loop 8) Flow to waste.
  • Figure 16 shows a display of ⁇ MR data collected from a test plate using the plug- flow sampling method of some embodiments of the present invention.
  • Figure 17 shows a schematic of a 3-way diaphragm valve used in some embodiments of the present invention.
  • Figure 18 shows a schematic of a 3-way slider valve used in some embodiments of the present invention.
  • biocatalyst refers to any entity capable of catalyzing the conversion of a substrate into a product within a biological entity or biological environment, and, in particular, a catalytic biomolecule or biomolecular assemblage.
  • biocatalyst includes, but is not limited to, (A) (1) organisms (e.g., live or dead single-cell or multi-cell organisms); viruses, including, e.g., bacteriophages and viroids); (2) organs (live or dead); (3) tissues (live or dead); (4) cells (live or dead cells, protoplasts, spheroplasts); (5) organism, organ, tissue, and cell homogenates and lysates; (6) organism, organ, tissue, and cell fractions and isolates (e.g., organelles, microsomes, cytoplasts, karyoplasts); (7) environmental samples (e.g., biological entity-containing soils, sediments); (8) catalytic entities therein; and (B) (1) enzymes (e.g., catalytic polypeptides and proteins), including catalytic peptides and mini-enzymes (e.g., catalytic oligopeptides); (2) abzymes (e.g., catalytic antibodies,
  • A
  • catalytic nucleic acids including ribozymes (e.g., catalytic poly- and oligo-nucleotides, "RNAzymes”) and deoxyribozym.es (e.g., catalytic poly- and oligo-deoxynucleotides, "DNAzymes”), which includes standard nucleic acids, and non-standard nucleic acids such as pyranosyl nucleic acids, uracil-containing DNA, thymine-containing RNA, nucleic acids containing non- standard bases or non-standard linkages (e.g., 5'-2'), and so forth; (4) catalytic biomolecule analogs (e.g., catalytic peptide analogs; catalytic nucleic acid analogs, such as catalytic peptide nucleic acids); and (5) catalytic assemblages thereof.
  • RNAzymes e.g., catalytic poly- and oligo-nucleotides, "RNAzymes”
  • Biocatalyst also includes catalytic biological complexes and derivatives that contain chemically diverse atoms or groups (e.g., atoms and groups that are chemically different from the main biomolecule(s) with which they are complexed or to which they are attached).
  • Such catalytic biomolecular complexes and derivatives include, but are not limited to, catalytic biological metallo-, organo-, organometallo-, phospho-, sulfo-, boro-, nitro-, amino-, glyco-, peptido-, and lipo-biomolecular complexes and derivatives.
  • biocatalysts of these types include, but are not limited to, glycoprotein-based abzymes, lipoprotein-based enzymes, catalytic aminophospholipids, phosphoprotein- based enzymes, flavin-containing enzymes, selenoprotein-based enzymes, copper- containing enzymes, iron-sulfur-complex-containing enzymes, heme-containing mini- enzymes.
  • the biocatalyst(s) can be native, synthetic, or modified.
  • native biocatalyst(s) may be obtained from any living or dead cell(s) or organism(s), e.g., a plant, animal, human, or microorganism (e.g., a bacterium, archaeon, fungus, yeast, or protist), or from a virus (e.g., bacteriophages and viroids).
  • Synthetic biocatalysts may be obtained, for example, by rational or stochastic synthesis of polynucleotides, followed by expression of catalytic nucleic acid(s) or polynucleotide(s) encoded thereby.
  • Modified biocatalyst(s) may be obtained from an already-provided biocatalyst(s) by, for example, treatment (e.g., as with an organism or cell to which a substance or condition has been applied), or enrichment (e.g., as with a cell culture, mixed culture, or environmental sample that has been enriched in a given population of cells or cell types, or as with a mixture of biomolecules that has been enriched in, or fractioned to obtain, molecules of a given range of sizes or other properties), and/or by mutating and/or recombining, and then expressing, the nucleic acids encoding the already-provided biocatalyst(s) (e.g., as by means of application of mutagenic agents, conditions, or processes, application of nucleic acid recombination-fostering agents, conditions, or processes).
  • treatment e.g., as with an organism or cell to which a substance or condition has been applied
  • enrichment e.g., as with
  • the biocatalysts may be localized, during biocatalysis, in any configuration.
  • the biocatalysts may be in suspension, may be located in cyto (as for biocatalytic molecules) or in vivo, may be immobilized to a solid support or to a membrane, may be imbedded within a support material (such as a cross-linked matrix or gel), or may be expressed on the surface of a cell or viral particle.
  • biocatalyst(s) is retained from entry into the NMR probe, e.g., as by being immobilized, imbedded, or filtered, a single biocatalyst may be contacted with a series of substrates, and this contacting may be performed either in a batch-type mode or in a continuous flow-through mode.
  • the term “discovery”, as in “discovery of a biocatalytic activity” means the identification of at least one biocatalytic activity.
  • the biocatalytic activity so discovered may represent a single catalytic activity (e.g. , a single step reaction or a direct conversion of the substrate to the metabolite(s)) or multiple catalytic activities.
  • the multiple catalytic activities represent at least one metabolic pathway.
  • the discovery of biocatalytic activity further comprises the identification of a catalytic chemistry (e.g. , a substrate-to-production conversion or a reaction mechanism) that has never been reported.
  • the discovery of biocatalytic activity also comprises the identification of a catalytic chemistry that is a cognate chemistry to a known type of chemistry, and that has never been reported.
  • a cognate chemistry include, but are not limited to: (1) catalysis involving a substrate aza (or aza) group was known, but the identified catalysis involving a substrate phospha (or phospho, respectively) group was not reported, and vice versa; (2) catalysis involving the amino group of one of a primary, second, tertiary, or quaternary amine or an substrate or imine group was known, but the identified catalysis involving the amino (or imino) group of one of the other listed substrates was not reported, and vice versa; (3) catalysis involving one of a substrate acid, ester, lactone, or anhydride group was known, but the identified catalysis involving a substrate having one of the other listed groups was not reported, and vice versa; (4) catalysis involving a substrate
  • biocatalytic activities is not limited to the discovery of novel chemistries or chemical reactions.
  • the discovery of biocatalytic activity comprises the identification of a catalytic chemistry that is a known type of chemistry known to be effected upon a first substrate, and that has never been reported for the different substrate tested.
  • Non-liming examples of this include, but are not limited to: identification of the same catalytic chemistry effected upon a C20 test substrate, where such chemistry had been reported only for C2-C4 substrates; and identification of the same catalytic chemistry effected upon a cycloaliphatic substrate, where such chemistry had been reported only for linear aliphatic substrates.
  • Biocatalytic discovery also comprises the identification of a multi-step catalytic chemistry in which each step involves a known chemistry, but in which the combination and/or order of steps has never been reported (e.g., in any biological organism, system, environment or for any biomolecule or biomolecular assemblage; in the particular biological taxon, system (e.g., mixed culture, tissue, organelle, lysate), or environment under study; for the particular catalytic molecule or molecular assemblage under study; or for the studied biological taxon, system, or environment or catalytic molecule in the particular conditions under study).
  • any biological organism, system, environment or for any biomolecule or biomolecular assemblage in the particular biological taxon, system (e.g., mixed culture, tissue, organelle, lysate), or environment under study
  • the particular catalytic molecule or molecular assemblage under study for the studied biological taxon, system, or environment or catalytic molecule in the particular conditions under study.
  • Biocatalytic discovery further comprises the identification of a catalytic chemistry toward a given substrate, which chemistry is known and is known toward that substrate, but which has never been reported (e.g., in any biological organism, system, environment or for any biomolecule or biomolecular assemblage; in the particular biological taxon, system (mixed culture, tissue, organelle, lysate), or environment under study; for the particular catalytic molecule or molecular assemblage under study; or for the studied biological taxon, system, or environment or catalytic molecule in the particular conditions under study).
  • a catalytic chemistry toward a given substrate, which chemistry is known and is known toward that substrate, but which has never been reported (e.g., in any biological organism, system, environment or for any biomolecule or biomolecular assemblage; in the particular biological taxon, system (mixed culture, tissue, organelle, lysate), or environment under study; for the particular catalytic molecule or molecular assemblage under study; or for the studied biological taxon, system
  • Biocatalytic discovery also comprises the identification of a catalytic chemistry toward a given substrate, which chemistry is known and is known toward that substrate, but which has never been reported in any biological organism, system, environment or for any biomolecule or biomolecular assemblage; in the particular biological taxon, system (e.g., mixed culture, tissue, organelle, lysate), or environment under study; or for the particular catalytic molecule or molecular assemblage under study.
  • a catalytic chemistry toward a given substrate, which chemistry is known and is known toward that substrate, but which has never been reported in any biological organism, system, environment or for any biomolecule or biomolecular assemblage; in the particular biological taxon, system (e.g., mixed culture, tissue, organelle, lysate), or environment under study; or for the particular catalytic molecule or molecular assemblage under study.
  • biocatalytic discovery comprises the identification of a catalytic chemistry that is a known type of chemistry known to be effected upon a first substrate, and that has never been reported for the similar substrate tested. Examples of this include, but are not limited to identification of the same catalytic chemistry effected upon a C6 test substrate, where such chemistry had been reported only for C2-C4 substrates; and identification of the same catalytic chemistry effected upon a cyclo-heptane-based substrate, where such chemistry had been reported only for cyclo-hexane-based substrates.
  • the known catalytic chemistry has not been reported in the particular biological organism, system, environment or for any biomolecule or biomolecular assemblage; in the particular biological taxon, system (e.g. , mixed culture, tissue, organelle, lysate), or environment; for the particular catalytic molecule or molecular assemblage under study; or for the studied biological taxon, system, or environment or catalytic molecule in the particular conditions under study.
  • the known catalytic chemistry has never been reported in the particular biological organism, system, environment or for any biomolecule or biomolecular assemblage; in the particular biological taxon, system (e.g., mixed culture, tissue, organelle, lysate), or environment under study; or for the particular catalytic molecule or molecular assemblage under study.
  • high throughput nuclear magnetic resonance (HTP- NMR) system and “HTP-NMR spectroscopy” refer, respectively, to an NMR system that is able to rapidly analyze a plurality of samples (e.g., at least 400 per day, preferably at least 2000 per day, even more preferably at least 5000 per day, still more preferably at least 10,000 per day, and yet more preferably at least 50,000 per day) with minimal operator intervention, and to spectroscopy performed by use of such a system.
  • HTP- NMR systems are capable of analyzing each sample in a short time period (e.g., less than five minutes, preferably less than 3 minutes, and even more preferably less than 1 minute per sample).
  • biocatalysis medium refers to a medium that permits catalytic activity to take place and is NMR silent (e.g., does not include components that generate an proton-NMR signal that interferes with that of the substrate/product).
  • biocatalysis medium is optimized for biocatalysis (e.g., the buffer, pH, additional components are optimized for biocatalysis).
  • the term "chemical identity" refers to the chemical nature of a given compound, substrate, or metabolite.
  • the chemical entity is described by at least one of an empirical formula; a molecular formula; a structural formula; or a chemical class (classified by, e.g., organic backbone element(s), unsaturation(s), configuration(s), heteroatom substituent(s), and/or chemical group(s), e.g. : a long chain (i.e. >C12) hydrocarbon, a medium chain (i.e. C5-C11) fatty acid, a short chain (i.e.
  • alkylamine an ether, an aldehyde, an ester, an alkenyl ketone, an alkyl chloride, an aryl or alkylaryl phosphate, a cycloaliphatic sulfonamide, a thiaalkanol, or a branched-chain aliphatic isocyanate, and so forth).
  • gene refers to a nucleic acid (e.g., DNA) that comprises coding sequences necessary for the production of a polypeptide, RNA (e.g., including but not limited to, mRNA, tRNA and rRNA) or precursor.
  • RNA e.g., including but not limited to, mRNA, tRNA and rRNA
  • the polypeptide, RNA, or precursor can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or functional properties (e.g., enzymatic activity, ligand binding, signal transduction, etc.) of the full-length or fragment are retained.
  • the term also encompasses the coding region of a structural gene and sequences located adjacent to the coding region on both the 5' and 3' ends for a distance of about 1 kb on either end such that the gene corresponds to the length of the full-length mRNA or pre-mRNA.
  • the sequences that are located 5' of the coding region and which are present on the mRNA are referred to as 5' untranslated sequences.
  • the sequences that are located 3' or downstream of the coding region and that are present on the mRNA are referred to as 3' untranslated sequences.
  • the term "gene" encompasses both cDNA and genomic forms of a gene.
  • a genomic form or clone of a gene has a coding region that may be interrupted with non- coding sequences termed “introns” or “intervening regions” or “intervening sequences.” Introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are absent in the mature (e.g., messenger RNA (mRNA)) or other mRNA product. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide. In particular, the term “gene” refers to the full-length nucleic acid.
  • nucleic acid or " polynucleotide sequence” encompasses DNA, cDNA, and RNA (e.g., mRNA) forms of a gene or nucleic acid.
  • RNA e.g., mRNA
  • genomic forms of a gene may also include regions located on the 5' and 3' ends of the transcribed portion of the gene. These regions are referred to as "flanking" regions.
  • flanking regions are located, respectively, 5' and 3' to the portions of the gene from which are transcribed the 5' and 3' non-translated regions present in the mRNA transcript (i.e. the 5'-NTR and 3'-NTR of the mRNA, also known as the 5'-UTR and 3'-UTR thereof).
  • the 5' flanking region may contain regulatory elements such as promoters and enhancers that control or influence the transcription of the gene, and/or native or engineered tag elements that mark the gene for rapid identification or purification.
  • the 3' flanking region may contain sequences that direct termination of transcription, post-transcriptional cleavage, and polyadenylation.
  • amino acid sequence is recited herein to refer to an amino acid sequence of a naturally occurring protein molecule
  • amino acid sequence and like terms, such as “polypeptide” or “protein” are not meant to limit the amino acid sequence to the complete, native amino acid sequence associated with the recited protein molecule.
  • wild-type refers to a gene or gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source.
  • a wild- type gene is that which is most frequently observed in a population and is thus arbitrarily designed the "normal” or “wild-type” form of the gene.
  • modified refers to a gene or gene product that displays modifications in sequence and/or functional properties (i.e., altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
  • nucleic acid molecule encoding for example “DNA sequence encoding,” “RNA sequence encoding” and “DNA encoding” refer to the order or sequence of nucleotides along a strand of nucleic acid. The order of these nucleotides determines the order of amino acids along the polypeptide (protein) chain. The nucleic acid sequence thus codes for the amino acid sequence.
  • DNA molecules are said to have "5' ends” and "3' ends” because mononucleotides are reacted to make oligonucleotides or polynucleotides in a manner such that the 5' phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its neighbor in one direction via a phosphodiester linkage.
  • an end of an oligonucleotides or polynucleotide referred to as the "5' end” if its 5' phosphate is not linked to the 3' oxygen of a mononucleotide pentose ring and as the "3' end” if its 3' oxygen is not linked to a 5' phosphate of a subsequent mononucleotide pentose ring.
  • an oligonucleotide or polynucleotide even if internal to a larger nucleic acid, also may be said to have 5' and 3' ends.
  • RNA molecules In either a linear or circular DNA molecule, discrete elements are referred to as being "upstream” or 5' of the "downstream” or 3' elements. This terminology reflects the fact that transcription proceeds in a 5' to 3' fashion along the DNA strand.
  • the promoter and enhancer elements that direct transcription of a linked gene are generally located 5' or upstream of the coding region. However, enhancer elements can exert their effect even when located 3' of the promoter element and the coding region.
  • Transcription termination and polyadenylation signals are located 3' or downstream of the coding region. This terminology reflects the fact that transcription and replication proceed in a 5' to 3' fashion along the DNA strand(s). This terminology, as described for DNA molecules, also applies to RNA molecules.
  • an oligonucleotide having a nucleotide sequence encoding a gene and "polynucleotide having a nucleotide sequence encoding a gene,” means an oligonucleotide or polynucleotide comprising the coding region of a gene or, in other words, the nucleic acid sequence that encodes a gene product.
  • the coding region may be present in a cDNA, genomic DNA, or RNA form.
  • the oligonucleotide or polynucleotide may be single-stranded (i.e., the sense strand) or double-stranded.
  • Suitable control elements such as enhancers/promoters, splice junctions, polyadenylation signals, etc. may be placed in close proximity to the coding region of the gene if needed to permit proper initiation of transcription and/or correct processing of the primary RNA transcript.
  • the coding region utilized in the expression vectors of the present invention may contain endogenous enhancers/promoters, splice junctions, intervening sequences, polyadenylation signals, etc. or a combination of both endogenous and exogenous control elements.
  • a gene may produce multiple RNA species that are generated by differential splicing of the primary RNA transcript or by differential transcription initiation sites.
  • cDNAs that are splice variants of the same gene will contain regions of sequence identity or complete homology (representing the presence of the same exon or portion of the same exon on both cDNAs) and regions of complete non-identity (for example, representing the presence of exon "A” on cDNA 1 wherein cDNA 2 contains exon "B” instead). Because the two cDNAs contain regions of sequence identity they will both hybridize to a probe derived from the entire gene or portions of the gene containing sequences found on both cDNAs; the two splice variants are therefore substantially homologous to such a probe and to each other.
  • fragment refers to a portion of a full length polymer (e.g., a polypeptide) that has an deletion of monomers from a portion of the polymer (e.g., an amino-terminal and/or carboxy-terminal deletion as compared to the native protein), but where the remaining portion of the polymer is identical to the corresponding positions in the native or full length polymer.
  • Fragments typically are at least 4 monomers (e.g. , amino acids) long, preferably at least 20 monomers long, usually at least 50 monomers long or longer, and span the portion of the polymer required for activity.
  • naturally-occurring refers to the fact that an object can be found in nature.
  • a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and which has not been intentionally modified by man in the laboratory is naturally-occurring.
  • Amplification is a special case of nucleic acid replication involving template specificity. It is to be contrasted with non-specific template replication (i.e., replication that is template-dependent but not dependent on a specific template). Template specificity is here distinguished from fidelity of replication (i.e., synthesis of the proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently described in terms of “target” specificity. Target sequences are “targets” in the sense that they are sought to be sorted out from other nucleic acid. Amplification techniques have been designed primarily for this sorting out.
  • Amplification enzymes are enzymes that, under conditions they are used, will process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid.
  • MDN-1 R ⁇ A is the specific template for the replicase (D.L. Kacian et al, Proc. Natl. Acad. Sci. USA 69:3038 [1972]).
  • Other nucleic acid will not be replicated by this amplification enzyme.
  • this amplification enzyme has a stringent specificity for its own promoters (Chamberlin et al, Nature 228:227 [1970]).
  • the enzyme will not ligate the two oligonucleotides or polynucleotides, where there is a mismatch between the oligonucleotide or polynucleotide substrate and the template at the ligation junction (D.Y. Wu and R. B. Wallace, Genomics 4:560 [1989]).
  • Taq and Pfu polymerases by virtue of their ability to function at high temperature, are found to display high specificity for the sequences bounded and thus defined by the primers; the high temperature results in tliermodynamic conditions that favor primer hybridization with the target sequences and not hybridization with non-target sequences (Erlich (ed.), PCR Technology, Stockton Press [1989]).
  • amplifiable nucleic acid is used in reference to nucleic acids that may be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid” will usually comprise “sample template.”
  • sample template refers to nucleic acid originating from a sample that is analyzed for the presence of "target” (defined below).
  • background template is used in reference to nucleic acid other than sample template that may or may not be present in a sample. Background template is most often inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
  • the term "primer” refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH).
  • the primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products.
  • the primer is an oligodeoxyribonucleotide.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
  • PCR polymerase chain reaction
  • the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule.
  • the primers are extended with a polymerase so as to form a new pair of complementary strands.
  • the steps of denaturation, primer annealing, and polymerase extension can be repeated many times ( . e. , denaturation, annealing and extension constitute one "cycle”; there can be numerous "cycles") to obtain a high concentration of an amplified segment of the desired target sequence.
  • the length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter.
  • PCR polymerase chain reaction
  • PCR With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of 32p_l a beled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment).
  • any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules.
  • the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
  • PCR product refers to the resultant mixture of compounds after two or more cycles of the PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences.
  • amplification reagents refers to those reagents (deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for primers, nucleic acid template, and the amplification enzyme.
  • amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, micro well, etc.).
  • restriction endonucleases and “restriction enzymes” refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific nucleotide sequence.
  • the term "recombinant DNA molecule” as used herein refers to a DNA molecule that is comprised of segments of DNA joined together by means of molecular biological techniques.
  • isolated when used in relation to a nucleic acid, as in “an isolated oligonucleotide” or “isolated polynucleotide” refers to a nucleic acid sequence that is identified and separated from at least one contaminant nucleic acid with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic acids are nucleic acids such as DNA and RNA found in the state they exist in nature.
  • a given DNA sequence e.g. , a gene
  • RNA sequences such as a specific mRNA sequence encoding a specific protein
  • isolated nucleic acid encoding a protein of interest includes, by way of example, such nucleic acid in cells ordinarily expressing the protein where the nucleic acid is in a chromosomal location different from that of natural cells, or is otherwise flanked by a different nucleic acid sequence than that found in nature.
  • the isolated nucleic acid, oligonucleotide, or polynucleotide may be present in single-stranded or double-stranded form.
  • the oligonucleotide or polynucleotide will contain at a minimum the sense or coding strand (i.e., the oligonucleotide or polynucleotide may single-stranded), but may contain both the sense and anti-sense strands (i.e., the oligonucleotide or polynucleotide may be double-stranded).
  • portion when in reference to a nucleotide sequence (as in “a portion of a given nucleotide sequence”) refers to fragments of that sequence. The fragments may range in size from four nucleotides to the entire nucleotide sequence minus one nucleotide (10 nucleotides, 20, 30, 40, 50, 100, 200, etc.).
  • coding region when used in reference to structural gene refers to the nucleotide sequences that encode the amino acids found in the nascent polypeptide as a result of translation of a mRNA molecule.
  • the coding region is bounded, in eukaryotes, on the 5' side by the nucleotide triplet "ATG" that encodes the initiator methionine and on the 3' side by one of the three triplets, which specify stop codons (i. e. , TAA, TAG, TGA).
  • recombinant DNA molecule refers to a DNA molecule that is comprised of segments of DNA joined together by means of molecular biological techniques.
  • recombinant protein or “recombinant polypeptide” as used herein refers to a protein molecule that is expressed from a recombinant DNA molecule.
  • native protein as used herein to indicate that a protein does not contain amino acid residues encoded by vector sequences; that is the native protein contains only those amino acids found in the protein as it occurs in nature.
  • a native protein may be produced by recombinant means or may be isolated from a naturally occurring source.
  • portion when in reference to a protein (as in “a portion of a given protein") refers to fragments of that protein. The fragments may range in size from four consecutive amino acid residues to the entire amino acid sequence minus one amino acid.
  • vector is used in reference to nucleic acid molecules that transfer DNA segment(s) from one cell to another.
  • vehicle is sometimes used interchangeably with “vector.”
  • expression vector refers to a recombinant DNA molecule containing a desired coding sequence and appropriate nucleic acid sequences necessary for the expression of the operably linked coding sequence in a particular host organism.
  • Nucleic acid sequences necessary for expression in prokaryotes usually include a promoter, an operator (optional), and a ribosome binding site, often along with other sequences.
  • Eukaryotic cells are known to utilize promoters, enhancers, and termination and polyadenylation signals.
  • host cell refers to any eukaryotic, prokaryotic, archaea, or protist cell (e.g., bacterial cells such as E.
  • yeast cells coll, yeast cells, mammalian cells, avian cells, amphibian cells, plant cells, fish cells, and insect cells
  • host cells may be located in a transgenic animal.
  • overexpression and “overexpressing” and grammatical equivalents are used in reference to levels of mRNA to indicate a level of expression approximately 3 -fold higher than that typically observed in a given tissue in a control or non-transgenic animal. Levels of mRNA are measured using any of a number of techniques known to those skilled in the art including, but not limited to Northern blot analysis (See, Example 10, for a protocol for performing Northern blot analysis).
  • RNA loaded from each tissue analyzed e.g., the amount of 28 S rRNA, an abundant RNA transcript present at essentially the same amount in all tissues, present in each sample can be used as a means of normalizing or standardizing the RAD50 mRNA-specific signal observed on Northern blots.
  • the amount of mRNA present in the band corresponding in size to the correctly spliced transgene RNA is quantified; other minor species of RNA which hybridize to the transgene probe are not considered in the quantification of the expression of the transgenic mRNA.
  • transfection refers to the introduction of foreign DNA into cells. Transfection may be accomplished by a variety of means known to the art including calcium phosphate-DNA co-precipitation, DEAE-dextran-mediated transfection, polybrene-mediated transfection, electroporation, microinjection, liposome fusion, lipofection, protoplast fusion, retroviral infection, and biolistics.
  • transformed nucleic acid for example, as in “transformed with an exogenous nucleic acid library” refers to exogenous nucleic acid stably (e.g., able to be transmitted from one generation to the next) introduced into a cell by any method.
  • transformed nucleic acids are integrated into the genome of host cells.
  • Exemplary methods of generating transformed cells include, but are not limited to, transfection, Agrobacterium transformation, and infection with a viral vector.
  • stable transfection or “stably transfected” refers to the introduction and integration of foreign DNA into the genome of the transfected cell.
  • stable transfectant refers to a cell that has stably integrated foreign DNA into the genomic DNA.
  • transient transfection or “transiently transfected” refers to the introduction of foreign DNA into a cell where the foreign DNA fails to integrate into the genome of the transfected cell.
  • the foreign DNA persists in the nucleus of the transfected cell for several days. During this time the foreign DNA is subject to the regulatory controls that govern the expression of endogenous genes in the chromosomes.
  • transient transfectant refers to cells that have taken up foreign DNA but have failed to integrate this DNA.
  • calcium phosphate co-precipitation refers to a technique for the introduction of nucleic acids into a cell.
  • the uptake of nucleic acids by cells is enhanced when the nucleic acid is presented as a calcium phosphate-nucleic acid co-precipitate.
  • composition comprising a given polynucleotide sequence refers broadly to any composition containing the given polynucleotide sequence.
  • the composition may comprise an aqueous solution.
  • Compositions comprising polynucleotide sequences encoding a protein of interest or fragments thereof may be employed as hybridization probes.
  • the protein encoding polynucleotide sequences are typically employed in an aqueous solution containing salts (e.g., NaCl), detergents (e.g., SDS), and other components (e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.).
  • sample is used in its broadest sense. In one sense it can refer to a tissue sample. In another sense, it is meant to include a specimen or culture obtained from any source, including environmental samples, as well as biological. Biological samples may be obtained from animals (including humans) and encompass fluids, solids, tissues, and gases. Biological samples include, but are not limited to blood products, such as plasma, serum and the like. These examples are not to be construed as limiting the sample types applicable to the present invention.
  • computer memory and “computer memory device” refer to any storage media readable by a computer processor.
  • Examples of computer memory include, but are not limited to, RAM, ROM, computer chips, digital video disc (DVDs), compact discs (CDs), hard disk drives (HDD), and magnetic tape.
  • computer readable medium refers to any device or system for storing and providing information (e.g., data and instructions) to a computer processor.
  • Examples of computer readable media include, but are not limited to, DNDs, CDs, hard disk drives, magnetic tape and servers for streaming media over networks.
  • processor and "central processing unit” or “CPU” are used interchangeably and refer to a device that is able to read a program from a computer memory (e.g., ROM or other computer memory) and perform a set of steps according to the program.
  • a computer memory e.g., ROM or other computer memory
  • metabolicomics refers to the quantitative measurement of the dynamic multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification.
  • H NMR spectroscopy provides a function- based assay without the development time required for chromatographic or wet chemical methods. When the spectroscopy is practiced at sufficient field strength, the resulting spectral dispersion is sufficient to allow for the simultaneous detection of multiple substrates and products.
  • Microbial strains are usually cultured in aqueous medium.
  • 1H NMR for detection in H 2 O matrices is the issue of dynamic range and hence sensitivity.
  • a deuterated growth medium is used to eliminate the water signal.
  • a variety of NMR techniques are used to facilitate the suppression of the water signal.
  • the methods of the present invention overcome many of the problems of the prior art.
  • This technology is function-based (direct monitoring of a specific reaction) and does not require extensive method development, indicating its broad applicability (e.g., simplifying the screening of membrane-bound/localized enzymes/receptors and other proteins).
  • the present invention provides a novel high throughput nuclear magnetic resonance (HTP-NMR) method for the analysis of biological and chemical samples (e.g., samples comprising or suspected of comprising biocatalysts).
  • HTP-NMR high throughput nuclear magnetic resonance
  • the present invention provides a HTP-NMR method for screening compounds in a complex biological mixture (e.g., whole cells or crude cell ly sates). The methods of the present invention find use in a wide variety of applications, including, but not limited to, those disclosed herein.
  • plug-flow sample delivery is utilized.
  • plug- flow sample delivery the NMR probe is never empty and the system is continually locked (See Example 6).
  • Plug-flow sample delivery results in minimization of sampling errors, reduces required sample equilibration time, requires smaller sample volume, and reduces "over-head" time of the autosampler because the needle moves only from well to well and there is no sample retrieval.
  • a detailed description of the plug-flow methods of the present invention is provided below and in Example 6.
  • the novel HTP-NMR technology of the present invention can screen multiple micro-titer plates, which corresponds to between 600 and 5000 samples per day, depending on the sample handling and injection methods.
  • the present invention utilizes direct-injection NMR technology (See Experimental Section).
  • pooling is utilized to increase the number of samples that can be analyzed consecutively. In pooling, groups of samples (e.g., one row of a microtitre plate (e.g., 8 samples)) are pooled into 1 well for HT- NMR analysis. Then, individual samples from the pools that give positive results are further screened. This increases the throughput by 8-fold which corresponds to more than 50,000 samples per day.
  • the high throughput methods of the present invention are also able to analyze each sample in a shorter time than previously utilized methods.
  • a single NMR spectrum is obtained for each sample or control.
  • multiple spectra are obtained for each sample.
  • multiple spectra are averaged to obtain a final spectrum for use in further analysis.
  • the methods of the present invention are function-based, and can thus screen virtually any enzymatic reaction, without requiring a labeled substrate and with no knowledge of the enzyme of interest.
  • at least a 600 MHz NMR is used to provide extra sensitivity.
  • the methodology is advantageous compared to traditional screening methods since it does not require fluorescence or radioactive 'tags' and method development process.
  • the technique is considered non-invasive and non- destructive.
  • the technology of the present invention benefits from the traditional analytical strengths of NMR spectroscopy, such as the ability to distinguish detailed changes in structure, concentration and stereo chemistry.
  • enantiomers are characterized by direct derivatization of the enantiomers to specific chiral reagents, which can be monitored by NMR.
  • the methods do not require the preparation of cell extracts. Therefore, the methods of the present invention are amenable to the screening of membrane-bound proteins or other enzymes in or bound to cell organelles without any isolation procedure, which is generally difficult since many membrane bound proteins are unstable in cell free preparations in insoluble or soluble form.
  • the methods of the present invention further find use in the screening of ligands directly binding to specific receptors (e.g., drug discovery candidates).
  • the methods of the present invention further find use in the measurement of product increase or/and substrate decrease, which can be directly applied to screen 'functionally improved' catalysts/receptors, which is the target of many directed evolution technologies.
  • the methods of the present invention are suitable for the quantitation of reaction product formation.
  • time points can easily be taken, allowing for an analysis of reaction kinetics. Exemplary methods and applications are discussed in greater detail below.
  • the present invention provides methods of growing cells comprising biological entities suspected of having biocatalytic activity.
  • the cells may be grown under a variety of growth conditions.
  • the present invention provides HTP-NMR methods utilizing cells grown in deuterated medium.
  • Cells comprising a gene expressing a protein (e.g., enzyme) of interest or a library of genes are grown in medium in which all of the protons have been replaced with deuterium (See Example 1). More than ten different microorganisms including E. coli and P. fluorescence were shown to grow in completely deuterated medium. Growth rates were monitored and found to be about 1/3 of the rate in the normal medium.
  • cells are grown in partially (e.g., less than completely) deuterated medium.
  • Cells may be grown in medium comprising any level of partial deuteration (e.g., from less than 1% to greater than 99%).
  • cells are grown in non-deuterated medium (e.g., LB or similar solution).
  • non-deuterated medium e.g., LB or similar solution.
  • This method is preferred for growth of microorganisms that are not able to grow in deuterated medium. Using this method, it is possible to screen any organism. Experiments conducted during the course of development of the present invention demonstrated the feasibility of this method.
  • non-deuterated medium is particularly amenable to situations in which microorganism are not going to be lysed (e.g., the substrate and product are present in the culture medium) and situations in which the product or metabolite to be detected is not endogenous to the organism.
  • water suppression algorithms See e.g., below description
  • the non-deuterated growth medium contains just enough carbon source so that, when the cell-based sample has reached the growth or biomass phase desired for use in the biocatalysis phase, there is no longer any significant carbon source remaining. This lowers the background signal in the NMR phase of the analysis because no other substantial concentrations of carbon sources, apart from the protonated substrate(s) and its metabolites, are present in the NMR analysis.
  • the biological entities are transferred to a biocatalysis medium.
  • a biocatalysis medium Any suitable medium that does not interfere with biocatalysis may be utilized. It is preferred that the biocatalysis medium not comprise any protonated species that could potentially interfere with proton NMR.
  • the biocatalysis medium is aqueous.
  • another solvent is utilized including, but not limited to, acetone, formaldehyde, formic acid, acetic acid, methanol, chloroform, methylene chloride, DMSO, ethanol, dioxane, pyrazine, and piperazine.
  • biocatalytic medium is fully deuterated, methylene chloride, DMSO, chloroform, dimethylether, diethylether, organic acids (e.g., formic or acetic acids), acetone, methylene chloride, cyclohydrocarbons (e.g., cyclohexane), formaldehyde, dioxane, pyrazine, piperazine, and small alcohols (e.g., methanol, ethanol, or propanols).
  • organic acids e.g., formic or acetic acids
  • acetone methylene chloride
  • cyclohydrocarbons e.g., cyclohexane
  • formaldehyde e.g., dioxane
  • pyrazine pyrazine
  • piperazine e.g., methanol, ethanol, or propanols
  • the biocatalysis medium is fully deuterated. In other embodiments, the biocatalysis medium is partially deuterated. In yet other embodiments, the biocatalysis medium is non-deuterated.
  • NMR is performed directly on samples in biocatalysis medium. In other embodiments, samples are further processed as described below.
  • NMR spectra of whole-cells grown in deuterated medium are NMR silent, having very low background signal.
  • signals of compounds e.g., substrates for an enzyme
  • NMR e.g., 1H NMR
  • non-deuterated substrates are added and the substrate and product signals monitored with 1H NMR.
  • samples not exposed to deuterium or complete deuterium are utilized in NMR.
  • the water suppression methodology described below is utilized in such embodiments.
  • the present invention is not limited to the use of direct 1H NMR.
  • additional isotopes that allow for indirect detection of 1H spectra of protons in their vicinity are utilized (e.g., 13 C and 15 N). The use of such methods eliminates background signal.
  • additional isotopes e.g., including, but not limited to 31 P, 23 Na, 29 Si, or 39 K
  • any NMR detectable isotope may be utilized in the methods of the present invention.
  • proton NMR spectroscopy of biological samples in aqueous media includes the elimination of the water signal to facilitate the observation of dilute species.
  • Bulk water has a concentration of 111 M. Most organic analytes are present in mM quantities.
  • the dynamic range of an NMR spectrometer is determined by the analog to digital converter (ADC) used. For a 16-bit ADC, the theoretical dynamic range is 1 part in 100,000. ADC saturation affects both the signal to noise and the limit of detection of dilute components in water.
  • the solution to this limitation is to selectively eliminate the water signal from the acquired 1H NMR spectrum. This can be achieved using a variety of NMR pulse sequences. The type of sequences applied will depend on the experimental intent.
  • One preferred aspect of applying 1H NMR to high throughput screening is the collection of quantitative spectral data.
  • selective pre-saturation methods have been applied to minimize the water signal from aqueous samples.
  • water suppression methods fall into the general classes of selective pulse suppression and gradient techniques.
  • the "presat” experiment allows for the application of a frequency selected radio frequency pulse that saturates the water protons. This experiment is available on most spectrometers.
  • the term “presat” indicates that the function of this NMR pulse sequence is to eliminate the water resonance using a pre-saturation pulse that is applied prior to the data acquisition period.
  • the duration of the presaturation pulse is typically in the order of 1-2 seconds.
  • the "WET" experiment uses a combination of a frequency selected radio frequency pulse and field gradients to null the protons from water.
  • this method requires on the order of msec to eliminate the water signal.
  • the application of this method does require the use of pulse field gradients.
  • One advantage of the later is the reduction in time required to achieve sufficient peak suppression and is preferred for use with HTP-NMR.
  • the present invention utilizes "plug-flow NMR.”
  • the development of plug-flow NMR is described in Example 6.
  • the plug-flow methods of the present invention are suitable for both high and low throughput NMR discovery of biocatalysts.
  • a robot is used in combination with an NMR system to automate sample injection.
  • a NARIAN Palo Alto, CA
  • GILSON Microddleton, WI
  • Existing "VAST” (Versatile Automatic Sample Transport) subroutines, which use the TCL (Tool Control Language) scripting language were modified to properly inject samples sequentially, i.e. as plugs.
  • Modification of the programming code that is used to instruct the auto-sampler was utilized for the implementation of the "plug-flow" method.
  • the key attributes of the modified software include the removal of the sample retrieval step, addition of an "air-
  • the air-plug or gas-plug separating sequential liquid volumes preferably, but need not to, have a volume much smaller than the volume of the liquid samples.
  • the volume of the gas- or air-plug is selected to be just enough, in light of the internal diameter of the tubing, to space adjacent liquid samples. A few microliters is typically a sufficient volume.
  • a gas-plug is used in place of an air-plug, preferably it will be of a non-reactive gas.
  • gases for a gas-plug include: nitrogen, carbon dioxide, Nobel gases (inert gases); oxygen; and non-air gas mixtures.
  • Gaseous halocarbons e.g., CF 4 , C 2 C1 2 F 4
  • gaseous hydrocarbons might also be used, but these are less preferred.
  • N , CO 2 , and Ar are particularly preferred gases for a gas-plug.
  • the "sample plug,” which is followed by a gas-plug or air-plug can then be, but need not be, followed by a solvent plug that is also then followed by a gas- or air-plug, in order to facilitate washing of sample residue between samples. If needed, the sample volume is followed by a push solvent such as D 2 O or other solvent. Approximately 120 samples per hour can be analyzed using the system described herein. This includes the NMR data acquisition time.
  • the samples are automatically suctioned up by a robotic sampler from their various wells or test tubes, one at a time, first residing in the tubing (Tubing 1) that leads to the pump.
  • This tubing is selected to be long enough to contain one sample volume.
  • the pump then flows this "Sample 1 " into the tubing that leads to the NMR probe (Tubing 2).
  • Sample 1 is so moved, Sample 2 is suctioned up into Tubing 1 and the pump introduces a small air bubble between Sample 1 and Sample 2.
  • Tubing 2 is selected to be long enough to contain a whole number multiple of Samples, plus their dividing air bubbles.
  • the plug-flow method allows for continuous NMR sampling using a direct- injection configuration, reduces possibility for spectrometer errors during automated data acquisition, uses a minimal sample volume (e.g., 250-300 ⁇ L), allows for accurate mapping of sample location, and the sample carry-over is equivalent to what is observed utilizing the standard Narian VAST approach.
  • the plug-flow approach does not allow for the analyzed sample volume to be recovered.
  • the relatively small sample volume of 250 ⁇ L allows for the retention of a portion of the biocatalysis reaction for further analysis.
  • Sample throughput can be increased by as much as a factor of five using the plug-flow approach.
  • the entire analytical sequence for one sample can take several minutes in the factory configuration of an exemplary system.
  • sample delivery and recovery requires as much as 80% of the time required per sample.
  • the methods of the present invention reduce this time to approximately 30 seconds per sample.
  • the ⁇ MR probe is continuously loaded.
  • the spectrometer retains a deuterium lock during the sample transport process, thus eliminating a common cause of ⁇ MR failures.
  • the total analysis time is significantly reduced to optimize sample throughput in the case where hundreds or thousands of samples must be screened (e.g., in biocatalyst screening).
  • the total analysis time is the sum of the sample transport and data acquisition.
  • the methods of the present invention utilize a system where it is possible to inject one sample after another, while minimizing the small amounts of mixing that might occur.
  • the present invention utilizes a method where samples are kept separated using a small bubble of air. The bubble minimizes sample mixing as the sample is transferred along the transfer tubing and the small bubble does not interfere with the acquisition of the NMR spectrum.
  • the present invention utilizes a second sampling valve for plug-flow NMR.
  • the present invention is amenable to screening of microorganisms (e.g., bacteria (eubacteria, mycoplasma, or cyanobacteria), archaea, protists (e.g., algae), or fungi), cells in culture (e.g., mammalian or plant cells), as well as tissues (e.g., plant or animal tissues), organs, organelles, protoplasts, spheroplasts, mixed cultures, and environmental samples. Live cells, lysates, fractions, isolates, immobilized free-solution/suspensions, and combinations thereof may be utilized. In other embodiments, purified biocatalysts (e.g., enzymes) are utilized.
  • microorganisms e.g., bacteria (eubacteria, mycoplasma, or cyanobacteria), archaea, protists (e.g., algae), or fungi)
  • cells in culture e.g., mammalian or plant cells
  • live, whole cell based samples are utilized. More preferably, live cells are utilized.
  • Preferred microorganisms include, but are not limited to, eubacteria (e.g., E. coli, Bacillus spp., Klebsiella spp., Pseudomonas spp., P. fluorescens) and yeasts (e.g., Saccharomyces spp., Yarrowia spp., Pichia spp., Aspergillus spp., and Candida spp.).
  • low cell concentrations are utilized to prevent clogging of the NMR flow cell and to facilitate rapid filtration.
  • very low cell concentrations are present.
  • no significant concentration of cells is present. In some particularly preferred embodiments, no cells are present.
  • intact cells or cell clusters are screened.
  • the supernatant of reaction medium from cells that have been used to catalyze reactions with protonated substrates of interest is utilized.
  • the cells may be separated from the supernatant using any suitable method. For example, in some embodiments, samples are centrifuged and the cells are removed. In other embodiments, filtration is utilized to separate cells from supernatants.
  • a microtiter plate that comprises a filtration membrane is utilized. Such microtiter plates are commercially available (e.g., from Millipore, Bedford, MA).
  • NMR systems may comprise an on-line filter for use in separating supernatants from whole cells. Cellular supernatants may be screened from cells grown in deuterated, non- deuterated, or partially deuterated medium.
  • lysed cells are utilized. The entire reaction is lysed and the lysate is screened using HTP-NMR. In some embodiments, lysed cells are filtered prior to analysis and the supernatant is used for HTP-NMR. Lysed cells are particularly suited for applications in which cells are grown in fully or partially deuterated systems. Lysed cells may also be utilized in non-deuterated systems. In such embodiments, it is preferred that substrates be labeled or that the metabolite to be assayed not be endogenous to the cell type utilized for growth.
  • a solubilizing enhancer or agent is added (e.g., DMSO, alcohols, polyols, or surfactants).
  • the solubilizing enhancer is deuterated.
  • the catalytic sensitivity of whole cells/tissues is enhanced by use of cell membrane and/or cell wall permeabilizing agents (See e.g. , U.S. Patent No. 6,524,839, herein incorporated by reference).
  • surfactants are used to better solubilize less soluble substrates or other test compounds. Such methods are particularly suited to improve permeability/catalytic sensitivity of bacteria, plants, fungi (including yeast), and walled protists.
  • HTP-NMR high resolution magnetic resonance
  • additional separation steps are added prior to HTP-NMR such as, including, but not limited to, liquid chromatography (LC).
  • additional characterization steps are performed following HTP-NMR such as, including, but not limited to, mass spectroscopy.
  • both additional separation and additional characterization steps are combined (e.g., LC-HTP-NMR-MS).
  • the present invention provides methods of discovering, identifying and screening novel biocatalysts (e.g., enzymes).
  • the biocatalytic test samples e.g., microorganisms, tissue samples, environmental samples, a set of clones comprising a gene library of interest, others
  • the biocatalytic test samples are grown or maintained, e.g., under deuterated, partially deuterated, or non-deuterated growth conditions.
  • a collection of, e.g., isolated enzymes may form the biocatalytic test samples, and these may be deuterated, partially deuterated, or non-deuterated.
  • the test samples are then used to catalyze reactions with protonated substrates of interest.
  • libraries of biocatalysts are screened against one or more test substrates.
  • one or more biocatalysts are screened against a library of test substrates.
  • the protonated substrate comprises at least one non-exchangeable proton.
  • the medium may be the same as or different from that used during the growth/maintenance phase.
  • biocatalytic test samples grown or maintained under deuterated conditions may be contacted with the test substrate(s) in partially deuterated or non-deuterated conditions.
  • Those test samples grown or maintained under partially deuterated conditions may be placed in deuterated or non- deuterated conditions during biocatalysis.
  • Test samples grown or maintained under conditions of high (or low) partial deuteration may be placed, respectively, in low (or high) partial deuterated conditions during biocatalysis.
  • Test samples grown or maintained under non-deuterated conditions may be placed in deuterated or partial deuterated conditions during biocatalysis.
  • microorganisms are or contain the biocatalyst, H 2 O, D 2 O, or a mixture thereof, or a minimal salts medium in which the solvent is H 2 O, D 2 O, or a mixture thereof, is used as the medium during biocatalysis.
  • a minimal salts medium in which the solvent is H 2 O, D 2 O, or a mixture thereof, is used as the medium during biocatalysis.
  • no substantial concentration of carbon source, for cell growth/maintenance is present during the biocatalysis phase. More preferably, no significant concentration of carbon source for cell growth/maintenance, is present during the biocatalysis phase.
  • the methods of the present invention find use in the discovery and characterization of industrially useful enzymes. Such enzymes can then be produced on a large scale and utilized in the industrial production of chemicals.
  • a library of deu-E. coli cells containing a Rhodococcus rhodochrous genome - were created and all the genes were controlled under their native promoters for transcription, resulting in low levels of expression.
  • the dehalogenation reaction that converts 1,2,3-trichloropropane (TCP) to 2,3-dichloro-l-propanol (DCH) was used as a model reaction for the study (Example 1).
  • TCP-dehalogenase has a very low turnover number, which is problematic for typical function-based screening methods.
  • This model enzyme was intentionally chosen as a challenging target reaction in order to demonstrate high sensitivity and wide application perspective and sensitivity of this NMR technology.
  • a series of low-throughput experiments were initially conducted to prove the concept.
  • the HTP NMR technology was developed, refined, and successfully applied to identify dehalogenase genes from a Rhodococcus rhodochrous genomic library. Screening of the target TCP-dehalogenase gene was performed with whole cells, using HTP NMR technology. Two positive hits were identified that converted TCP to 2,3-dichloro-l-propanol (DCH). Subsequent analysis of these two clones using molecular biology techniques further proved that both hits were the same dehalogenase gene.
  • DCH 2,3-dichloro-l-propanol
  • the biocatalytic test sample(s) may be identical to one another and be used to screen a variety of protonated substrates or protonated substrate groups or pools. Alternatively, the biocatalytic test samples may differ from one another, thus forming a varied collection of test samples (e.g., a library). In some embodiments, the varied collection of biocatalytic test samples may be used to screen a single protonated substrate or a single protonated substrate group or pool.
  • the libraries are organism libraries (e.g., cDNA or EST libraries from a microorganism or recovered environmental DNA).
  • the libraries are libraries of a particular class of enzymes.
  • the libraries are derived from biocatalyst directed evolution processes in which mutation and/or recombination is performed to vary nucleic acids and thus the biocatalysts encoded thereby, followed by screening to identify those resulting variants that have obtained improved properties relative to the biocatalyst(s) encoded by the nucleic acid(s) from which the directed evolution process started. Directed evolution libraries may be screened at multiple points in a directed evolution experiment.
  • Such embodiments are particularly suited for pooling experiments where multiple samples are combined, the reducing the number of individual samples that are analyzed.
  • the present invention is not limited to a particular library or class of enzyme. Any enzyme that acts on a substrate to produce NMR distinct products may be screened using the methods of the present invention. Indeed, it is not necessary to know the identity of the enzyme or its nucleic acid or protein sequence. Libraries may be obtained from commercial sources (e.g., Invitrogen, Carlsbad, CA, USA) or other sources known to one of skill in the art. In other embodiments, libraries are generated from organisms using conventional molecular biology techniques (e.g., PCR).
  • libraries may be generated using directed evolution (See e.g., U.S. Patent Nos. 6,395,547; 6,376,246; 6,391,640; 6,365,408; each of which is herein incorporated by reference).
  • artificial evolution is performed by random mutagenesis (for example, by utilizing error-prone PCR to introduce random mutations into a given coding sequence). This method requires that the frequency of mutation be finely tuned.
  • beneficial mutations are rare, while deleterious mutations are common. This is because the combination of a deleterious mutation and a beneficial mutation often results in an inactive enzyme.
  • the ideal number of base substitutions for a targeted gene is usually between 1.5 and 5 [Moore and Arnold, Nat. Biotech.
  • libraries are generated using Gene Site Saturation Mutagenesis procedures (U.S. Patent No. 6,171,820, herein incorporated by reference).
  • the procedure provides, from a parental template gene, a set of mutagenized progeny genes whereby at each original codon position there is produced at least one substitute codon encoding each of the 20 naturally encoded amino acids.
  • the procedure also provides, from a parental template polypeptide, a set of mutagenized progeny polypeptides wherein each of the 20 naturally encoded amino acids is represented at each original amino acid position.
  • libraries are generated using gene shuffling or sexual PCR procedures (for example, Smith, Nature 370:324-25 (1994); U.S. Patent Nos. 5,837,458; 5,830,721; 5,811,238; and 5,733,731, each of which is herein incorporated by reference).
  • Gene shuffling involves random fragmentation of several mutant DNAs followed by their reassembly by PCR into full-length molecules. Examples of various gene shuffling procedures include, but are not limited to, assembly following DNase treatment, the staggered extension process (STEP), and random priming in vitro recombination.
  • DNA segments isolated from a pool of positive mutants are cleaved into random fragments with DNasel and subjected to multiple rounds of PCR with no added primer.
  • the lengths of random fragments approach that of the uncleaved segment as the PCR cycles proceed, resulting in mutations in present in different clones becoming mixed and accumulating in some of the resulting sequences.
  • Multiple cycles of selection and shuffling have led to the functional enhancement of several enzymes [Stemmer, Nature 370:398-91 (1994); Stemmer, Proc. Nat'lAcad. Sci. USA 91:10747-51 (1994); Crameri et al., Nat. Biotech. 14:315-19 (1996); Zhang et al, Proc.
  • DNA shuffling can be applied using multiple related DNA sequences or combination of the different mutants. Such approaches mix multiple parent molecules in a shuffling process, generating a library of millions of different chimeric sequences (Crameri et al, Nature 391:288-291 (1998)).
  • cosmids are used for the screening because they can contain 25-35 kb genomic fragments.
  • regular plasmid-based libraries only hold 5 or less kb fragment per plasmid. Larger genomic fragments (like cosniid-libraries) reduce the number of clones that need to be screened in order to identify the target.
  • the cosmid-transformed E. coli was observed to grow slower in deu-medium than plasmid- transformed E. coli. This results in a low cell density, long cultivation time and low biocatalytic activity for cosmid clones.
  • the actual extent of throughput depends on the sensitivity of detection, growth density in the deu-media, the way libraries are constructed, and the level of gene expression.
  • plasmids are used for screening. Genomic fragments of 1 to 3 kb in length are inserted into the plasmid vectors.
  • the advantage of using plasmids is that the expression of the gene in the genomic fragment can utilize the artificial promoter on the plasmid. Such an arrangement allows for the screening of the genomic fragment in a host where the native promoter cannot be recognized and allows for the tuning of the expression level.
  • the higher copy number for plasmids compared with cosmids can afford higher expressions for targeted proteins.
  • the methods of the present invention are used to confirm the identity of a biocatalytic activity by confirming the disappearance of product.
  • the product(s) of the biocatalytic reaction are further characterized.
  • biocatalytic activity it is possible to determine the type of biocatalytic activity as well as substrate specificity. For example, in some embodiments where the identity of the product(s) of an enzymatic reaction are known, the structure of a substrate and a product are compared to assign an activity to the biocatalyst.
  • the methods of the present invention are used to analyze multiple substrates simultaneously.
  • the use of NMR allows for the simultaneous detection of multiple products and substrates with different NMR signatures. This allows for an increase in throughput.
  • the screening of multiple substrates is also useful in the characterization of novel enzymes whose specificity is unknown.
  • a standard set of substrates e.g., a carbohydrate, an alcohol, and a lipid
  • substrate analogs are used to screen newly identified enzymes.
  • enzymes of a particular class are screened to identify enzymes with a particular substrate specificity, reaction specificity (e.g., production of a particular enantiomer), or activity towards a specific set of substrate analogs.
  • the methods of the present invention find use both in the identification of enzymatic activities of unknown proteins and in the identification of new activities of known enzymes.
  • the methods of the present invention can be used to screen virtually any biocatalytic reaction without assay method development.
  • 1H NMR spectroscopy can be used to unequivocally identify and quantify organic compounds. Because the response factor of a single proton is ideally one, quantification using NMR only requires knowledge of the molecular structure and the presence of an internal concentration reference. This eliminates the need for development of chromatographic or colorimetric analytical methods. Additionally, NMR can be used to detect unanticipated reaction products that otherwise may be undetected by compound specific analytical methods. It is also common that some colorimetric methods, for example, rely upon a change in pH to indirectly detect product formation. This may lead to assignment of false positives. Detection of the actual product or change in substrate using NMR eliminates this issue.
  • the present invention thus provides improved methods of discovering and characterizing biocatalysts, including enzymes.
  • NMR screening experiments are sequenced to identify the genes.
  • a multi-gene library such as a genome library, is screened by an HTP-NMR method according to the present invention. "Hits” identified thereby are then matched to the cell clone "library member” that provided the biocatalytic "hit” and the transgene(s) that had been cloned therein are sequenced to obtain the nucleotide sequence of the gene(s) encoding the biocatalyst(s).
  • the methods of the present invention thus can provide a "reverse genomics" analysis where encoded gene function is followed by gene discovery.
  • the results of the NMR analysis are used to search databases of compounds or properties to identify metabolites or properties of screened enzymes.
  • genes encoding enzymes of interest are cloned and expressed for further characterization and/or large-scale production.
  • NMR results e.g., chemical structures of metabolite(s), NMR spectra, or other NMR-generated data
  • the systems and methods of the present invention are utilized in the screening of pharmaceuticals. Both high and low throughput NMR are suitable for the screening of pharmaceuticals. In clinical trials, typical failure rates range from 1 to 6.5% (UK), 1 to 5.6% (USA), and 1 to 13.5% (Switzerland). In addition, clinical trials account for 40% of R&D costs. Many failures occur in Phase 1 due to toxicity in animal models (pre-clinical) and inappropriate pharmacokinetics (clinical) such as post absorption in GI tract and extensive initial metabolism. Additional problems include drag-drug interactions/ induced drug toxicity (DDI/IDT) and pre and post- marketing adverse events.
  • DAI/IDT drag-drug interactions/ induced drug toxicity
  • the systems and methods of the present invention provide rapid technology to 'sort and select' the best drug.
  • the present invention provides methods of testing candidates entering the testing phase to identify the best candidates to move forward, rapid assessment of IDT and DDI, generation of information for FDA review, and early detection of toxicity in certain populations. This results in a reduction in whole animal testing.
  • the methods of the present invention further provide for the rapid determination of drug metabolism, the effect of candidate compounds on different cell lines, environmental conditions, allow for the direct analysis of one or more drug candidates, kinetics, transport into cells, the relation of metabolism to toxicity, and the relation of candidate compounds to gene induction. Results of HTP (or low- throughput)-NMR analysis are tied to gene-chips (pharmacogenomics) when desired.
  • cells e.g., cells of particular types and clonal cell lines
  • cells comprising the gene for a drug metabolizing enzyme of interest are first grown in partially, completely or un-deuterated medium as described above.
  • the desired drug or mixture of drugs is added, followed by NMR analysis and data collection.
  • Data can be obtained in time point or kinetic mode.
  • direct substrate/product analysis, quantitative analysis, and multiple substrate/product analysis are performed.
  • the NMR methods of the present invention are used to analyze candidate pharmaceuticals at early stages in product development. Currently utilized methods involve whole animal testing of many candidates, the evaluation of drug metabolism in cell systems, GC/MS analysis, and LC/MS analysis.
  • culture media is modified to facilitate supernatant analysis by proton NMR.
  • NMR is used in toxicity screening of drag candidates.
  • clones of yeast, bacteria, or eukaryotic cells e.g., hepatocytes
  • Drug candidates/compounds are added, the mixture is allowed to incubate, and NMR analysis is performed.
  • drag/drag, drug/metabolite, or metabolite/metabolite interactions are screened by combining the drag and/or metabolite prior to analysis.
  • the NMR analysis allows for the determination of metabolic profile, kinetics, quantitation, and metabolism as a function of gene induction.
  • the present invention is not limited to a particular cell type.
  • the methods of the present invention result in a reduction in whole animal use for drag candidates entering clinical evaluation.
  • the methods of the present invention allow for the rapid evaluation of which drug is metabolized, and to what level, the interpretation of toxicity on whether the drug or metabolite is toxic, and direct drag-drag interaction analysis.
  • candidate compounds can be screened to determine the best candidates for further testing (e.g., animal testing) or the least effective compounds.
  • the NMR methods of the present invention are used to evaluate the toxicity of candidate pharmaceuticals in a particular population (e.g., children or the elderly) or specific cell lines from animals or humans.
  • enzymes are cloned from the specific population to be analyzed and grown in a suitable host cell. NMR analysis is then performed as described above.
  • the HTP-NMR methods of the present invention are used to discover pharmaceutical and agricultural leads.
  • the leads are inhibitors of certain enzymes or enzyme clusters that catalyze specific reactions or pathways.
  • a library of potential inhibitors can be screened using HTP-NMR based on their ability to inhibit the enzyme-catalyzed reaction or pathway.
  • a drag target e.g., biocatalyst
  • substrate are contacted with a combinatorial library of drug candidates and HTP-NMR is used to screen for changes to the substrate indicative of active drag candidates.
  • NMR is used in pharmaco- genometabolomics applications. Cells containing clones of metabolizing enzymes from different populations (e.g.
  • NMR e.g., HTP-NMR
  • HTP-NMR is used to screen libraries of compounds for potential enzyme inhibitors.
  • a cell expressing an enzyme or biocatalyst of interest is contacted with a know substrate and the level of conversion of the substrate to product is measured in the presence and absence of the inhibitor library.
  • Potential inhibitors are compounds that decrease the level of product formation relative to the level in the absence of the inhibitor.
  • the methods of the present invention are used to screen environmental compounds.
  • agricultural compounds e.g., pesticides or herbicides
  • Cells containing metabolizing enzymes e.g., human metabolizing enzymes
  • the presence of toxic compounds or metabolites is then assayed using the methods of the present invention.
  • environmental toxins e.g., industrial pollutants
  • Cell containing metabolizing enzymes are contacted with the environmental toxin and the presence of toxins or toxic metabolites is determined using the HTP-NMR methods of the present invention.
  • biodegradation of pollutants or agricultural compounds is assayed by obtaining environmental samples for analysis.
  • Environmental samples can be obtained from different regions of the country or depth of soil to determine the breakdown and prevalence of such compounds.
  • the HTP-NMR methods of the present invention are used to screen drags in a variety of animals to determine if a finding (e.g., of toxicity) is species specific or more broadly applicable.
  • the HTP-NMR methods of the present invention are used in the screening of cellular uptake and bioavailability of compounds and/or enzymes.
  • HTP-NMR is used to identify gene inducers, and to assay cellular uptake and transport kinetics of compounds.
  • dosage e.g. , of pharmaceuticals or herbicides.
  • Gl-cells are cultivated and uptake of different xenobiotics is compared using the HTP-NMR methods of the present invention.
  • Non- bioavailable xenobiotics can be "sorted out” quickly by following the concentrations of the xenobiotics in the supernatant as compared with the controls. For bioavailable xenobiotics, the uptake rates of different xenobiotics are also compared to determine the different kinetics for transports. VI. Screening for Anti-Microbial Activity
  • HTP-NMR methods of the present invention are used to screen for anti-microbial activity of compounds, including peptides.
  • libraries of anti-microbial compounds are screened using microbes cultured using media rich in glucose. Under normal growth conditions, the initial dose of glucose is consumed. In cases where the microbial activity has been limited or curtailed the amount of glucose observed in the media is comparable to the initial concentration. Accordingly, in some embodiments, anti-microbial activity is detected by the presence of glucose in the media compared to the consumption of the nutrient under normal growth conditions.
  • the methods of the present invention are used to identify alterations in biological activities of living systems resulting from contact with a test compound (whether directly by the test compound or via a metabolite thereof).
  • a living biological entity is utilized.
  • a combinatorial chemistry library is tested against an immobilized enzyme/array to find enzyme inhibitors.
  • a monoclonal antibody array or aptamer array is tested to find binding targets.
  • an array of ligands is screened to identify binding molecules.
  • the normal substrate for the enzyme is included in the medium. Then, test compounds are added to see if they are inhibitory.
  • enzymes are free in solution. In other embodiments, they are immobilized (e.g. , in a microtitre plate).
  • methods that do not require biocatalysis of the test compound are utilized (e.g. , the test compound or metabolite is active).
  • a series of inducers is screened with a known induction system or operons.
  • a series of inducers is screened with a known induction system or known operons.
  • a series of cells is screened with different operons to test the same compound for induction or repression.
  • HTP-NMR is used to identify compound uptake into cell/tissue.
  • differences in concentration e.g., removal of target compound from the test medium
  • LB medium was used in all experiments as normal medium for cultivation of microbes. It was prepared using normal deionized water. Antibiotics and isopropyl- ⁇ -D-thiogalactopyranoside (IPTG) were added where appropriate at the following concentrations: kanamycin (Kan), 50 ⁇ g/mL; tetracycline (Tc), 50 ⁇ g/mL; ampicillin (Ap), 50 ⁇ g/mL; and IPTG, 0.25 mM. Both antibiotics and IPTG were dissolved in D 2 O and sterilized through a 0.22- ⁇ m membrane prior to addition to the medium.
  • TSP Trimethylsilylpropionic-2, 3,3,3- d acid
  • TCP 1,2,3-Trichloro ⁇ ropane
  • DCH 2,3-Dichloro- 1-propanol
  • NMR samples were analyzed in a 600 MHz Varian UNITY INONA ⁇ MR spectrometer.
  • Total DNA (genomic and plasmid) of Rhodococcus rhodochrous TDTM 003 [ATCC Designation 55388] was isolated using the Invitrogen "EASY-DNAKit”.
  • a 1.4- kb DNA fragment containing the dehalogenase gene with its native promoter was amplified from the total DNA using the following primers: 5'- CGGGATCCTTGGCAGACGTAGGATGCT (SEQ ID NO:l) and 5'- CGGGATCCATTGGATGCTTCGTTCTCC (SEQ ID NO:2).
  • Inclusion of Bam ⁇ l recognition sequences at 5'- and 3'- ends of the dhl fragment facilitated its cloning.
  • Cosmid libraries of Rhodococcus rhodochrous were prepared in SuperCosl purchased from Stratagene.
  • Total DNA was purified from R. rhodochrous TDTM003 using the procedure described in the Instruction Manual of Stratagene "SuperCosl Cosmid Vector Kit". The majority of the purified total DNA was 49 to 150 kb long.
  • Total DNA was partially digested with Mbol restriction enzyme to afford fragments in the range of 25 to 35 kb. The resulting DNA fragments were ligated into the BamHI site of Supercosl .
  • Ligated DNA was packaged in phage ⁇ using GIGAPACKIII XL Packaging Extract purchased from Stratagene (La Jolla, CA). The phage library was transfected into E. coli BL21 (DE3), and colonies were selected on LB plates for resistance to Ap.
  • TCP in deu-DMSO stock solution (40%) was freshly prepared. An aliquot of such TCP stock solution (5 ⁇ L) was added to each well. After mixing vigorously, the plate was returned to a 30°C shaker with agitation at 250 rpm for 30 hours. The plate was stored at -80°C after TSP was added to each well.
  • Enzyme activity was determined based on a pH indicator bromothymol blue
  • the assay buffer consisted of 1 mM BES buffer and 50 ⁇ M of BTB, pH 7.8. Because products of the dehalogenation consist of a strong acid (HC1), the rate of the production of H + (reaction rate) can be monitored from the rate change in absorbance at 620 nm.
  • the assay reaction solution contains 10 mM TCP as a substrate.
  • the dehalogenation reaction was initiated by adding biocatalyst (whole-cell or crude lysate). The absorbance at 620 nm was monitored at 30°C. For crude lysates, protein concentrations were determined using the Bradford dye-binding procedure (Bradford, Anal. Biochem. 72:248 (1976)).
  • VAST Varian Versatile Automatic Sample Transport
  • Gilson 215 sampling robot connected to a 600 MHz pulsed field gradient capable, triple resonance micro-flow probe with 60 ⁇ L cell volume.
  • the proton 90° pulse width for this probe is 2.3 ⁇ s giving a signal to noise ratio of 67:1 measured on the anomeric proton signal of 2 mM sucrose in D 2 O with presaturation of the residual HOD signal.
  • the Gilson 215 was equipped with 3 x 205H sample trays holding 2 x 96 well plates each for a total of 576 samples. The system is capable of holding 5 x 205H sample trays for a total of 960 samples.
  • the Gilson 215 was connected to the micro-flow probe using 10 ft. (3.048 m) of 0.01" (254 ⁇ m) ID PEEK tubing giving a total sample volume of 570 ⁇ L.
  • Various flow parameters were used during the data acquisition. Optimum values can be recognized by a small amount of cavitation present in the syringe pump at the end of the sample aspiration. This generally occurred with both the fast and slow rates set between 0.8 and 1.0 mL/min.
  • the sample extra volume was set to 125 ⁇ L to compensate for the slight cavitation.
  • Sample injection volume was set to 300 ⁇ L with a push volume of 270 ⁇ L of D 2 O making up the remaining sample volume. Since the system operates under near plugged flow conditions little dilution of the sample was observed. No probe rinses were performed between samples.
  • the VAST system was operated in direct injection mode (DI) with 50 lbs/in 2 of nitrogen pushing the sample out of the probe during the sample aspiration step.
  • Holdover residual sample at the end of aspiration
  • Carryover (residual NMR signal from the previous sample) varied between 10 and 20%, with 10 to 15% being typical at 0 ⁇ L holdover. Holdover was found to vary at constant flow rates; slight blockage of the system by cellular materials is suspected to be responsible for the variation.
  • NMR data were collected using a 90° observe pulse with a 1.2 s presaturation pulse applied on resonance with the residual HOD signal between each observe pulse. A total of 8 acquisitions were acquired for each spectrum, giving a total time of 23 seconds for the acquisition of the NMR data. Flow of the sample into the spectrometer required approximately 45 seconds, as did the withdrawal of each sample from the probe. Some additional time is consumed rinsing the syringe between injections and in the positioning of the robot arm giving a total time of 138 seconds for each sample when optimized. This gives a total throughput of 625 samples per 24 hours of operation.
  • microbes To completely replace protons with deuterons in the cell culture, microbes have to be cultivated in fully deuterated medium ("deu-medium"). Cultures that were grown initially in regular medium were diluted (1:500) and cultivated in deu-media. At least two more serial dilution and cultivation cycles were performed in order to achieve a sufficiently silent 1H NMR spectral background. A variety of microorganisms (both prokaryotic and eukaryotic) were examined for their growth characteristics in two different deu-medium compositions.
  • Table 1 Cultivation of Different Microorganisms in deu-Medium.
  • E. coli. The level of cellular growth is compared relative to Escherichia coli (E. coli.). Microorganisms that displayed growth comparable to E. coli were given a score of ++, less than E. coli were given a +, and no growth is indicated by a -; "N/D” indicates "not determined.”
  • E. coli BL21(D ⁇ 3)/pHZ83F3 contains a mutant dehalogenase gene under the transcription control of a strong T7 promoter and a Kan resistant marker. It was examined in BIO-EXPRESS 1000 and LB medium for growth comparison.
  • the inoculum was started by introduction of 10 ⁇ L of glycerol freeze of deuterated culture into 5 mL of 1 x concentrated deu-medium containing kanamycin.
  • the culture was grown for 17.5 hours at 30°C with agitation at 290 rpm before 10 ⁇ L of the culture was transferred into 10 mL of fresh 1 x concentrated deu-medium or LB medium. Growth of the fresh cultures was continued at 30°C with agitation at 290 rpm.
  • Samples were taken at designated time points to determine cell densities by measurement of absorbance at 600 nm (OD 6 oo). Slower growth rate and lower cell density were obtained in deu medium (1 x concentrated) compared to LB medium. Cells grown in deu-medium reached stationary phase earlier than in LB medium. The deu cell density (OD 600 ) in the stationary phase was approximately 2.0. Both cell growth rate and final cell density in the "rich" deu- medium were about 1/3 of those in LB.
  • E. coli BL21(DE3) and E. coli BL21(DE3)/pHZ83F3 were examined as biocatalysts to catalyze the dehalogenation reaction of 1,2,3-trichloropropane (TCP) (Scheme 1).
  • Cell cultivation was started by inoculating 5 ⁇ L of glycerol freeze of deu cell culture (for deu-biocatalyst) and 5 ⁇ L of glycerol freeze of LB cell culture (for normal biocatalyst) in 3 mL of 1 x concentrated deu-medium and LB medium, respectively. Cells were grown at 30°C with an agitation of 290 rpm.
  • E. coli BL21(DE3)/pHZ83F3 that was cultivated in deu medium exhibited substantial dehalogenase activity, whereas the same strain grown in LB medium exhibited higher activity: the reaction rate for deu cells was about 1/3 that for normal cells. As expected, no dehalogenase activities were detected in BL21(DE3) controls from either cultivation medium.
  • the crude-lysates of E. coli BL21(DE3)/pHZ83F3 were also examined as biocatalysts to catalyze the dehalogenation reaction of 1,2,3-trichloropropane (TCP) to 2,3-dichloro-l-propanol (DCH).
  • the cells were further disrupted by sonication.
  • the cell debris was removed by centrifugation at 4°C.
  • Protein concentrations were determined by comparison to a standard curve prepared using bovine serum albumin.
  • the crade-lysate of BL21(DE3)/pHZ83F3 cultivated in deu-medium showed dehalogenase activity.
  • the crude lysate of the same strain cultivated in LB medium had a higher specific activity.
  • the ratio of specific activity for deu-cell lysate versus normal- cell lysate was approximately 1/3.2.
  • the signals correspond to various metabolites, proteins, DNA, RNA and residual nutrients in the LB medium.
  • the residual water peak had been zeroed in order to observe the signals of other components.
  • Crude lysates obtained from corresponding cells were also analyzed by NMR and similar results were obtained. Deu-crade lysates were NMR- silent while the spectra of normal-lysates were too complicated to be interpreted.
  • the dehalogenation catalysis in deuterated medium was demonstrated to be both catalytically active and NMR-silent, the catalytic activity was less active than that of biocatalysis performed using normal media because of the deuterium isotope effect.
  • TCP 1,2,3-trichloropropane
  • DCH 2,3-dichloro-l- propanol
  • the native dehalogenase from Rhodococcus has a very low activity for TCP as a substrate, with a K ca t of 0.08 s "1 and a K m , of 2.2 mM (Bosma et al, Appl. Environ. Microbiol 65:4575 (1999)). Such a low activity is problematic for typical function-based screening methods.
  • Plasmids pKL.DOW.1.110A and pKL.DOW.1.110B were created by cloning the native dehalogenase gene in pBR322, an Ap resistant vector with the copy number of about 15. To further mimic the clone library, the expression of the dehalogenase was designed to utilize its native promoter, which is much weaker than an artificial promoter such as T7 or tac.
  • Biocatalyst K coli BL21 (DE3)/ pKL.DOWl .110A and E. coli BL21 (DE3)/pKL.DOW.l.l 10B were created by transforming the corresponding plasmids into E. coli host BL21 (DE3).
  • the E. coli BL21(DE3) was used as a negative control.
  • Each of the three cells were deuterated by cultivate in BIO-EXPRESS 1000.
  • pKL.DOW.1.110A was observed to produce more DCH than pKL.DOW.1.110B. This was probably due to increased transcription of the TCP-dehalogenase gene since it can also utilize the Tc promoter that exists in pBR322.
  • P. fluorescens MB214 is a derivative of wild-type prototropliic P. fluorescens biotype A. P. fluorescens MB214 was derived by integrating into the wild-type chromosome, the lacIZYA operon (deleted of the lacZ promoter region).
  • P. fluorescens MB214 is Lac + , whereas the wild-type is Lac.
  • Pseudomonas fluorescens biotype A also called biovar 1 or biovar I, is available from the American Type Culture Collection under Designation ATCC 13525.
  • Pseudomonas fluorescens MB214/pLCl 1.1a contains a mutant dehalogenase under transcription control of a tac promoter and a Tc resistance marker.
  • Cells were cultivated at 30°C with an agitation of 290 rpm. When OD 600 reached 0.5, the appropriate amount of IPTG was added to induce the expression of the dehalogenase. After further cultivation overnight, TCP stock solution in DMSO-d 6 was added to make the final TCP concentration 4 mM. The reaction proceeded under the same conditions as previously described.
  • the deu-NMR method successfully monitored the reaction in this Pseudomonas fluorescens system, indicating that Pseudomonads are also suitable host cells for use in constructing gene libraries and that such deu-grown Pseudomonads can be directly screened by the deu-NMR method for desired reactions. Since the expression of dehalogenase in Pseudomonas fluorescens MB214/pLCl 1.1a is under a strong tac promoter, the reaction rate is even higher than E. coli BL21 (DE3)/ ⁇ KL.DOW.1.11 OA. One hour of reaction time clearly demonstrated DCH formation.
  • Rhodococcus spp. are known to play a significant role in the biodegradation of organic compounds in the environment and are believed to be important for future bioremediation process.
  • Rhodococcus rhodochrous TDTM3 is a strain that was demonstrated to have dehalogenase activity. It was reported that some Rhodococcus rhodochrous strains have the dehalogenase genes located on their plasmids instead of their genomes. For example, strain NCIMB 13064 has its dehalogenase gene on one of its two plasmids pRTLl (100 kb) (Kulakova, Microbiology 143:109 (1997)). In order to guarantee the inclusion of the dehalogenase gene, total DNA (genomic and plasmid) of TDTM3 were prepared as starting materials for the library construction.
  • cosmids that can accept 25-35 kb fragments were used as the vector for the library construction instead of plasmid.
  • the library was constracted as described in the Experimental Section. A total of 46 microtiter plates were prepared which represented 4416 individual colonies. For each plate, E. coli BL21(DE3)/pKL.DOW.l.l 10A and E. coli BL21(DE3)/pKL.DOW.l.l 10B were inoculated in well G12 and H12, respectively, as positive controls. After cultivation in 1 x deu medium for only one-round of deuteration, deu-biocatalysts were used for screening of the dehalogenase reaction. The screening was performed using customized VAST DI NMR to identify the clone that carried the dehalogenase gene based on detection of DCH formation.
  • microtiter plates were frozen and stored for analysis. The plates were thawed and placed in the Gilson's 205H sample trays without any sample preparation. Most of the cellular material settled to the bottom of the well quickly allowing the withdrawal of a 300 ⁇ L sample, which did not clog any of the components of the VAST system. Once flow rates had been optimized, each sample took approximately 138 seconds to run. This gives a sample throughput of approximately 625 samples per day in this direct injection mode.
  • the software was modified to designate the computer to search for the relevant signals automatically.
  • Varian supplied software which will map 2 defined integral regions into a two colors (green and red) and display the results as an image of the 96 well plate with the third color (blue) being used to map the sum of the first two integrals.
  • This software was written to accept a series of SCOUT data sets (which are 2D NMR data sets) and combine these into a 2D NMR data set.
  • the autopresat experiment used is a ID data set, as a result the Varian macros had to be modified to accept ID data as input.
  • the software was modified to allow three integral regions to be mapped by changing the third mapping from the sum of the first two integrals to a third raw integral.
  • the Varian software normalized all of the integrals to the values obtained for the first integral of well HI by division, after first setting negative values to 0.
  • this integral was the product (DCH), which was generally not present resulting in either small positive or negative integrals.
  • Negative integrals were set to 0 by the software resulting in divide by zero errors. Rather than change the software at the point of failure (this would have required extensive testing of the new code) it was decided to switch the data files collected for cells HI 2 and HI, thus effectively moving one of the control samples to well HI .
  • the two positive hits 31H7 and 49B11 were identified in the glycerol freeze samples of the library. They were streaked out on LB/Ap plates separately to obtain single colonies for cultivation. Purification of the cosmid DNAs afforded the two cosmids 31H7 and 49B11. To confirm and characterize both clones, the purified DNA was digested with restriction enzymes. The results confirmed that both 31H7 and 49B11 were cosmids with large DNA inserts in it. Bam HI digests of the two clones were prepared and analyzed by gel electrophoresis. The length of the genomic inserts were estimated to be 26.8 kb and 33.8 kb, for 31H7 and 49B11, respectively.
  • the nucleotide sequences of the dhl genes in both 31H7 and 49B11 were determined. Their ORFs were aligned and compared. The two dhl clones showed the same sequences, which was an 885-bp of DNA encoding a protein of approximately 30 kDa. The deduced amino acid sequence of this dhl gene was also compared with the published dhl product from Rhodococcus rhodochrous NCIMB 13064.
  • the crade-lysate of deu E. coli BL21/pHZ83F3 was used as the biocatalyst for the reaction.
  • the cultivation and disruption of cells were performed as described previously.
  • the dehalogenation reaction was initiated through the addition of 2 ⁇ L of TCP directly into the reaction buffer (5 mL) containing 4.4 mL of NaD 2 PO 4 (pH 8.0) in D 2 O and 0.6 mL of lysate. After mixing vigorously, the reaction proceeded at 30°C with agitation at 290 rpm in an orbital shaker.
  • Deu E. coli BL21 (DE3)/pHZ83F3 whole-cells were used as the biocatalyst for the reaction.
  • Deu-culture (5 mL) was prepared as described previously. After the IPTG induction, cells were cultivated for another 15 hours.
  • a 10% (v/v) TCP in deu-DMSO stock solution was prepared using the sterile deu-DMSO.
  • the deu-DMSO was sterilized through 0.22- ⁇ m membrane.
  • the dehalogenation reaction was initiated through the addition of 20 ⁇ L of such TCP stock solution into the culture. After mixing vigorously, the reaction proceeded at 30°C with agitation at 290 rpm in an orbital shaker.
  • This example describes the use of HTP-NMR following growth of microorganisms in non-deuterated medium.
  • M9 salts solution (1 L) contained Na 2 HPO 4 (6 g), KH 2 PO (3 g), NH 4 C1 (1 g), and NaCl (0.5 g).
  • the Partial-Deu-Medium (1 L) consisted of 990 mL of M9 salts, MgSO 4 (0.12 g), and 10 mL of BIO-EXPRESS 1000 (U-D, 98%, lOx concentrated) purchased from Cambridge Isotope Laboratory, Inc. SOC consisted of (1L) tryptone (20g), yeast extract (5g), NaCl (0.5g), KC1 (2.5 mM), MgCl 2 (10 mM) and glucose (20 mM).
  • a library of random 1.5-2.5 kb fragments of genomic DNA from Pseudomonas fluorescens strain MB214 was constructed in the pET21(+) vector (Novagen, Madison, WI; Catalogue number 69770-3).
  • the vector DNA was linearized with BamHI, and then treated with calf intestinal phosphatase (New England Biolabs, Beverly, MA). This preparation was then treated with DNA ligase and subsequently separated on an agarose gel.
  • the DNA molecules that still remained in a linear form were purified from the gel using the QiaexII gel purification kit (Qiagen, Valencia, CA).
  • MB214 genomic DNA was treated with the enzyme Sau3AI at 1 unit of enzyme per 1 ⁇ g of DNA for 8 minutes at room temperature. This condition was found to be ideal to enrich for DNA that is 1.5-2.5 kb in size. After 5 minutes, the reaction at 65°C was terminated. The preparation was separated on agarose gels and 1.5-2.5 kb-large DNA molecules were purified from the gel. This DNA was ligated to the Bam Hi-cut pET21(+) vector and subsequently transformed by electroporation into E. coli strain Electro-Ten-Blue (Stratagene, Cedar Creek, TX; Catalogue number 200159). The cells were plated out on LB medium containing ampicillin and grown overnight at 37°C.
  • the cells that grew were collected and used to prepare plasmid DNA.
  • This DNA preparation representing the MB214 DNA library was subsequently used to repeatedly transform E. coli to analyze individual clones.
  • PCR amplification of the library DNA was performed with primers based on vector sequences that flank the multiple cloning site (Forward: 5'-CTT GTC GAC GGA GCT CGA A-3'(SEQ ID NO:3), Reverse: 5'-GGG GAA TTG TGA GCG GAT AAC-3' (SEQ ID NO:4)).
  • the following program was used to perform the PCR reaction: one cycle of 94°C for 2 min, followed by 20 cycles of denaturation at 94°C for 1 min, annealing at 58°C for 1 min, and amplification at 72°C for 4 min. An additional amplification step at 72°C for 15 min was added at the end of the reaction.
  • the plate was incubated overnight at 37°C, the resultant colonies are picked with a Qpix colony picking robotic system and inoculated into 2 mL square 96 well plates (Qiagen, Valencia, CA) containing 900 ⁇ L of LB/ampicillin.
  • a typical transformation yields -3600 colonies, of which, -1800 are selected based upon preset parameters.
  • the Qpix measurements are in terms of pixels, and criteria for selected colonies were as follows: 5-50 diameter, 0.75 axis ratio, 0.75 roundness. The closer the roundness and axis ratio values are to 1, the more perfectly circular the colony. The typical range for both is 0.6-0.95.
  • the cell cultures are grown in a HIGRO chamber (GeneMachines, San Carlos, California, USA) for 6 hours, 36°C at -500 rpm. They are then induced with lOmM IPTG, for a final concentration of 0.3 mM, and grown overnight with O 2 supplementation to the chambers.
  • Glycerol stocks are made from a 1 : 1 dilution of culture and 40% glycerol in water for a final volume of 150 ⁇ L.
  • the daughter plate of the glycerol stocks were immediately stored in -80°C.
  • the parent plates were spun down in a tabletop centrifuge that has been adapted for microtiter plates, at 4000 rpm for 5 minutes at 4°C.
  • the supernatant is decanted manually and the pellet is resuspended in 900 ⁇ L of the Partial-Deu-Reaction Medium containing 5 mM hydrocinnamonitrile.
  • the plates are placed back in the HIGRO chamber for at least 20 hrs at 30°C, without O 2 , at a rpm of -500. After this time, the plates were once again centrifuged for 5 minutes at 4000 rpm, 4°C.
  • the supernatant, 900 ⁇ L was then aspirated and transferred to new 2 mL square 96 well plates containing 300 ⁇ L of TSP solution in D 2 O (33.3 mM), the solutions were then mixed thoroughly. These plates were then sealed and stored at 4°C until analysis by NMR. Medium manipulations of the titer plates were done using a Tecan Genesis liquid handling system.
  • a Varian UNITY INOVA 600AS MHz NMR spectrometer equipped with the Versatile Automated Sample Transport (VAST) system was used for the high throughput screening (entire NMR System obtained from Varian, Inc., Palo Alto, California, USA).
  • the Gilson 215 liquid handling system that is part of the VAST accessory had been modified to allow for increased sample throughput under screening conditions (the Gilson 215 is manufactured by Gilson, Inc., Middleton, Wisconsin, USA).
  • a 120 ⁇ L triple resonance flow-probe was used to collect the 1H NMR spectra. Configuration of the Gilson 215 for "plug-flow" sample delivery was as follows: Probe Volume: 250 ⁇ L
  • Probe Slow Volume 100 ⁇ L
  • Probe Slow Rate 1.25 mL/min
  • Probe Fast Rate 1.25 mL/min
  • Water suppressed 1H NMR spectra were collected using the "WET" pulse sequence NMR spectral acquisition parameters were as follows: Spectral width: 8000.00 Hz
  • Transmitter Power 59 dB 90° pulse width: 4.4 ⁇ sec
  • Data acquisition time per sample was 20 seconds.
  • Sample transport time was approximately 15 seconds per sample.
  • An additional 5 seconds is required for the autolock routine performed in automated spectrometer control and to save the data to disk.
  • a repetition time of 2.5 seconds was chosen as an optimum of relaxation effect and analysis speed.
  • Spectral data for each 96-well titer plate was done using a modified "vastglue" macro.
  • This macro creates a pseudo-2D data set from a collection of contiguously indexed ID spectra. The macro performs a weighted Fourier transform on each free induction decay. Once this process is completed, the "fbc" macro was executed to baseline correct the entire suite of NMR spectra. The resulting set of 1H ID spectra was then viewed as a stack plot of only the region of 2.7 ppm to 3.1 ppm.
  • This data processing can be extended to include the use of the "combishow” macro. This macro will only execute successfully when all 96 spectra are included in the data set. Issues relating to the reliability of the Tecan to produce consistent culture plates and issues with culture contamination did not allow for the reliable use of "combishow” for data display.
  • Deuterated medium inhibits culture growth and limits the variety of strains that can be analyzed with this technique.
  • E. coli cell growth is inhibited to one-third growth in normal medium. This reduction is biomass translates to a reduction in biocatalytic activity and hence a reduction in the sensitivity with which biocatalysis can be detected.
  • a novel culture preparation was devised. Details are provided in the methods section above. The key aspect of this technique is re- suspension of the cells in minimal medium during the biocatalysis phase.
  • the library DNA was digested with HmdIII and then separated by agarose gel elecfrophoresis. Since H dlll cuts once on the vector DNA, all the molecules are supposed to be linearized. If a DNA molecule does not contain any insert, it will appear as a 5.4 kb-large empty vector fragment. However, molecules containing inserts will be larger than 5.4 kb based on the size of the insert they contain. Most of the library DNA migrated as a diffused band of 7 to 8 kb size. This result indicates that the insert size of the library DNA varies between 1.5 and 2.5 kb as the size of the vector itself is 5.4 kb. A 5.4 kb-large thin band representing empty vector molecules was also observed. The empty vector band is approximately 10% of the intensity of the insert-containing band. Therefore, approximately 10% of the library DNA consists of empty self-ligated vector DNAs.
  • the library DNA was amplified using primers based on vector sequences flanking the multiple cloning site. A product that varied between 1.5 and 2.5 kb in size was obtained by agarose gel electrophoresis. This result reconfirms previous observation that the library contains random pieces of P. fluorescens genomic DNA ranging from 1.5 to 2.5 kb in size.
  • Pseudomonas fluorescens strain MB214 and subsequently screened for nitrilase activity.
  • the modified sampling system described above provided errorless data acquisition as long as the titer plates were prepared with sufficient consistency.
  • the modified plug-flow method is robust to variations in liquid level per sample well. The most severe error conditions occurred when wells were empty, the liquid level was below the cannula position or there was inadequate mixing of the supernatant with the TSP/D 2 O solution.
  • Plate analysis was done in batches of four or six plates. Data analysis was performed manually subsequent to the completion of an automation run. Using an analysis procedure that interrogates only the region of presumed product formation for nitrilase activity, product formation was detected in well B10 of plate 115. It was defined as positive hit 115-B10. The resulting spectrum is shown in Figure 5. Multiple product formation from a single substrate was observed. Plasmid Recovery and Downstream Molecular Biology
  • the positive hit 115-B10 was recovered from the glycerol freeze samples of the library. They were streaked out on LB/Ap plates to obtain single colonies for cultivation. Purification of the plasmid DNA afforded the plasmid "115-B10.” To confirm and characterize the clone, the purified DNA was digested with restriction enzymes. The results confirmed that 115 -BIO was a plasmid with an insert vector pET21b(+). The length of the genomic inserts was estimated to be 1.8 kb by digesting the clone with several different restriction endonucleases and comparing the digested DNA with standard size markers, by agarose gel electrophoresis.
  • the construct that was able to catalyze hydrocinnamonitrile to its corresponding acid upon expression in E. coli was completely sequenced to identify the gene that encodes the enzyme capable of this catalysis.
  • the insert was found to be a 1766 base pair DNA (see Figure 6), and the gene was determined to comprise a 924 bp-long open reading frame (ORF) (SEQ ID NO: 9) that encodes a putative protein consisting of 308 amino acids (SEQ ID NO: 10) with a predicted molecular weight of 35.4 kD, see Figure 7. This gene is defined as DOW2447.
  • M9 salts solution (1 L) contained Na 2 HPO 4 (6 g), KH 2 PO (3 g), NH 4 C1 (1 g), and NaCl (0.5 g).
  • the Partial-Deu-Medium (1 L) consisted of 990 mL of M9 salts, MgSO 4 (0.12 g), and 10 mL of BIO-EXPRESS 1000 (U-D, 98%, 10 x concentrated) purchased from Cambridge Isotope Laboratory, Inc. The D 2 O (99.9%), which was used as the NMR solvent was purchased from Aldrich.
  • LB medium was used in all experiments as normal medium for cultivation of microbes. It was prepared using normal deionized water.
  • IPTG isopropyl- ⁇ -D-thiogalactopyranoside
  • Primers were designed to amplify the open reading frame from the Pseudomonas fluorescens MB214 genome. Based on the sequence of clone 115-B10 screened out of expression library using HTP-NMR technology, the primer sequences were designed with Spe I and Xho I sites so that the product could be directly cloned into the pMYC1803 vector.
  • the primer sequences are as follows:
  • Vorward-GACTAGTCAGGAGGAATAATATGCCCGTATCGACTGTCGC (SEQ ID NO:7); and Reverse - CCCrCG GGGTCAGTCAGTGATATAGCGAA (SEQ ID NO:8).
  • the product was amplified via PCR with an annealing temperature of 53°C for 35 cycles. The product was then run on an agarose gel to separate any other nonspecific PCR products and purified. Both the vector and the DOW2447 product were digested with Spe I and Xho I and then ligated together. The construct was then transformed into P. fluorescens MB214 and E. coli JM109 to test for expression.
  • the plasmid pPMl was created by cloning the amplified ORF of DOW2447 gene into shuttle vector pMycl803 ( Figure 8).
  • the 5' primer was designed such that a ribosome binding site (RBS) was included to facilitate protein synthesis.
  • the vector contains a strong tac promoter and a lac operon so that the gene expression of the DOW2447 can be regulated by the addition of IPTG when the plasmid was transformed into strains containing lad in the genome.
  • the vector also contains replicons for propagation in both P. fluorescens and E.
  • the plasmid pPMl makes the plasmid pPMl suitable to be analyzed in both P. fluorescens host and E. coli host.
  • the gene DOW2447 was derived from the P. fluorescens genome, biocatalysis was analyzed primarily in E. coli host JM109.
  • the plasmid pPMl also contains an antibiotic resistance marker, making it resistant to Tetracycline.
  • DOW2447 protein in E. coli JM109/pPMl and P. fluorescens MB214/pPMl was analyzed by SDS PAGE gels. As controls, E. coli JM109 and P. fluorescens MB214 were cultivated in the same conditions. Samples were taken immediately before the additions of IPTG and 3 hours after the induction. The SDS PAGE was run according to the procedures described in the Experimental Section. Before induction, there is low expression of the 33kDa DOW2447 in both JM109/pPMl and MB214/pPMl as compared with JM109 and MB214 controls. Significant DOW2447 was produced in both JM109/pPMl and MB214/pPMl 3 hours after the addition with IPTG. The expressed DOW2447 was estimated to consist of 40% of total protein in both cases.
  • E. coli JM109/pPMl was subjected to the study of biocatalysis. Cultivation of E. coli JM109/pPMl was carried out as described in the experimental session. E. coli JM109 was used as a control. The reaction was initiated by addition of the substrate hydrocinnamonitrile (3.6 mM) and was carried out at 30°C with agitation at 250 rpm. Samples were taken at 0 hours, 5 hours, 22 hours, 28 hours, and 40 hours and subsequently analyzed by proton NMR for product formation.
  • substrate hydrocinnamonitrile 3.6 mM
  • the cells from each time point were also subjected to SDS PAGE analysis to view their protein profiles.
  • the overexpressed 33 kDa nitrilase DOW2447 was present for the JM109/ ⁇ PMl culture but was absent for the JM019 culture, indicating that the nitrilase enzyme is fairly stable.
  • the focus was initially on the disappearance of the acid in the reaction catalyzed by JM109/pPMl after 22h. Both JM109/pPMl and JM109 were cultivated as described in the experimental section. The acid (4 mM) was used as the only substrate for possible reactions for both cells.
  • Hydrolysis is the most common reaction for the microbial metabolism of nitrile and it proceeds via the formation of the corresponding carboxylic acid and ammonia.
  • Two types of hydrolysis reaction are reported in the literature. In the first type, the end products are formed directly without any intermediate, catalyzed by nitrilases (EC 3.5.5.1, nitrile aminohydrolase).
  • the second type of hydrolysis is by the action of a two-enzyme system, consisting of a nitrile hydratase (EC 4.2.1.84) that converts nitrile to amide and an amidase (EC.3.5.14) that converts amide to the corresponding carboxylic acid and ammonia (Ramakrishna et al, J. Sci. & Indust. Res. 58:925-47 (1999); Kobayashi et al, Current Opinion Chem. Biol. 4:95-102 (2000)).
  • the DOW2447 nitrilase is a bifunctional enzyme. It has either nitrilase and nitrile hydratase activities, or nitrile hydratase and amidase activities.
  • Hydrocinnamamide was used as the substrate for the biocatalysts in order to differentiate between these two possibilities.
  • the hydrocinnamamide was synthesized since it is not commercially available. Synthesis of hydrocinnamamide was accomplished through hydrogenation of cinnamamide using H 2 (1 atm [101325 Pa]) at 10% Pd/C. The reaction was carried out at room temperature in methanol solvent. A 100% yield was achieved after 4 hour reaction.
  • the hydrocinnamamide was purified by filtration through CELITE (Celite Corp., Lompoc, California, USA) and concentration.
  • nitrilase activity appeared dominant since there was approximately twice as much the acid formed as the amide formed at 5 hours and 22 hours for JM109/ ⁇ PMl when using hydrocinnamonitrile as the substrate.
  • This example of the DOW2447 enzyme clearly demonstrates the advantage of using 1H NMR spectroscopy to elucidate novel gene function.
  • the nitrilase activity was assayed by detecting the release of the ammonia, which form a blue color in the presence of sodium phenate (i.e. sodium phenoxide), hypochlorite, and sodium nitroprusside (Fawcett et al, J. Clin. Path. 13:156 (I960)).
  • the colorimetric assay would not detect the nitrile hydratase activity of the DOW2447 enzyme.
  • the detection of the "byproduct" hydrocinnamamide by NMR supplied the solid base for the mechanism study.
  • a single colony from the JM109/pPMl strain was inoculated in a 50 mL of culture in LB with Tetracycline (Tet) added to maintain the plasmid.
  • Tet Tetracycline
  • the culture was grown at 37°C with 280 rpm agitation. After 7 hours, the OD 600 reached 0.6 to 0.7 and IPTG was added to the culture to a final concentration of 0.25 mM. The cultivation was continued overnight.
  • the culture was cenfrifuged for 10 min at 4000 rpm, and the pellet was rinsed with 50 mL of M9 medium. The pellet was then resuspended in equal volume of Partial-Deu-Medium with Tet added.
  • a 5 mL aliquot of the re-suspended culture was transferred into a new 15 mL FALCON tube and the two nitrile compounds from the same group were added to initiate the biotransformation.
  • the biotransformation was carried out for 48 hours at 37°C with agitation of 280 rpm.
  • the detected hydrocinnamic acid at 48 hours differs from previous observation that the acid was catabolized by E. coli when using hydrocinnamonitrile as a sole substrate. This indicated that the 4-CN-phenol and/or its corresponding acid (amide) might inhibit the induced enzyme activities that were responsible for the degradation of the hydrocinnamic acid.
  • group 2 by 48 hours, for the 5 mM of hydrocinnamonitrile that was added to the reaction, the only product detected by NMR is the hydrocinnamamide (2 mM), indicating that 60% of hydrocinnamonitrile was converted to the acids which was further catabolized by the E. coli.
  • glutaronitrile since it has two CN groups at both ends of the carbon chain, a variety of products were formed representing the different hydrolysis for both CN groups ( Figure 10). All glutaronitrile was converted by 48 hours.
  • the enzyme DOW2447 was able to use a variety of nitriles as its substrates. These include both aromatic and aliphatic nitriles. Utilizing NMR technology not only can detect reactions with multiple substrates, it can also reveal the formation of multiple products. In addition, by adding several substrates simultaneously into one reaction, the technology can also be used to differentiate "good” substrate versus "not-so- good” substrate.
  • nifrilase and nitrile hydratase hydrolyze a number of structurally diverse nitriles.
  • Several commercially important organic compounds such as /?-aminobenzoic acid, benzamide, 2,6-diflurobenzamide, nicotinamide, isonicotinamide etc. have been prepared from the corresponding nitriles using the nitrilase or nitrile hydratase catalyzed reactions.
  • certain nitrilases can only work on certain nitriles.
  • the enzyme DO W2447 was used to demonstrate the application of the NMR technology in the quick screening of substrate specificity.
  • the E. coli JM109/pPMl cells were cultivated as described in the experimental session.
  • Sample handling of a Varian Unity/ Anova NMR system run under Unix can be automated in "tubeless" format to allow rapid screening of samples.
  • the purpose of this example was to optimize screening for samples that are essentially identical in composition. The occurrence of a positive sample is expected to be on the order of 1 in 20,000 samples. Sample carry-over or contamination of one sample with another was not considered a serious problem because "hits" are only occasionally expected. If a hit is encountered, the samples will be re-analyzed more completely.
  • the procedure described in this example is utilized in high throughput screening. The method described herein has been termed "plug-flow NMR.” It allows for the continuous loading of the NMR flow-probe.
  • the Varian NMR system includes, as optional equipment, a Gilson 215 laboratory robot. That robot was used to automate sample injection.
  • FIG. 12 A schematic of the VAST accessory in normal operation is shown in Figure 12. To facilitate the plug-flow approach, a 3-way slider valve and a sample loop were added to the apparatus. A simplified schematic of the modified sample transport system is shown in Figure 13. Positioning the slider valve on the cannula control arm eliminates the additional time required to position the cannula over the Gilson injection port. Alternatively, a flexible diaphragm valve (e.g., a Burkert 3-way miniature diaphragm valve, Butler and Land, Dallas, TX) is used in place of the 3-way slider valve.
  • Figure 14 shows a schematic of a system loading process used in some embodiments.
  • an automated liquid handling system is used to sequentially direct the sample loading tube input to different sample containers (e.g., in an array such as that shown on the left panel of Figure 14).
  • the syringe pump For sample delivery to the NMR flow probe, the syringe pump aspirates a sample volume from a titer plate well, approximately 300 mL.
  • the outlet tubing from the syringe pump is attached to the common port of a pneumatically-actuated 3-way valve (See e.g., the center panel of Figure 14).
  • the internal diameter of the slider valve passage is approximately 0.8 mm.
  • the sample is temporarily stored in the sample loop, which is a piece of tubing that connects the cannula with the 3-way.
  • the volume of the sample loop is equal to the sample volume + 25 mL.
  • the 3-way valve is positioned so that the open channel is from the cannula/sample loop to the syringe pump.
  • the 3-way valve is a slider valve (See e.g., Figure 18).
  • 3-way rotary valves, 3-way ball valves, and 3-way diaphragm valves are used.
  • a 3-way diaphragm valve is used.
  • These types of valves are commercially available.
  • 3-way rotary and 3-way slider valves can be obtained from Chrom Tech, Inc. (Apple Valley, Minnesota), and 3-way diaphragm valves can be obtained from Fluid Process Control Corp (Burr Ridge, Illinois).
  • the modified system operates as follows.
  • the slider valve is switched to connect the syringe pump to the NMR sample line.
  • the sample volume is injected into the peek tubing that is attached to the inlet of the NMR flow-probe. Internally, the actual injection volume is calculated as the Sample Volume + PushVolume + 25 ⁇ L. Sample Volume and the
  • PushVolume can be set in the software. After the sample is injected into the NMR sample line, the NMR software is notified that the sample has been injected. At this point, the syringe pump has been emptied, and the data acquisition process is initiated. Concurrently, the slider valve is released so that the syringe pump is connected to the cannula/sample loop. The cannula moves to the next sample position and aspirates the next sample into the sample loop.
  • sampleVolume is set such that the solution will be drawn up into the tubing exactly to the slider valve, but not into the valve.
  • the length of sample tubing between the slider valve and the NMR cavity was determined such that it would hold slightly less than twice the volume that was injected. This allows some of the injection plug to be pushed through the NMR cavity and the "center" of the sample is then in the center of the NMR sample cell.
  • the injection sequence continues and after the next injection, a second sample is drawn through the cannula up to the slider valve and the first sample is drawn through the slider valve and into the tubing that connects the syringe pump to the slider valve.
  • the third injection is initiated, the first sample is pushed into the first section of the NMR sample line and when the fourth injection is effected, the first sample is finally pushed into the NMR sample cell.
  • This Example describes modifications of the programming code that were used to instruct the Gilson 250 auto-sampler for the implementation of the "plug-flow" method.
  • the key attributes of these modifications include the removal of the sample retrieval step, addition of an "air-plug”, and alteration of the sample aspiration procedure. Additionally, data acquisition and sample manipulation are allowed to occur simultaneously hence reducing the over-head sampling time and increasing sample throughput capacity.
  • VAST accessory The operation of the VAST accessory is discussed in the Varian Manual. Files that control the operation of the VAST accessory are found in VNMR ASM and its subdirectories. In particular, one subdirectory, NNMR/ASM/TCL contains TCL (Tool Control Language) scripts, which are used to control the injection of samples.
  • TCL Tool Control Language
  • the scripts include the following seven files: GET.TCL
  • TCL files had to be modified in order to effect the desired action.
  • the following files were changed: GET.TCL, INJECT.TCL, RETRIENE.TCL WASH.TCL.
  • the other three files were not changed.
  • the modified code is as follows: source $env(vnmrsystem)/asm/tcl/wash.tcl source $env(vnmrsystem)/asm/tcl/mix.tcl source $env(vnmrsystem)/asm/tcl/transfer.tcl source $env(vnmrsystem)/asm/tcl/retrieve.tcl source $env(vnmrsystem)/asm/info/default source $env(vnmrsystem)/asm/info/racks
  • PushVolume [expr ($ProbeVo_ume - $Sample Volume)] if ⁇ $PushVolume ⁇ 0.0 ⁇ ⁇ set PushVolume 0.0 ⁇
  • the INJECT routine operates as follows. In the following discussion, command lines of the software code are indicated in italics. The two initial lines, p roc inject ..., and global ... are instructions to the command parser. The first line states that this procedure is called inject, and defines the parameters that are passed. The second line identifies global variables that the subroutine requires. The first line of code, gStopTestAll is called to ensure that no syringe or robot arm motion is taking place. This would happen if one call to INJECT.TCL overran another call. This should never happen.
  • the second command, gSetContacts 1 1 activates one of the sets of contacts on the back of the Gilson 115.
  • the first "1" indicates the contact number (there are 4 of them), and the second indicates that the contacts should be closed. This is followed by a 50-mSec delay.
  • a pneumatic valve driver was constracted and mounted beneath the
  • VAST accessory This valve driver box contains two air valves, which push and pull the pneumatic valve. Only one of the two air valves is used in this project. The other exists in the event that a second valve is needed. The command is followed by a 50-mSec delay that ensures that the contacts are closed and that any unequal pressures between the valve's ports, have equalized.
  • the syringe pump (which is already primed) is connected directly to the NMR's sampling probe.
  • the next block of commands pushes sample through the slider valve and eventually into the probe.
  • the first command gAspirate 0 0 0, resolves an apparent bug in the Gilson' s programming.
  • the second command gDispense [expr ($ProbeVolume + 25)] ProbeFastRate 0, injects "ProbeVolume” + 25 ⁇ L toward the probe.
  • the 25 ⁇ L accounts for the volume of an air bubble, which was aspirated and used to keep samples apart. This is followed by a 1/5-second delay, which ensures that the volume has been moved.
  • the ResumeAcq command is issued. This command sends a command to the NMR console, instructing it that the sample has been changed and that it may continue with data acquisition.
  • the INJECT.TCL script has not yet been completed; it still needs to load the next sample. However, loading occurs simultaneously with data acquisition, thus increasing the system's efficiency.
  • a variable, PushVolume is calculated as the difference between ProbeVolume and
  • the Push Volume is the quantity of solvent (D O) that is loaded into the syringe and pushed toward the probe after the sample volume. This helps wash out the sample.
  • the PushVolume is loaded by calling gFlush.
  • gFlush takes three parameters. The first parameter is the volume, the second parameter is the flow rate while filling the syringe and the second parameter is the flow rate while dispensing from the syringe. In this case, the second parameter is zero. This causes solvent to flow into the syringe from the reservoir, but it is not dispensed.
  • the cannula is then moved into to the next sample vial and into the sample.
  • a volume, SampleVolume is aspirated from the sample well.
  • the cannula is then lifted out of the sample and a 25 ⁇ L plug of air is aspirated into the sample line.
  • the final step is gStopTestAll, which causes the system to wait until all arm and syringe movement has halted. The subroutine then exits.
  • the invention is not limited to the injection configuration described above.
  • One alternative configuration utilizes two valves.
  • One valve is placed on the liquid handling robot and the other is placed beneath the NMR jacket close to the flow-through cell. Because electric valves may interfere with the NMR or vice-versa, the valves are pneumatically actuated.
  • the valve actuator that was installed in the previous system contains two electrically actuated air valves. Each pneumatic line contains two tubes so the slider valves can be pushed and pulled into place rather than requiring a spring return on the valve.
  • FIG 15 shows one alternate configuration.
  • Slider Valve A is arranged similarly to that shown in Figure 13.
  • Slider Valve B is mounted beneath the NMR magnet as close as reasonable to the injection port.
  • a relatively large quantity of solution is drawn through the cannula and into the tubing connecting the Slider Valve A to the syringe pump. This is followed by a large air bubble, drawn through the cannula and past the slider valve.
  • the air bubble displaces everything in the cannula and connecting tubing.
  • Sufficient air is drawn through the valve to act as a separator between the previous sample and the current sample.
  • Slider Valve A is toggled to the inject position and the syringe pushes the sample toward the NMR.
  • the "PushVolume" is aspirated into the syringe through a valve on the tip of the syringe.
  • the cannula is inserted into the next sample vial and a volume of that sample is drawn through the cannula into the tubing as previously described.
  • the cannula is then moved out of the sample and a large air bubble is drawn through the cannula and into the tubing to fill the slider valve and beyond. At least 50 ⁇ L of air should be drawn through Slider Valve A.
  • the lengths of connecting tubing are not drawn to scale. Very long pieces of tubing may be required to hold the sample.
  • the advantages of the alternate injection scheme are that the lines are flushed with a greater sample volume, thus reducing carryover from one sample to the next.
  • the separating air bubble and the initial plug of sample which collects small amounts of the previous sample, which may have adhered to the walls of the tubing, is pushed into the waste container. This greatly reduces sample contamination.
  • the relatively large volume of sample "washes" the tubing more effectively than a smaller sample.
  • a large sample also means that the "offset" between the current sample and the NMR spectrum number is reduced.
  • the aspirated air bubble provides a separator between samples. This barrier is much better than assuming that little or no diffusion takes place between samples. However, air bubbles are compressible and can reduce the pressure transmitted down the tubing. Thus, the air bubble in the line must be kept fairly small. This scheme provides a way to remove the air bubble before it reaches the NMR probe. An air bubble in the NMR probe will compromise the deuterium lock and the sample homogeneity.
  • sample carry-over using the modified plug flow approach is 4 samples. This amount of carryover is consistent with what was previously observed when no probe wash was used with the standard Gilson configuration. It is also evident from the intensities of the contours that subsequent concentrations are diluted nearly exponentially.
  • the development and implementation of the novel plug-flow approach to increase sample throughput using a flow-probe NMR system has been demonstrated.
  • the method allows for continuous NMR sampling using a direct-injection configuration, reduces possibility for spectrometer errors during automated data acquisition, uses a minimal sample volume 250-300 mL, allows for accurate mapping of sample location, and the sample carry-over is equivalent to what is observed utilizing the standard Varian VAST approach.
  • the relatively small sample volume of 250 mL allows for approximately 1000 mL per sample well to be retained.
  • Sample throughput can be increased by as much as a factor of five using the plug-flow approach.
  • the current method utilizes an air plug to separate the sample plugs.
  • An alternative approach has been presented that would use a second three valve to facilitate sample isolation and improve sample carryover. This approach would require large injection volumes. Further reductions in sample analysis time can be realized by increasing sample flow rates. This can be achieved by increasing the inner diameter of the flow-probe tubing, hence reducing the system backpressure.

Landscapes

  • Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • High Energy & Nuclear Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Immunology (AREA)
  • General Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Genetics & Genomics (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medicinal Chemistry (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Pathology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Condensed Matter Physics & Semiconductors (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

La présente invention concerne des procédés et un appareil pour la recherche de biocatalyseurs. La présente invention concerne des procédés à base de résonance magnétique nucléaire (RMN) à haut rendement pour la recherche systématique de biocatalyseurs de façon à identifier des activités biocatalytiques et pour découvrir des biocatalyseurs. L'invention concerne également des systèmes améliorés d'analyse à haut rendement à base de RMN.
PCT/US2004/012447 2003-04-25 2004-04-22 Utilisation de la rmn et du deuterium pour la decouverte de biocatalyseurs et d'activites biocatalytiques WO2004097028A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US46569903P 2003-04-25 2003-04-25
US60/465,699 2003-04-25

Publications (2)

Publication Number Publication Date
WO2004097028A2 true WO2004097028A2 (fr) 2004-11-11
WO2004097028A3 WO2004097028A3 (fr) 2009-03-26

Family

ID=33418273

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/012447 WO2004097028A2 (fr) 2003-04-25 2004-04-22 Utilisation de la rmn et du deuterium pour la decouverte de biocatalyseurs et d'activites biocatalytiques

Country Status (1)

Country Link
WO (1) WO2004097028A2 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011041892A1 (fr) * 2009-10-09 2011-04-14 Carolyn Slupsky Procédés pour le diagnostic, le traitement et la surveillance de la santé d'un patient en utilisant la métabolomique
CN108414562A (zh) * 2018-05-11 2018-08-17 中国医学科学院医药生物技术研究所 一种西咪替丁注射液中西咪替丁含量的测定方法
CN108982390A (zh) * 2018-09-07 2018-12-11 华南农业大学 一种基于原子吸收光谱信息的水体农药残留检测方法
CN110609056A (zh) * 2019-06-25 2019-12-24 北京大学 一种可获取高质量固体核磁共振谱图的rna固体样品制备方法及其结构检测方法
CN114018964A (zh) * 2021-10-15 2022-02-08 华东师范大学 一种基于原位高场核磁共振技术检测四环素类有机污染物废水降解的方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6380737B1 (en) * 2001-07-10 2002-04-30 Varian, Inc. Apparatus and method utilizing sample transfer to and from NMR flow probes
US20030044800A1 (en) * 2000-09-05 2003-03-06 Connelly Patrick R. Drug discovery employing calorimetric target triage

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030044800A1 (en) * 2000-09-05 2003-03-06 Connelly Patrick R. Drug discovery employing calorimetric target triage
US6380737B1 (en) * 2001-07-10 2002-04-30 Varian, Inc. Apparatus and method utilizing sample transfer to and from NMR flow probes

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011041892A1 (fr) * 2009-10-09 2011-04-14 Carolyn Slupsky Procédés pour le diagnostic, le traitement et la surveillance de la santé d'un patient en utilisant la métabolomique
CN108414562A (zh) * 2018-05-11 2018-08-17 中国医学科学院医药生物技术研究所 一种西咪替丁注射液中西咪替丁含量的测定方法
CN108982390A (zh) * 2018-09-07 2018-12-11 华南农业大学 一种基于原子吸收光谱信息的水体农药残留检测方法
CN110609056A (zh) * 2019-06-25 2019-12-24 北京大学 一种可获取高质量固体核磁共振谱图的rna固体样品制备方法及其结构检测方法
CN110609056B (zh) * 2019-06-25 2020-09-29 北京大学 一种可获取高质量固体核磁共振谱图的rna固体样品制备方法及其结构检测方法
CN114018964A (zh) * 2021-10-15 2022-02-08 华东师范大学 一种基于原位高场核磁共振技术检测四环素类有机污染物废水降解的方法

Also Published As

Publication number Publication date
WO2004097028A3 (fr) 2009-03-26

Similar Documents

Publication Publication Date Title
US7384387B1 (en) High throughput mass spectrometry
Farha et al. Strategies for target identification of antimicrobial natural products
Heitzer et al. Optical biosensor for environmental on-line monitoring of naphthalene and salicylate bioavailability with an immobilized bioluminescent catabolic reporter bacterium
Yu et al. Structure and biosynthesis of heat-stable antifungal factor (HSAF), a broad-spectrum antimycotic with a novel mode of action
US20030215798A1 (en) High throughput fluorescence-based screening for novel enzymes
JP2001514017A (ja) 新規生物活性のスクリーニング
Dreier et al. Mechanistic analysis of a type II polyketide synthase. Role of conserved residues in the β-ketoacyl synthase− chain length factor heterodimer
CA2396553A1 (fr) Procedes de detection parallele de compositions ayant des caracteristiques desirees par spectroscopie irm
AU2017393714B2 (en) Genotoxic substance detection vector and detection method thereof
CN112912496B (zh) 提高氨基酸球菌属cpf1的dna切割活性的新型突变
Rienzo et al. High-throughput screening for high-efficiency small-molecule biosynthesis
US20230062579A1 (en) Activity-specific cell enrichment
Reetz et al. Enzyme engineering: selective catalysts for applications in biotechnology, organic chemistry, and life science
Trivedi et al. Cheating the cheater: Suppressing false-positive enrichment during biosensor-guided biocatalyst engineering
WO2004097028A2 (fr) Utilisation de la rmn et du deuterium pour la decouverte de biocatalyseurs et d'activites biocatalytiques
CN109679886A (zh) 一种基于生物传感器的高通量筛选方法
Li et al. Establishment of picodroplet-based co-culture system to improve erythritol production in Yarrowia lipolytica
EP1801212A1 (fr) Sélection de biocatalyseurs pour la synthèse chimique
WO2004096988A2 (fr) Decouverte de biocatalyseurs et activites biocatalytiques utilisant la resonance magnetique nucleaire
Hwang et al. Biosensor-guided discovery and engineering of metabolic enzymes
WO2000048004A9 (fr) Spectrometrie de masse a haut rendement
Wiegand Establishment of Cell-Free Expression Systems Derived From Non-Standard Bacterial Organisms for Natural Product Synthesis and Screening with Electroanalytical Methods
Pourmasoumi et al. Analysing Megasynthetase Mutants at High Throughput Using Droplet Microfluidics
CN115184516B (zh) 一种快速检测肠杆菌β-内酰胺酶的方法
US6514703B1 (en) Method for separating and characterizing functions potentially present in a biological sample containing nucleic acids

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase