WO2004097027A2 - Recherche systematique de biocatalyseurs chez des plantes - Google Patents

Recherche systematique de biocatalyseurs chez des plantes Download PDF

Info

Publication number
WO2004097027A2
WO2004097027A2 PCT/US2004/012446 US2004012446W WO2004097027A2 WO 2004097027 A2 WO2004097027 A2 WO 2004097027A2 US 2004012446 W US2004012446 W US 2004012446W WO 2004097027 A2 WO2004097027 A2 WO 2004097027A2
Authority
WO
WIPO (PCT)
Prior art keywords
plant
screening
enzyme
gene
gene library
Prior art date
Application number
PCT/US2004/012446
Other languages
English (en)
Other versions
WO2004097027A3 (fr
Inventor
Venkiteswaran Subramanian
Lada Rasochova
Kai Li
Dipnath Baidyaroy
Claire B. Conboy
Original Assignee
Dow Global Technologies Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dow Global Technologies Inc. filed Critical Dow Global Technologies Inc.
Publication of WO2004097027A2 publication Critical patent/WO2004097027A2/fr
Publication of WO2004097027A3 publication Critical patent/WO2004097027A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8257Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits for the production of primary gene products, e.g. pharmaceutical products, interferon
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/18Carboxylic ester hydrolases (3.1.1)
    • C12N9/20Triglyceride splitting, e.g. by means of lipase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)

Definitions

  • the present invention relates to compositions and methods for the screening of biocatalysts.
  • the present invention relates to methods of screening novel plant biocatalysts in plants and plant products.
  • One screening method involves the production of plant viral nucleic acids comprising exogenous nucleic acids encoding proteins or enzymes of interest and the infection of host plants with the virus.
  • the recombinant plant viral nucleic acids are stable, capable of systemic infection and capable of stable transcription or expression in the plant host of the non- native nucleic acid sequences.
  • screening requires the isolation of enzymes from plants, which is time consuming and can be difficult for enzymes expressed in small quantities, membrane bound enzymes, or multi-subunit enzymes.
  • the present invention relates to compositions and methods for the screening of biocatalysts.
  • the present invention relates to methods of screening novel plant biocatalysts in plants and plant products.
  • the present invention provides a method, comprising providing a population of vectors encoding an exogenous plant gene library; and a plant host; and contacting the plant host with the population of vectors under conditions such that the plant host expresses the exogenous plant gene library; and screening the plant host for the presence or absence of plant enzymatic activity, wherein the presence of enzymatic activity is indicative of an activity encoded by the exogenous plant gene library.
  • the plant host comprises a plant or a plant part.
  • the plant part comprises leaves, roots, stems, protoplasts, or plant cell cultures.
  • the exogenous plant gene library encodes a plurality of enzymes (e.g.
  • the screening comprises an enzyme activity assay.
  • the enzyme assay is performed on whole plant tissue.
  • the enzyme activity assay comprises the detection of products of an enzymatic reaction.
  • the detection comprises nuclear magnetic resonance spectrometry.
  • the screening comprises a high-throughput screening method. In certain embodiments, the high-throughput screening method is configured for the screening of at least 600, preferably at least 2000, and even more preferably at least 10,000 plant hosts per day.
  • the plurality of vectors are plant viral vectors.
  • the method further comprises the step of overexpressing the enzyme in a host cell.
  • the host cell is a bacterium.
  • the host cell is a plant cell, wherein the plant cell is part of a plant.
  • the present invention further provides a system, comprising a population of plant hosts transformed with an exogenous plant gene library; and a screening component configured for the detection of the presence or absence of an plant enzyme activity encoded by the exogenous plant gene library in the population of plant hosts.
  • the plant host comprises a plant or a plant part.
  • the plant part comprises leaves, roots, stems, protoplasts, or plant cell cultures.
  • the exogenous plant gene library encodes a plurality of enzymes (e.g., including, but not limited to, a nitrilase, a P450 enzyme, a fatty acid modifying enzyme, a dehalogenase, a lipase, a phosphatase, and a kinase).
  • the screening component is configured for performing an enzyme activity assay.
  • the screening component is configured for performing the enzyme activity assay on whole plant tissue.
  • the screening component is configured for the detection of products of an enzymatic reaction.
  • the screening component is configured for nuclear magnetic resonance spectrometry.
  • the screening component is configured for high-throughput screening.
  • screening component is configured for the screening of at least 600, preferably at least 2000, and even more preferably at least 10,000 plant hosts per day.
  • the plurality of vectors are plant viral vectors.
  • the system further comprises a production component configured for the production of an enzyme encoded by the exogenous plant gene library.
  • the production component comprises a host cell for production of the protein.
  • the host cell is a bacterium.
  • the host cell is a plant cell, wherein the plant cell is part of a plant.
  • Figure 1 shows the genome organization of the GENE ARE vector.
  • Figure 2 shows an alignment of nucleotide sequences between (A) the published
  • a ⁇ .NIT2 sequence and the sequence obtained from the cloned gene in the construct and (B) the published A.INIT4 sequence and the sequence obtained from the corresponding cloned gene in the GENEWARE construct.
  • the published sequences appear as the lower strand with the nucleotides that are different between the sequences appearing in bold. Nucleotide changes that result in a change in the amino acid sequence are indicated with an asterisk (*).
  • Figure 3 shows the nucleic acid sequence of the dehalogenase gene from Rhodococcus rhodochrous.
  • Figure 4 shows the nucleic acid sequence of the CAL-B gene from Candida antarctica.
  • Figure 5 shows a schematic of an exemplary method of the present invention.
  • Figure 6 shows a comparison of NMR spectra of tobacco whole leaves and crude extracts.
  • Figure 7 shows NMR spectra of hydrolysis reactions catalyzed with whole tobacco leaves.
  • Figure 8 shows an NMR spectra of a hydrolysis reaction catalyzed with recombinant E. coli.
  • plant host refers to a plant or plant part.
  • plant part refers to any portion of a plant or plant substructure, including, but not limited to, leaves (detached or non-detached), roots, stems, fruits, flowers, or protoplast and cell cultures.
  • gene refers to a nucleic acid (e.g., DNA) sequence that comprises coding sequences necessary for the production of a polypeptide, RNA (e.g., including, but not limited to, mRNA, tRNA and rRNA) or precursor.
  • RNA e.g., including, but not limited to, mRNA, tRNA and rRNA
  • the polypeptide, RNA, or precursor can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or functional properties (e.g., enzymatic activity, ligand binding, signal transduction, etc.) of the full-length or fragment are retained.
  • the term also encompasses the coding region of a structural gene and the including sequences located adjacent to the coding region on both the 5' and 3' ends for a distance of about 1 kb on either end such that the gene corresponds to the length of the full-length mRNA.
  • the sequences that are located 5' of the coding region and which are present on the mRNA are referred to as 5' untranslated sequences.
  • the sequences that are located 3' or downstream of the coding region and that are present on the mRNA are referred to as 3' untranslated sequences.
  • gene encompasses both cDNA and genomic forms of a gene.
  • a genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed "introns” or “intervening regions” or “intervening sequences.”
  • Introns are segments of a gene that are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript.
  • mRNA messenger RNA
  • nucleotide sequence refers to the full-length nucleotide sequence (e.g., of an enzyme identified using the methods of the present invention). However, it is also intended that the term encompass fragments of the sequence, mutants as well as other domains within the full-length nucleotide sequence.
  • nucleotide sequence or “polynucleotide sequence” encompasses DNA, cDNA, and RNA (e.g., mRNA) sequences.
  • exogenous gene refers to a gene that is not naturally present in a host organism or cell, or is artificially introduced into a host organism or cell.
  • exogenous plant gene library refers to a plurality of exogenous plant genes.
  • the exogenous genes in the exogenous plant gene library are related based on origin (e.g., all of the genes are derived from the same plant or are derived from a directed evolution experiment) or function (e.g., all of the genes are from a similar class of enzymes.
  • enzyme activity refers to any detectable activity of an enzyme, including, but not limited to, alteration of a substrate to generate a product.
  • enzyme activity assay refers to any assay for detecting an enzymatic activity. Suitable enzyme activity assays include, but are not limited to, nuclear magnetic resonance (e.g., high throughput nuclear magnetic resonance).
  • amino acid sequence is recited herein to refer to an amino acid sequence of a naturally occurring protein molecule
  • amino acid sequence and like terms, such as “polypeptide” or “protein” are not meant to limit the amino acid sequence to the complete, native amino acid sequence associated with the recited protein molecule.
  • genomic forms of a gene may also include sequences located on both the 5' and 3' end of the sequences that are present on the RNA transcript. These sequences are referred to as "flanking" sequences or regions (these flanking sequences are located 5 ' or 3' to the non-translated sequences present on the mRNA transcript).
  • the 5' flanking region may contain regulatory sequences such as promoters and enhancers that control or influence the transcription of the gene.
  • the 3' flanking region may contain sequences that direct the termination of transcription, post-transcriptional cleavage and polyadenylation. In some embodiments, the flanking sequences are used as "tags" for protein purification.
  • biocatalyst refers to any biological entity capable of catalyzing the conversion of a substrate into a product, especially a catalytic biomolecule or biomolecular assemblage.
  • biocatalyst includes, but is not limited to, organisms (e.g., live or dead single-cell organisms, multi-cell organisms); organs (e.g., live or dead); tissues (e.g., live or dead); cells (e.g., live or dead cells, protoplasts, spheroplasts); organism, organ, tissue, and cell homogenates and lysates; organism, organ, tissue, and cell fractions and isolates (e.g., organelles, microsomes, cytoplasts); environmental samples (e.g., biological entity-containing soils, sediments); enzymes (e.g., catalytic proteins, catalytic polypeptides); ribozymes (e.g., catalytic polynucleotides, catalytic
  • biocatalyst also includes catalytic biological complexes and derivatives that contain chemically diverse atoms or groups, i.e., atoms and groups that are chemically different from the biomolecule(s) with which they are complexed or to which they are attached.
  • Such catalytic biomolecular complexes and derivatives include, but are not limited to, for example, catalytic biological metallo-, organo-, organometallo-, phospho-, sulfo-, nitro-, boro-, glyco-, peptido-, and lipo- biomolecular complexes and derivatives.
  • wild-type refers to a gene or gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source.
  • a wild- type gene is that which is most frequently observed in a population and is thus arbitrarily designated the “normal” or “wild-type” form of the gene.
  • modified refers to a gene or gene product that displays modifications in sequence and/or functional properties (i.e., altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
  • nucleic acid molecule encoding As used herein, the terms “nucleic acid molecule encoding,” “DNA sequence encoding,” and “DNA encoding” refer to the order or sequence of deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides determines the order of amino acids along the polypeptide (protein) chain. The DNA sequence thus codes for the amino acid sequence.
  • DNA molecules are said to have "5" ends” and "3' ends” because mononucleotides are reacted to make oligonucleotides or polynucleotides in a manner such that the 5' phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its neighbor in one direction via a phosphodiester linkage.
  • an end of an oligonucleotides or polynucleotide referred to as the "5' end” if its 5' phosphate is not linked to the 3' oxygen of a mononucleotide pentose ring and as the "3' end” if its 3' oxygen is not linked to a 5' phosphate of a subsequent mononucleotide pentose ring.
  • a nucleic acid sequence even if internal to a larger oligonucleotide or polynucleotide, also may be said to have 5' and 3' ends.
  • an oligonucleotide having a nucleotide sequence encoding a gene and “polynucleotide having a nucleotide sequence encoding a gene,” means a nucleic acid sequence comprising the coding region of a gene or, in other words, the nucleic acid sequence that encodes a gene product.
  • the coding region may be present in a cDNA, genomic DNA, or RNA form.
  • the oligonucleotide or polynucleotide may be single-stranded (i.e., the sense strand) or double-stranded.
  • Suitable control elements such as enhancers/promoters, splice junctions, polyadenylation signals, etc. may be placed in close proximity to the coding region of the gene if needed to permit proper initiation of transcription and/or correct processing of the primary RNA transcript.
  • the coding region utilized in the expression vectors of the present invention may contain endogenous enhancers/promoters, splice junctions, intervening sequences, polyadenylation signals, etc. or a combination of both endogenous and exogenous control elements.
  • a gene may produce multiple RNA species that are generated by differential splicing of the primary RNA transcript.
  • cDNAs that are splice variants of the same gene will contain regions of sequence identity or complete homology (representing the presence of the same exon or portion of the same exon on both cDNAs) and regions of complete non-identity (for example, representing the presence of exon "A” on cDNA 1 wherein cDNA 2 contains exon "B" instead). Because the two cDNAs contain regions of sequence identity they will both hybridize to a probe derived from the entire gene or portions of the gene containing sequences found on both cDNAs; the two splice variants are therefore substantially homologous to such a probe and to each other.
  • fragment refers to a polypeptide that has an amino- terminal and/or carboxy-terminal deletion as compared to the native protein, but where the remaining amino acid sequence is identical to the corresponding positions in the amino acid sequence deduced from a full-length cDNA sequence. Fragments typically are at least 4 amino acids long, preferably at least 20 amino acids long, usually at least 50 amino acids long or longer, and span the portion of the polypeptide required for intermolecular binding of the compositions (claimed in the present invention) with its various ligands and/or substrates.
  • naturally-occurring refers to the fact that an object can be found in nature.
  • a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and which has not been intentionally modified by man in the laboratory is naturally-occurring.
  • Amplification is a special case of nucleic acid replication involving template specificity. It is to be contrasted with non-specific template replication (i.e., replication that is template-dependent but not dependent on a specific template). Template specificity is here distinguished from fidelity of replication (i.e., synthesis of the proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently described in terms of “target” specificity. Target sequences are “targets” in the sense that they are sought to be sorted out from other nucleic acid. Amplification techniques have been designed primarily for this sorting out.
  • Amplification enzymes are enzymes that, under conditions they are used, will process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid.
  • MDN-1 R ⁇ A is the specific template for the replicase (D.L. Kacian et al, Proc. ⁇ atl. Acad. Sci. USA 69:3038 [1972]).
  • Other nucleic acids will not be replicated by this amplification enzyme.
  • this amplification enzyme has a stringent specificity for its own promoters (Chamberlin et ⁇ /., Nature 228:227 [1970]).
  • the enzyme will not ligate the two oligonucleotides or polynucleotides, where there is a mismatch between the oligonucleotide or polynucleotide substrate and the template at the ligation junction (Wu and Wallace, Genomics 4:560 [1989]).
  • Tag and Pfu polymerases by virtue of their ability to function at high temperature, are found to display high specificity for the sequences bounded and thus defined by the primers; the high temperature results in thermodynamic conditions that favor primer hybridization with the target sequences and not hybridization with non-target sequences (H. A. Erlich (ed.), PCR Technology, Stockton Press [1989]).
  • amplifiable nucleic acid is used in reference to nucleic acids that may be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid” will usually comprise "sample template.”
  • sample template refers to nucleic acid originating from a sample that is analyzed for the presence of “target” (defined below).
  • target defined below
  • background template is used in reference to nucleic acid other than sample template that may or may not be present in a sample. Background template is most often inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
  • the term "primer” refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH).
  • the primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products.
  • the primer is an oligodeoxyribonucleotide.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
  • PCR polymerase chain reaction
  • This process for amplifying the target sequence consists of introducing a large excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence, followed by a precise sequence of thermal cycling in the presence of a DNA polymerase.
  • the two primers are complementary to their respective strands of the double stranded target sequence.
  • the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule.
  • the primers are extended with a polymerase so as to form a new pair of complementary strands.
  • the steps of denaturation, primer annealing, and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one "cycle”; there can be numerous “cycles”) to obtain a high concentration of an amplified segment of the desired target sequence.
  • the length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter.
  • the method is referred to as the “polymerase chain reaction” (hereinafter "PCR”). Because the desired amplified segments of the target sequence become the predominant sequences (in terms of concentration) in the mixture, they are said to be “PCR amplified.”
  • PCR it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; inco ⁇ oration of 32 P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment).
  • any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules.
  • the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
  • PCR product refers to the resultant mixture of compounds after two or more cycles of the PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences.
  • amplification reagents refers to those reagents
  • amplification reagents deoxyribonucleotide triphosphates, buffer, etc.
  • amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, microwell, etc.).
  • reaction vessel test tube, microwell, etc.
  • the terms "restriction endonucleases” and “restriction enzymes” refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific nucleotide sequence.
  • the term “recombinant DNA molecule” as used herein refers to a DNA molecule that is comprised of segments of DNA joined together by means of molecular biological techniques.
  • isolated when used in relation to a nucleic acid, as in “an isolated oligonucleotide” or “isolated polynucleotide” refers to a nucleic acid sequence that is identified and separated from at least one contaminant nucleic acid with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic acids are nucleic acids such as DNA and RNA found in the state they exist in nature. For example, a given DNA sequence (e.g.
  • RNA sequences such as a specific mRNA sequence encoding a specific protein, are found in the cell as a mixture with numerous other mRNAs that encode a multitude of proteins.
  • isolated nucleic acid encoding an enzyme identified using the methods of the present invention includes, by way of example, such nucleic acid in cells ordinarily expressing the gene encoding the enzyme where the nucleic acid is in a chromosomal location different from that of natural cells, or is otherwise flanked by a different nucleic acid sequence than that found in nature.
  • the isolated nucleic acid, oligonucleotide, or polynucleotide may be present in single-stranded or double-stranded form.
  • the oligonucleotide or polynucleotide will contain at a minimum the sense or coding strand (i.e., the oligonucleotide or polynucleotide may single-stranded), but may contain both the sense and anti-sense strands (i.e., the oligonucleotide or polynucleotide may be double- stranded).
  • a "portion of a chromosome” refers to a discrete section of the chromosome. Chromosomes are divided into sites or sections by cytogeneticists as follows: the short (relative to the centromere) arm of a chromosome is termed the "p" arm; the long arm is termed the "q” arm. Each arm is then divided into 2 regions termed region 1 and region 2 (region 1 is closest to the centromere). Each region is further divided into bands. The bands may be further divided into sub-bands.
  • the 1 lpl5.5 portion of human chromosome 11 is the portion located on chromosome 11 (11) on the short arm (p) in the first region (1) in the 5th band (5) in sub-band 5 (.5).
  • a portion of a chromosome may be "altered;" for instance the entire portion may be absent due to a deletion or may be rearranged (e.g., inversions, translocations, expanded or contracted due to changes in repeat regions).
  • hybridize i.e., specifically bind
  • a probe homologous to a particular portion of a chromosome could result in a negative result (i.e., the probe could not bind to the sample containing genetic material suspected of containing the missing portion of the chromosome).
  • hybridization of a probe homologous to a particular portion of a chromosome may be used to detect alterations in a portion of a chromosome.
  • sequences associated with a chromosome means preparations of chromosomes (e.g., spreads of metaphase chromosomes), nucleic acid extracted from a sample containing chromosomal DNA (e.g., preparations of genomic DNA); the RNA that is produced by transcription of genes located on a chromosome (e.g. , hnRNA and mRNA), and cDNA copies of the RNA transcribed from the DNA located on a chromosome.
  • Sequences associated with a chromosome may be detected by numerous techniques including probing of Southern and Northern blots and in situ hybridization to RNA, DNA, or metaphase chromosomes with probes containing sequences homologous to the nucleic acids in the above listed preparations.
  • portion when in reference to a nucleotide sequence (as in “a portion of a given nucleotide sequence”) refers to fragments of that sequence. The fragments may range in size from four nucleotides to the entire nucleotide sequence minus one nucleotide (10 nucleotides, 20, 30, 40, 50, 100, 200, etc.).
  • coding region when used in reference to structural gene refers to the nucleotide sequences that encode the amino acids found in the nascent polypeptide as a result of translation of a mRNA molecule.
  • the coding region is bounded, in eukaryotes, on the 5' side by the nucleotide triplet "ATG" that encodes the initiator methionine and on the 3' side by one of the three triplets, which specify stop codons (i. e., TAA, TAG, TGA).
  • recombinant DNA molecule refers to a DNA molecule that is comprised of segments of DNA joined together by means of molecular biological techniques.
  • recombinant protein or “recombinant polypeptide” as used herein refers to a protein molecule that is expressed from a recombinant DNA molecule.
  • native protein as used herein, is used to indicate that a protein does not contain amino acid residues encoded by vector sequences; that is the native protein contains only those amino acids found in the protein as it occurs in nature.
  • a native protein may be produced by recombinant means or may be isolated from a naturally occurring source.
  • portion when in reference to a protein (as in “a portion of a given protein”) refers to fragments of that protein.
  • the fragments may range in size from four consecutive amino acid residues to the entire amino acid sequence minus one amino acid.
  • transgene refers to a foreign, heterologous, or autologous gene that is placed into an organism by introducing the gene into newly fertilized eggs or early embryos.
  • foreign gene refers to any nucleic acid (e.g., gene sequence) that is introduced into the genome of an animal or plant by experimental manipulations and may include gene sequences found in that animal or plant so long as the introduced gene does not reside in the same location as does the naturally-occurring gene.
  • autologous gene is intended to encompass variants (e.g., polymorphisms or mutants) of the naturally occurring gene. The term transgene thus encompasses the replacement of the naturally occurring gene with a variant form of the gene.
  • vector is used in reference to nucleic acid known to replicate autonomously in a host cell, to which a segment of DNA may be spliced to allow its replication; for example, a plasmid or an artificial chromosome.
  • Vectors include viral vectors, which are viral nucleic acids altered so that they can act as a vector for recombinant DNA.
  • vehicle is sometimes used interchangeably with “vector.”
  • expression vector refers to a recombinant DNA molecule containing a desired coding sequence and appropriate nucleic acid sequences necessary for the expression of the operably linked coding sequence in a particular host organism.
  • Nucleic acid sequences necessary for expression in prokaryotes usually include a promoter, an operator (optional), and a ribosome binding site, often along with other sequences.
  • Eukaryotic cells are known to utilize promoters, enhancers, and termination and polyadenylation signals.
  • host cell refers to any eukaryotic or prokaryotic cell (e.g., bacterial cells such as E. coli, yeast cells, mammalian cells, avian cells, amphibian cells, plant cells, fish cells, and insect cells), whether located in vitro or in vivo.
  • host cells may be located in a transgenic animal or plant.
  • overexpression and “overexpressing” and grammatical equivalents are used in reference to levels of mRNA to indicate a level of expression approximately 3 -fold higher than that typically observed in a given tissue in a control or non-transgenic animal or plant.
  • Levels of mRNA are measured using any of a number of techniques known to those skilled in the art including, but not limited to Northern blot analysis. Appropriate controls are included on the Northern blot to control for differences in the amount of RNA loaded from each tissue analyzed (e.g. , the amount of 28S rRNA, an abundant RNA transcript present at essentially the same amount in all tissues, present in each sample can be used as a means of normalizing or standardizing the mRNA-specific signal observed on Northern blots).
  • the amount of mRNA present in the band corresponding in size to the correctly spliced transgene RNA is quantified; other minor species of RNA which hybridize to the transgene probe are not considered in the quantification of the expression of the transgenic mRNA.
  • transfection refers to the introduction of foreign DNA or RNA into eukaryotic cells. Transfection may be accomplished by a variety of means known to the art including calcium phosphate-DNA co-precipitation, DEAE-dextran- mediated transfection, polybrene-mediated transfection, electroporation, microinjection, liposome fusion, lipofection, protoplast fusion, retroviral infection, polyethylene glycol treatment, and biolistics.
  • stable transfection or "stably transfected” refers to the introduction and integration of foreign DNA into the genome of the transfected cell.
  • stable transfectant refers to a cell that has stably integrated foreign DNA into the genomic DNA.
  • transient transfection or “transiently transfected” refers to the introduction of foreign DNA or RNA into a cell where the foreign DNA fails to integrate into the genome of the transfected cell.
  • the foreign DNA persists in the nucleus of the transfected cell for several days. During this time the foreign DNA is subject to the regulatory controls that govern the expression of endogenous genes in the chromosomes.
  • transient transfectant refers to cells that have taken up foreign DNA but have failed to integrate this DNA.
  • composition comprising a given polynucleotide sequence refers broadly to any composition containing the given polynucleotide sequence.
  • the composition may comprise an aqueous solution.
  • compositions comprising polynucleotide sequences encoding an enzyme identified using the methods of the present invention or fragments thereof may be employed as hybridization probes.
  • the enzyme encoding polynucleotide sequences are typically employed in an aqueous solution containing salts (e.g., NaCl), detergents (e.g., SDS), and other components (e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.).
  • salts e.g., NaCl
  • detergents e.g., SDS
  • other components e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.
  • sample as used herein is used in its broadest sense.
  • a sample suspected of containing a human chromosome or sequences associated with a human chromosome may comprise a cell, chromosomes isolated from a cell (e.g., a spread of metaphase chromosomes), genomic DNA (in solution or bound to a solid support such as for Southern blot analysis), RNA (in solution or bound to a solid support such as for Northern blot analysis), cDNA (in solution or bound to a solid support) and the like.
  • a sample suspected of containing a protein may comprise a cell, a portion of a tissue, an extract containing one or more proteins and the like.
  • computer memory and “computer memory device” refer to any storage media readable by a computer processor.
  • Examples of computer memory include, but are not limited to, RAM, ROM, computer chips, digital video discs (DVDs), compact discs (CDs), hard disk drives (HDD), and magnetic tape.
  • computer readable medium refers to any device or system for storing and providing information (e.g., data and instructions) to a computer processor.
  • Examples of computer readable media include, but are not limited to, DVDs, CDs, hard disk drives, magnetic tape and servers for streaming media over networks.
  • processor and “central processing unit” or “CPU” are used interchangeably and refer to a device that is able to read a program from a computer memory (e.g., ROM or other computer memory) and perform a set of steps according to the program.
  • “Compatible”, as used herein, refers to the capability of operating with other components of a system.
  • a vector or plant viral nucleic acid that is compatible with a host is one that is capable of replicating in that host.
  • a coat protein that is compatible with a viral nucleotide sequence is one capable of encapsidating that viral sequence.
  • “Coding region”, as used herein, refers to that portion of a gene that codes for a protein.
  • non-coding region refers to that portion of a gene that is not a coding region.
  • the present invention provides methods for rapid screening of plant enzymes of various classes.
  • the present invention further provides methods of building enzyme libraries useful for identifying enzymes of interest.
  • plant enzymes identified using the methods of the present invention are cloned into microbial hosts for large-scale fermentation and enzyme production (e.g. , for use in industrial processes for production of chemicals of interest).
  • the present invention provides methods of screening plant enzymes (e.g. , biocatalysts) in plants. Any number of suitable expression vectors may be utilized, including, but not limited to, those disclosed herein.
  • NIT2 from Arabidopsis thaliana was cloned and expressed in N. benthamiana leaves. The function of NIT2 was confirmed by a function-based assay employing NMR technology.
  • Experiments conducted during the course of the development of the present invention further identified nitrilase 4 activity. To date, no report has been published that clearly demonstrated the nitrilase 4 function, which can catalyze the hydrolysis of hydrocinnamonitrile.
  • the present invention further provides the novel use of whole plants tissue for biocatalysis. Direct screening of plant enzymes in whole plants provides the further advantage of codon usage, promoter recognition, and post-translational modifications in the native environment.
  • the present invention provides vectors for expressing enzymes in plants or plant parts (e.g., for screening purposes). Any suitable vector/expression system may be utilized, including, but not limited to, those described herein.
  • the GENEWARE viral vector system (Biosource Technologies, Inc. Corp., Vacaville, California, USA), described in the illustrative examples below, is utilized. Briefly, in the GENEWARE system, tobacco mosaic virus (TMV)-derived transient RNA replicons are used to introduce and express heterologous genes in plants (Dawson et ⁇ ., Virology 172:285 [1989]; Kumagai et al, PNAS 90:427 [1992]; Donson et al, PNAS 88:7204 [1991]). GENEWARE TMV has been modified to contain a duplicated copy of the subgenomic coat protein mRNA promoter positioned between the 30 kDa cell-to-cell movement protein gene and the native coat protein gene ( Figure 1).
  • the present invention is not limited to the GENEWARE system. Any suitable expression system may be utilized.
  • alternative viral vectors are utilized. These include both DNA and RNA viruses. Plant gene expression vectors have been developed using single stranded DNA geminiviruses, double stranded non-integrating pararetroviruses and plus-sense RNA viruses such as bromoviruses (brome mosaic virus, cowpea chlorotic mottle virus), tombusviruses, other tobamoviruses, potexviruses (potato virus X), comoviruses (cowpea mosaic virus) and potyviruses (Granoff A, Webster, RG (1999) Encyclopedia of Virology. San Diego: Academic Press Volume 3, pp. 1885- 99). Because these viruses have different host ranges the testing of foreign genes can be performed in various plant species.
  • sequences encoding polypeptides may be driven by any of a number of promoters.
  • plant vectors are created using a recombinant plant virus containing a recombinant plant viral nucleic acid, as described in PCT publication WO 96/40867. Subsequently, the recombinant plant viral nucleic acid that contains one or more non-native nucleic acid sequences may be transcribed or expressed in the infected tissues of the plant host and the product of the coding sequences may be recovered from the plant, as described in WO 99/36516.
  • An important feature of this embodiment is the use of recombinant plant viral nucleic acids that contain one or more non-native subgenomic promoters capable of transcribing or expressing adjacent nucleic acid sequences in the plant host and that result in replication and local and/or systemic spread in a compatible plant host.
  • the recombinant plant viral nucleic acids have substantial sequence homology to plant viral nucleotide sequences and may be derived from an RNA, DNA, cDNA or a chemically synthesized RNA or DNA. A partial listing of suitable viruses is described below.
  • the first step in producing recombinant plant viral nucleic acids is to modify the nucleotide sequences of the plant viral nucleotide sequence by known conventional techniques such that one or more non-native subgenomic promoters are inserted into the plant viral nucleic acid without destroying the biological function of the plant viral nucleic acid.
  • the native coat protein coding sequence may be deleted in some embodiments, placed under the control of a non-native subgenomic promoter in other embodiments, or retained in a further embodiment. If it is deleted or otherwise inactivated, a non-native coat protein gene may be inserted under control of one of the non-native subgenomic promoters, or optionally under control of the native coat protein gene subgenomic promoter.
  • the non-native coat protein is capable of encapsidating the recombinant plant viral nucleic acid to produce a recombinant plant virus.
  • the recombinant plant viral nucleic acid contains a coat protein coding sequence, which may be native or a normative coat protein coding sequence, under control of one of the native or non-native subgenomic promoters.
  • the coat protein is involved in the systemic or local infection of the plant host or has other functions.
  • viruses that meet this requirement include viruses from the tobamovirus group such as Tobacco Mosaic virus (TMV), Ribgrass Mosaic Virus (RGM), Cowpea Mosaic virus (CMV), Alfalfa Mosaic virus (AMV), Cucumber Green Mottle Mosaic virus watermelon strain (CGMMV-W) and Oat Mosaic virus (OMN) and viruses from the brome mosaic virus group such as Brome Mosaic virus (BMV), broad bean mottle virus and cowpea chlorotic mottle virus.
  • Additional suitable viruses include Rice Necrosis virus (RNV), and geminiviruses such as tomato golden mosaic virus (TGMV), Cassava latent virus (CLV) and maize streak virus (MSV).
  • TMV Tobacco Mosaic virus
  • RGM Ribgrass Mosaic Virus
  • CMV Alfalfa Mosaic virus
  • AMV Alfalfa Mosaic virus
  • CGMMV-W Cucumber Green Mottle Mosaic virus watermelon strain
  • plant vectors used for the expression of sequences encoding polypeptides include, for example, viral promoters such as the 35S and 19S promoters of CaMV used alone or in combination with the omega leader sequence from TMV (Takamatsu, EMBO J. 3:1671 [1984]).
  • plant promoters such as the small subunit of RUBISCO or heat shock promoters may be used (Coruzzi et at, EMBO J. 3:1671 [1984]; Broglie et al, Science 224:838 [1984]; and Winter et al, Results Probl. Cell Differ. 17:85 [1991]).
  • These constructs can be introduced into plant cells by direct DNA transformation, transfection, or pathogen-mediated transfection. Such techniques are described in a number of generally available reviews (See for example, Hobbs, S. or Murry, L. E. in McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New York, N.Y.; pp. 191-196.
  • virus-derived gene expression systems developed to date are based mostly on autonomously replicating viruses, but other strategies, such as helper-dependent systems, have been explored as well.
  • an essential viral gene is replaced with the gene of interest thus rendering the virus non-infectious.
  • the disabled vector is then inoculated onto transgenic plants or cells expressing the missing viral gene where it replicates and expresses the foreign gene.
  • this strategy is used for overcoming problems with virus instability that may occur upon introduction of very large genes in the form of gene libraries from plants of interest for sourcing biocatalysts.
  • Other alternatives exist in addition to using viruses to rapidly express genes in plants.
  • the foreign genes or cDNA libraries are cloned into a DNA plasmid downstream of a DNA-dependent RNA polymerase promoter.
  • the promoter is recognized by host transcription machinery and the heterologous mRNAs are synthesized and translated upon inoculation into the host cell.
  • the mRNA is produced from the cDNA libraries in vitro and inoculated into the cell where it is translated by host translation machinery.
  • enzymes screened using the methods of the present invention include, but are not limited to, nitrilases, fatty acid modifying enzymes, dehalogenases, lipases, phosphatases, kinases, oxidoreductases, transferases, hydrolases, lyases, isomerases, and ligases.
  • libraries of enzymes are generated (e.g., using the vectors described above).
  • the libraries comprise genes from a particular plant. In other embodiments, the libraries comprise genes from a particular type of enzyme (e.g., from multiple species of plants). In yet other embodiments, the libraries comprise genes from bacteria and fungi as well as any other species whose genes are suitable for screening in plants.
  • the libraries are derived from directed evolution procedures. Directed evolution libraries may be screened at multiple points in a directed evolution experiment. Such embodiments are particularly suited for pooling experiments.
  • the present invention is not limited to a particular library or class of enzyme. Any enzyme that acts on a substrate to produce NMR distinct products may be screened using the methods of the present invention. Indeed, it is not necessary to know the identity of the enzyme or its nucleic acid or protein sequence. Libraries may be obtained from commercial sources (e.g., Invitrogen, Carlsbad, CA) or other sources known to one of skill in the art. In other embodiments, libraries are generated using PCR.
  • libraries may be generated directed evolution (See e.g., U.S. Patents 6,395,547, 6,376,246; 6,391,640; 6,365,408; each of which is herein incorporated by reference).
  • artificial evolution is performed by random mutagenesis (for example, by utilizing error-prone PCR to introduce random mutations into a given coding sequence). This method requires that the frequency of mutation be finely tuned.
  • beneficial mutations are rare, while deleterious mutations are common. This is because the combination of a deleterious mutation and a beneficial mutation often results in an inactive enzyme.
  • the ideal number of base substitutions for a targeted gene is usually between 1.5 and 5 (Moore and Arnold, Nat.
  • libraries are generated using Gene Site Saturation Mutagenesis procedures (U.S. patent 6,171,820, herein incorporated by reference).
  • the procedure provides, from a parental template gene, a set of mutagenized progeny genes whereby at each original codon position there is produced at least one substitute codon encoding each of the 20 naturally encoded amino acids.
  • the procedure also provides, from a parental template polypeptide, a set of mutagenized progeny polypeptides wherein each of the 20 naturally encoded amino acids is represented at each original amino acid position.
  • libraries are generated using gene shuffling or sexual PCR procedures (See e.g., Smith, Nature, 370:324-25 [1994]; U.S. Pat.
  • Gene shuffling involves random fragmentation of several mutant DNAs followed by their reassembly by PCR into full-length molecules. Examples of various gene shuffling procedures include, but are not limited to, assembly following DNase treatment, the staggered extension process (STEP), and random priming in vitro recombination.
  • DNase mediated method DNA segments isolated from a pool of positive mutants are cleaved into random fragments with DNasel and subjected to multiple rounds of PCR with no added primer.
  • DNA shuffling can be applied using multiple related DNA sequences or combination of the different mutants.
  • Such approaches mix multiple parent molecules in a shuffling process, generating a library of millions of different chimeric sequences (Crameri et al., Nature 391 (1998), pp. 288-291)
  • libraries of useful enzymes are generated based on the results of the screening methods of the present invention.
  • Libraries generated using the methods of the present invention may be utilized for initial or further screening (e.g., to identify enzymes with particular substrate specificities or other properties).
  • the present invention is not limited to a particular plant host. Any suitable plant host may be utilized. In preferred embodiments, the host is selected based on compatibility with the particular plant viral vector selected. In one illustrative embodiment of the present invention (Example 1), N. benthamiana plants are utilized as hosts.
  • the present invention is not limited to the use of whole plants. Plant parts, including, but not limited to, leaves (detached or non-detached), roots, stems, fruits, flowers or protoplast and cell cultures (Dijkstra J. and de Jager C.P. (1998) Practical Plant Virology, New York: Springer (Part I, Protocol 2)).
  • the use of protoplasts and cell cultures provide the additional advantage of reducing possible detrimental effects of increased genome size due to the presence of foreign gene and can be automated for high throughput assays.
  • insects and animal host systems are utilized for screening enzymes.
  • Methods for the use of insect and animal hosts for expression of proteins of interest are known in the art (O'Relly et al. , (1992) Baculovirus Expression Vectors: A laboratory manual, New York: WH Freeman; Kreissig et al, (1995) J. Virol. Methods 53: 263-72; Rayner et al, (1994) Mol. Cell Biol. 14: 880-7).
  • the present invention provides methods of screening plants or plant parts for enzymatic activity.
  • the activity assay is a high-throughput assay (e.g. , capable of analyzing at least 600, preferably at least 1000, even more preferably at least 5000, and still more preferably at least 10,000 samples per day). High-throughput assays allow for the screening of large numbers of enzymes in a short period of time.
  • the activity assay is an enzymatic assay.
  • the presence of a substrate or product is assayed using reagents that react with the product of the enzymatic reaction to generate a reporter molecule (e.g., with a recognizable spectra or other property).
  • enzymatic reactions are assayed by monitoring changes in substrate concentration.
  • enzyme activity is detected using other analytical methods, including, but not limit to, methods that can identify organic compounds such as mass spectrometry (MS), Infrared (IR), and ultraviolet UV) or indirect methods such as colorimetric or fluorescence labeled methods.
  • MS mass spectrometry
  • IR Infrared
  • UV ultraviolet UV
  • HTP-NMR high-throughput nuclear magnetic resonance
  • NMR is suitable for the analysis of intact or homogenized plants or plant parts.
  • the present invention utilizes direct-injection NMR technology.
  • pooling is utilized to increase the number of samples that can be analyzed consecutively.
  • pooling groups of samples (e.g., one row of a microtitre plate (e.g., 8 samples)) are pooled into 1 well for HTP-NMR analysis. Then, individual samples from the pools that give positive results can be further screened. This increases the throughput by 8-fold which corresponds to more than 50,000 samples per day.
  • HTP-NMR methods are function-based, and can thus screen virtually any enzymatic reaction, without requiring a labeled substrate and with no knowledge of the enzyme of interest.
  • at least a 600 MHz NMR is used to provide extra sensitivity.
  • the methodology is advantageous compared to traditional screening methods since it does not require fluorescence or radioactive 'tags' and method development process.
  • this technology benefits from the traditional analytical strengths of NMR spectroscopy, such as the ability to distinguish detailed changes in structure, concentration and stereo chemistry. These abilities make it possible to monitor and screen the desired reaction directly, screen multiple reactions from the same substrate or different substrates and track multiple products from the same or different substrates, track diastereomeric products, tautomers, epimers and other isoforms of products.
  • enantiomers are characterized by direct derivatization of the enantiomers to specific chiral reagents, which can be monitored by NMR.
  • the methods of the present invention further find use in the measurement of product increase or/and substrate decrease.
  • HTP-NMR methods are suitable for the quantitation of reaction product formation.
  • time points can easily be taken, allowing for an analysis of reaction kinetics.
  • plug-flow sample delivery is utilized.
  • plug- flow sample delivery the NMR probe is never empty and the system is continually locked.
  • Plug-flow sample delivery results in minimization of sampling errors, reduces required sample equilibration time, requires smaller sample volume, and reduces "overhead" time of the autosampler because the needle moves only from well to well and there is no sample retrieval.
  • the plug-flow method allows for continuous NMR sampling using a direct- injection configuration, reduces possibility for spectrometer errors during automated data acquisition, uses a minimal sample volume (e.g., 250-300 ⁇ L), allows for accurate mapping of sample location, and the sample carry-over is equivalent to what is observed utilizing the standard Narian NAST approach.
  • the plug-flow approach does not allow for the analyzed sample volume to be recovered. However, the relatively small sample volume of 250 ⁇ L allows for approximately 1000 ⁇ L per sample well to be retained. Sample throughput can be increased by as much as a factor of five using the plug-flow approach.
  • the entire analytical sequence for one sample can take several minutes in the factory configuration of an exemplary system. In the factory configuration, sample delivery and recovery requires as much as 80% of the time required per sample. The plug-flow methods described herein reduce this time to approximately 30 seconds per sample.
  • the ⁇ MR probe is continuously loaded. Therefore, the spectrometer retains a deuterium lock during the sample transport process, thus eliminating a common cause of ⁇ MR failures.
  • the total analysis time is significantly reduced to optimize sample throughput in the case where hundreds or thousands of samples must be screened (e.g., in biocatalyst screening).
  • the total analysis time is the sum of the sample transport and data acquisition.
  • the methods described herein utilize a system where it is possible to inject one sample after another, while minimizing the small amounts of mixing that might occur.
  • the present invention utilizes a method where samples are kept separated using a small bubble of air. The bubble minimizes sample mixing as the sample is transferred along the transfer tubing and the small bubble does not interfere with the acquisition of the ⁇ MR spectrum.
  • the present invention utilizes a second sampling valve for plug-flow ⁇ MR.
  • the present invention provides methods of producing and purifying enzymes (e.g. , industrial enzymes) identified using the screening methods of the present invention.
  • Enzymes or other proteins of interest may be expressed in any number of suitable organisms.
  • enzymes of industrial utility are expressed and purified on a large scale.
  • a variety of expression vector/host systems may be utilized to contain and express sequences encoding a polypeptide of interest (e.g., encoding an enzyme of interest identified by the screening methods of the present invention).
  • a polypeptide of interest e.g., encoding an enzyme of interest identified by the screening methods of the present invention.
  • These include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with insect expression vectors; plant cell systems transformed with plant expression vectors (for example, Ti or pBR322 plasmids); or animal cell expression systems.
  • control elements are those non-translated regions of the vector (for example, enhancers, promoters, 5' and 3' untranslated regions) that interact with host cellular proteins to carry out transcription and translation. Such elements may vary in their strength and specificity. Depending on the vector system and host utilized, any number of suitable transcription and translation elements, including constitutive and inducible promoters, may be used. For example, when cloning in bacterial systems, inducible promoters such as the hybrid lacZ promoter of the BLUESCRIPT phagemid (Stratagene, LaJolla, CA) or PSPORT1 plasmid (Life Technologies, Inc., Rockville, MD) and the like may be used.
  • inducible promoters such as the hybrid lacZ promoter of the BLUESCRIPT phagemid (Stratagene, LaJolla, CA) or PSPORT1 plasmid (Life Technologies, Inc., Rockville, MD) and the like may be used.
  • the baculovirus polyhedrin promoter may be used in insect cells. Promoters or enhancers derived from the genomes of plant cells (for example, heat shock, RUBISCO, and storage protein genes) or from plant viruses (for example, viral promoters or leader sequences) may be cloned into the vector. In mammalian cell systems, promoters from mammalian genes or from mammalian viruses are preferable. If it is necessary to generate a cell line that contains multiple copies of the sequence encoding a polypeptide, vectors based on SV40 or EBV may be used with an appropriate selectable marker.
  • a number of expression vectors may be selected depending upon the use intended for the polypeptide of interest. For example, when large quantities of the polypeptide are desired (e.g., when producing an enzyme for industrial use), vectors that direct high level expression of fusion proteins that are readily purified may be used. Such vectors include, but are not limited to, the multifunctional E.
  • coli cloning and expression vectors such as BLUESCRIPT phagemid (Stratagene, La Jolla, CA), in which the sequence encoding the polypeptide of interest may be ligated into the vector in frame with sequences for the amino-terminal Met and the subsequent 7 residues of beta-galactosidase so that a hybrid protein is produced; pIN vectors (Van Heeke and Schuster, J. Biol. Chem. 264:5503 [1989]); and the like.
  • pGEMX vectors Promega Co ⁇ oration, Madison, WI
  • GST glutathione S-transferase
  • fusion proteins are soluble and can easily be purified from lysed cells by adso ⁇ tion to glutathione-agarose beads followed by elution in the presence of free glutathione.
  • Proteins made in such systems may be designed to include heparin, thrombin, or factor XA protease cleavage sites so that the cloned polypeptide of interest can be released from the GST moiety at will.
  • yeast Saccharomyces cerevisiae a number of vectors containing constitutive or inducible promoters such as alpha factor, alcohol oxidase, and PGH may be used.
  • constitutive or inducible promoters such as alpha factor, alcohol oxidase, and PGH.
  • plant expression vectors are used (See e.g., above description of plant expression vectors in Section I).
  • Agrobacterium mediated transfection is utilized to create transgenic plants for expression of a gene of interest. Since most dicotyledonous plants are natural hosts for Agrobacterium, almost every dicotyledonous plant may be transformed by Agrobacterium in vitro. Although monocotyledonous plants, and in particular, cereals and grasses, are not natural hosts to Agrobacterium, work to transform them using Agrobacterium has also been carried out (Hiei et al, Plant Mol. Biol. 35:205 [1997]; Komari et al, Curr. Opin. Plant Biol.
  • Plant genera that may be transformed by Agrobacterium include Arabidopsis, Chrysanthemum, Dianthus, Gerbera, Euphorbia, Pelargonium, Ipomoea, Passiflora, Cyclamen, Malus, Prunus, Rosa, Rubus, Populus, Santalum, Allium, Lilium, Narcissus, Ananas, Arachis, Phaseolus and Pisum.
  • Agrobacterium For transformation -with. Agrobacterium, disarmed Agrobacterium cells are transformed with recombinant Ti plasmids of Agrobacterium tumefaciens or Ri plasmids of Agrobacterium rhizogenes (such as those described in U.S. Patent No. 4,940,838, the entire contents of which are herein inco ⁇ orated by reference). The nucleic acid sequence of interest is then stably integrated into the plant genome by infection with the transformed Agrobacterium strain. For example, heterologous nucleic acid sequences have been introduced into plant tissues using the natural DNA transfer system of Agrobacterium tumefaciens and Agrobacterium rhizogenes bacteria (for review, see Newell, Mol.
  • the first method is co-cultivation of Agrobacterium with cultured isolated protoplasts. This method requires an established culture system that allows culturing protoplasts and plant regeneration from cultured protoplasts.
  • the second method is transformation of cells or tissues with. Agrobacterium. This method requires (a) that the plant cells or tissues can be transformed by Agrobacterium and (b) that the transformed cells or tissues can be induced to regenerate into whole plants.
  • the third method is transformation of seeds, apices or meristems with Agrobacterium. This method requires micropropagation. The efficiency of transformation by Agrobacterium may be enhanced by using a number of methods known in the art.
  • a natural wound response molecule such as acetosyringone (AS)
  • AS acetosyringone
  • transformation efficiency may be enhanced by wounding the target tissue to be transformed. Wounding of plant tissue may be achieved, for example, by punching, maceration, bombardment with microprojectiles, etc. (See e.g., Bidney et al, (1992) Plant Molec. Biol. 18:301-313).
  • the plant cells are transfected with vectors via particle bombardment (i. e.
  • Particle mediated gene transfer methods are known in the art, are commercially available, and include, but are not limited to, the gas driven gene delivery instrument descried in McCabe, U.S. Pat. No. 5,584,807, the entire contents of which are herein inco ⁇ orated by reference. This method involves coating the nucleic acid sequence of interest onto heavy metal particles, and accelerating the coated particles under the pressure of compressed gas for delivery to the target tissue.
  • Other particle bombardment methods are also available for the introduction of heterologous nucleic acid sequences into plant cells.
  • these methods involve depositing the nucleic acid sequence of interest upon the surface of small, dense particles of a material such as gold, platinum, or tungsten.
  • the coated particles are themselves then coated onto either a rigid surface, such as a metal plate, or onto a carrier sheet made of a fragile material such as MYLAR (E. I. du Pont De Nemours and Co. Co ⁇ ., Wilmington, Delaware, USA).
  • the coated sheet is then accelerated toward the target biological tissue.
  • the use of the flat sheet generates a uniform spread of accelerated particles that maximizes the number of cells receiving particles under uniform conditions, resulting in the introduction of the nucleic acid sample into the target tissue.
  • An insect system may also be used to express polypeptides (for example, a polypeptide encoded by a nucleic acid identified using the methods of the present invention).
  • polypeptides for example, a polypeptide encoded by a nucleic acid identified using the methods of the present invention.
  • Autographa calif ornica nuclear polyhedrosis virus (AcNPV) is used as a vector to express foreign genes in Spodoptera ⁇ 'ugiperda cells or in Trichoplusia larvae.
  • the sequences encoding a polypeptide of interest may be cloned into a non-essential region of the virus, such as the polyhedrin gene, and placed under control of the polyhedrin promoter.
  • transcription enhancers such as the Rous sarcoma virus (RS V) enhancer, may be used to increase expression in mammalian host cells.
  • RS V Rous sarcoma virus
  • Specific initiation signals may also be used to achieve more efficient translation of sequences encoding the polypeptide of interest. Such signals include the ATG initiation codon and adjacent sequences. In cases where sequences encoding the polypeptide of interest, its initiation codon, and upstream sequences are inserted into the appropriate expression vector, no additional transcriptional or translational control signals may be needed. However, in cases where only coding sequence, or a portion thereof, is inserted, exogenous translational control signals including the ATG initiation codon should be provided. Furthermore, the initiation codon should be in the correct reading frame to ensure translation of the entire insert. Exogenous translational elements and initiation codons may be of various origins, both natural and synthetic. The efficiency of expression may be enhanced by the inclusion of enhancers that are appropriate for the particular cell system that is used, such as those described in the literature (Scharf et al, Results Probl. Cell Differ., 20:125 [1994]).
  • a host cell strain may be chosen for its ability to modulate the expression of the inserted sequences or to process the expressed protein in the desired fashion.
  • modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and acylation.
  • Post-translational processing that cleaves a "prepro" form of the protein may also be used to facilitate correct insertion, folding and/or function.
  • Different host cells such as CHO, HeLa, MDCK, HEK293, and WI38, that have specific cellular machinery and characteristic mechanisms for such post-translational activities, may be chosen to ensure the correct modification and processing of the foreign protein.
  • cell lines that stably express the polypeptide of interest may be transformed using expression vectors that may contain viral origins of replication and/or endogenous expression elements and a selectable marker gene on the same or on a separate vector. Following the introduction of the vector, cells may be allowed to grow for 1-2 days in an enriched media before they are switched to selective media.
  • the pu ⁇ ose of the selectable marker is to confer resistance to selection, and its presence allows growth and recovery of cells that successfully express the introduced sequences.
  • Resistant clones of stably transformed cells may be proliferated using tissue culture techniques appropriate to the cell type.
  • Any number of selection systems may be used to recover transformed cell lines. These include, but are not limited to, the he ⁇ es simplex virus thymidine kinase (Wigler et al. , Cell 11:223 [1977]) and adenine phosphoribosyltransferase (Lowy et al. , Cell
  • genes that can be employed in tk" or aprt" cells respectively.
  • antimetabolite, antibiotic, or herbicide resistance can be used as the basis for selection; for example, dhfr, which confers resistance to methotrexate (Wigler et al, Proc. Natl. Acad. Sci., 77:3567 [1980]); npt, which confers resistance to the aminoglycosides neomycin and G-418 (Colbere-Garapin et al, J. Mol.
  • marker gene expression suggests that the gene of interest is also present, its presence and expression may need to be confirmed.
  • sequence encoding a polypeptide is inserted within a marker gene sequence, recombinant cells containing sequences encoding the polypeptide can be identified by the absence of marker gene function.
  • a marker gene can be placed in tandem with a sequence encoding the polypeptide under the control of a single promoter. Expression of the marker gene in response to induction or selection usually indicates expression of the tandem gene as well.
  • host cells that contain the nucleic acid sequence encoding the polypeptide of interest may be identified by a variety of procedures known to those of skill in the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations and protein bioassay or immunoassay techniques that include membrane, solution, or chip based technologies for the detection and/or quantification of nucleic acid or protein.
  • polynucleotide sequences encoding a polypeptide of interest can be detected by DNA-DNA or DNA-RNA hybridization or amplification using probes or portions or fragments of polynucleotides encoding the polypeptide.
  • Nucleic acid amplification based assays involve the use of oligonucleotides or oligomers based on the sequences encoding the polypeptide to detect transformants containing DNA or RNA encoding the polypeptide.
  • oligonucleotides or “oligomers” refer to a nucleic acid sequence of at least about 10 nucleotides and as many as about 60 nucleotides, preferably about 15 to 30 nucleotides, and more preferably about 20-25 nucleotides, that can be used as a probe or amplimer.
  • a variety of protocols for detecting and measuring the expression of a polypeptide for example, a polypeptide encoded by a nucleic acid identified using the methods of the present invention, using either polyclonal or monoclonal antibodies specific for the protein are known in the art. Examples include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescence activated cell sorting (FACS).
  • ELISA enzyme-linked immunosorbent assay
  • RIA radioimmunoassay
  • FACS fluorescence activated cell sorting
  • a two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering epitopes on the polypeptide is preferred, but a competitive binding assay may be employed.
  • Means for producing labeled hybridization or PCR probes for detecting sequences related to polynucleotides encoding a polypeptide of interest include oligonucleotide labeling, nick translation, end-labeling or PCR amplification using a labeled nucleotide.
  • sequences encoding the polypeptide, or any portions thereof may be cloned into a vector for the production of an mRNA probe.
  • RNA polymerase such as T7, T3, or SP6 and labeled nucleotides.
  • T7, T3, or SP6 an appropriate RNA polymerase
  • RNA polymerase such as T7, T3, or SP6
  • Suitable reporter molecules or labels include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents as well as substrates, cofactors, inhibitors, magnetic particles, and the like.
  • Host cells transformed with nucleotide sequences encoding a polypeptide of interest may be cultured under conditions suitable for the expression and recovery of the protein from cell culture.
  • the protein produced by a recombinant cell may be secreted or contained intracellularly depending on the sequence and/or the vector used.
  • expression vectors containing polynucleotides that encode the polypeptide of interest may be designed to contain signal sequences that direct secretion of the polypeptide through a prokaryotic or eukaryotic cell membrane.
  • purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine-tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification system (Immunex Co ⁇ ., Seattle, WA).
  • cleavable linker sequences such as those specific for Factor XA or enterokinase (available from Invitrogen, San Diego, CA) between the purification domain and the polypeptide of interest may be used to facilitate purification.
  • One such expression vector provides for expression of a fusion protein containing the polypeptide of interest and a nucleic acid encoding 6 histidine residues preceding a thioredoxin or an enterokinase cleavage site. The histidine residues facilitate purification on IMIAC (immobilized metal ion affinity chromatography) as described in Porath et al, (Prot. Exp.
  • enterokinase cleavage site provides a means for purifying the polypeptide from the fusion protein.
  • a discussion of vectors that contain fusion proteins is provided in Kroll et al, (DNA Cell Biol, 12:441 [1993]).
  • fragments of the polypeptide of interest may be produced by direct peptide synthesis using solid-phase techniques (Merrifield, J. Am. Chem. Soc, 85:2149 [1963]). Protein synthesis may be performed using manual techniques or by automation. Automated synthesis may be achieved, for example, using the Applied Biosystems 431 A peptide synthesizer (Perkin Elmer). Various fragments of the polypeptide may be chemically synthesized separately and combined using chemical methods to produce the full-length molecule.
  • the selected promoter is induced by appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an additional period.
  • appropriate means e.g., temperature shift or chemical induction
  • the produced whole cells and immobilized whole cells are used directly as biocatalysts without being disrupted.
  • cells are typically harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained for further purification.
  • microbial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents.
  • production is carried out using large- scale fermentation. Large-scale bioreactors for use in such production are commercially available.
  • the enzyme is purified.
  • the cells are first disrupted and fractionated before subsequent enzyme purification; disruption and fractionation methods are well known.
  • Purification methods are also well known, and include, but are not limited to, ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography and lectin chromatography.
  • protein-refolding steps can be used as necessary, in completing configuration of the mature protein.
  • high performance liquid chromatography HPLC
  • the present invention further provides polynucleotides having the coding sequence fused in frame to a marker sequence that allows for purification of the polypeptide of the present invention.
  • a non-limiting example of a marker sequence is a hexahistidine tag which may be supplied by a vector, preferably a pQE-9 vector, which provides for purification of the polypeptide fused to the marker in the case of a bacterial host, or, for example, the marker sequence may be a hemagglutinin (HA) tag when a mammalian host (e.g., COS-7 cells) is used.
  • the HA tag corresponds to an epitope derived from the influenza hemagglutinin protein (Wilson et al, Cell, 37:767 [1984]).
  • Nitrilases are a class of enzymes that convert nitriles to corresponding acids (Kobayashi and Shimizu (2000), Current Opinion in Chemical Biology, 4:95-102). This class of enzyme has proven industrial applications such as production of acrylic acid and chiral intermediates for high value pharmaceuticals (Effenberger and Bohme, (1994) Bioorganic & Medicinal Chemistry; 2(7):715-721).
  • nitrilases available for screening and applications are of microbial origin (DeSantis et al, (2002) J Am. Chem. Soc. 124(31):9024-5). Plants are rich sources of nitrilases, but there is no method available to screen plant nitrilases, rapidly build a nitrilase-library and then screen the library for applications.
  • Nitrilases from Arabidopsis thaliana can be cloned into GENEWARE and the infected leaves can be readily assayed for nitrilase activity using an appropriate substrate, even if the proteins are not expressed at high levels and cannot be detected in SDS PAGE gels. This is the first report of directly measuring nitrilase activity in plants using a substrate of commercial interest.
  • These nitrilases can be functionally cloned in E. coli for production of the biocatalyst in large quantities by fermentation, to develop industrial processes.
  • Plant genomes, genomes of other organisms, or gene homologs from plants and other organisms can be screened using GENEWARE for reactions of interest. The identified genes can then be cloned in E. coli for production and process development.
  • An overall exemplary method for utilizing the methods of the present invention for discovery of plant enzymes for industrial applications is outlined in Figure 5.
  • M9 salts solution (1 L) contained Na 2 HPO 4 (6 g), KH 2 PO 4 (3 g), NH 4 C1 (1 g), and NaCl (0.5 g).
  • the Partial-Deu-Medium (1 L) consisted of 990 mL of M9 salts, MgSO4 (0.12 g), and 10 mL of BIO-EXPRESS 1000 (U-D, 98%, lOx concentrated) purchased from Cambridge Isotope Laboratory, Inc.
  • Terrific Broth (1 L) consisted of bacto-tryptone (12 g), bacto-yeast (24 g), and glycerol (4 mL).
  • Arabidopsis thaliana cDNA library (Invitrogen, cat. # 11474-0112) was used for cloning of A. thaliana nitrilase genes NITl, NIT2, NIT3, and NIT4.
  • the mRNA was obtained from total seedlings at the third flower-stage, 42 days post germination.
  • the library was prepared in the pUC derived pSPORT-P eukaryotic expression vector and the average insert size was 1.3 kb.
  • the titer of the library was 5x10 9 cfu/mL.
  • U09959, U09961 were designed to introduce EcoRl and BamHl sites at the 5' and 3 'ends, respectively, of the nitrilase gene for cloning into the pGEM3Z plasmid.
  • the primers for the NITl gene were 5'-CCCG ⁇ 42TCATGGGAGCCATAGAGAAGG-3' (SEQ ID NO:l) (Forward) and 5'-CGCGG_4rCCTTACTATTTGTTTGAGTCATC-3' (SEQ ID NO:2) (Reverse).
  • the primers for the NIT2 gene were 5'- CCCG_4_4rrCATGTCAACTTCAGAAAACAC-3' (SEQ ID NO:3) (Forward) and 5'- CGCGG_4rCCTTACTTGTTTGAGTCATCTTC-3' (SEQ ID NO:4) (Reverse).
  • the primers for the NIT3 ORF were 5'-CCCG ⁇ 47TCATGTCTAGTACTGAAGAAATG-3' (SEQ ID NO:5) (Forward) and 5'-CGCGG_47CCCTATTTGTTTGATTCATCCTC-3' (SEQ ID NO:6) (Reverse).
  • the primers for the NIT4 ORF were 5'- CCCGA47TCATGTCCATGCAACAAGAAACG-3' (SEQ ID NO:7) (Forward) and 5'- CGCGG_4rCCTTAGACGGATTCATCTTCC-3' (SEQ ID NO:8) (Reverse).
  • the part of the sequence of the primers that appear in italics indicate the restriction enzyme cleavage sites.
  • the Quick PCR System (Promega) was used for amplification of individual nitrilase genes with Tfu polymerase.
  • Each PCR reaction contained 50 pM of appropriate primers and 200 ng of template plasmid DNA from the A. thaliana cDNA library. The final volume of each reaction was 50 ⁇ l.
  • the following program was used to perform the PCR reaction: one cycle of 94°C for 2 min, followed by 50 cycles of denaturation at 94°C for 30 sec, annealing at 68°C for 1 min, and amplification at 68°C for 2 min. An additional amplification step at 68°C for 7 min was added at the end of the reaction.
  • Products obtained from the above-mentioned PCR regime were resolved on 1.2% agarose gels and purified using the QIAEX II Agarose Gel Extraction Kit (Qiagen) according to the manufacturer's directions.
  • the purified DNAs were then used for a second round of PCR amplification with the GENEWARE specific primers using similar conditions as described above.
  • the A. t. NITl ORF was amplified directly from the plasmid DNA using the GENEWARE specific primers (see below).
  • the primers for the second round of amplification were designed to amplify the A. t.NIT2, A. t.NIT3, and A ⁇ .N ⁇ T4 genes and add P d and Notl restriction sites at the 5' and 3' end of each gene.
  • the primers for the A ⁇ .NIT2 gene were 5'cat ⁇ tt ⁇ ATGTCAACTTCAGAAAACAC3' (SEQ ID ⁇ O:9) (Forward) and 5'ggcggccgcTTACTTGTTTGAGTCATCTTC3' (SEQ ID NO: 10) (Reverse).
  • the primers for the A. t.NIT3 gene were 5'catt tt ⁇ ATGTCTAGTACTGAAGAAAT3' (SEQ ID NO:l l) (Forward) and 5'ggcggccgcCTATTTGTTTGATTCATCCTC3' (SEQ ID NO:12) (Reverse).
  • the primers for the A ⁇ .NIT4 gene were 5'catt ⁇ tt ⁇ ATGTCCATGCAACAAGAAAC3'
  • the vector fragment of appropriate size (7,169 bp) was gel purified using the QIAEX II Agarose Gel Extraction Kit (Qiagen).
  • the ligation reactions were performed using 200 ng of vector fragment and at 1 :3 vector:insert ratio at 15°C overnight. 5 ⁇ l of each ligation reaction was transformed into TOP 10 chemically competent cells (Invitrogen) according to the manufacturer's protocol. 100 ⁇ l of cells from each transformation reaction were spread on LB plates containing 100 ⁇ L/mL ampicillin and incubated at 37°C overnight.
  • the positive clones were identified by colony PCR performed using the following procedure: individual colonies were picked using a pipette tip and placed directly into separate tubes containing 10 ⁇ l of TE buffer.
  • Plasmid D ⁇ A was isolated from clones containing the appropriate full-length insert using the Plasmid D ⁇ A Mini Kit (Qiagen) and the presence of correct insert was verified by restriction digestion. Restriction enzymes were purchased from New' England Biolabs and reactions were carried out according to the manufacturer's recommendations.
  • DNA Sequencing The recombinant DNA constructs containing the A. t.NIT2 and A. t.NIT4 genes in the GENEWARE vector 1057 were sequenced to ascertain whether the inserted foreign genes have undergone any deleterious mutations rendering them inactive. Only the region of DNA consisting of the AJ.NIT2 andA.t.NIT4 ORFs were sequenced using DNA primers homologous to vector sequences located immediately upstream of the unique Pad restriction enzyme site (5'-TGATGATTCGGAGGCTACTG-3'; SEQ ID NO:15), and downstream of the Notl site (5'-ACCACGTGTGATTACGGACA-3'; SEQ ID NO: 16).
  • AMPLICAP T7 High Yield Message Maker Kit (Epicentre) was used to prepare capped RNA. The reactions were performed as suggested by the manufacture with an additional 2 ⁇ l of 2 mM GTP added to each sample to enrich for full-length capped transcripts. To verify the presence of RNA transcript, 1 ⁇ l of each sample was run on 1.2% agarose/lxTBE buffer gel and stained with ethidium bromide to visualize the RNA.
  • Nicotiana benthamiana plants were grown in a growth chamber at 25°C with a 16 hour photoperiod. Plants were inoculated with GENEWARE RNA containing no insert (p 1057, negative control), GENEWARE RNA containing the NIT2 (p2.2LR) and NIT4 insert (p4.8DBl), and a non-infectious unrelated RNA (ctrl2) at 4 weeks post germination. A lower leaf of individual plants was lightly dusted with carborundum and the whole transcription reaction (approximately 20 ⁇ L) was applied on the leaf surface and spread by applying a light pressure with a finger. Gloves were used and changed between samples. Inoculated plants were observed for symptom development.
  • Samples were collected at 4 and 7 days post inoculation. At 4 days post inoculation about 1/3 to 1/2 of the inoculated leaf was removed with a razor blade and assayed immediately for nitrilase activity. A new razor blade was used for each sample. At 7 days post inoculation, the remaining part of the inoculated leaf was collected and assayed for activity.
  • the reaction was initiated by adding hydrocinnamonitrile (5 mM) and carried on in a 26°C shaker with agitation at 150 rpm. Samples at different time points were collected by directly pipetting the supernatant out of the reaction tube and stored at -20°C before analysis by NMR.
  • the biotransformation was initiated by adding the substrate hydrocinnamonitrile (2 mM). Samples (1 mL) were taken at timed intervals and centrifuged. The supernatants were transferred into a new Eppendorf tube and stored in -20°C before analysis by NMR.
  • Proteins were extracted from equal amounts by weight of green leaf tissue for every sample. The leaves were frozen under liquid nitrogen and ground to a fine powder using a mortar and a pestle. The powders were suspended in 100 ⁇ L of an extraction buffer containing 15 mM Tris-HCl, pH 7.5, 10 mM NaCl, 0.25 mM EDTA and 1 mM 2- mercaptoethanol. The samples were incubated on ice for 5 min and then spun down at full speed in a micro-centrifuge for 5 min. The supernatants were collected and further purified on a polyacrylamide copolymer column (Bio-Spin 6 Tris column, Bio-Rad) to exclude small molecules. The columns were previously washed three times with the extraction buffer. The proteins were resolved on a 4-12 % Bis-Tris polyacrylamide gel (NuPAGE, Invitrogen Life Technologies) at 150 V for 1.5 h. Proteins were subsequently visualized with Coomassie stain.
  • the 3-indolacetonitrile was used to mimic the substrate for nitrilases.
  • the tubes were incubated at room temperature with mild agitation. NMR samples were taken at 0 and 4 hour time points so that comparison of the spectra could reveal the "leakage" of the proteins and metabolites from the leave to the supernatant.
  • the other method employed a "crude extract", obtained by grinding the leaf in liquid nitrogen. 260 mg of crude extract was mixed with 4 mL of Partial-Deu-Medium containing 3 mM 3- indoacetonitrile. NMR samples were taken at 0 and 4 hours by centrifuging the insoluble solid from the supernatant. Results are shown in Figure 6.
  • the nitrilase genes were amplified from the A. thaliana cDNA library DNA as described above.
  • the products obtained after the first round of PCR were not sufficient to be cloned into a cloning vector. This indicates that the nitrilase gene products were not highly represented in the cDNA library due to low levels of expression.
  • the purified products from the first round of PCR were then used as templates for another PCR reaction with primers that contain the unique Pad and Notl restriction enzyme sites for assistance in cloning of the PCR products directly into the GENEWARE vector.
  • the PCR products were examined on an agarose gel and found to be of the expected size.
  • the A.t.NITl gene was amplified from an A. thaliana cDNA library by PCR using the following primers: 5'-gcctt ⁇ tt ATGTCTAGTACTAAAGATATGTC-3' (forward)
  • the ligated product was transformed into four different E. coli cell lines, namely DH5 ⁇ , EC300, JM109, and BL21, and the transformed cells were grown either at 37°C or at 30°C, and then tested for the presence of the correct construct. However, again, none of the tested samples was found to contain a desired clone.
  • GENEWARE vector DNA containing A. thaliana nitrilase genes constructs containing only the A. t.NIT2 and A. t.NIT4 were recovered. Digestion of the GENEWARE vector simultaneously with EcoRI and Notl results in the following fragments : 0.3 kb, 1.6 kb, 2.1 kb, 2.4 kb, and 3.9 kb. Since neither of these enzymes cut on the A ⁇ .NTT2 or A .NIT4 sequences, GENEWARE clones containing A ⁇ .NIT2 and A. t.NIT4 would have 2.5-kb and 2.6-kb fragments, respectively, instead of the vector 2.1-kb piece with the other restriction fragments remaining the same. When these DNAs are digested accordingly, the pattern detailed above is observed thereby suggesting that the constructs do contain a full-length copy of the desired genes.
  • the sequences of the A ⁇ .NIT2 andA.t.NIT4 ORFs cloned in the GENEWARE vectors differed slightly from their published original sequences (A ⁇ .NTT2: GenBank accession X68305; A ⁇ .NIT4: GenBank accession U09961).
  • the A ⁇ .NIT2 cloned ORF differed by four nucleotides from its corresponding published sequence ( Figure 2A). These differences, however, did not result in any change in the translated protein sequence.
  • the A ⁇ .N ⁇ T4 cloned ORF differed from its published sequence by two nucleotides that resulted in the substitution of a serine with a proline in the corresponding amino acid sequence ( Figure 2B).
  • benthamiana plants with this specific virus can often result in a rather benign infection that does not inhibit growth, and hence will probably not be an impediment in the process of detection and recovery of heterologous proteins being expressed through this process.
  • Leaf samples from these plants were taken at various points of time for analysis of nitrilase activity.
  • the leaves from 7-day inoculation were also subjected to SDS PAGE analysis for protein profile. There was no difference observed in the protein profiles of leaf tissue obtained from plants that were infected with unrelated RNA and those infected with p2.2LR RNA. This result indicates that despite the detection of high levels of enzyme activity, the level of heterologous A.1NIT2 protein expression remains at a level undetectable by Coomassie staining. This suggests that the nitrilase enzyme encoded by the A. t.NITl ORF is fairly stable and efficient and/or the method of detection of enzyme activity employed is very sensitive.
  • the protein profile of leaf tissue infected with the empty vector pi 057 RNA also does not show any difference from the one infected with unrelated RNA. This result suggests that the expression level of the vector (TMV) itself was probably low and hence undetectable by protein staining. Thus, the reason for low expression of the heterologous protein is likely because of low levels of expression of the vector itself.
  • the pu ⁇ ose of this study is to enable rapid screening of plant genes using a plant host.
  • the gene of interest can then be cloned and expressed in traditional microbial hosts such as E. coli.
  • the NIT2 gene from A. thaliana was also cloned into expression vector pET21 b(+) .
  • the vector pET21 b(+) contains a T71ac promoter which include a 25 bp lac operator sequence immediately downstream from the T7 promoter.
  • the vector also contains a lad gene, which produces lac repressor.
  • Binding of the lac repressor at the lac operator site can effectively reduce transcription by T7 RNA polymerase, thus providing a second /-.cl-based mechanism (besides the repression at lacUV5 in the genome) to suppress basal expression in the E. coli BL21(DE3) host.
  • the vector also contains a RBS site to ensure translation.
  • the ORF of the A. t.NIT2 was amplified using primers containing Ndel and BamHI as 5 ' and 3' linkers, respectively. Cloning of the resulting 1.0-kb PCR fragment into the Ndel and B ⁇ mHI site of the pET21b (+) afforded the plasmid pSyngenel2.
  • the plasmid pSyngene 12 was transformed into E. coli BL21(DE3).
  • the strain BL21(DE3)/pSyngenel2 was cultivated and induced as described in the Experimental Section. After the cells were harvested by centrifugation, the cells were rinsed once with the Partial-Deu-Medium to wash away any residual protonated compounds on the cell surface that can increase the noise level in the NMR spectrum.
  • the pellet was resuspended in 5 mL of Partial-Deu-Medium with Ampicillin (100 ⁇ g/mL) added.
  • the biotransformation was initiated by added 2-mM hydrocinnamonitrile. The reaction was carried out at 30°C with agitation at 250 rpm for 24 h, and subsequently analyzed by proton NMR.
  • hydrocinnamic acid As shown in Figure 8, a small amount of hydrocinnamic acid [ ⁇ 2.89 (triplet, 2H), 2.48 (triplet, 2H)] was produced after a 20 hour reaction time.
  • the titer of the substrate and the product are 1.70 mM and 0.04 mM, respectively, indicating a conversion of approximately 2%.
  • the strain BL21(DE3) that was cultivated in the same condition did not show any product formation.
  • the formation of the hydrocinnamic acid demonstrates the activity of the A. t NIT2 in the recombinant E. coli host.
  • the dehalogenase gene (dhl) is amplified by PCR from the Rhodococcus rhodochrous TDTM003 total DNA.
  • the inclusion of Pad and Notl sites at the 5 ' and 3 ' ends of the dehalogenase gene facilitates the cloning of the gene into pGEM3Z plasmid.
  • the amplified product is purified and digested with the Pad and Notl enzymes and ligated with the Pad and Notl digested vector pi 057.
  • the ligated product is transformed into E. coli cells and ampicillin-resistant colonies are tested for the presence of the desired clone using restriction enzyme digestions.
  • the ORF of the cloned dhl is subjected to sequencing.
  • the sequence of the dhl is compared with the published sequence shown in Figure 3 (Kulakova et ⁇ l, Microbiology, 1997, 143, 109). Inoculation of N. benthamiana plants
  • RNA is transcribed in vitro using the GENEWARE vector pi 057, the dhl construct, and an unrelated DNA as templates (Ctrl2). Nicotiana benthamiana plants are grown in a growth chamber at 25°C with a 16 hour photoperiod. Plants are inoculated with GENEWARE RNA containing no insert (p 1057, negative control), GENEWARE RNA containing the dhl, and a non-infectious unrelated RNA (ctr!2) at 4 weeks post germination. A lower leaf of individual plants is lightly dusted with carborundum and the whole transcription reaction is applied on the leaf surface and spread by applying light pressure with a finger. Gloves are used and changed between samples.
  • Inoculated plants are observed for symptom development.
  • Samples 100 mg to 800 mg are collected at 4 and 7 days post inoculation. At 4 days post inoculation about 1/3 to 1/2 of inoculated leaf is removed with a razor blade and assayed immediately for dehalogenase activity. A new razor blade is used for each sample. At 7 days post inoculation the remaining part of the inoculated leaf is collected and assayed for activity.
  • Each leaf is chopped into 20 to 30 pieces in a plastic weighing boat using a razor blade so that the resulting small pieces of the leaf can fit into a 14-mL FALCON polypropylene round-bottom tube.
  • 6 mL of a partial-deu medium (see Example 1) is added.
  • the tube is incubated at 30°C with agitation at 250 ⁇ m for 2 hour.
  • the samples are centrifuged at 4000 ⁇ m for 10 min and the supernatant fluid is discarded.
  • a fresh 6 mL of the medium is added to resuspend the leaves.
  • the reaction is initiated by adding 1,2,3 trichloropropane (Gray et al, (2001) Adv. Synth.
  • the CAL-B gene is first PCR amplified from a cDNA library of Candida antarctica.
  • the inclusion of Pad and Notl site at the 5' and 3 ' ends of the lipase gene facilitates the cloning of the gene into pGEM3Z plasmid.
  • the amplified product is purified and digested with the Pad and Notl enzymes and ligated with the Pad and Notl digested vector pi 057.
  • the ligated product is transformed into E. coli cells and ampicillin- resistant colonies are tested for the presence of the desired clone using restriction digestions.
  • the ORF of the cloned CAL-B is be subjected to sequencing.
  • the sequence of the dhl is compared to the published sequence shown in Figure 4 (Uppenberg et al, Structure 2:293 [1994]).
  • RNA is transcribed in vitro using the GENEWARE vector pi 057, the CAL-B construct, and an unrelated DNA as templates (Ctrl2). Nicotiana benthamiana plants are grown in a growth chamber at 25°C with a 16 hour photoperiod. Plants are inoculated with GENEWARE RNA containing no insert (p 1057, negative control), GENEWARE RNA containing the CAL-B, and a non-infectious unrelated RNA (ctrl2) at 4 weeks post germination. A lower leaf of individual plants is lightly dusted with carborundum and the whole transcription reaction is applied on the leaf surface and spread by applying a light pressure with a finger. Gloves are used and changed between samples.
  • Inoculated plants are observed for symptom development.
  • Samples (about 100 mg to 800 mg) are collected at 4 and 7 days post inoculation.
  • At 4 days post inoculation about 1/3 to 1/2 of inoculated leaf is removed with a razor blade and assayed immediately for dehalogenase activity. A new razor blade is used for each sample.
  • At 7 days post inoculation the remaining part of the inoculated leaf is collected and assayed for activity.
  • the reaction is initiated by adding 1,2,3 bis(trifluoromethyl)-alkanediol diacetate (5 mM) as shown with 10% acetone to increase reaction rate (Itoh et al, Tetrahedron letters, 1996, 37(28), 5501-5502) or 2-octanol (5 mM) and S-ethyl thiooctanoate (Orrenius et al, Tetrahedron: Asymmetry, 6(5) 1217- 1220) (5 mM).
  • the reaction is carried out in a 26°C shaker with agitation at 150 ⁇ m. Samples at different time points were collected by directly pipetting the supernatant out of the reaction tube and stored at -20°C before analysis.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Medicinal Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Cell Biology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

La présente invention concerne des compositions et des procédés visant à la recherche systématique de biocatalyseurs. L'invention concerne plus particulièrement des procédés de recherche systématique de nouveaux biocatalyseurs végétaux dans des plantes et des produits végétaux.
PCT/US2004/012446 2003-04-25 2004-04-22 Recherche systematique de biocatalyseurs chez des plantes WO2004097027A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US46570103P 2003-04-25 2003-04-25
US60/465,701 2003-04-25

Publications (2)

Publication Number Publication Date
WO2004097027A2 true WO2004097027A2 (fr) 2004-11-11
WO2004097027A3 WO2004097027A3 (fr) 2006-04-20

Family

ID=33418275

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/012446 WO2004097027A2 (fr) 2003-04-25 2004-04-22 Recherche systematique de biocatalyseurs chez des plantes

Country Status (1)

Country Link
WO (1) WO2004097027A2 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998058085A1 (fr) * 1997-06-16 1998-12-23 Diversa Corporation Procede de criblage de nouvelles enzymes a haut rendement
US20030049841A1 (en) * 1997-06-16 2003-03-13 Short Jay M. High throughput or capillary-based screening for a bioactivity or biomolecule

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998058085A1 (fr) * 1997-06-16 1998-12-23 Diversa Corporation Procede de criblage de nouvelles enzymes a haut rendement
US20030049841A1 (en) * 1997-06-16 2003-03-13 Short Jay M. High throughput or capillary-based screening for a bioactivity or biomolecule

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BARTLING D ET AL: 'Cloning and expression of an Arabidopsis nitrilase which can convert indole-3-acetonitrile to the plant hormone, indole-3-acetic acid.' EUR J BIOCHEM. vol. 205, 1992, pages 417 - 424, XP001145402 *
FERNANDEZ-FERNANDEZ MR ET AL: 'Use of a pox potyvirus as an expression vector in plants.' MOLECULAR FARMINE, PROCEEDINGS OF THE OECD WORKSHOP. 03 September 2000 - 06 September 2000, pages 161 - 172, XP008062153 *
JALLAGEAS J C ET AL: 'Nitrilases and Amidases: Determination of Activity by Proton Magnetic Resonance Spectrometry.' ANALYTICAL BIOCHEMISTRY. vol. 95, 1979, pages 436 - 443, XP008062065 *
LAZO GR ET AL: 'A DNA transformation-competent Arabidopsis genomic library in Agrobacterium.' BIOTECHNOLOGY. vol. 9, no. 10, October 1991, pages 963 - 967, XP008062132 *

Also Published As

Publication number Publication date
WO2004097027A3 (fr) 2006-04-20

Similar Documents

Publication Publication Date Title
US6406910B1 (en) Recombination of insertion modified nucleic acids
US20060166319A1 (en) Charging tRNA with pyrrolysine
JP2004532014A (ja) チトクロームp450類及びその使用
WO2000061740A1 (fr) Production de lipides modifies
JP2003524394A (ja) ハイスループット質量分析法
CN111979163B (zh) 一种重组罗氏真氧菌及其制备方法和应用
CN113073089B (zh) 一种提高NMN生物合成酶Nampt的酶活的创新方法
US7635798B2 (en) Nucleic acid compositions conferring altered metabolic characteristics
DK2377930T3 (en) Group of esterases for the enantioselective production of fine and specialty chemicals
KR20160079865A (ko) 프테로스틸벤의 생합성 제조를 위한 o-메틸트랜스퍼라제의 사용 방법
JP2018536400A (ja) ドリメノールシンターゼiii
Xiong et al. High efficiency and throughput system in directed evolution in vitro of reporter gene
WO2004097027A2 (fr) Recherche systematique de biocatalyseurs chez des plantes
CN104293755A (zh) 一种珠子参达玛烯二醇合成酶基因及其应用
CN114085850B (zh) 水稻中一个芳香族酚胺合成基因簇的克隆及抗病方面的应用
CA2134261C (fr) Gene marqueur/de selection utilise dans la manipulation genetique des plantes et des cellules vegetales
Couch et al. Construction of expression vectors to produce affinity-tagged proteins in Pseudomonas
US20110289632A1 (en) Methylketone synthase, production of methylketones in plants and bacteria
CN104232597A (zh) 一种羽叶三七鲨烯环氧酶基因及其应用
CN104232601A (zh) 一种珠子参法呢基焦磷酸合酶基因及其应用
JP2021006049A (ja) 組換え型ヌクレオシド特異的リボヌクレアーゼ及びその生成法と使用法
CN113930407A (zh) 一种水解酶及其在生物催化合成4″-甲氨基-5′-羟基-阿维菌素中的应用
CN118048373A (zh) 提高磷酸调节子传感器激酶PhoR的表达水平的方法
EP1082441A1 (fr) N-acetyltransferases mycobacteriennes
WO1999013057A1 (fr) SYNTHETASE D'ARNt DE LYSYLE DE CLASSE-I

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DPEN Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed from 20040101)
122 Ep: pct application non-entry in european phase