WO1999041395A1 - Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase deaminase - Google Patents

Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase deaminase Download PDF

Info

Publication number
WO1999041395A1
WO1999041395A1 PCT/US1999/000560 US9900560W WO9941395A1 WO 1999041395 A1 WO1999041395 A1 WO 1999041395A1 US 9900560 W US9900560 W US 9900560W WO 9941395 A1 WO9941395 A1 WO 9941395A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
set forth
sequence set
leu
ala
Prior art date
Application number
PCT/US1999/000560
Other languages
French (fr)
Inventor
Georges S. Mourad
Donald J. Merlo
Dayakar Reddy Pareddy
Ignacio Mario Larrinua
Original Assignee
Dow Agro Sciences Llc
Purdue Research Foundation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/US1998/014362 external-priority patent/WO1999002656A1/en
Application filed by Dow Agro Sciences Llc, Purdue Research Foundation filed Critical Dow Agro Sciences Llc
Priority to AU22202/99A priority Critical patent/AU2220299A/en
Publication of WO1999041395A1 publication Critical patent/WO1999041395A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8209Selection, visualisation of transformants, reporter constructs, e.g. antibiotic resistance markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8243Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
    • C12N15/8251Amino acid content, e.g. synthetic storage proteins, altering amino acid biosynthesis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8271Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
    • C12N15/8274Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for herbicide resistance
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/88Lyases (4.)

Definitions

  • the present invention relates to methods and materials in the field of molecular biology and to the utilization of isolated nucleotide sequences to genetically engineer plants, and/or microorganisms. More particularly, the invention relates in certain preferred aspects to novel nucleotide sequences and uses thereof, including their use in DNA constructs for transforming plants, fungi, yeast & bacteria. The nucleotide sequences are particularly useful as selectable markers for screening plants and/or microorganisms for successful transformants and also for improving the nutritional value of plants.
  • Threonine dehydratase/deaminase is the first enzyme in the biosynthetic pathway of isoleucine, and catalyzes the formation of 2-oxobutyrate from threonine ("Thr") in a two-step reaction.
  • the first step is a dehydration of Thr, followed by rehydration and liberation of ammonia. All reactions downstream from TD are catalyzed by enzymes that are shared by the two main branches of the biosynthetic pathway that lead to the production of the branched-chain amino acids, isoleucine ("He”), leucine (“Leu”), and valine (“Val").
  • Figure 1 The cellular levels of lie are controlled by negative feedback inhibition. When the cellular levels of lie are high, lie binds to TD at a regulatory site (allosteric site) that is different from the substrate binding site (catalytic site) of the enzyme. The formation of this Ile-TD complex causes conformational changes to TD, which prevent the binding of substrate, thus inhibiting the lie biosynthetic pathway.
  • selectable markers are widely used in methods for genetically transforming cells, tissues and organisms. Such markers are used to screen cells, most commonly bacteria, to determine whether a transformation procedure has been successful.
  • constructs for transforming a cell may include as a selectable marker a nucleotide sequence that confers antibiotic resistance to the transformed cell. After transformation, the cells may be contacted with an antibiotic in a screening procedure.
  • microorganisms is a growing medical concern as the efficacy of antibiotics in fighting bacterial infections is decreasing. Many infections including meningitis no longer respond well to drugs that once worked well against them. This phenomenon is attributed largely to the overuse of antibiotics, both as drugs and as a laboratory screening tool, and the resulting antibiotic resistance of a growing number of microorganisms. As an example, the bacteria that causes meningitis once was routinely controlled with ampicillin a commonly prescribed antibiotic and an antibiotic very heavily used in screening transformed bacterial cells for resistance as a selectable marker. Now, however, about 20 percent of such infections are resistant to ampicillin.
  • the present invention addresses the aforementioned problems in screening genetic transformants and provides nucleotide sequences which may be advantageously used as selectable markers, and which may be inserted into the genome of a plant or microorganism to provide a transformed plant or microorganism.
  • a transformed plant or microorganism advantageously exhibits significantly increased levels of lie synthesis and synthesis of intermediates of the lie biosynthetic pathway and is therefore also capable of surviving in the presence of a toxic lie analog.
  • the present invention provides nucleotide sequences, originally isolated and cloned from Arabidopsis thaliana , which encode feedback insensitive TD that may advantageously be used to transform a wide variety of plants, fungi, bacteria and yeast.
  • Inventive forms of TD are not only insensitive to feedback inhibition by isoleucine, but are also insensitive to structural analogs of isoleucine that are toxic to plants and microorganisms which synthesize only wild-type TD. Therefore, inventive nucleotide sequences encoding mutated forms of TD can be used to create cells that are
  • an inventive nucleotide sequence may be used in a DNA construct to provide a biochemical selectable marker
  • One aspect of the present invention is identification, isolation and purification of a gene encoding a wild-type form of TD.
  • the DNA sequence thereof can be used as disclosed herein to determine the complete amino acid sequence for the protein encoded thereby and thus allow identification of domains found therein that can be mutated to produce additional TD proteins having altered enzymatic characteristics.
  • isolated and purified polynucleotides there are provided isolated and purified polynucleotides, the polynucleotides encoding a mutated form of TD, or a portion thereof, as disclosed herein.
  • the invention provides isolated polynucleotides comprising the sequence set forth in SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO: 23 and SEQ ID NO: 25, nucleotide sequences having substantial identity thereto, and nucleotide sequences encoding TD variants of the invention.
  • isolated polypeptides comprising the amino acid sequence set forth in SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO.-18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24 and SEQ ID NO: 26, and variants thereof selected in accordance with the invention.
  • a chimeric DNA construct comprising a promoter operably linked to a nucleotide sequence encoding a threonine dehydratase/deaminase that is substantially resistant to feedback inhibition.
  • the nucleotide sequence can be transcribed to produce mRNA and said mRNA can be translated to produce either mature, mutated TD or a precursor mutated TD protein, said protein being functional in said cell.
  • a vector useful for transforming a cell, and plants and microorganisms transformed therewith comprising a DNA construct selected in accordance with the invention.
  • cells and plants having incorporated into their genome a foreign nucleotide sequence operably linked to a promoter, the foreign sequence comprising a nucleotide sequence having substantial identity to a sequence set forth herein or a foreign nucleotide sequence encoding an inventive polypeptide.
  • a method comprising incorporating into a plant's genome an inventive DNA construct to provide a transformed plant; wherein the transformed plant is capable of expressing the nucleotide sequence.
  • Yet another aspect of the invention is the production and propagation of cells transformed in accordance with the invention, wherein the cells express a mutated TD enzyme, thus making the cells resistant to feedback inhibition by isoleucine, and resistant to molecules that are toxic to a cell producing only the wild-type TD enzyme.
  • a method comprising providing a vector featuring a promoter operably linked to a nucleotide sequence encoding a threonine dehydratase/deaminase that is resistant to feedback inhibition, wherein the promoter regulates expression of the nucleotide sequence in a host plant cell; and transforming a target plant with the vector to provide a transformed plant, the transformed plant being capable of expressing the nucleotide sequence.
  • Plants transformed in accordance with the invention have within their chloroplasts a mature, mutated form of TD, which renders the cells resistant to toxic lie analogs. Also provided are transformed plants obtained according to inventive methods and progeny thereof.
  • cells wherein at least one of the cells has in its genome an expressible foreign nucleotide sequence selected in accordance with the invention; and (2) contacting the plurality of cells with a substrate comprising a toxic isoleucine structural analog; wherein cells comprising the expressible foreign nucleotide sequence are capable of growing in the substrate and wherein cells not comprising the expressible foreign nucleotide sequence are incapable of growing in the substrate.
  • a construct comprising a primary nucleotide sequence to be introduced into the genome of a target cell, tissue and/or organism, and further comprising a biochemical selectable marker selected in accordance with the invention.
  • This aspect of the invention may be advantageously used to transform a wide variety of cells, including microorganisms and plant cells.
  • the plant or microorganism may be grown in a substrate comprising a toxic isoleucine analog (a "toxic substrate”), thereby providing a mechanism for the early determination whether the transformation was successful.
  • a method for reliably incorporating a first, expressible, foreign nucleotide sequence into a target cell comprising providing a vector comprising a promoter operably linked to a first primary nucleotide sequence and a second nucleotide sequence selected in accordance with the invention, the second sequence encoding an insensitive TD enzyme; transforming the target cell with the vector to provide a transformed cell; and contacting the cell with a substrate comprising L-O-methylthreonine wherein successfully transformed cells are capable of growing in the substrate, and wherein unsuccessfully transformed cells are incapable of growing in the substrate.
  • a method for growing a plurality of plants in the absence of undesirable plants comprising providing a plurality of plants, each having in its genome a foreign nucleotide sequence comprising a promoter operably linked to a nucleotide sequence selected in accordance with the invention; growing the plurality of plants in a substrate; and introducing a preselected amount of an isoleucine structural analog into the substrate.
  • TD enzymes described herein function in the chloroplasts of a plant cell. Therefore, it is readily appreciated by a skilled artisan that a nucleotide sequence inserted into a plant cell will necessarily encode a precursor TD peptide.
  • chimeric DNA constructs are described herein that comprise a first nucleotide sequence encoding a mature mutated form of TD and a second nucleotide sequence encoding a chloroplast transit peptide of choice, the second sequence being functionally attached to the 5' end of the first sequence. Expression of the chimeric DNA construct results in the production of a mutated precursor TD
  • nucleotide sequences which may be introduced into the genome of a plant or microorganism to increase the ability of the plant or microorganism to synthesize lie and intermediates of the lie biosynthetic pathway. Additionally, it is an object of the invention to provide nucleotide sequences, which may be used as excellent biochemical selectable markers for identifying successful transformants in genetic engineering protocols. It is also an object of the invention to provide a novel, efficient, selective, environmentally-friendly herbicide system.
  • Figure 1 illustrates the biosynthetic pathway of the branched-chain amino acids valine, leucine and isoleucine.
  • Figure 2 sets forth the alignment of the amino acid sequence of TD of tomato and chickpea. C regions are highly conserved regions of the catalytic site of TD while R regions are highly conserved regions of the regulatory site of TD. Also shown are the locations of the degenerate oligonucleotide primers TD205 and TD206
  • Figure 3 sets forth the structure and degree of degeneracy of the two oligonucleotide primers TD205 and TD206 used in amplifying an Arabidopsis genomic DNA fragment of the TD gene omrl .
  • TD205 is anchored with an EcoRI site (underlined) at its 5' end and TD206 is anchored with a Hind III site (underlined) at its 5' end.
  • Figure 4 sets forth the DNA sequence of clone 23 (pGM-td23) isolated from a cDNA library of the mutated line GMllb (omrl/omrl ) of Arabidopsis thaliana .
  • Figure 5 sets forth the nucleotide sequence and the predicted amino acid sequence of clone 23 as isolated from the cDNA library constructed from line GMllb of Arabidopsis (omrl/omrl) .
  • the TD insert in clone 23 is in pBluescript vector between the EcoRI and Xhol sites. An open reading frame (top reading frame) was observed which showed an ATG codon at nucleotide 166 and a termination codon at nucleotide 1801.
  • Figure 6a depicts the structure of the expression vector pCM35S-omrl used in the transformation of wild- type Arabidopsis thaliana and which expressed a mutated form of TD capable of conferring resistance to the toxic analog L-O-methylthreonine upon transformants.
  • Figure 6b sets forth the nucleotide sequence and the predicted amino acid sequence of the chimeric mutant omrl expressing resistance to L-O-methylthreonine in transgenic Arabidopsis plants that have been transformed with the expression vector pCM35s-omrl (shown in figure 6a) .
  • the total length of the fusion (chimeric) mutant TD expressed in transgenic plants was 609 amino acid residues.
  • the first 9 amino-terminal residues start by methionine encoded by a start codon (ATG) furnished by the 3' end of the nucleotide sequence of CaMV 35s promoter linked to the omrl insert of clone 23.
  • ATG start codon
  • the following 12 amino acid residues are generated by the nucleotide sequence of the polylinker region from the multiple cloning site of the vector and finally the
  • -10- remaining 585 amino acid residues are encoded by the omrl mutant allele of Arabidopsi s as present in clone 23.
  • the first residue of the 585 amino acid long portion encoded by omrl in pCM35s-omrl corresponds to threonine (Thr) which is the amino-terminal residue number 8 of the full length omrl cDNA shown in Figures 8 and 9 and SEQ ID NO: 2.
  • Figure 7 is the nucleotide sequence of the full length cDNA of the omrl allele encoding mutated TD.
  • the total length of the cDNA of omrl is 1779 nucleotides including the stop codon.
  • Figure 8 is the predicted amino acid sequence of the mutated TD encoded by omrl The total length of the TD protein encoded by omrl is 592 amino acids.
  • Figure 9 is the nucleotide sequence and the predicted amino acid sequence encoded by the mutated allele omrl of line GMllb of Arabidopsis thaliana .
  • Figure 10 is the nucleotide sequence of the full length cDNA of the wild type allele OMR1 encoding wild type TD.
  • Figure 11 is the predicted amino acid sequence of the wild type TD encoded by OMR1 .
  • Figure 12 is the nucleotide sequence and the predicted amino acid sequence encoded by the wild type allele OMR1 of Arabidopsis thaliana Columbia wild type.
  • Figure 13 sets forth the multi-alignment of the deduced amino acid sequence of the wild-type TD of Arabidopsis thaliana reported in this disclosure with that from other organisms obtained from GenBank with the following accession numbers: 940472 for chickpea: 10257 for tomato: 401179 for potato; 730940 for yeast 1; 134962 for yeast 2; 68318 for E. coli biosynthetic; 135723 for E. coli catabolic: I 174668 for Salmonella typhimuri um.
  • GenBank accession numbers: 940472 for chickpea: 10257 for tomato: 401179 for potato; 730940 for yeast 1; 134962 for yeast 2; 68318 for E. coli biosynthetic; 135723 for E. coli catabolic: I 174668 for Salmonella typhimuri um
  • Figure 14 is a portion of the DNA sequencing gel comparing the nucleotide sequence of the mutated omrl allele and its wild-type allele OMR1 and showing the base
  • Figure 16 sets forth the amino acid sequence at the regulatory region R4 of TD encoded by mutated omrl and wild type OMRl alleles of Arabidopsis thaliana compared to that from several organisms.
  • the arrow points to the mutated amino acid residue in omrl .
  • Figure 17 is a portion of the DNA sequencing gel comparing the nucleotide sequence of the mutated omrl allele and its wild-type allele OMRl and showing the base substitution G (in OMRl) to A (in omrl) at nucleotide residue 1631.
  • the arrow is pointing to the base substitution.
  • Figure 18 depicts the point mutation in omrl at nucleotide residue 1631, predicting an amino acid substitution, arginine (R) to histidine (H) at amino acid residue 544 at the TD level.
  • Figure 19 sets forth the amino acid sequence at the regulatory region R6 of TD encoded by mutated omrl and wild type OMRl alleles of Arabidopsis thaliana compared to that from several organisms.
  • the arrow points to the mutated amino acid residue in omrl.
  • Figure 20 is a map of plasmid pGMtd23.
  • Figure 21 is a map of plasmid pDAB1850.
  • Figure 22 is a map of plasmid pDAB1852.
  • Figure 23 is a map of plasmid pDAB311.
  • Figure 24 is a map of plasmid pDAB305.
  • Figure 25 is a map of plasmid pDAB1518.
  • the present invention relates to methods and compositions for obtaining transformed cells, said cells expressing therein a mutated form of threonine
  • TD dehydratase/deaminase
  • omrl a gene sequence from Arabidopsis thaliana, designated omrl , which encodes a surprisingly advantageous mutated form of the enzyme TD.
  • the present invention relates in another aspect to amino acid sequences that comprise functional, feedback-insensitive TD enzymes. Aspects of the present invention thus relate to nucleotide sequences encoding mutated forms of TD, which sequences may be introduced into target plant cells or microorganisms to provide a transformed plant or microorganism having a number of desirable features.
  • TD TD-like TD
  • TD mutated forms of TD, unlike wild-type TD, are resistant to negative feedback inhibition by isoleucine ("lie") and transformed cells are resistant to molecules which are toxic to cells that do not express feedback insensitive TD. Therefore, transformants harboring an expressible inventive nucleotide sequence demonstrate increased levels of isoleucine production and increased levels of production of intermediates in the lie biosynthetic pathway, and the transformants are resistant to lie structural analogs which are lethal to non-transformants, which express only wild-type TD.
  • the invention therefore provides isolated nucleotide sequences encoding mutated TD-functional polypeptides ("mutated TD") which are resistant to lie feedback inhibition and are resistant to the toxic effect of lie analogs.
  • mutated TD mutated TD-functional polypeptides
  • inventive nucleotide sequences can be incorporated into vectors, which in turn can be used to transform other microorganisms and plant cells. Such transformation can be used, for instance, for purposes of providing a selectable marker, to increase plant nutritional value or to increase the production of commercially-important intermediates of the isoleucine biosynthetic pathway.
  • Expression of the mutated TD results in the cell having altered susceptibility to certain enzyme inhibitors relative to cells having wild- type TD only.
  • TD enzyme is used to refer generally to a wild-type TD amino acid sequence, to a mutated TD selected in accordance with the invention, and to variants of each which catalyzes the reaction of threonine to 2-oxobutyrate in the lie biosynthetic pathway, as described herein.
  • wild-type form is distinguished from a mutated form, where necessary, by usage of the terms wild-type TD and mutated TD.
  • transi t peptide transi t peptide
  • chloroplast leader sequence and signal peptide are used interchangeably to designate those amino acids that direct a passenger peptide to a chloroplast.
  • mature peptide or enzyme or passenger peptide or enzyme is meant a polypeptide which is found after processing and passing into an organelle and which is functional in the organelle for its intended purpose.
  • the chloroplast leader sequence is covalently bound to the mature enzyme or passenger enzyme.
  • precursor protein is meant a polypeptide having a transit peptide and a passenger peptide covalently attached to each other.
  • the carboxy terminus of the transit peptide is covalently attached to the amino terminus of the passenger peptide.
  • the passenger peptide and transit peptide can be encoded by the same gene locus, that is, homologous to each other, in that they are encoded in a manner isolated from a single source.
  • the transit peptide and passenger peptide can be heterologous to each other, i.e., the transit peptide and passenger peptide can be from different genes and/or different organisms.
  • Passenger peptides are originally made in a precursor form that includes a transit peptide and the passenger peptide.
  • Passenger peptides are the polypeptides typically obtained upon purification from a homogenate, the sequence of which can be determined as described herein.
  • transformed and transgenic are used interchangeably to refer to a cell or plant expressing a foreign nucleotide sequence introduced through transformation efforts.
  • the term foreign nucleotide sequence is intended to indicate a sequence encoding a polypeptide whose exact amino acid sequence is not normally found in the host cell, but is introduced therein through transformation techniques.
  • a structural gene is that portion of a gene comprising a DNA segment encoding a protein, polypeptide or a portion thereof, and excluding the 5' sequence which drives the initiation of transcription.
  • the structural gene may be one which is normally found in the cell or one which is not normally found in the cellular location wherein it is introduced, in which case it is termed a heterologous gene .
  • a heterologous gene may be derived in whole or in part from any source known to the art,
  • a structural gene may contain one or more modifications in either the coding or the untranslated regions which could affect the biological activity or the chemical structure of the expression product, the rate of expression or the manner of expression control. Such modifications include, but are not limited to, mutations, insertions, deletions and substitutions of one or more nucleotides.
  • the structural gene may constitute an uninterrupted coding sequence or it may include one or more introns, bounded by the appropriate splice junctions.
  • the structural gene may be a composite of segments derived from a plurality of sources (naturally occurring or synthetic, where synthetic refers to DNA that is chemically synthesized) .
  • the structural gene may also encode a fusion protein.
  • Plant tissue includes differentiated and undifferentiated tissues of plants, including, but not limited to, roots, shoots, leaves, pollen, seeds, tumor tissue and various forms of cells in culture, such as single cells, protoplasts, embryos and callus tissue.
  • the plant tissue may be in plants or in organ, tissue or cell culture.
  • Plant cell as used herein includes plant cells in plants and plant cells and protoplasts in culture.
  • promoter regulatory element nucleotide sequence elements within a nucleotide sequence which control the expression of that nucleotide sequence.
  • Promoter regulatory elements provide the nucleic acid sequences necessary for recognition of RNA polymerase and other transcriptional factors required for efficient transcription.
  • Promoter regulatory elements are meant to include constitutive, tissue-specific, developmental- specific, inducible promoters and the like: Promoter
  • regulatory elements may also include certain enhancer sequence elements that improve transcriptional efficiency.
  • Operably linked refers to a juxtaposition wherein the components are configured so as to perform their usual function.
  • control sequences operably linked to a coding sequence are capable of effecting the expression of the coding sequence.
  • Homology refers to identity or near identity of nucleotide or amino acid sequences.
  • nucleotide mismatches can occur at the third or wobble base in the codon without causing amino acid substitutions in the final polypeptide sequence.
  • minor nucleotide modifications e.g., substitutions, insertions or deletions
  • chemically synthesized copies of whole, or parts of, gene sequences can replace the corresponding regions in the natural gene without loss of gene function.
  • Homologs of specific DNA sequences may be identified by those skilled in the art using the test of cross-hybridization of nucleic acids under conditions of stringency as is well understood in the art (as described in Ha es et al.. Nucleic Acid Hybridisation, (1985) IRL Press, Oxford, UK) . Extent of homology is often measured in terms of percentage of identity between the sequences compared.
  • nucleotide sequence Similar to homology, the term substantial identi ty is used herein with respect to a nucleotide sequence to designate that the nucleotide sequence has a sequence sufficiently similar to a reference nucleotide sequence that it will hybridize therewith under moderately stringent conditions, this method of determining identity being well known in the art to which the invention pertains. Briefly, moderately stringent conditions are described in Sambrook et al.. Molecular Cloning: a
  • nucleotide sequence is intended to refer to a natural or synthetic linear and sequential array of nucleotides and/or nucleosides, and derivatives thereof.
  • encoding and coding refer to the process by which a nucleotide sequence, through the mechanisms of transcription and translation, provides the information to a cell from which a series of amino acids can be assembled into a specific amino acid sequence to produce a functional polypeptide, such as, for example, an active enzyme.
  • the process of encoding a specific amino acid sequence may involve DNA sequences having one or more base changes (i.e., insertions, deletions, substitutions) that do not cause a change in the encoded amino acid, or which involve base changes which may alter one or more amino acids, but do not eliminate the functional properties of the polypeptide encoded by the DNA sequence.
  • base changes i.e., insertions, deletions, substitutions
  • amino acid sequence is used herein to designate a plurality of amino acids linked in a serial array.
  • Skilled artisans will recognize that through the process of mutation and/or evolution, polypeptides of different lengths and having differing constituents, e.g., with amino acid insertions, substitutions, deletions, and the like, may arise that are related to a sequence set forth herein by virtue of amino acid sequence homology and advantageous functionality as described in detail herein.
  • an amino acid sequence isolated from one species may differ to a certain degree from the wild-type TD sequence set forth in SEQ ID NO: 1 (nucleic acid sequence), and SEQ ID N0:2 (corresponding amino acid sequence) , and yet have similar functionality with respect to catalytic and regulatory function.
  • Amino acid sequences comprising such variations are included within the scope of the present invention and are considered substantially similar to a reference amino acid sequence. It is believed that the identity between amino acid sequences that is necessary to maintain proper functionality is related to maintenance of the tertiary structure of the polypeptide such that specific interactive sequences will be properly located and will have the desired activity.
  • a polypeptide including these interactive sequences in proper spatial context will have good activity, even where alterations exist in other portions thereof.
  • a TD variant is expected to be functionally similar to the wild-type TD set forth in SEQ ID NO: 2, for example, if it includes amino acids which are conserved among a variety of species or if it includes non-conserved amino acids which exist at a given location in another species that expresses functional TD.
  • Figure 13 sets forth an amino acid alignment of TD polypeptides of a number of species.
  • Arabidopsis is depicted which comprises the following sequence (corresponding to the underlying three-letter codes numbered as set forth in SEQ ID N0:1): V N L T T S D L V K D H L R Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp His Leu Arg 486 490 495
  • Group II Uncharged polar amino acids: serine. threonine, asparagine, glutamine, tyrosine;
  • Group III Charged polar acidic amino acids: aspartic, glutamic;
  • Group IV Charged polar basic amino acids: lysine, arginine, histidine.
  • insensitive TD enzymes are therefore not similar to wild-type TD, as that term is defined and used herein, because inhibition functionality is altered.
  • Insensitive TD enzymes feature one or more mutations in the regulatory site which mutations alter the functionality of the regulatory site without substantially altering the functionality of the catalytic site.
  • an amino acid sequence (SEQ ID NO: 4) having two substitutions, this sequence comprising a mutated TD which has good catalytic functionality but which does not exhibit regulatory functionality.
  • the enzyme set forth in SEQ ID NO: 4 comprises a feedback insensitive Arabidopsis thaliana TD.
  • Arg to His at residue 544 was a change from a charged, polar, basic amino acid (Arg) to another charged, polar, basic amino acid (His) . While it is not intended that the present invention be limited by any theory by which it achieves its advantageous result, it is believed that the substitution at residue 544 alone may not have substantially altered the feedback site of TD, and, in contrast, that the substitution at residue 499 alone may have desensitized TD encoded thereby to feedback regulation. Certainly, when combined, the substitutions
  • ⁇ 22- were very effective in desensitizing TD encoded by omrl to feedback regulation.
  • amino acid sequence set forth in SEQ ID NO: 6 (585 residues encoded by omrl ) is a truncated version, missing 7 amino-terminal residues, of that set forth in SEQ ID NO: 4. It is seen from the following description, including the Examples set forth herein, that a significant amount of research was performed based upon this slightly shortened version, and that the slightly shortened version may be advantageously used to transform a wide variety of plants and microorganisms. It is believed that the portion of the amino acid sequence that is present in SEQ ID NO: 4 and absent in SEQ ID NO: 6 is a portion of the chloroplast leader sequence, and not present in the mature TD enzyme.
  • SEQ ID NO: 2 sets forth an amino acid sequence comprising a wild-type TD from Arabidopsis thaliana .
  • SEQ ID NOS: 4 and 6 set forth amino acid sequences comprising precursor proteins of differing lengths.
  • SEQ ID NO: 6 (see also Figure 6b) comprises a 609 amino acid fusion or chimeric polypeptide of which 585 amino acid residues are encoded by mutant omrl of Arabidopsis . That is, SEQ ID NO: 6 comprises a mutant TD that is shorter than the full-length mutant TD shown in SEQ ID NO: 4 by 7 amino terminal residues.
  • SEQ ID NOS: 8, 10 and 12 set forth sequences comprising three predicted mature proteins.
  • SEQ ID NO: 14 sets forth the putative regulatory site of an inventive mutated TD enzyme, and SEQ ID NOS: 16 and 18 set forth regulatory regions harboring mutations in accordance with one aspect of the invention.
  • the wild-type TD enzyme features dual functionality. Specifically, the TD enzyme
  • ⁇ 23- has a catalytic site which is divided into catalytic regions C1-C5, as shown with respect to the analogous tomato TD enzyme and chickpea TD enzyme in Figure 2.
  • the catalytic site catalyzes the reaction of threonine to 2- oxobutyrate.
  • TD also has a regulatory site which is divided into regulatory regions R1-R7, as shown in Figure 2.
  • the regulatory site is responsible for the feedback inhibition which occurs when the regulatory site binds to an inhibitor, in this case isoleucine.
  • the present invention therefore, provides, in alternative aspects, a feedback insensitive TD comprising the amino acid sequence set forth in SEQ ID NO: 4 or SEQ ID NO: 6 (precursor polypeptides); set forth in SEQ ID NO: 8, SEQ ID NO: 10 or SEQ ID NO: 12 (expected mature TD enzymes); SEQ ID NO: 14 (an insensitive TD regulatory site); or set forth in SEQ ID NO: 16 (regulatory region R4) or SEQ ID NO: 18 (regulatory region R6) .
  • the amino acid sequence of SEQ ID NO: 14 or variants thereof as described above may be operably coupled to a TD catalytic site from a wide variety of species, including functionally similar variants thereof, to provide the advantageous result of the invention.
  • Amino acid sequences SEQ ID NOS: 16 and 18 may also be operably coupled to a wide variety of sequences to provide insensitive TD enzymes, and therefore comprise certain preferred aspects of the invention. Substitutions giving rise to similar amino acid sequences, as described herein, are particularly applicable to SEQ ID NO: 16, and the following sets forth a plurality of particularly preferred alternative sequences for SEQ ID NO: 16 in accordance with the invention:
  • the invention therefore also encompasses amino acid sequences similar to the amino acid sequences set forth herein that have at least about 50% identity thereto and
  • inventive amino acid sequences have at least about 75% identity to these sequences, more preferably at least about 85% identity and most preferably at least about 95% identity.
  • Percent identity may be determined, for example, by comparing sequence information using the GAP computer program, version 6.0, available from the University of Wisconsin Genetics Computer Group (UWGCG) .
  • the GAP program utilizes the alignment method of Needleman and Wunsch (J. Mol . Biol . 48:443, 1970), as revised by Smith and Waterman (Adv. Appl . Ma th . 2:482, 1981). Briefly, the GAP program defines identity as the number of aligned symbols (i.e., nucleotides or amino acids) which are the same, divided by the total number of symbols in the shorter of the two sequences.
  • the preferred default parameters for the GAP program include: (1) a uniary comparison matrix (containing a value of 1 for identities and 0 for non-identities) , and the weighted comparison matrix of Gribskov and Burgess, Nucl . Acids Res . 14:6745, 1986, as described by Schwartz and Dayhoff, eds., Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, pp. 353-358, 1979; (2) a penalty of 3.0 for each gap and an additional 0.10 penalty for each symbol in each gap; and (3) no penalty for end gaps.
  • the invention also contemplates amino acid sequences having alternative mutations to those identified herein which also result in a feedback insensitive TD.
  • the cys at position 499 and the his at position 544 in SEQ ID NO: 4 could be substituted with alternative amino acids from the same amino acid group as cys and his, respectively (as described above) to provide an alternate inventive enzyme.
  • a skilled artisan can alter the nucleotide sequence set forth in SEQ ID N0:1 by site-directed mutagenesis to provide a mutated sequence which encodes an enzyme having an alternate amino acid in a given location of the enzyme.
  • a skilled artisan can synthesize an amino acid sequence having one or more additions, substitutions and/or deletions at a highly conserved location of the wild-type TD enzyme using techniques known in the art.
  • Such variants which exhibit functionality substantially similar to a polypeptide comprising the sequence set forth in SEQ ID NO: 4, are included within the scope of the present invention.
  • the present application finds advantageous use in a wide variety of plants, as well as in a wide variety of microorganisms.
  • the TD enzyme functions in chloroplast, and, therefore, that the polypeptide transcribed therefore is a precursor protein which includes a portion identified herein as a chloroplast leader sequence or transit peptide.
  • the transit peptide may be derived from monocotyledonous or dicotyledonous plants upon choice of the artisan.
  • DNA sequences encoding said transit peptides may be obtained from chloroplast proteins such as ⁇ -9 desaturase, palmitoyl-ACP thioesterase, ⁇ -KETOACYL-ACP synthase, oleyl-ACP thioesterase, chlorophyll a/b binding protein, NADPH+ dependent glyceraldehyde-3-phosphate dehydrogenase, early light inducible protein, clip protease regulatory protease, pyruvate orthophosphate dikinase, chlorophyll a/b binding protein, triose phosphate-3- ⁇ hos ⁇ hoglycerate phosphate translocator, 5- enol pyruval shikimate-e-phosphate synthase, dihydrofolate reductase, thymidylate synthase, acetyl- coenzyme A carboxylase, Cu/Zn superoxide dismutase,
  • the chloroplast leader sequence is used to direct the passenger protein to chloroplast; however, they are typically cleaved and degraded upon entry of the passenger protein into the organelle of interest. Therefore, purification of a cleaved transit peptide from plant tissues is typically not possible.
  • transit peptide sequences can be determined by comparison of the precursor protein amino acid sequence obtained from the gene encoding the same to the amino acid sequence of the isolated passenger protein (mature protein) .
  • passenger protein sequences can also be determined from the transit peptide proteins associated therewith by comparison of sequences to other similar proteins isolated from different species.
  • genes encoding precursor forms of mutated TD protein, disclosed as SEQ ID NO: 3-6 when compared to wild type precursor and mature TD protein obtained from other species, can establish the expected sequence of the mature protein.
  • amino acid sequence and hence the nucleic acid sequence of a transit peptide can be determined in a variety of ways available to the skilled artisan.
  • passenger proteins of interest can be purified using a variety of techniques available to the person skilled in the art of protein
  • an amino terminal sequence of the protein can be determined using methods such as Edman degradation, mass spectroscopy, nuclear magnetic spectroscopy and the like. Using this information and the genetic code, standard molecular biology techniques can be employed to clone the gene encoding the protein as exemplified herein. Comparison of amino acid sequence determined from the cDNA to that obtained from the amino terminal sequence of the passenger protein can allow determination of the transit peptide sequence. In addition, many transit peptide sequences are available in the art and can easily be obtained from GenBank located in the Entrez Database at the National Center for Biotechnology Information web site.
  • transit peptides in plants has been extensively reviewed by Keegstra et al., (1989) (Ceil, 56:247-253), which is incorporated herein by reference.
  • transit peptide may show very little sequence homology at any level.
  • the length of transit peptides can vary, with some precursor proteins comprising transit peptide proteins with as few as about 10 amino acids while others can be about 150 amino acids or longer. Additional descriptions of transit peptide characteristics in plants and mechanisms associated therewith can be found in Ko and Ko, (1992) J. Biol . Chem.
  • the first 90 amino acid residues in the N-terminal region of the Arabidopsis TD protein encoded by omrl represent an expected region comprising the transit peptide, as indicated by:
  • This expected mature TD polypeptide comprises 502 sequential amino acid residues.
  • the only two other higher plant TD genes that have been cloned to date are those of tomato (Samach A.. Ilaryen D., Gutfinger T.. Ken-Dror S., Lifschitz E., 1991, Proc Nat Acad Sci USA 88:2678-2682) and chickpea (Jacob John S., Srivastava V., Guha-Mukherjee S., 1995, Plant Physiol 107:1023-1024).
  • the lengths of the transit peptides of the tomato TD and chickpea TD were predicted to be the first 80 and 91 amino terminal residues, respectively, and the full length precursor proteins were reported to be 595 residues and 590 residues, respectively (Samach et al., 1991; Jacob John et al., 1995) .
  • the amino-terminus of the TD protein contained a typical two-domain transit peptide consistent with chloroplast lumen targeting sequences (Keegstra K., Olsen L.J., Theg S.M., 1989, Chloroplast precursors and their transport across the membrane. Annu Rev Plant Physiol Plant Mol Biol 40:471- 501) .
  • the first domain at the amino-terminal (45 residues) of the transit peptide was rich in serine
  • Arabidopsis TD By analogy to tomato and chickpea, Arabidopsis TD also showed a typical two-domain transit peptide consistent with chloroplast lumen targeting sequences (as reviewed by Keegstra et al., 1989). The first 49 residues of the amino terminal end represented a domain that was rich in serine and threonine (31%) and other hydrophilic residues while the remaining 41 residues represented a second domain that contained 59% hydrophobic residues. The cleavage site of the transit peptide of Arabidopsis TD was not determined.
  • the cleavage site of the transit peptide of Arabidopsis TD may alternatively start at the lysine at residue 54 or at the lysine at residue 61. This is a presumptive cleavage site and one skilled in the art can readily determine the cleavage site in a similar fashion as in the case of
  • a transit peptide of choice is in the proper reading frame with the mature coding sequence of mutated TD.
  • the DNA encoding the transit peptide is place 5' and in the proper reading frame with the DNA encoding the mature, mutated TD protein. Placement of the chimeric DNA in correct relationship with promoter regulatory elements and other sequences as described herein can allow production of mRNA molecules that encode for heterologous precursor proteins.
  • mRNA can then be translated thus producing a functional heterologous precursor protein which can be delivered to the chloroplast.
  • a DNA construct may be made in accordance with the invention to include a promoter that is native to the gene of a selected species that encodes that species' TD precursor polypeptide. Uptake of the protein by the chloroplast and cleavage of the associated transit peptide can result in a chloroplast containing a mature, mutated form of TD, thus rendering the cell resistant to feedback inhibition which would normally inhibit cells containing only the wild-type TD protein.
  • an inventive DNA construct for transforming, for example, bacteria may be made by simply attaching a start codon directly to, and in the proper reading frame with, a
  • -31- nucleotide sequence encoding a mature peptide.
  • other elements are preferably present as described herein, such as a promoter upstream of the start codon and a termination sequence downstream of the coding region.
  • nucleotide sequences encoding inventive insensitive TD enzymes nucleotide sequences encoding preferred feedback insensitive precursor TD of the species Arabidopsis thaliana are set forth in SEQ ID NOS: 3 and 5 herein.
  • the mutated polynucleotides set forth therein and described polynucleotides related thereto are referred to as omrl .
  • omrl has been found to be a dominant allele, this imparting significant value to the invention. It is of course not intended that the present invention be limited to the exemplary nucleotide sequences, but include sequences having substantial identity thereto and sequences which encode variant forms of insensitive TD as described above.
  • a nucleic acid sequence encoding a variant amino acid sequence is within the scope of the invention.
  • Modifications to a sequence, such as deletions, insertions, or substitutions in the sequence which produce "silent" changes that do not substantially affect the functional properties of the resulting polypeptide molecule are expressly contemplated by the present invention.
  • alterations in a nucleotide sequence which reflect the degeneracy of the genetic code, or which result in the production of a chemically equivalent amino acid at a given site are contemplated.
  • a codon for the amino acid alanine, a hydrophobic amino acid may be substituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue,
  • nucleotide changes which result in alteration of the N-terminal and C-terminal portions of the polypeptide molecule would also not be expected to alter the activity of the polypeptide. In some cases, it may in fact be desirable to make mutations in the sequence in order to study the effect of alteration on the biological activity of the polypeptide. Each of the proposed modifications is well within the routine skill in the art. In a preferred aspect, therefore, the present invention contemplates nucleotide sequences having substantial identity to the sequences set forth herein and variants thereof as described herein. A further requirement of an inventive polynucleotide variant is that it must encode a polypeptide having similar functionality to the specific mutated TD enzymes recited herein, i.e., good catalytic functionality and insensitivity to feedback inhibition.
  • a suitable DNA sequence selected for use according to the invention may be obtained, for example, by cloning techniques using cDNA libraries corresponding to a wide variety of species, these techniques being well known in the relevant art.
  • Suitable nucleotide sequences may be isolated from DNA libraries obtained from a wide variety of species by means of nucleic acid hybridization or PCR, using as hybridization probes or primers nucleotide sequences selected in accordance with the invention, such as those set forth in the Sequence Listing included herewith; nucleotide sequences having substantial identity thereto; or portions thereof.
  • -33- sequences encoding TD may then be altered as provided by the present invention by site-directed mutagenesis.
  • nucleic acid sequences encoding enzymes of the invention may be constructed using standard recombinant DNA technology, for example, by cutting or splicing nucleic acids which encode cytokines and/or other peptides using restriction enzymes and DNA ligase.
  • nucleic acid sequences may be constructed using chemical synthesis, such as solid-phase phosphoramidate technology.
  • polymerase chain reaction PCR is used to accomplish splicing of nucleic acid sequences by overlap extension as is known in the art.
  • Inventive DNA sequences can be incorporated into the genome or a plant or microorganism using conventional recombinant DNA technology, thereby making a transformed plant or microorganism having the excellent features described herein.
  • the term "genome” as used herein is intended to refer to DNA which is present in a plant or microorganism and which is inheritable by progeny during propagation thereof.
  • an inventive transformed plant or microorganism may alternatively be produced by producing FI or higher generation progeny of a directly transformed plant or microorganism, wherein the progeny comprise the foreign nucleotide sequence.
  • Transformed plants or microorganisms and progeny thereof are all contemplated by the invention and are all intended to fall directly within the meaning of the terms "transformed plant” and "transformed microorganism. "
  • the present invention contemplates the use of transformed plants which are selfed to produce an inbred plant.
  • the inbred plant produces seed containing the gene of interest. These seeds can be grown
  • inbred lines can also be crossed with other inbred lines to produce hybrids.
  • Parts obtained from the regenerated plant, such as flowers, seeds, leaves, branches, fruit, and the like are covered by the invention provided that said parts contain genes encoding and/or expressing the protein of interest. Progeny and variants, and mutants of the regenerated plants are also included within the scope of the invention.
  • diploid plants typically one parent may be transformed and the other parent is the wild type.
  • the first generation hybrids (FI) are sulfate to produce second generation hybrids (F2) . Those plants exhibiting the highest levels of the expression can then be chosen for further breeding.
  • expressing as used herein, is meant the transcription and stable accumulation of mRNA inside a cell, the cell being of prokaryotic or eukaryotic origin.
  • Transit peptides of the present invention when covalently attached to the mature, mutated TD protein, can provide intracellular transport to the chloroplast.
  • a mutated mature form of TD found in a chloroplast of a cell renders the cell resistant to feedback inhibition and resistance to lie structural analogs.
  • transformation of a plant or microorganism involves inserting a DNA sequence into an expression vector in proper orientation and correct reading frame.
  • the vector may desirably contain the necessary elements for the transcription of the inserted polypeptide-encoding sequence.
  • vector systems known in the art can be advantageously used in accordance with the invention, such as plasmids, bacteriophage viruses or other modified viruses.
  • Suitable vectors include, but are not limited to the following viral vectors: lambda vector system gtll, gtlO.
  • plasmid vectors such as pBI121, pBR322, pACYCl77, pACYCI84, pAR series, pKK223-3, pUC8, pUC9, pUCI8, pUC 19, pLG339, pRK290, pKC37, pKClOl, pCDNAII, and other similar systems.
  • the DNA sequences may be cloned into the vector using standard cloning procedures in the art, for example, as described by Maniatis et al. Molecular Cloning: A Laboratory Manual, Cold Springs Laboratory, Cold Springs Harbor, New York (1982), which is hereby incorporated by reference in its entirety.
  • the plasmid pBI121 is available from Clontech Laboratories, Palo Alto, California. It is understood that known techniques may be advantageously used according to the invention to transform microorganisms such as, for example,
  • a promoter be present in the expression vector.
  • the promoter is preferably a constitutive promoter, but may alternatively be a tissue-specific promoter or an inducible promoter.
  • the promoter is one isolated from a native gene which encodes a TD.
  • An expression vector according to the invention may be either naturally or artificially produced from parts derived from heterologous sources, which parts may be naturally occurring or chemically synthesized, and wherein the parts have been joined by ligation or other means known in the art.
  • the introduced coding sequence is preferably under control of the promoter and thus will be generally downstream from the promoter. Stated alternatively, the promoter sequence will be generally upstream (i.e., at the 5' end) of the coding sequence.
  • the phrase "under control of" contemplates the presence of such other elements as may be necessary to achieve transcription of the introduced sequence.
  • enhanced production of a feedback insensitive TD may be achieved by inserting an inventive nucleotide sequence in a vector downstream from and operably linked to a promoter sequence capable of driving expression in a host cell.
  • Two DNA sequences (such as a promoter region sequence and a feedback insensitive TD- encoding nucleotide sequence) are said to be operably linked if the nature of the linkage between the two DNA sequences does not (1) result in the introduction of a frame-shift mutation, (2) interfere with the ability of the promoter region sequence to direct the transcription of the desired nucleotide sequence, or (3) interfere with the ability of the desired nucleotide sequence to be transcribed by the promoter region sequence.
  • RNA polymerase normally binds to the promoter and initiates transcription of a DNA sequence or a group of linked DNA sequences and regulatory elements (operon) .
  • a transgene such as a nucleotide sequence selected in accordance with the present invention, is expressed in a transformed cell to produce in the cell a polypeptide encoded thereby. Briefly, transcription of the DNA sequence is initiated by the binding of RNA polymerase to the DNA sequence's promoter region.
  • RNA polymerase movement of the RNA polymerase along the DNA sequence forms messenger RNA ("mRNA") and, as a result, the DNA sequence is transcribed into a corresponding mRNA.
  • mRNA messenger RNA
  • tRNA transfer RNA
  • reporter nucleotide sequence elements which can stimulate promoter activity in a cell such as those found in plants as exemplified by the leader sequence of maize streak virus (MSV) , alcohol dehydrogenase intron 1, and the like.
  • the recombinant DNA will preferably include a transcriptional termination sequence downstream from the introduced sequence. It may also be desirous to use a reporter gene. In some instances, a reporter gene may be used with or without a selectable marker. Reporter genes are genes which are typically not present in the recipient organism or tissue and typically encode proteins resulting in some phenotypic change or enzymatic property. Examples of such genes are provided in K.
  • reporter genes include the beta-glucuronidase (GUS) of the uidA locus of E. co/i . the green fluorescent protein from the bioluminescent jellyfish Aequorea victoria, and the luciferase genes from firefly Photinus pyralis .
  • GUS beta-glucuronidase
  • An assay for detecting reporter gene expression may then be performed at a suitable time after the gene has been introduced into recipient cells.
  • a preferred such assay entails the use of the gene encoding beta-glucuronidase (GUS) of the uidA locus or E. coli , as described by Jefferson et al., (1987 Biochem . Soc . Trans . 15, 17-19) to identify transformed cells.
  • promoter regulatory elements from a wide variety of sources can be used efficiently in plant cells to express foreign genes.
  • promoter regulatory elements of bacterial origin such as the octopine synthase promoter, the nopaline synthase promoter, the mannopine synthase promoter, and promoters of viral origin, such as the cauliflower mosaic virus (35S and 19S), 35T (which is a re-engineered 35S promoter, WO 97/13402 published April 17, 1997) and the like may be used.
  • Plant promoter regulatory elements include, but are not limited to, ribulose-1-5- bisphosphate (RUBP) carboxylase small subunit (ssu) , beta-conglycinin promoter, beta-phaseolin promoter, ADH promoter, heat-shock promoters, and tissue-specific promoters.
  • RUBP ribulose-1-5- bisphosphate
  • shu carboxylase small subunit
  • beta-conglycinin promoter beta-conglycinin promoter
  • beta-phaseolin promoter beta-phaseolin promoter
  • ADH promoter beta-phaseolin promoter
  • heat-shock promoters heat-shock promoters
  • tissue-specific promoters tissue-specific promoters.
  • elements such as matrix attachment regions, scaffold attachment regions, introns, enhancers, polyadenylation sequences, and the like, may be present and thus may improve the transcription efficiency or DNA integration. Such elements may or may not be necessary for DNA function, although they can provide better expression or functioning of the DNA by affecting transcription, mRNA stability, and the like. Such elements may be included in the DNA as desired to obtain optimal performance of the transformed DNA in the plant. Typical elements include, but are not limited to, Adh- intron 1, Adh-intron 6, the alfalfa mosaic virus coat protein leader sequence, the maize streak virus coat protein leader sequence, as well as others available to a skilled artisan.
  • Constitutive promoter regulatory elements may be used thereby directing continuous gene expression in all cell types at all times (e.g., actin, ubiquitin, CaMV 35S, and the like) .
  • Tissue specific promoter regulatory elements are responsible for gene expression in specific cell or tissue types, such as the leaves or seeds (e.g., zein, oleosin, napin, ACP, globulin, and the like) and these may alternatively be used.
  • -39- Promoter regulatory elements may also be active during a certain stage of the plants' development as well as active in plant tissues and organs. Examples of such include, but are not limited to, pollen-specific, embryo- specific, corn silk-specific, cotton fiber-specific, root-specific, seed endosperm-specific promoter regulatory elements, and the like. Under certain circumstances, it may be desirable to use an inducible promoter regulatory element, which is responsible for expression of genes in response to a specific signal, such as, for example, physical stimulus (heat shock genes) , light (RUBP carboxylase) , hormone (Em) , metabolites, chemicals and stress. Other desirable transcription and translation elements that function in plants may also be used. Numerous plant-specific gene transfer vectors are known in the art.
  • Plant tissue suitable for transformation of a plant in accordance with certain preferred aspects of the invention include, for example, whole plants, leaf tissues, flower buds, root tissues, callus tissue types I, II and III, embryogenic tissue, meristems, protoplasts, hypocotyls and cotyledons. It is understood, however, that this list is not intended to be limiting, but only to provide examples of plant tissues which may be advantageously transformed in accordance with the present invention. A wide variety of plant tissues may be transformed during dedifferentiation using appropriate techniques described herein. Transformation of a plant or microorganism may be achieved using one of a wide variety of techniques known in the art. The manner in which the transcriptional unit
  • -40- is introduced into the plant host is not critical to the invention. Any method which provides efficient transformation may be employed.
  • One technique of transforming plants with a DNA construct in accordance with the present invention is by contacting the tissue of such plants with an inoculum of bacteria transformed with a vector comprising the DNA construct. Generally, this procedure involves inoculating the plant tissue with a suspension of bacteria and incubating the tissue for about 48 to about 72 hours on regeneration medium without antibiotics at about 25-28 °C. Bacteria from the genus Agrobacteri um may be advantageously utilized to transform plant cells.
  • Suitable species of such bacterium include Agrobacteri um tumefaciens and Agrobacteri um rhizogenes
  • Agrobacterium tumefaciens e.g., strains LBA4404 or EHA105
  • Another technique which may advantageously be used is vacuum-infiltration of flower buds using Agrobacterium-based vectors.
  • Various methods for plant transformation include the use of Ti or Ri-plasmids and the like to perform Agrobacterium mediated transformation. In many instances, it will be desirable to have the construct used for transformation bordered on one or both sides by T-DNA borders, more specifically the right border.
  • Agrobacteri um tumefaciens or Agrobacteri um rhizogenes as a mode for transformation, although T-DNA borders may find use with other modes of transformation.
  • Agrobacteri um is used for plant transformation
  • a vector may be used which may be introduced into the host for homologous recombination with T-DNA or the Ti or Ri plasmid present in the host. Introduction of the vector may be performed via electroporation, tri-parental mating and other techniques for transforming gram-negative bacteria which are known to those skilled in the art.
  • the manner of vector transformation into the Agrobacterium host is not critical to the invention.
  • Agrobacteri um In some cases where Agrobacteri um is used for transformation, the expression construct being within the T-DNA borders will be inserted into a broad spectrum vector such as pRK2 or derivatives thereof as described in Ditta et al. (PNAS USA (1980) 77:7347-7351 and EPO 0 120 515), which are incorporated herein by reference. Explants may be combined and incubated with the transformed AgroJacteriwn for sufficient time to allow transformation thereof. After transformation, the Agrobacteria and plant cells are cultured with the appropriate selective medium. Once calli are formed, shoot formation can be encouraged by employing the appropriate plant hormones according to methods well known in the art of plant tissue culturing and plant regeneration.
  • the polynucleotide of interest is preferably incorporated into a transfer vector adapted to express the polynucleotide in a plant cell by including in the vector a plant promoter regulatory element, as well as 3' non-translated transcriptional termination regions such as Nos and the like.
  • Plant RNA viral based systems can also be used to express genes for the purposes disclosed herein.
  • the chimeric genes of interest can be inserted into the coat promoter regions of a suitable plant virus under the control of a subgenomic promoter which will infect the host plant of interest.
  • Plant RNA viral based systems are described, for example, in U.S. Patent Nos. 5,500,360; 5,316,931 and 5,589,367, each of which is hereby incorporated herein by reference in its entirety.
  • Another approach to transforming plant cells with a DNA sequence selected in accordance with the present invention involves propelling inert or biologically
  • the target cell can be surrounded by the vector so that the vector is carried into the cell by the wake of the particle.
  • Biologically active particles e.g., dried yeast cells, dried bacterium or a bacteriophage, each containing DNA material sought to be introduced
  • Biologically active particles can also be propelled into plant cells. It is not intended, however, that the present invention be limited by the choice of vector or host cell. It should of course be understood that not all vectors and expression control sequences will function equally well to express the DNA sequences of this invention. Neither will all hosts function equally well with the same vector expression system. However, one of skill in the art may make a selection among vectors, expression control sequences, and hosts without undue experimentation and without departing from the scope of this invention.
  • An isolated DNA construct selected in accordance with the present invention may be utilized in an expression vector to transform a wide variety of plants, including monocots and dicot.
  • the invention finds advantageous use, for example, in transforming the following plants: rice, wheat, barley, rye, corn, potato, carrot, sweet potato, bean, pea, chicory, lettuce, cabbage, cauliflower, broccoli, turnip, radish, spinach, asparagus, onion, garlic, eggplant, pepper, celery, squash, pumpkin, zucchini, cucumber, apple, pear, quince, melon, plum, cherry, peach, nectarine, apricot, strawberry, grape, raspberry, blackberry, pineapple,
  • Certain intermediates of the lie biosynthetic pathway have significant commercial value, and production of these intermediates is advantageously increased in a transformant in accordance with the invention.
  • 2-oxobutyrate the reaction product of the reaction catalyzed by TD
  • polyhydroxybutyrate is known to be a precursor for the production of polyhydroxybutyrate in plants that have been genetically engineered using techniques known in the art to include bacterial genes necessary to produce polyhydroxybutyrate.
  • Polyhydroxybutyrate is a desired biopolymer in the plastic industry because it may be biologically degraded. Because plants and microorganisms transformed in accordance with the invention feature increased production of 2-oxobutyrate, such plants and/or microorganisms may be advantageously utilized by plastic manufacturers in this manner.
  • plants that overproduce 2-oxobutyrate would be ideal for metabolic engineering by bacterial genes for polyhydroxybutyrate production because the overproduction of 2-oxobutyrate would provide plenty of substrate for both the natural lie biosynthetic pathway and the engineered polyhydroxybutyrate pathway.
  • an inventive nucleotide sequence may be used in an expression vector as a selectable marker.
  • an inventive nucleotide sequence is incorporated into a vector such that it is expressed in a cell transformed
  • successful transformants will not only express the primary sequence, but will also express a feedback insensitive TD.
  • successful transformants can be screened in accordance with the invention by growing the plant or microorganism in a substrate comprising a toxic lie analog, such as, for example, OMT (termed "toxic substrate” hereto) .
  • omrl is also an excellent biochemical marker to be used in experiments of genetic engineering of bacteria replacing the traditionally used and environmentally-hazardous antibiotic-resistant genes (such as ampicillin- and kanamycin-resistant marker genes), omrl is very environmentally friendly and poses no risk to human health when included in a transformant, because it does not have an ortholog in humans. Humans do not synthesize isoleucine and may only obtain it by digesting food.
  • the mutation in the omrl gene causes TD from GMllb to be insensitive to feedback control by lie.
  • TD activity in extracts from GMllb plants was about 50-fold more resistant to feedback inhibition by lie than TD in extracts from wild type plants.
  • the loss of lie feedback sensitivity in GMllb led to a 20-fold overproduction of free lie when compared to the wild type. This overproduction of lie in GMllb had no effect on plant growth or reproduction.
  • RNA was extracted from 16-day-old GMllb (omrl/omrl) plants that were germinated in a minimal agar medium supplemented with 0.2 mM MTR.
  • Poly (A) RNA (mRNA) was extracted from the total RNA and complementary DNA (cDNA) was synthesized using reverse transcriptase.
  • the cDNA library was synthesized using the ZAP-cDNA synthesis kit of Stratagene. To prime the cDNA synthesis, a 50-base oligonucleotide linker primer containing an Xhol site and an 18-base poly(dT) was used.
  • a 13-mer oligonucleotide adaptor containing an EcoRI cohesive end was ligated to the double stranded cDNA molecules at the 5' end. This allowed unidirectional cloning of the cDNA molecules, in the sense orientation, into the EcoRI and Xhol si tes of the Uni-ZAP XR vector of Stratagene.
  • the recombinant ⁇ phage library was amplified using the XLl-Blue M1RF' E. coli host cells yielding a titer 6.8 x 10 9 pfu/ml. The average size insert was approximately 1.4 kb. This was calculated from PCR analysis of 20 random, clear plaques isolated from the amplified library.
  • the Uni-ZAP XR vector contains the pBluescript SK(-), a plasmid containing the N-terminus of the lacZ gene.
  • pBluescript SK(-) a plasmid containing the N-terminus of the lacZ gene.
  • ExAssist/SOLR system provided by Stratagene was used. This allowed the rescue of the cDNA inserts from the positive ⁇ clones in pBluescritpt SK plasmids in a single step.
  • TD -48- TD is conserved in a variety of organisms
  • degenerate primers were designed from conserved amino acid regions of TD. Such conserved regions were identified by aligning the amino acid sequence of TD from chickpea and tomato.
  • Figure 2 shows the location of the conserved amino sequences in tomato and chickpea and also the location of the degenerate oligonucleotide primers TD205 and TD206 that were designed to isolate a TD-DNA fragment from Arabidopsis .
  • Figure 4 shows the structure and degree of degeneracy of the PCR oligonucleotide primers, TD205 (the 5' end primer) and TD206 (the 3' end primer).
  • Both primers TD205 and TD206 were designed to accommodate the Arabidopsis codon usage bias.
  • Primer TD205 had 384-fold degeneracy and was a 28-mer anchored with an EcoRI site starting 2 bases downstream from the first nucleotide at the 5' end of the primer.
  • TD 206 had 324-fold degeneracy and was a 28-mer anchored with a Hindlll site starting 2 bases downstream from the first nucleotide at the 5' end of the primer.
  • Genomic DNA was isolated from GMllb and used as a template in a PCR amplification with the primers TD205 and TD206. A 438 bp fragment was amplified.
  • the fragment was cloned into the EcoRI- Hindi I I si tes of the plasmid pGEM3Zf(+). The fragment was sequenced to completion using the dideoxy chain termination method and the sequenase kit of USB. The fragment showed a putative 280 bp intron. The remaining 158 bp of the PCR-fragment had 60.1% identical nucleotide sequence with the chickpea TD gene. To eliminate the putative intron sequences, a second pair of primers TD211 and TD212 were designed and used in a PCR reaction with the 438 bp fragment as a template. A DNA fragment of about 100 bp length, containing exon sequences, was amplified and purified. This was the homologous probe used for screening the cDNA library constructed from GMllb.
  • the 100 bp PCR-fragment was labeled with [ «- 32 P]dCTP (3000 Ci/mmol) using random priming (prime-a gene labeling kit of Promega) and used as a probe to screen plaque lifts (two replicas per plate) of the plated GMllb cDNA library. Hybridization was done at 42°C. in formamide for 2 days. The nylon membranes containing the plaque lifts were washed 3X at room temperature (25°C) in 7XSSPE and 0.5%SDS for 5 minutes. The nylon membranes were then put on X-ray film and exposed for 1 day. Two plaques hybridized and showed signal on the X-ray films of the two replicas taken from the same plate.
  • plugs were cut out of the agar plate and put in 1 ml of SM buffer with 20 p,L chloroform.
  • a secondary, tertiary and quaternary screening was performed until about 90% of the plaques on the plate showed a strong signal on the X-ray film of both replicas of the same plate.
  • a well isolated plaque representing each clone was cut out from the plate and put in SM buffer.
  • the phage eluate was infected with the ExAssist helper phage to excise the pBluescript SK plasmid containing the cDNA insert and the resulting recombinant bacteria was plated on media with ampicillin (60 pg/ml) .
  • plasmid DNA was prepared then digested with EcoRI and Xhol to release the inserts.
  • a Southern blot was prepared from the plasmid digests and probed with the 32P-labelled 100 bp TD fragment. All the clones, descendants from the two phage clones, showed very strong signal. This was a strong indication that the isolated clones contained the TD from the line GMllb.
  • One clone was named TD23 and was selected for DNA sequencing. The size of the cDNA insert in clone TD23 was 2229 nucleotides .
  • an oligonucleotide primer complementary to the T3 promoter of pBluescript SK was synthesized and used to obtain the sequence of the first few nucleotides of the insert. This sequence, 30 nucleotides, included the multiple cloning site downstream of the T3 promoter. The start of the cDNA sequence was immediately following the EcoRI site which starts at position 31. DNA sequencing was also performed on the opposite strand starting from the 3' end and using the T7 promoter of the pBluescrLpt SK.
  • Both strands of the TD 23 insert were sequenced to completion using a set of oligonucleotide primers designed from the DNA revealed after each sequencing reaction. A total of 19 oligonucleotide primers were synthesized and used in sequencing the cDNA insert.
  • the total length of the sequenced fragment was 2277 nucleotides of which 2229 were the cDNA insert. Of the remaining 48 nucleotides, 2277-2229, 31 nucleotides were the multiple cloning site between the T3 promoter and the EcoRI site at the 5' end of the insert and 17 nucleotides were multiple cloning site between the T7 promoter and Xhol site at the 3' end of the insert ( Figure 4).
  • Figure 5 shows the nucleotide sequence and the predicted amino acid sequence of clone 23 as isolated from the cDNA library constructed from line GMllb of Arabidopsis
  • the TD insert in clone 23 is in pBluescript vector between the EcoRI and Xhol sites. An open reading frame (top reading frame) was observed which showed an ATG codon at nucleotide 166 and a termination codon at nucleotide 1801. The total coding region of the cDNA insert in clone 23 is 1758 nucleotides (including the stop codon) encoding a polypeptide of 585 amino acids.
  • Figure 4 shows the DNA sequence of clone 23 and Figure 5 shows the DNA sequence and the open reading frame with the predicted amino acid sequence encoded by the cDNA insert.
  • the predicted amino acid sequence encoded by the TD 23 cDNA gene shared greater than 50% identity with the amino acid sequence of TD of potato and tomato
  • the E. coli strain TGXA is all auxotroph with a deletion in the ilvA gene encoding threonine dehydratase/deaminase.
  • Fisher KE, Eisenstein (1993) An efficient approach to identify ilva mutations reveals an amino-terminal catalytic domain in biosynthetic threonine dea inase from Escherichia coli , J Bacteriol 175:6605- 6613. This strain cannot grow on a minimal medium without supplementation with lie. This strain was a generous gift from Drs . Kathryn E. Fisher and Edward Eisenstein, University of Maryland Baltimore County, Maryland.
  • omrl was cloned in front of the lacZ IPTG-inducible promoter while in pUCK2, omrl was cloned in front of a constitutive promoter.
  • Xhol and Sail cohesive termini are compatible and therefore allowed the ligation of the inserts into the expression vectors.
  • the recombinant vectors pTrc-td23, pUCK-rd23 or pBluescript- td23 all containing full length omrl were transformed
  • the E. coli prototroph host DH5 ⁇ was transformed with pTrc-td23 or pUCK-td23 and plated on minimal medium supplemented with varying concentrations of the toxic analog L-O- methylthreonine. Both of the constructs were able to confer upon DH5 ⁇ resistance to 30 ⁇ M L-O-methylthreonine. No bacterial colonies grew on plates containing untransformed DH5 ⁇ . This result provided strong evidence that the mutated omrl gene of the line GMllb of Arabidopsis is able to confer resistance to L-O- methylthreonine present in the growth medium. Therefore omrl provides a new environmentally friendly selectable marker for genetic transformation of bacteria.
  • the strategy for cloning the omrl allele into a plant expression vector was as follows: A. The coding region of omrl allele was excised from pGM- td23 as an Xbal - Kpnl fragment.
  • the 500 bp CaMV 35S promoter was cleaved out of the vector pBI121.1 (Jefferson et al. 1987) with HindUI and BamHI.
  • the pBIN19 vector was linearized by Hindi I I and BamHI then ligated to the CaMV 35S promoter so as to place the promoter into the multiple cloning site in the correct orientation. This vector was named pCM35S.
  • the plasmid pCM35S was digested with Xbal - Kpnl and the omrl fragment isolated in step A was cloned into the Xbal - Kpnl sites placing the omrl coding region sequence under the transcriptional control of the CaMV 35S promoter and creating a plasmid with the kanamycin
  • the NOS terminator of pBlN19 was PCR-amplified using a pair of oligonucleotide primers, the 5' primer was anchored with an Xbal site and the 3' primer was anchored with a Sail site. PCR amplification yielded a 300 bp NOS terminator fragment.
  • the plasmid pCM35S-omr_Z therefore contained two constructs that could be expressed in plants, the CaMV35S. omrl .
  • L-O-methyhthreonine-sensitive Arabidopsis thaliana Columbia wild type were transformed with pCM35S- omrl .
  • the TI seeds from each pot were screened tbr expression of L-O-methylthreonine resistance by germinating in agar medium supplemented with 0.2 mM L- O-methylthreonine, a concentration previously determined and known to completely inhibit the growth of wild type seedlings beyond the cotyledonous stage (Mourad and King, 1995) .
  • the T2 seed was harvested from each of the 5 positive TI transformants and 50 T2 seeds/transfor ant were planted in a separate petri plate containing 0.2 mM L-O-methylthreonine agar medium.
  • the majority (75% or more) of the T2 seedlings were resistant to L-O-methylthreonine indicating that a single copy of the transgene omrl had been inserted in the parent TI transgenic plant.
  • Figure 6b shows that 585 amino acid residues of the total 592 residues representing the full length mutant TD were expressed in the transgenic plants. This slightly truncated precursor mutant TD was able to translocate to the chloroplast and confer upon transgenic plants resistance to OMT.
  • Example 3 Cloning of a Full-Length cDNA That Encodes a Mutated Threonine Dehydratase/Deaminase Enzyme
  • Plasmid pGMtd23 ( Figure 20) is a cDNA clone that contains a portion of a transcript that encodes a mutant threonine dehydratase/deaminase enzyme.
  • the sequence of the cDNA insert portion of pGMtd23, including the EcoRI (GAATTC, bases 1-6) and Xhol (CTCGAG, bases 2230-2235) recognition sites that were added in the preparation of the clone, is set forth as SEQ ID NO: 19. It is pertinent to this invention that an uninterrupted open reading frame (ORF) begins with base numbered 1 of SEQ ID NO: 19, and continues to base number 1770, where the ORF is terminated by a TGA stop codon.
  • ORF uninterrupted open reading frame
  • amino acid sequence of the protein encoded by this ORF is given in SEQ ID NO: 19 as three letter designations of the amino acids underneath the cDNA sequence. It is seen that the first Met (methionine) residue in this deduced protein sequence occurs at amino acid number 46 (underlined) . It is well known in the field of eukaryotic gene expression that translation of proteins in most cases originates at the first ATG (methionine start codon) encountered by the ribosomes as they scan the mRNA from the 5' end.
  • bases 7-135 of SEQ ID NO: 19 represent a 5' untranslated sequence of the mRNA represented as a cDNA in pGMtd23, and that the Met encoded by bases 136-138 represents the first amino acid of a 545 amino acid encoded protein.
  • the first ATG codon is found at bases 125-127. This ATG codon is in a different reading frame from that which is found at bases 136-138.
  • the presence of this out-of frame ATG codon 11 bases 5' to the putative protein initiation codon is a highly unusual feature of the 5' untranslated sequences of plant mRNAs, and might indicate
  • bases 7-135 of SEQ ID NO: 19 are actually not the 5' untranslated leader sequence of the mRNA represented by SEQ ID N0:19.
  • the alignments reveal that both the tomato and chickpea proteins are substantially longer at their amino terminus than the deduced protein encoded by bases 136-1770 of SEQ ID NO: 19. Specifically, the tomato protein is 49 amino acids longer, and the chickpea protein is 43 amino ' acids longer. It is known [Samach, A., Hareven, D., Gutfinger, T., Ken-Dror, S., and Lifschitz, E., Biosynthetic threonine deaminase gene of toma to : isolation, structure, and upregulation in floral organs . Proc. Natl. Acad. Sci.
  • amino acids 1-50 of the tomato protein comprise the chloroplast transit peptide which directs the transport of the preprotein form of threonine dehydratase/deaminase into the chloroplast where it is naturally found in plant cells.
  • amino acids 1-50 of the tomato protein comprise the chloroplast transit peptide which directs the transport of the preprotein form of threonine dehydratase/deaminase into the chloroplast where it is naturally found in plant cells.
  • bases 7-135 of SEQ ID NO: 19 Substantial homology is seen between the amino acids encoded by bases 7-135 of SEQ ID NO: 19 and the chloroplast transit peptides of the tomato and chickpea proteins. This analysis further suggests that bases 7- 135 of SEQ ID NO: 19 do not represent a 5' untranslated leader sequence, but rather are part of a longer ORF that is incompletely represented in pGMtd23.
  • CACAGGAAACAGGAC TCTAGA-3' (tdexF, complementary to the lambda YES cloning vector) and 5'-GGAGAGACC TTAAGACGTGG- 3' (tdintR, the reverse complement of bases 166-185 of SEQ ID NO: 19) were used as forward and reverse primers, respectively in Polymerase Chain Reactions (PCR) .
  • PCR Polymerase Chain Reactions
  • a typical reaction contained in a volume of 100 ⁇ l, 50 pmole of tdintR primer, 185 pmole of tdexF primer, 1 ⁇ g of lambda YES library template DNA, 2.5 units of Amplitaq enzyme, and buffers as recommended by the manufacturer of the gene amplification PCR kit (Roche Molecular Systems, Branchburg, NJ, USA) .
  • This reaction was cycled through 35 cycles of 94°, 1 min; 50°, 2 min; and 72°, 5 min, and then followed by incubation at 72° for 7 min.
  • Amplification products in a range of sizes less than about 500 base pairs (bp) in length were detected by agarose gel electrophoresis. Following cloning of the amplification products into the TOPO TA vector
  • SEQ ID NO: 21 the DNA sequences of the insert fragments were determined by standard dideoxy terminator methodologies. One clone had the DNA sequence set forth as SEQ ID NO: 21. It can be seen that bases 22- 191 of SEQ ID NO: 21 correspond precisely to bases 16-186 of SEQ ID NO: 19, thus indicating that SEQ ID NO: 21 represents a partial clone of a cDNA derived from an Arabidopsi s mRNA encoding threonine
  • bases 1-21 of SEQ ID NO: 21 are an upstream continuation of the ORF that is known to encode the mutated threonine dehydratase/deaminase as presented in SEQ ID NO: 19, thereby indicating that bases 7-136 of SEQ ID NO: 19 do not represent a 5' untranslated sequence.
  • SEQ ID NO: 23 contains the full- length 1776 base coding region for a mutated threonine dehydratase/deaminase protein of 592 amino acids, and which includes a chloroplast transit peptide sequence.
  • TCATGA Rca I recognition site
  • Synthetic oligonucleotides having the sequence (in the 5' to 3' direction) GCTCTAGATCATGA ATTCCGTTCAGCTTCCGACGGCGCAATCCTCTCTCCGTAGCCACATT (TD5XREL primer) and CTCGTTCGTACGTTCTGGTACAGCACCGAG (tdSPL R primer) were used as forward and reverse primers, respectively, in PCR reactions using pGMtd23 DNA as template.
  • the sequence TCTAGA comprises an Xbal recognition site, and the underlined bases correspond precisely to bases 15-45 of SEQ ID NO: 19.
  • the sequence CGTACG comprises a BsiWI recognition site, and the primer sequence in its entirety forms the reverse complement of bases 214-243 of SEQ ID NO: 19.
  • the reaction contained, in a total volume of 50 ⁇ l, 50 pmol each of TD5XREL and tdSPL R primers, 20 ng of pGMtd23 DNA, 8 nmol each of dATP, dGTP, dCTP, and dTTP, with 2.5 units of Amplitaq Gold polymerase and IX buffer as supplied by the manufacturer (Roche Molecular Systems, Branchburg, NJ, USA) .
  • Amplification products of the expected size (259 bp) were detected by agarose gel
  • the 245 bp XballBsiWl fragment from this clone was used to replace the corresponding Xbal/BsiWI fragment of pGMtd23, generating a plasmid named pDAB5017, which has as a portion of its sequence the DNA sequence set forth as SEQ ID NO: 23.
  • the Nhel site of SEQ ID ⁇ O:23 (underlined bases 2005-2010) was converted to an Smal recognition site, generating plasmid pDAB5018.
  • plasmid pDAB5018 Following cleavage of plasmid pDAB5018 with Real and Smal, a 2013 bp DNA fragment that includes the entire full-length coding region for the mutated threonine dehydratase/deaminase enzyme and 228 bp corresponding to the 3' untranslated region of the mRNA, was isolated.
  • Plasmid pDAB1850 therefore is capable of independent expression of the coding region for the full-length mutated threonine dehydratase/deaminase enzyme in a transformed plant cell.
  • the coding region for the full- length mutated threonine dehydratase/deaminase enzyme whose expression is regulated by the maize ubiquitin 1 promoter/intron and Nos terminator, is covalently linked to a plant selectable marker gene, specifically, the phosphinothricin acetyl transferase resistance (Jar) coding region.
  • the bar coding region is under the transcriptional control of a highly modified version of the cauliflower mosaic virus (CaMV) 35S promoter (the modified version being called 35T) , with transcription termination and polyadenylation being mediated by the Nos terminator.
  • the 35T promoter is comprised of a doubly- enhanced version of the basic 35S promoter, which is placed in front of a chimeric 5' untranslated leader sequence.
  • This leader consists of the 5' untranslated leader of the Maize Streak Virus (MSV) coat protein gene, into which has been ligated an internally deleted version of intron 1 of the maize alcohol dehydrogenase IS gene.
  • MSV Maize Streak Virus
  • Plasmid pDAB1852 has utility in testing the expression of the coding region for the full-length mutated threonine dehydratase/deaminase enzyme in transgenic plant cells.
  • pDAB1852 By virtue of the simultaneous introduction of both plant-expressible genes present on pDAB1852 into transformed plant cells, it is possible to first select for transformed cells using the bar selectable marker gene, and then screen the transformed plant cells for production of the mutant threonine dehydratase/deaminase enzyme.
  • the presence of the mutated threonine dehydratase/deaminase enzyme can be exemplified by biochemical methods described elsewhere.
  • plasmid pDABl850 a nonselectable, but scorable, marker gene is included in such experiments.
  • One such scorable gene is the Escherichia coli uidA gene, which encodes the GUS protein.
  • FIG. 24 presents the map of plasmid pDAB311, which contains the 35T/jar/Nos and 35T/GUS/Nos genes, and is used in co-transformation experiments with plasmid pDABl850.
  • Figure 25 presents the map of plasmid pDAB305, which contains the 35T/GUS/Nos gene, and is used in co- transformations with plasmid pDAB1852.
  • the plasmids pDABl850, pDABl852, pDAB311 and pDAB305 may be used for the production of transgenic plants, such as maize. Examples of the production of transgenic lines are described further in the Examples.
  • Part A Initiation, establishment and maintenance of embryogenic maize suspension cultures.
  • H9CP+ is liquid MS medium (Murashige and Skoog, 1962, A revised medium for rapid growth and bioassays with tobacco tissue cultures. Physiol. Plant.15: 473-497) plus 2 mg/L 2,4-D, 2 mg/L NAA, 100 mg/L myo-inositol, 0.69g/L L-proline, 200 mg/L casein hydrolysate, 30 mg/L sucrose and 5% coconut water (added at subculture) adjusted to pH 6 prior to autoclaving. Cultures were maintained in 125 ml Erlenmeyer flasks in the dark at 28°C on a 125 rpm shaker. Cultures typically became established 2 to 3 months after initiation, and were maintained by subculture every 3.5 days. For subculture, 3 ml packed cell volume (pcv) of cell was measured in a 10 ml wide bore pipet. The measured cells plus 7 ml of old medium was pipetted into 20 ml of fresh medium.
  • pcv packed
  • Part B Preparation of silicon carbide whiskers for use in transformation experiments.
  • a 5% w/v whisker suspension was made by adding an appropriate amount of an osmotic culture medium per tube of sterile whiskers.
  • the osmotic medium was liquid N6 medium plus 45 g/L D- sorbitol, 45g/L D-mannitol, and 30 g/L sucrose, adjusted to pH 6.0 before autoclaving.
  • the whisker suspension was vortexed 1 to 2 minutes immediately prior to use.
  • Part C Preparation of maize suspension cultures for use in transformation experiments. Approximately 16 to 24 hours prior to transformation experiments, maize suspension cultures were each subcultured into 20 ml of liquid N6 medium. On the day of the experiment, all cells of a given line are pooled, to reduce flask to flask variability.
  • the N6 osmotic medium described in part B of this example is added to sterile 125 ml Erlenmeyer flasks (12 ml/flask) . Typically 6 or 8 flasks were used per experiment. To each flask, 2 ml pcv from the pool was added. The flasks were placed on a shaker in the dark for 45 minutes. After that time period, the contents of each flask were transferred to a 15 ml centrifuge tube. After the cells had settled to the bottom of the tube, all but 1 ml of liquid was drawn off and added back to the original flask.
  • the 5% w/v whisker suspension was prepared and vortexed as outlined in part B. Using a wide bore pipet tip, 160 ⁇ l was added to each centrifuge tube of cells. In approximately half the experiments, 20 ⁇ l of plasmid pDAB1852, adjusted to a concentration of 1 mg/ml, was added to each tube. In the other half of the experiments, 10 ⁇ l of pDAB311 and 10 ⁇ l of pDABl850, both adjusted to lmg/ml, were added to each tube. The tubes were securely capped and swirled or tapped to mix the contents. Each tube was placed on a modified Vari-Mix dental amalgamator. The amalgamator was adapted to hold a
  • Part E Plating of whisker-treated suspension cells, selection and recovery of stable transformants.
  • each flask was transferred to a 50 ml centrifuge tube.
  • a sterilized glass cell collector unit was connected to a vacuum, and a sterile Whatman #4 filter paper was placed on the unit.
  • Six ml of cell suspension was pipetted into the unit, with the vacuum drawing through the liquid, leaving the cells on the filter paper.
  • One flask yielded 5 filters of cells.
  • Each filter paper was placed on a 60 x 20 plate of N6 solid medium. Plates were wrapped with 3M micropore tape and placed in the dark for 1 week at 28° C.
  • the filter papers were transferred to plates containing solid N6 medium + 1 mg/L bialaphos. This step was repeated after an additional week.
  • the tissue was embedded on 100 x 15, which also contained solid N6 medium + bialaphos.
  • 5 ml of melted 37° C agarose was added to a sterile test tube which contained 50 ⁇ l of bialaphos stock. The contents of each filter was scraped into the test tube and pipetted up and down to mix. Approximately 2.5 ml (1/2 of the contents) was pipetted over the surface of individual 100 x 15 selection plates. Each test tube yielded 2 plates. Plates were wrapped with parafilm and incubated in the dark at 28°C.
  • Bialaphos-resistant transformants were recovered 2 to 8 weeks post-embedding. They appeared as light yellow sectors proliferating against a background of dark yellow to brown growth-inhibited tissue. The growing tissue was
  • Radiolabled probe D ⁇ A was hybridized to the genomic D ⁇ A on the blots using 50 ml of minimal hybridization buffer (10% polyethylene glycol, 7% sodium dodecyl sulfate, 0.6x SSC, 10 mM sodium phosphate, 5 mM EDTA and 100 mg/ml denatured salmon sperm D ⁇ A) and was heated to 60° C and mixed with the denatured radiolabeled hybridization at 60° C. The blots were washed at 60 °C in 0.25X SSC and 2% SDS for 45 minutes, blotted dry and exposed to XAR-5 film with two intensifying screens overnight.
  • minimal hybridization buffer (10% polyethylene glycol, 7% sodium dodecyl sulfate, 0.6x SSC, 10 mM sodium phosphate, 5 mM EDTA and 100 mg/ml denatured salmon sperm D ⁇ A
  • callus was produced which was transformed with the Arabidopsis thaliana mutated threonine dehydratase/deaminase (TD) denoted as omrl with either pDAB1852 (plasmid containing BAR and omrl) or cotransformed with pDAB1850 (plasmid with omrl) and pDAB311 (plasmid containing BAR and GUS) .
  • Maize callus material was selected on bialaphos and analyzed for threonine dehydratase/deaminase activity in the presence and absence of isoleucine.
  • Maize callus was homogenized, proteins extracted, and normalized for protein concentration (BioRAD Protein assay, Hercules, CA) .
  • Threonine dehydratase/deaminase assays were conducted according to Strauss et al., ((1985) Planta 163:554-562) with slight modifications.
  • a standard reaction contained 0.15 M Tris-HCl, pH 9.0, 60 mM threonine, 0.3 M K2HP04, 0.3 mM Na2EDTA, pH 9.0, 0.3 mM DTT, 2-5 mM L-isoleucine in treated assays, and enzyme in total volume of 500 ⁇ L.
  • Ketoacid produced was determined according to Friedmann and Haugen (1943) (?) by adding 200 ⁇ L of 0.1% (w/v) 2,4- dinitrophenylhydrazine in 2 N HCl and incubated for 20 min at room temperature. KOH (900 ⁇ L of 2.5 N) was then added and mixed, the tubes were incubated for 15 min at room temperature, and the A515 was determined. Natural variations in threonine dehydratase/deaminase activity were determined using nontransformed callus lines as a control. The results are shown in Table 1.
  • omrl-2 2.205 1.632 1141 + omrl-27 0.521 0.401 280 + omrl-30 0.669 0.536 374 +
  • TD activity was measured by absorbance at 515 nm.
  • b The effect of 2 mM isoleucine on TD activity.
  • c Percent change of TD activity relative to the control. Southern analysis presence of omrl gene (+) or no band determined (ND) . A significant correlation was observed between the presence and absence of isoleucine on TD activity. Callus lines omrl-2, 27, 30 were insensitive to isoleucine and overall showed an increase in TD activity as compared to the control lines and callus lines omrl-3, 6, and 10. One callus line, omrl-3 was determined to contain the gene of interest however, was not shown to have a difference in TD activity. The results described above demonstrate that transformation of maize callus
  • Callus maintenance medium consisted of N6 salts and vitamins (Chu et al, (1978) The N6 medium and its application to anther culture of cereal crops. Proc. Symp. Plant Tissue Culture, Peking Press, 43-56) , 1.0 mg/L 2,4-D, 2.5 g/L GELRITE, and 20 g/L sucrose, with a pH of 5.8. After 2 and 4 weeks of culture, fresh weight of the callus was measured. Growth responses of callus lines with and without increase in threonine dehydratase ' (TD) activity are presented in Table X.
  • Transgenics which contained the omrl gene and showed increased TD activity (i.e., omrl-2, 27 and 30), were found to grow at lethal or sub-lethal concentrations of OMT (0.5 and 1.0 mM) , however, at different levels. No or very little growth was observed in the case of transgenic lines with TD enzyme activity similar to that of the non-transgenic controls as described previously. These results demonstrate that the omrl gene is functional in maize transgenics and confer resistance to lethal or sub-lethal concentrations of OMT.
  • Omrl-2, 27 and 28 have increased levels of TD activity, which are significantly different from that of the controls. These lines show resistance to OMT as shown by their growth at 0.5 and 1.0 mM OMT.
  • NT control non-transgenic control
  • omrl- 3, 6, and 10 lines have low levels of TD activity, which are not significantly different from each other.
  • ⁇ 71- Maize ⁇ Type II' callus was used as tissue targets for transformation via helium blasting (Pareddy et al . , 1987, Maize transformation via helium blasting, Maydica, 42: 143-154).
  • x Type II' callus cultures were initiated from immature zygotic embryos of the genotype "Hi-II.” (Armstrong et al, (1991) Maize Cooperation Newsletter, pp.92-93). Embryos were isolated from greenhouse-grown ears from crosses between Hi-II parent A and Hi-II parent B or F2 embryos derived from a self- or sib-pollination of a Hi-II plant.
  • Immature embryos (1.5 to 3.5 mm) were cultured on initiation medium consisting of N6 salts and vitamins (Chu et al, (1978) The N6 medium and its application to anther culture of cereal crops. Proc. Symp. Plant Tissue Culture, Peking Press, 43-56), 1.0 mg/L 2,4-D, 25mM L-proline, 100 mg/L casein hydrolysate, 10 mg/L AgN03, 2.5 g/L GELRITE, and 20 g/L sucrose, with a pH of 5.8. Selection for Type II callus took place for ca. 2-12 weeks. After four weeks callus was subcultured onto maintenance medium (initiation medium in which AgN03 was omitted and L-proline was reduced to 6 mM) .
  • maintenance medium initiation medium in which AgN03 was omitted and L-proline was reduced to 6 mM
  • Part B Precipitation of gold particles for use in helium blasting.
  • plasmid DNA two plasmids, i.e., PDAB1852 and pDAB305, in 1 : 1 molar ratio
  • plasmid DNA two plasmids, i.e., PDAB1852 and pDAB305, in 1 : 1 molar ratio
  • spherical gold particles Bio-Rad 1.0 ⁇ m diameter or Aldrich 1.0-1.5 ⁇ m diameter
  • the solution was immediately vortexed and the DNA-coated gold particles were allowed to settle.
  • the resulting clear supernatant was removed and the gold particles were resuspended in 1 mL of absolute ethanol. This suspension was diluted with absolute ethanol to obtain 15 mg DNA- coated gold/ mL.
  • embryogenic callus tissue was spread over the surface of ⁇ Type II' callus maintenance medium as described herein lacking casein hydrolysate and L- proline, but supplemented with 0.2 M sorbitol and 0.2 M mannitol as an osmoticum.
  • tissue was transferred to culture dishes containing blasting medium (osmotic media solidified with 20 g/L tissue culture agar (JRH Biosciences, Lenexa, KS) instead of 7 g/L GELRITE (Schweizerhall) .
  • Helium blasting accelerated suspended DNA-coated gold particles towards and into the prepared tissue targets.
  • the device used was an earlier prototype of that described in US Patent #5,141,131 which is incorporated herein by reference. Tissues were covered with a stainless steel screen (230 ⁇ m openings) and placed under a partial vacuum of 25 inches of Hg in the device chamber.
  • the DNA-coated gold particles were further diluted 1:1 with absolute ethanol prior to blasting and were accelerated at the callus target once using a helium pressure of 1500 psi, with each blast delivering 20 ⁇ L of the DNA/gold suspension.
  • tissue was transferred to osmotic media for a 16-24 h recovery period. Afterwards, the tissue was divided into small pieces and transferred to selection medium (maintenance medium lacking casein hydrolysate and L-proline but having 0.5 mM concentration of O-methyl threonine (Sigma, St. Louis, MO). Every three weeks for 3 months, tissue pieces were non-selectively transferred to fresh selection medium containing either 0.5 or 1.0 mM OMT. After 6-8 weeks, callus sectors found proliferating against a background of growth-inhibited tissue were removed and isolated. The resulting OMT-resistant tissue was subcultured biweekly onto fresh selection medium.
  • selection medium maintenance medium lacking casein hydrolysate and L-proline but having 0.5 mM concentration of O-methyl threonine (Sigma, St. Louis, MO). Every three weeks for 3 months, tissue pieces were non-selectively transferred to fresh selection medium containing either 0.5 or 1.0 mM OMT. After 6-8 weeks, callus sectors
  • Radiolabled probe DNA was hybridized to the genomic DNA on the blots using 50 ml of minimal hybridization buffer (10% polyethylene glycol, 7% sodium dodecyl sulfate, 0.6x SSC, 10 mM sodium phosphate, 5 mM EDTA and 100 mg/ml denatured salmon sperm DNA) and was heated to 60° C and mixed with the denatured radiolabeled hybridization at 60° C. The blots were washed at 60 °C in 0.25X SSC and 2% SDS for 45 minutes, blotted dry and exposed to XAR-5 film with two intensifying screens overnight.
  • minimal hybridization buffer (10% polyethylene glycol, 7% sodium dodecyl sulfate, 0.6x SSC, 10 mM sodium phosphate, 5 mM EDTA and 100 mg/ml denatured salmon sperm DNA
  • callus was produced which was transformed with the Arabidopsis thaliana mutated
  • TD -74- threonine dehydratase/deaminase
  • Maize callus material was selected on L-O-methylthreonine and analyzed for threonine dehydratase/deaminase activity in the presence and absence of isoleucine.
  • Threonine dehydratase/deaminase activity was performed on extracted proteins from each individual callus line normalized for protein concentrations (BioRAD Protein assay, Hercules, CA) .
  • TD Threonme dehydratas
  • the transgenic lines produced here were transformed with pDAB1852 and pDAB305 containing the GUS reporter gene, which is the gene of interest in this study.
  • GUS reporter gene which is the gene of interest in this study.
  • callus samples of each line were subjected to histochemical GUS analysis (Jefferson, 1987, Plant Mol. Biol. Rep. 5:387-405).
  • tissues were placed in 24-well microtiter plates (Corning) containing 500 ⁇ L of assay buffer [0.1 M sodium phosphate, pH 8.0, 0.5 mM' potassium ferricyanide, 0.5 mM potassium ferrocyanide, 10 mM sodium EDTA, 1.9 mM 5-bromo-4-chloro-3-indolyl-beta-D- glucuronide, and 0.06% TRITON X-100] per well and incubated in the dark for 1-2 days at 37° C before analysis.
  • GUS expression was observed as blue spots on callus or intense blue of entire callus under a microscope and are presented in Table 4.
  • Four transgenic lines displayed GUS expression.
  • Example 7 Use Of omrl As A Selectable Marker For Rice
  • NB 'callus induction' medium
  • the NB medium consisted of N6 macro elements (Chu, 1978, The N6 medium and its application to anther culture of cereal crops. Proc. Symp. Plant Tissue Culture, Peking Press, p43-56) , B5 micro elements and vitamins (Gamborg et al., 1968,
  • Part B Precipitation of gold particles for use in helium blasting.
  • callus was transferred back to the media with high osmoticum overnight before placing on selection medium, which consisted of NB medium with 1.5 mM O-methyl threonine (OMT) .
  • OMT O-methyl threonine
  • callus was produced which was transformed with the Arabidopsis thaliana mutated threonine dehydratase/deaminase (TD) denoted as omrl cobombarded with pDABl850 (plasmid containing omrl) and pDABl518 (plasmid containing GUS) .
  • Rice callus material was selected on L-O-methylthreonine and analyzed for threonine dehydratase/deaminase activity.
  • Threonine dehydratase/deaminase activity was performed on extracted proteins from each individual callus line normalized for protein concentrations (BioRAD Protein assay, Hercules, CA) . Analysis was performed using threonine as the substrate, as described previously, and is shown in Table 5.
  • TD Threonine dehydratase/deaminase
  • ⁇ _L- have GUS activity, however, two of these lines did have increased TD activity compared to the nontransformed control lines.
  • the results described above demonstrate that transformation of rice callus with omrl increased the overall TD activity.
  • the transgenic lines produced here were transformed with pDAB1850 and pDAB1518 containing the GUS reporter gene.
  • pDAB1850 and pDAB1518 containing the GUS reporter gene were transformed with pDAB1850 and pDAB1518 containing the GUS reporter gene.
  • callus samples of each line were subjected to histochemical GUS analysis (Jefferson, 1987, Plant Mol. Biol. Rep. 5:387- 405) as described herein.
  • the results in Table 5 provide clear evidence that the transgenics selected on OMT are true transformants and contain the gene of interest, i.e., GUS reporter gene.
  • omrl can also be used as a selectable marker in rice and other monocots.
  • the recombinant plasmid containing the wild type allele OMRl was named pGM-td54 and the OMRl allele was manually sequenced using the sequenase kit of USB and the
  • the full length cDNA of the omrl locus was found to be 1779 nucleotides ( Figure 7) encoding a TD protein of 592 amino acids ( Figures 8 and 9) .
  • the omrl insert as shown in Figure 6b (SEQ ID NOS: 5-6) was not only strongly expressed in the first transgenic plants (Tl) but was also inherited and strongly expressed in their progeny (12 plants) .
  • the full length cDNA of the OMRl allele of the omrl locus was 1779 nucleotides ( Figure 10) encoding a wild type TD of 592 amino acids ( Figures 11 and 12) .
  • E. coli biosynthetic (Wek RC, Hatfield GC (1986) Nucleotide sequence and in vivo expression of ilvY and ilvC genes in Escherichia coli K12. Transcription from
  • the degree of similarity between amino acid residues of Arabidopsis threonine dehydratase/deaminase and those of other organisms was calculated by the Lipman-Pearson protein alignment method using the Lasergene software and was found to be 46.2% with chickpea, 52.7% with tomato, 55.0% with potato (partial), 45.0% with yeast 1, 24.7% yeast 2, 43.4% with E. coli (biosynthetic), 39.3% with E. coil (catabolic) and 43.3% with Salmonella .
  • This point mutation resides in a conserved regulatory region of amino acids designated R6 (regulatory) by Taillon et al. (1988) where the mutated amino acid is normally an arginine residue in TD of Arabidopsis, chickpea, tomato, potato (partial), yeast 1, E. coli (biosynthetic) and Salmonella ( Figure 19) .
  • acg aac gag get gag aac gga age ate gcg gaa get atg gag tat ttg 288 Thr Asn Glu Ala Glu Asn Gly Ser lie Ala Glu Ala Met Glu Tyr Leu
  • gtt att get gga caa ggg act gtt ggg atg gag ate act cgt cag get 768 Val He Ala Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin Ala 245 250 255
  • gtt cac aca get gga gag etc aaa gca eta cag aag aga atg gaa tet 1440 Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser 465 470 475 480
  • acg aac gag get gag aac gga age ate gcg gaa get atg gag tat ttg 288 Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr Leu 85 90 95
  • ttg get aag aag eta tet aag aga tta ggt gtt cgt atg tat 384 Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr 115 120 125
  • gtt att get gga caa ggg act gtt ggg atg gag ate act cgt cag get 768 Val He Ala Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin Ala
  • gtt cac aca get gga gag etc aaa gca eta cag aag aga atg gaa tet 1440 Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser 465 470 475 480
  • cca etc caa ttg get aag aag eta tet aag aga tta ggt gtt cgt atg 432
  • gga get tac aat atg atg gtg aaa ctt cca gca gat caa ttg gca aaa 528
  • cga get gaa gaa gag ggt ctg acg ttt ata cct cct ttt gat cac cct 768 Arg Ala Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro 245 250 255
  • gga gtt cac aca get gga gag etc aaa gca eta cag aag aga atg gaa 1488 Gly Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu 485 490 495
  • gca cat get aag ata cga get gaa gaa gag ggt ctg acg ttt ata cct 432 Ala His Ala Lys He Arg Ala Glu Glu Glu Gly Leu Thr Phe He Pro 130 135 140
  • cct ttt gat cac cct gat gtt att get gga caa ggg act gtt ggg atg 480 Pro Phe Asp His Pro Asp Val He Ala Gly Gin Gly Thr Val Gly Met 145 150 155 160
  • gag ate act cgt cag get aag ggt cca ttg cat get ata ttt gtg cca 528 Glu He Thr Arg Gin Ala Lys Gly Pro Leu His Ala He Phe Val Pro 165 170 175
  • gca atg get ttg teg ctg cat cac ggt gag agg gtg ata ttg gac cag 672 Ala Met Ala Leu Ser Leu His His Gly Glu Arg Val He Leu Asp Gin 210 215 220
  • agt ggc get aac atg aac ttt gac aag eta agg att gtg aca gaa etc 960 Ser Gly Ala Asn Met Asn Phe Asp Lys Leu Arg He Val Thr Glu Leu 305 310 315 320
  • cct gcc ggt tac etc ggt get gta cca gaa cgt acg aac gag get gag 96 Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg Thr Asn Glu Ala Glu 20 25 30
  • tgc act get gtg att gtt atg cct gtt acg act cct gag ata aag tgg 432 Cys Thr Ala Val He Val Met Pro Val Thr Thr Pro Glu He Lys Trp 130 135 140
  • gaa gtt ggt gaa gag act ttt cgt ata age aga aat eta atg gat ggt 864 Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg Asn Leu Met Asp Gly 275 280 285
  • gtt gtt ctt gtc act cgt gat get att tgt gca tea ata aag gat atg 912 Val Val Leu Val Thr Arg Asp Ala He Cys Ala Ser He Lys Asp Met 290 295 300
  • gaa aag gag get gtt gta eta tac agt gtc gga gtt cac aca get gga 1248 Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly Val His Thr Ala Gly 405 410 415
  • gta cca gaa cgt acg aac gag get gag aac gga age ate gcg gaa get 96
  • aag ctt cgt gga get tac aat atg atg gtg aaa ctt cca gca gat caa 288 Lys Leu Arg Gly Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin
  • cct gtt acg act cct gag ata aag tgg caa get gta gag aat ttg ggt 432 Pro Val Thr Thr Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly 130 135 140

Abstract

The present invention relates to methods and materials in the field of molecular biology and the regulation of polypeptide synthesis through genetic engineering of plants and/or microorganisms. More particularly, the invention relates to newly-isolated nucleotide sequences, nucleotide sequences having substantial identity thereto and equivalents thereof, as well as polypeptides encoded thereby. The invention also involves the introduction of foreign nucleotide sequences into the genome of a plant and/or microorganism, wherein the introduction of the nucleotide sequence effects an increase in the transformant's resistance to toxic isoleucine structural analogs. Inventive sequences may therefore be used as excellent molecular markers for screening successful transformants, thereby replacing antibiotic resistance genes used in the prior art. Transformants harboring a nucleotide sequence comprising a promoter operably linked to an inventive nucleotide sequence demonstrate increased levels of isoleucine production, thereby providing an improved nutrient source.

Description

METHODS AND COMPOSITIONS FOR PRODUCING PLANTS AND MICROORGANISMS THAT EXPRESS FEEDBACK INSENSITIVE THREONINE DEHYDRATASE DEAMINASE
The present application is a continuation-in-part application of number PCT/US98/14362, filed July 10, 1998, designating the U.S., which claims the benefit of U.S. provisional application number 60/052,096, filed July 10, 1997 and U.S provisional application number 60/074,875, filed February 17, 1998; all of which are incorporated herein by reference.
The present invention relates to methods and materials in the field of molecular biology and to the utilization of isolated nucleotide sequences to genetically engineer plants, and/or microorganisms. More particularly, the invention relates in certain preferred aspects to novel nucleotide sequences and uses thereof, including their use in DNA constructs for transforming plants, fungi, yeast & bacteria. The nucleotide sequences are particularly useful as selectable markers for screening plants and/or microorganisms for successful transformants and also for improving the nutritional value of plants.
BACKGROUND OF THE INVENTION
Threonine dehydratase/deaminase ("TD") is the first enzyme in the biosynthetic pathway of isoleucine, and catalyzes the formation of 2-oxobutyrate from threonine ("Thr") in a two-step reaction. The first step is a dehydration of Thr, followed by rehydration and liberation of ammonia. All reactions downstream from TD are catalyzed by enzymes that are shared by the two main branches of the biosynthetic pathway that lead to the production of the branched-chain amino acids, isoleucine ("He"), leucine ("Leu"), and valine ("Val"). An illustration of the biosynthetic pathway is set forth in
-1- 9/41395
Figure 1. The cellular levels of lie are controlled by negative feedback inhibition. When the cellular levels of lie are high, lie binds to TD at a regulatory site (allosteric site) that is different from the substrate binding site (catalytic site) of the enzyme. The formation of this Ile-TD complex causes conformational changes to TD, which prevent the binding of substrate, thus inhibiting the lie biosynthetic pathway.
It is known that certain structural analogs of lie exist which are toxic to a wide variety of plants and microorganisms. It is believed that these lie analogs are toxic because cells incorporate the analogs into polypeptides in place of lie, thereby synthesizing defective polypeptides. In this regard, L-O- methylthreonine ("OMT") was reported in 1955 to be a structural analog of lie that inhibits growth of mammalian cell cultures by inhibiting incorporation of lie into protein (Rabinovitz M, et al., Steric relationship between threonine and isoleucine as indicated by an antimetabolite study. J Am Chem Soc. 77:3109-3111 (1955).) It is believed that the same phenomenon explains growth inhibition, which is caused by other structural analogs of lie such as, for example, thialle. Certain strains of bacteria and yeast and certain plant lines have been identified which are resistant to the toxicity of the above-noted lie structural analogs, and this resistance has been attributed to a mutation in the TD enzyme. The mutated TD apparently features a loss or decrease of lie feedback sensitivity (referred to herein as "insensitivity") . As a result of this insensitivity, cells harboring insensitive TD produce increased amounts of lie, thereby out competing the toxic lie analog during incorporation into cellular proteins. For example, resistance to thialle has been associated in certain strains of bacteria and yeast with a loss of feedback sensitivity of TD to lie. In Rosa cells, resistance to OMT was also associated with a TD that had
-2- reduced sensitivity to feedback inhibition by lie. Being in tissue culture and having high ploidy level, however, it was not possible to determine the genetic basis of feedback insensitivity to lie in the Rosa variant, the only known plant mutated with an Ile-insensitive TD. Turning to a field of research where the present invention finds advantageous application, selectable markers are widely used in methods for genetically transforming cells, tissues and organisms. Such markers are used to screen cells, most commonly bacteria, to determine whether a transformation procedure has been successful. As a specific example, it is widely known that constructs for transforming a cell may include as a selectable marker a nucleotide sequence that confers antibiotic resistance to the transformed cell. After transformation, the cells may be contacted with an antibiotic in a screening procedure. Only successful transformants, i.e., those which possess the antibiotic resistance gene, survive and continue to grow and proliferate in the presence of the antibiotic. This technique provides a manner whereby successful transformants may be identified and propagated, thereby eliminating the time consuming and costly alternative of growing and working with cells which were not successfully transformed.
The above-described screening technique is becoming less advantageous, however, because, due to prolonged exposure to antibiotics, an ever-increasing number of naturally-occurring microorganisms are developing antibiotic resistance by spontaneous mutation. The reliability of this screening technique is therefore compromised because the continuous exposure to antibiotics causes microorganisms that are not transformed to develop spontaneous mutations that confer antibiotic resistance.
In addition to the decreasing viability of this screening technique, the overuse of antibiotics, and the resulting resistance spontaneously developed by
-3- /41395
microorganisms, is a growing medical concern as the efficacy of antibiotics in fighting bacterial infections is decreasing. Many infections including meningitis no longer respond well to drugs that once worked well against them. This phenomenon is attributed largely to the overuse of antibiotics, both as drugs and as a laboratory screening tool, and the resulting antibiotic resistance of a growing number of microorganisms. As an example, the bacteria that causes meningitis once was routinely controlled with ampicillin a commonly prescribed antibiotic and an antibiotic very heavily used in screening transformed bacterial cells for resistance as a selectable marker. Now, however, about 20 percent of such infections are resistant to ampicillin. The present invention addresses the aforementioned problems in screening genetic transformants and provides nucleotide sequences which may be advantageously used as selectable markers, and which may be inserted into the genome of a plant or microorganism to provide a transformed plant or microorganism. Such a transformed plant or microorganism advantageously exhibits significantly increased levels of lie synthesis and synthesis of intermediates of the lie biosynthetic pathway and is therefore also capable of surviving in the presence of a toxic lie analog.
SUMMARY OF THE INVENTION
The present invention provides nucleotide sequences, originally isolated and cloned from Arabidopsis thaliana , which encode feedback insensitive TD that may advantageously be used to transform a wide variety of plants, fungi, bacteria and yeast. Inventive forms of TD are not only insensitive to feedback inhibition by isoleucine, but are also insensitive to structural analogs of isoleucine that are toxic to plants and microorganisms which synthesize only wild-type TD. Therefore, inventive nucleotide sequences encoding mutated forms of TD can be used to create cells that are
-4 - 9/41395
insensitive to compounds normally toxic to cells expressing only wild-type TD enzymes. In this regard, an inventive nucleotide sequence may be used in a DNA construct to provide a biochemical selectable marker One aspect of the present invention is identification, isolation and purification of a gene encoding a wild-type form of TD. The DNA sequence thereof can be used as disclosed herein to determine the complete amino acid sequence for the protein encoded thereby and thus allow identification of domains found therein that can be mutated to produce additional TD proteins having altered enzymatic characteristics. In another aspect of the invention, there are provided isolated and purified polynucleotides, the polynucleotides encoding a mutated form of TD, or a portion thereof, as disclosed herein. For example, the invention provides isolated polynucleotides comprising the sequence set forth in SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO: 23 and SEQ ID NO: 25, nucleotide sequences having substantial identity thereto, and nucleotide sequences encoding TD variants of the invention. Also provided are isolated polypeptides comprising the amino acid sequence set forth in SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO.-18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24 and SEQ ID NO: 26, and variants thereof selected in accordance with the invention. In an alternate aspect of the invention, there is provided a chimeric DNA construct comprising a promoter operably linked to a nucleotide sequence encoding a threonine dehydratase/deaminase that is substantially resistant to feedback inhibition. In a cell harboring the construct, the nucleotide sequence can be transcribed to produce mRNA and said mRNA can be translated to produce either mature, mutated TD or a precursor mutated TD protein, said protein being functional in said cell. Also
-5- 9/41395
provided, therefore, is a vector useful for transforming a cell, and plants and microorganisms transformed therewith, the vector comprising a DNA construct selected in accordance with the invention. In alternate aspects of the invention, there are provided cells and plants having incorporated into their genome a foreign nucleotide sequence operably linked to a promoter, the foreign sequence comprising a nucleotide sequence having substantial identity to a sequence set forth herein or a foreign nucleotide sequence encoding an inventive polypeptide.
In another aspect of the invention, there is provided a method comprising incorporating into a plant's genome an inventive DNA construct to provide a transformed plant; wherein the transformed plant is capable of expressing the nucleotide sequence.
Yet another aspect of the invention is the production and propagation of cells transformed in accordance with the invention, wherein the cells express a mutated TD enzyme, thus making the cells resistant to feedback inhibition by isoleucine, and resistant to molecules that are toxic to a cell producing only the wild-type TD enzyme. In this regard, there is provided a method comprising providing a vector featuring a promoter operably linked to a nucleotide sequence encoding a threonine dehydratase/deaminase that is resistant to feedback inhibition, wherein the promoter regulates expression of the nucleotide sequence in a host plant cell; and transforming a target plant with the vector to provide a transformed plant, the transformed plant being capable of expressing the nucleotide sequence. Plants transformed in accordance with the invention have within their chloroplasts a mature, mutated form of TD, which renders the cells resistant to toxic lie analogs. Also provided are transformed plants obtained according to inventive methods and progeny thereof.
Also provided is a method for screening potential transformants, comprising (1) providing a plurality of
-6- 99/41395
cells, wherein at least one of the cells has in its genome an expressible foreign nucleotide sequence selected in accordance with the invention; and (2) contacting the plurality of cells with a substrate comprising a toxic isoleucine structural analog; wherein cells comprising the expressible foreign nucleotide sequence are capable of growing in the substrate and wherein cells not comprising the expressible foreign nucleotide sequence are incapable of growing in the substrate.
In another aspect of the invention, there is provided a construct comprising a primary nucleotide sequence to be introduced into the genome of a target cell, tissue and/or organism, and further comprising a biochemical selectable marker selected in accordance with the invention. This aspect of the invention may be advantageously used to transform a wide variety of cells, including microorganisms and plant cells. After introducing the DNA construct, which also includes an appropriate promoter and such other regulatory sequences as may be selected by a skilled artisan, into a target plant or microorganism, the plant or microorganism may be grown in a substrate comprising a toxic isoleucine analog (a "toxic substrate"), thereby providing a mechanism for the early determination whether the transformation was successful. Where a plurality of plants or microorganisms are transformed, placing potential transformants into a toxic substrate provides an early screening step whereby successful transformants may be identified. It is readily understood by a person skilled in the relevant field, in view of the present specification, that successful transformants will grow normally in the toxic substrate by virtue of expression of the insensitive TD; however, unsuccessfully transformed plants and/or microorganisms will die due to the toxic effect of the substrate. Transformed plants may thereby be identified quickly in accordance with the invention, and transformed microorganisms may be
-7- 99/41395
identified in accordance with the invention without using antibiotic resistance genes.
In another aspect of the invention, there is provided a method for reliably incorporating a first, expressible, foreign nucleotide sequence into a target cell, comprising providing a vector comprising a promoter operably linked to a first primary nucleotide sequence and a second nucleotide sequence selected in accordance with the invention, the second sequence encoding an insensitive TD enzyme; transforming the target cell with the vector to provide a transformed cell; and contacting the cell with a substrate comprising L-O-methylthreonine wherein successfully transformed cells are capable of growing in the substrate, and wherein unsuccessfully transformed cells are incapable of growing in the substrate.
In an alternate aspect of the invention, there is provided a method for growing a plurality of plants in the absence of undesirable plants, such as, for example, weeds, the method comprising providing a plurality of plants, each having in its genome a foreign nucleotide sequence comprising a promoter operably linked to a nucleotide sequence selected in accordance with the invention; growing the plurality of plants in a substrate; and introducing a preselected amount of an isoleucine structural analog into the substrate. TD enzymes described herein function in the chloroplasts of a plant cell. Therefore, it is readily appreciated by a skilled artisan that a nucleotide sequence inserted into a plant cell will necessarily encode a precursor TD peptide. Thus, chimeric DNA constructs are described herein that comprise a first nucleotide sequence encoding a mature mutated form of TD and a second nucleotide sequence encoding a chloroplast transit peptide of choice, the second sequence being functionally attached to the 5' end of the first sequence. Expression of the chimeric DNA construct results in the production of a mutated precursor TD
-8- 99/41395 enzyme that can be translocated to a chloroplast. The presence of a mature mutated TD in the chloroplast results in a plant cell having characteristics described herein. It is an object of the present invention to provide isolated nucleotide sequences, which may be introduced into the genome of a plant or microorganism to increase the ability of the plant or microorganism to synthesize lie and intermediates of the lie biosynthetic pathway. Additionally, it is an object of the invention to provide nucleotide sequences, which may be used as excellent biochemical selectable markers for identifying successful transformants in genetic engineering protocols. It is also an object of the invention to provide a novel, efficient, selective, environmentally-friendly herbicide system.
Further objects, advantages and features of the present invention will be apparent from the detailed description herein.
BRIEF DESCRIPTION OF THE FIGURES
Although the characteristic features of this invention will be particularly pointed out in the claims, the invention itself, and the manner in which it may be made and used, may be better understood by referring to the following description taken in connection with the accompanying figures forming a part hereof.
Figure 1 illustrates the biosynthetic pathway of the branched-chain amino acids valine, leucine and isoleucine.
Figure 2 sets forth the alignment of the amino acid sequence of TD of tomato and chickpea. C regions are highly conserved regions of the catalytic site of TD while R regions are highly conserved regions of the regulatory site of TD. Also shown are the locations of the degenerate oligonucleotide primers TD205 and TD206
-9- 41395
used to PCR-amplify an Arabidopsis TD genomic DNA fragment.
Figure 3 sets forth the structure and degree of degeneracy of the two oligonucleotide primers TD205 and TD206 used in amplifying an Arabidopsis genomic DNA fragment of the TD gene omrl . TD205 is anchored with an EcoRI site (underlined) at its 5' end and TD206 is anchored with a Hind III site (underlined) at its 5' end. Figure 4 sets forth the DNA sequence of clone 23 (pGM-td23) isolated from a cDNA library of the mutated line GMllb (omrl/omrl ) of Arabidopsis thaliana .
Figure 5 sets forth the nucleotide sequence and the predicted amino acid sequence of clone 23 as isolated from the cDNA library constructed from line GMllb of Arabidopsis (omrl/omrl) . The TD insert in clone 23 is in pBluescript vector between the EcoRI and Xhol sites. An open reading frame (top reading frame) was observed which showed an ATG codon at nucleotide 166 and a termination codon at nucleotide 1801. Figure 6a depicts the structure of the expression vector pCM35S-omrl used in the transformation of wild- type Arabidopsis thaliana and which expressed a mutated form of TD capable of conferring resistance to the toxic analog L-O-methylthreonine upon transformants. Figure 6b sets forth the nucleotide sequence and the predicted amino acid sequence of the chimeric mutant omrl expressing resistance to L-O-methylthreonine in transgenic Arabidopsis plants that have been transformed with the expression vector pCM35s-omrl (shown in figure 6a) . The total length of the fusion (chimeric) mutant TD expressed in transgenic plants was 609 amino acid residues. The first 9 amino-terminal residues start by methionine encoded by a start codon (ATG) furnished by the 3' end of the nucleotide sequence of CaMV 35s promoter linked to the omrl insert of clone 23. The following 12 amino acid residues are generated by the nucleotide sequence of the polylinker region from the multiple cloning site of the vector and finally the
-10- remaining 585 amino acid residues are encoded by the omrl mutant allele of Arabidopsi s as present in clone 23. The first residue of the 585 amino acid long portion encoded by omrl in pCM35s-omrl corresponds to threonine (Thr) which is the amino-terminal residue number 8 of the full length omrl cDNA shown in Figures 8 and 9 and SEQ ID NO: 2.
Figure 7 is the nucleotide sequence of the full length cDNA of the omrl allele encoding mutated TD. The total length of the cDNA of omrl is 1779 nucleotides including the stop codon.
Figure 8 is the predicted amino acid sequence of the mutated TD encoded by omrl The total length of the TD protein encoded by omrl is 592 amino acids. Figure 9 is the nucleotide sequence and the predicted amino acid sequence encoded by the mutated allele omrl of line GMllb of Arabidopsis thaliana .
Figure 10 is the nucleotide sequence of the full length cDNA of the wild type allele OMR1 encoding wild type TD.
Figure 11 is the predicted amino acid sequence of the wild type TD encoded by OMR1 .
Figure 12 is the nucleotide sequence and the predicted amino acid sequence encoded by the wild type allele OMR1 of Arabidopsis thaliana Columbia wild type. Figure 13 sets forth the multi-alignment of the deduced amino acid sequence of the wild-type TD of Arabidopsis thaliana reported in this disclosure with that from other organisms obtained from GenBank with the following accession numbers: 940472 for chickpea: 10257 for tomato: 401179 for potato; 730940 for yeast 1; 134962 for yeast 2; 68318 for E. coli biosynthetic; 135723 for E. coli catabolic: I 174668 for Salmonella typhimuri um. The megalign program of the Lasergene software, DNASTAR Inc., Madison, Wisconsin was used.
Figure 14 is a portion of the DNA sequencing gel comparing the nucleotide sequence of the mutated omrl allele and its wild-type allele OMR1 and showing the base
-11- 9/41395
substitution C (in OMRl ) to T (in omrl) at nucleotide residue 1495 starting from the beginning of the coding sequence. The arrow is pointing to the base substitution. Figure 15 depicts the point mutation in omrl at nucleotide residue 1495, predicting an amino acid substitution, from arginine (R) to cysteine (C) at amino acid residue 499 at the TD level.
Figure 16 sets forth the amino acid sequence at the regulatory region R4 of TD encoded by mutated omrl and wild type OMRl alleles of Arabidopsis thaliana compared to that from several organisms. The arrow points to the mutated amino acid residue in omrl .
Figure 17 is a portion of the DNA sequencing gel comparing the nucleotide sequence of the mutated omrl allele and its wild-type allele OMRl and showing the base substitution G (in OMRl) to A (in omrl) at nucleotide residue 1631. The arrow is pointing to the base substitution.
Figure 18 depicts the point mutation in omrl at nucleotide residue 1631, predicting an amino acid substitution, arginine (R) to histidine (H) at amino acid residue 544 at the TD level.
Figure 19 sets forth the amino acid sequence at the regulatory region R6 of TD encoded by mutated omrl and wild type OMRl alleles of Arabidopsis thaliana compared to that from several organisms. The arrow points to the mutated amino acid residue in omrl.
Figure 20 is a map of plasmid pGMtd23. Figure 21 is a map of plasmid pDAB1850. Figure 22 is a map of plasmid pDAB1852. Figure 23 is a map of plasmid pDAB311. Figure 24 is a map of plasmid pDAB305. Figure 25 is a map of plasmid pDAB1518.
DETAILED DESCRIPTION The present invention relates to methods and compositions for obtaining transformed cells, said cells expressing therein a mutated form of threonine
-12- 9/41395
dehydratase/deaminase ("TD") . One feature of the present invention involves the discovery, isolation and characterization of a gene sequence from Arabidopsis thaliana, designated omrl , which encodes a surprisingly advantageous mutated form of the enzyme TD. The present invention relates in another aspect to amino acid sequences that comprise functional, feedback-insensitive TD enzymes. Aspects of the present invention thus relate to nucleotide sequences encoding mutated forms of TD, which sequences may be introduced into target plant cells or microorganisms to provide a transformed plant or microorganism having a number of desirable features. The mutated forms of TD, unlike wild-type TD, are resistant to negative feedback inhibition by isoleucine ("lie") and transformed cells are resistant to molecules which are toxic to cells that do not express feedback insensitive TD. Therefore, transformants harboring an expressible inventive nucleotide sequence demonstrate increased levels of isoleucine production and increased levels of production of intermediates in the lie biosynthetic pathway, and the transformants are resistant to lie structural analogs which are lethal to non-transformants, which express only wild-type TD.
The invention, therefore provides isolated nucleotide sequences encoding mutated TD-functional polypeptides ("mutated TD") which are resistant to lie feedback inhibition and are resistant to the toxic effect of lie analogs. These inventive nucleotide sequences can be incorporated into vectors, which in turn can be used to transform other microorganisms and plant cells. Such transformation can be used, for instance, for purposes of providing a selectable marker, to increase plant nutritional value or to increase the production of commercially-important intermediates of the isoleucine biosynthetic pathway. Expression of the mutated TD results in the cell having altered susceptibility to certain enzyme inhibitors relative to cells having wild- type TD only.
-13- /41395
These and other features of the invention are described in further detail below. For purposes of promoting an understanding of the principles of the invention, reference will now be made to particular embodiments of the invention and specific language will be used to describe the same. It will nevertheless be understood that no limitation of the scope of the invention is thereby intended, such alterations and further modifications in the invention, and such further applications of the principles of the invention as described herein being contemplated as would normally occur to one skilled in the art to which the invention pertains.
Definitions The following definitions are provided in order to provide clarity as to the intent or scope of their usage in the specification and claims. All patents and publications referred to herein are incorporated by reference herein. The term TD enzyme is used to refer generally to a wild-type TD amino acid sequence, to a mutated TD selected in accordance with the invention, and to variants of each which catalyzes the reaction of threonine to 2-oxobutyrate in the lie biosynthetic pathway, as described herein. For purposes of clarity, the wild-type form is distinguished from a mutated form, where necessary, by usage of the terms wild-type TD and mutated TD.
The terms transi t peptide, chloroplast leader sequence, and signal peptide are used interchangeably to designate those amino acids that direct a passenger peptide to a chloroplast. By mature peptide or enzyme or passenger peptide or enzyme is meant a polypeptide which is found after processing and passing into an organelle and which is functional in the organelle for its intended purpose. The chloroplast leader sequence is covalently bound to the mature enzyme or passenger enzyme. The term
-14- 99/41395
precursor protein is meant a polypeptide having a transit peptide and a passenger peptide covalently attached to each other. Typically, the carboxy terminus of the transit peptide is covalently attached to the amino terminus of the passenger peptide. The passenger peptide and transit peptide can be encoded by the same gene locus, that is, homologous to each other, in that they are encoded in a manner isolated from a single source. Alternatively, the transit peptide and passenger peptide can be heterologous to each other, i.e., the transit peptide and passenger peptide can be from different genes and/or different organisms. Passenger peptides are originally made in a precursor form that includes a transit peptide and the passenger peptide. Upon entry into an organelle, the transit peptide portion is cleaved, thus leaving the "passenger" or "mature" peptide. Passenger peptides are the polypeptides typically obtained upon purification from a homogenate, the sequence of which can be determined as described herein.
As used herein in connection with cells and plants, the terms transformed and transgenic are used interchangeably to refer to a cell or plant expressing a foreign nucleotide sequence introduced through transformation efforts. The term foreign nucleotide sequence is intended to indicate a sequence encoding a polypeptide whose exact amino acid sequence is not normally found in the host cell, but is introduced therein through transformation techniques. A structural gene is that portion of a gene comprising a DNA segment encoding a protein, polypeptide or a portion thereof, and excluding the 5' sequence which drives the initiation of transcription. The structural gene may be one which is normally found in the cell or one which is not normally found in the cellular location wherein it is introduced, in which case it is termed a heterologous gene . A heterologous gene may be derived in whole or in part from any source known to the art,
15- 9/41395
including a bacterial genome or episo e, eukaryotic, nuclear or plasmid DNA, cDNA, viral DNA or chemically synthesized DNA. A structural gene may contain one or more modifications in either the coding or the untranslated regions which could affect the biological activity or the chemical structure of the expression product, the rate of expression or the manner of expression control. Such modifications include, but are not limited to, mutations, insertions, deletions and substitutions of one or more nucleotides. The structural gene may constitute an uninterrupted coding sequence or it may include one or more introns, bounded by the appropriate splice junctions. The structural gene may be a composite of segments derived from a plurality of sources (naturally occurring or synthetic, where synthetic refers to DNA that is chemically synthesized) . The structural gene may also encode a fusion protein.
Plant tissue includes differentiated and undifferentiated tissues of plants, including, but not limited to, roots, shoots, leaves, pollen, seeds, tumor tissue and various forms of cells in culture, such as single cells, protoplasts, embryos and callus tissue. The plant tissue may be in plants or in organ, tissue or cell culture. Plant cell as used herein includes plant cells in plants and plant cells and protoplasts in culture.
As used herein, the terms chimeric polynucleotide, chimeric DNA construct and chimeric DNA are used to refer to recombinant DNA. By promoter regulatory element is meant nucleotide sequence elements within a nucleotide sequence which control the expression of that nucleotide sequence. Promoter regulatory elements provide the nucleic acid sequences necessary for recognition of RNA polymerase and other transcriptional factors required for efficient transcription. Promoter regulatory elements are meant to include constitutive, tissue-specific, developmental- specific, inducible promoters and the like: Promoter
-16- 99/41395
regulatory elements may also include certain enhancer sequence elements that improve transcriptional efficiency.
Operably linked refers to a juxtaposition wherein the components are configured so as to perform their usual function. Thus, control sequences operably linked to a coding sequence are capable of effecting the expression of the coding sequence.
Homology refers to identity or near identity of nucleotide or amino acid sequences. As is understood in the art, nucleotide mismatches can occur at the third or wobble base in the codon without causing amino acid substitutions in the final polypeptide sequence. Also, minor nucleotide modifications (e.g., substitutions, insertions or deletions) in certain regions of the gene sequence can be tolerated whenever such modifications result in changes in amino acid sequence that do not alter functionality of the final product. It has been shown that chemically synthesized copies of whole, or parts of, gene sequences can replace the corresponding regions in the natural gene without loss of gene function. Homologs of specific DNA sequences may be identified by those skilled in the art using the test of cross-hybridization of nucleic acids under conditions of stringency as is well understood in the art (as described in Ha es et al.. Nucleic Acid Hybridisation, (1985) IRL Press, Oxford, UK) . Extent of homology is often measured in terms of percentage of identity between the sequences compared. Similar to homology, the term substantial identi ty is used herein with respect to a nucleotide sequence to designate that the nucleotide sequence has a sequence sufficiently similar to a reference nucleotide sequence that it will hybridize therewith under moderately stringent conditions, this method of determining identity being well known in the art to which the invention pertains. Briefly, moderately stringent conditions are described in Sambrook et al.. Molecular Cloning: a
-17- 9/41395
Laboratory Manual, 2ed. Vol. 1, pp. 101-104. Cold Slating Harbor Laboratory Press (1989) as including the use of a prewashing solution of 5 x SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0) and hybridization and washing conditions of about 55°C 5 x SSC.
The term nucleotide sequence, as used herein, is intended to refer to a natural or synthetic linear and sequential array of nucleotides and/or nucleosides, and derivatives thereof. The terms encoding and coding refer to the process by which a nucleotide sequence, through the mechanisms of transcription and translation, provides the information to a cell from which a series of amino acids can be assembled into a specific amino acid sequence to produce a functional polypeptide, such as, for example, an active enzyme. The process of encoding a specific amino acid sequence may involve DNA sequences having one or more base changes (i.e., insertions, deletions, substitutions) that do not cause a change in the encoded amino acid, or which involve base changes which may alter one or more amino acids, but do not eliminate the functional properties of the polypeptide encoded by the DNA sequence.
The term "amino acid sequence" is used herein to designate a plurality of amino acids linked in a serial array. Skilled artisans will recognize that through the process of mutation and/or evolution, polypeptides of different lengths and having differing constituents, e.g., with amino acid insertions, substitutions, deletions, and the like, may arise that are related to a sequence set forth herein by virtue of amino acid sequence homology and advantageous functionality as described in detail herein.
Amino Acid Sequences
It is not intended that the present invention be limited to the specific sequences set forth herein. It is well known that plants and microorganisms of a wide variety of species commonly express and utilize analogous
-18- 9/41395
enzymes and/or polypeptides which have varying degrees of degeneracy, and yet which effectively provide the same or a similar function. For example, an amino acid sequence isolated from one species may differ to a certain degree from the wild-type TD sequence set forth in SEQ ID NO: 1 (nucleic acid sequence), and SEQ ID N0:2 (corresponding amino acid sequence) , and yet have similar functionality with respect to catalytic and regulatory function. Amino acid sequences comprising such variations are included within the scope of the present invention and are considered substantially similar to a reference amino acid sequence. It is believed that the identity between amino acid sequences that is necessary to maintain proper functionality is related to maintenance of the tertiary structure of the polypeptide such that specific interactive sequences will be properly located and will have the desired activity. While it is not intended that the present invention be limited by any theory by which it achieves its advantageous result, it is contemplated that a polypeptide including these interactive sequences in proper spatial context will have good activity, even where alterations exist in other portions thereof.- In this regard, a TD variant is expected to be functionally similar to the wild-type TD set forth in SEQ ID NO: 2, for example, if it includes amino acids which are conserved among a variety of species or if it includes non-conserved amino acids which exist at a given location in another species that expresses functional TD. Figure 13 sets forth an amino acid alignment of TD polypeptides of a number of species. Two significant observations which may be made based upon Figure 13 are (1) that there is a high degree of conservation of amino acids at many locations among the species shown, and (2) a number of insertions, substitutions and/or deletions are represented in the TD of certain species and/or strains, which do not eliminate the dual functionality of the respective TD enzymes. For example, on Page 4 of Figure 13, Regulatory Region 4 ("R4") of wild-type
19- /41395
Arabidopsis is depicted which comprises the following sequence (corresponding to the underlying three-letter codes numbered as set forth in SEQ ID N0:1): V N L T T S D L V K D H L R Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp His Leu Arg 486 490 495
Y L M G G Tyr Leu Met Gly Gly 500 The degeneracy shown in Figure 13 in this portion of the sequence provides examples of substitutions which may be made without substantially altering the functionality of the wild-type sequence set forth in SEQ ID NO: 2. For example, it is expected that the Asp ("D") at position 492 could be substituted with a Glu ("E") and that the Leu ("L") at position 493 could be substituted with a Met ("M") without substantially altering the functionality of the amino acid sequence.
The following sets forth a plurality of sequences of R4, depicted such that acceptable substitutions are. set forth at various amino acid locations. The sequences encompassed thereby are expected to exhibit similar functionality to the corresponding portion of SEQ ID NO:l. A slash ("/") between two or in a series of amino acids indicates that any one of the amino acids indicated may be present at that location.
Val/Leu/Phe/Ile Asn/Asp/Glu/Ser Leu/Ile/Phe/Val/Gly 486
Thr/Ser/Ala/Gly
Thr/His/Asp/Asn Ser/Asn/Asp/Ile Asp/Glu Leu/Met 490
Val/Ala Lys/Val/Ala 495
-20- 9/41395
Asp/Ile/Glu/Ser His Leu/Gly/Ile/Val Arg/Lys Tyr/His
500 Leu/Met Met/Val Gly Gly
504
It is understood that analogous substitutions throughout the sequence are encompassed within the scope of the invention, and that Region R4 is simply used above for purposes of illustration.
Another manner in which similarity may exist between two amino acid sequences is where a given amino acid is substituted with another amino acid from the same amino acid group. In this manner, it is known that serine may, commonly be substituted with threonine in a polypeptide without substantially altering the functionality of the polypeptide. The following sets forth groups of amino acids which are believed to be interchangeable in inventive amino acid sequences at a wide variety of locations without substantially altering the functionality thereof: Group I: Nonpolar amino acids: alanine, valine, proline, leucine, phenylalanine, tryptophan, methionine, isoleucine, cysteine, glycine:
Group II: Uncharged polar amino acids: serine. threonine, asparagine, glutamine, tyrosine; Group III: Charged polar acidic amino acids: aspartic, glutamic; and
Group IV: Charged polar basic amino acids: lysine, arginine, histidine.
Where one is unsure whether a given substitution will affect the functionality of the enzyme, this may be determined without undue experimentation using synthesis techniques and screening assays known in the art.
Having established the meaning of similarity with respect to an amino acid sequence, it is important to note that the invention features mutated amino acid sequences comprising one or more amino acid substitutions that do alter the functionality of the wild-type TD
-21- enzyme. Inventive insensitive TD enzymes are therefore not similar to wild-type TD, as that term is defined and used herein, because inhibition functionality is altered. Insensitive TD enzymes feature one or more mutations in the regulatory site which mutations alter the functionality of the regulatory site without substantially altering the functionality of the catalytic site. In one specific aspect of the invention, there is provided an amino acid sequence (SEQ ID NO: 4) having two substitutions, this sequence comprising a mutated TD which has good catalytic functionality but which does not exhibit regulatory functionality. In other words, the enzyme set forth in SEQ ID NO: 4 comprises a feedback insensitive Arabidopsis thaliana TD. It is seen upon comparing the wild type TD set forth in SEQ ID NO: 2 and the mutated sequence of SEQ ID NO: 4, which comprises a specific embodiment of the invention, that the sequences differ only by two point mutations in the respective nucleotide sequences (C to T at nucleotide 1495; and G to A at nucleotide 1631), which result in two amino acid substitutions in the TD polypeptide (Arg to Cys at amino acid location 499; and Arg to His at amino acid location 544) . The first mutation is in regulatory region R4 of TD, and the second is in regulatory region R6 of TD. The Arg to Cys substitution at amino acid residue 499 changed a charged, polar, basic amino acid
(Arg) to a nonpolar amino acid (Cys) which altered the feedback site in TD. On the other hand, the change of
Arg to His at residue 544 was a change from a charged, polar, basic amino acid (Arg) to another charged, polar, basic amino acid (His) . While it is not intended that the present invention be limited by any theory by which it achieves its advantageous result, it is believed that the substitution at residue 544 alone may not have substantially altered the feedback site of TD, and, in contrast, that the substitution at residue 499 alone may have desensitized TD encoded thereby to feedback regulation. Certainly, when combined, the substitutions
22- were very effective in desensitizing TD encoded by omrl to feedback regulation.
It is recognized that the amino acid sequence set forth in SEQ ID NO: 6 (585 residues encoded by omrl ) is a truncated version, missing 7 amino-terminal residues, of that set forth in SEQ ID NO: 4. It is seen from the following description, including the Examples set forth herein, that a significant amount of research was performed based upon this slightly shortened version, and that the slightly shortened version may be advantageously used to transform a wide variety of plants and microorganisms. It is believed that the portion of the amino acid sequence that is present in SEQ ID NO: 4 and absent in SEQ ID NO: 6 is a portion of the chloroplast leader sequence, and not present in the mature TD enzyme. As mentioned above, to assist in the description of the present invention, SEQ ID NO: 2 is provided which sets forth an amino acid sequence comprising a wild-type TD from Arabidopsis thaliana . SEQ ID NOS: 4 and 6 set forth amino acid sequences comprising precursor proteins of differing lengths. SEQ ID NO: 6 (see also Figure 6b) comprises a 609 amino acid fusion or chimeric polypeptide of which 585 amino acid residues are encoded by mutant omrl of Arabidopsis . That is, SEQ ID NO: 6 comprises a mutant TD that is shorter than the full-length mutant TD shown in SEQ ID NO: 4 by 7 amino terminal residues. Since transgenic plants transformed with pCM35s-omrl were capable of expressing OMT resistance, then the 585 amino acid-long truncated precursor was fully capable of translocation from the cytoplasm to the chloroplast. SEQ ID NOS: 8, 10 and 12 set forth sequences comprising three predicted mature proteins. SEQ ID NO: 14 sets forth the putative regulatory site of an inventive mutated TD enzyme, and SEQ ID NOS: 16 and 18 set forth regulatory regions harboring mutations in accordance with one aspect of the invention.
It is understood that the wild-type TD enzyme features dual functionality. Specifically, the TD enzyme
23- has a catalytic site which is divided into catalytic regions C1-C5, as shown with respect to the analogous tomato TD enzyme and chickpea TD enzyme in Figure 2. The catalytic site catalyzes the reaction of threonine to 2- oxobutyrate. TD also has a regulatory site which is divided into regulatory regions R1-R7, as shown in Figure 2. The regulatory site is responsible for the feedback inhibition which occurs when the regulatory site binds to an inhibitor, in this case isoleucine. The present invention, therefore, provides, in alternative aspects, a feedback insensitive TD comprising the amino acid sequence set forth in SEQ ID NO: 4 or SEQ ID NO: 6 (precursor polypeptides); set forth in SEQ ID NO: 8, SEQ ID NO: 10 or SEQ ID NO: 12 (expected mature TD enzymes); SEQ ID NO: 14 (an insensitive TD regulatory site); or set forth in SEQ ID NO: 16 (regulatory region R4) or SEQ ID NO: 18 (regulatory region R6) . The amino acid sequence of SEQ ID NO: 14 or variants thereof as described above, may be operably coupled to a TD catalytic site from a wide variety of species, including functionally similar variants thereof, to provide the advantageous result of the invention.
Amino acid sequences SEQ ID NOS: 16 and 18 may also be operably coupled to a wide variety of sequences to provide insensitive TD enzymes, and therefore comprise certain preferred aspects of the invention. Substitutions giving rise to similar amino acid sequences, as described herein, are particularly applicable to SEQ ID NO: 16, and the following sets forth a plurality of particularly preferred alternative sequences for SEQ ID NO: 16 in accordance with the invention:
Val/Leu/Phe/Ile Asn/Asp/Glu/Ser Leu/Ile/Phe/Val/Gly Thr/Ser/Ala/Gly Thr/His/Asp/Asn Ser/Asn/Asp/Ile Asp/Glu Leu/Met Val/Ala Lys/Val/Ala Asp/Ile/Glu/Ser His Leu/Gly/Ile/Val Cys Tyr/His Leu/Met Met/Val
The invention therefore also encompasses amino acid sequences similar to the amino acid sequences set forth herein that have at least about 50% identity thereto and
-24- that are insensitive to feedback inhibition by lie. Preferably, inventive amino acid sequences have at least about 75% identity to these sequences, more preferably at least about 85% identity and most preferably at least about 95% identity.
Percent identity may be determined, for example, by comparing sequence information using the GAP computer program, version 6.0, available from the University of Wisconsin Genetics Computer Group (UWGCG) . The GAP program utilizes the alignment method of Needleman and Wunsch (J. Mol . Biol . 48:443, 1970), as revised by Smith and Waterman (Adv. Appl . Ma th . 2:482, 1981). Briefly, the GAP program defines identity as the number of aligned symbols (i.e., nucleotides or amino acids) which are the same, divided by the total number of symbols in the shorter of the two sequences. The preferred default parameters for the GAP program include: (1) a uniary comparison matrix (containing a value of 1 for identities and 0 for non-identities) , and the weighted comparison matrix of Gribskov and Burgess, Nucl . Acids Res . 14:6745, 1986, as described by Schwartz and Dayhoff, eds., Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, pp. 353-358, 1979; (2) a penalty of 3.0 for each gap and an additional 0.10 penalty for each symbol in each gap; and (3) no penalty for end gaps.
The invention also contemplates amino acid sequences having alternative mutations to those identified herein which also result in a feedback insensitive TD. For example, it is expected that the cys at position 499 and the his at position 544 in SEQ ID NO: 4 could be substituted with alternative amino acids from the same amino acid group as cys and his, respectively (as described above) to provide an alternate inventive enzyme. Further, it is well within the purview of a person skilled in the art to engineer a feedback insensitive TD by providing a wild-type TD and substituting a highly conserved amino acid at a given location in the regulatory site with a diverse amino acid
-25- (i.e., one from a different amino acid group), and to assay the resulting enzyme for catalytic activity and feedback sensitivity. For example, a skilled artisan can alter the nucleotide sequence set forth in SEQ ID N0:1 by site-directed mutagenesis to provide a mutated sequence which encodes an enzyme having an alternate amino acid in a given location of the enzyme. Alternatively, a skilled artisan can synthesize an amino acid sequence having one or more additions, substitutions and/or deletions at a highly conserved location of the wild-type TD enzyme using techniques known in the art. Such variants, which exhibit functionality substantially similar to a polypeptide comprising the sequence set forth in SEQ ID NO: 4, are included within the scope of the present invention.
The Chloroplast Leader Sequence
The present application finds advantageous use in a wide variety of plants, as well as in a wide variety of microorganisms. With respect to plants, it is important to recognize that the TD enzyme functions in chloroplast, and, therefore, that the polypeptide transcribed therefore is a precursor protein which includes a portion identified herein as a chloroplast leader sequence or transit peptide. The transit peptide may be derived from monocotyledonous or dicotyledonous plants upon choice of the artisan. DNA sequences encoding said transit peptides may be obtained from chloroplast proteins such as Δ-9 desaturase, palmitoyl-ACP thioesterase, β-KETOACYL-ACP synthase, oleyl-ACP thioesterase, chlorophyll a/b binding protein, NADPH+ dependent glyceraldehyde-3-phosphate dehydrogenase, early light inducible protein, clip protease regulatory protease, pyruvate orthophosphate dikinase, chlorophyll a/b binding protein, triose phosphate-3-ρhosρhoglycerate phosphate translocator, 5- enol pyruval shikimate-e-phosphate synthase, dihydrofolate reductase, thymidylate synthase, acetyl- coenzyme A carboxylase, Cu/Zn superoxide dismutase,
-26- cystein synthase, rubisco activase, ferritin, granule bound starch synthase, pyrophosphate, glutamine synthase, aldolase, glutathione reductase, nitrite reductase, 2- oxoglutarate/malate translocator, ADP-glucose pyrophosphorylase, ferrodoxin, carbonic anhydrase, polyphenol oxidase, ferrodoxin NADP oxidoreductase, platocyannin, glycerol-3-phosphate dehydrogenase, lipoxygenase, o-acetylserine (thiol) -lysase, acyl carrier protein, 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase, chloroplast-localized heat shock protein, starch phosphorylase, pyruvate orthophosphate dikinase, starch glycosyltransferase, and the like, of which the transit peptide portion has been defined in GenBank.
In plants, the chloroplast leader sequence is used to direct the passenger protein to chloroplast; however, they are typically cleaved and degraded upon entry of the passenger protein into the organelle of interest. Therefore, purification of a cleaved transit peptide from plant tissues is typically not possible. In some cases, however, transit peptide sequences can be determined by comparison of the precursor protein amino acid sequence obtained from the gene encoding the same to the amino acid sequence of the isolated passenger protein (mature protein) . Furthermore, passenger protein sequences can also be determined from the transit peptide proteins associated therewith by comparison of sequences to other similar proteins isolated from different species. As exemplified herein, genes encoding precursor forms of mutated TD protein, disclosed as SEQ ID NO: 3-6, when compared to wild type precursor and mature TD protein obtained from other species, can establish the expected sequence of the mature protein.
As previously discussed, the amino acid sequence and hence the nucleic acid sequence of a transit peptide can be determined in a variety of ways available to the skilled artisan. For example, passenger proteins of interest can be purified using a variety of techniques available to the person skilled in the art of protein
-27- biochemistry. Once purified, an amino terminal sequence of the protein can be determined using methods such as Edman degradation, mass spectroscopy, nuclear magnetic spectroscopy and the like. Using this information and the genetic code, standard molecular biology techniques can be employed to clone the gene encoding the protein as exemplified herein. Comparison of amino acid sequence determined from the cDNA to that obtained from the amino terminal sequence of the passenger protein can allow determination of the transit peptide sequence. In addition, many transit peptide sequences are available in the art and can easily be obtained from GenBank located in the Entrez Database at the National Center for Biotechnology Information web site. The subject of transit peptides in plants has been extensively reviewed by Keegstra et al., (1989) (Ceil, 56:247-253), which is incorporated herein by reference. Typically, there is very little primary amino acid sequence homology between different plant transit peptides. Even though passenger proteins may have amino acid and nucleic acid sequence similarities between cultivars, lines, and species, transit peptide may show very little sequence homology at any level. Furthermore, the length of transit peptides can vary, with some precursor proteins comprising transit peptide proteins with as few as about 10 amino acids while others can be about 150 amino acids or longer. Additional descriptions of transit peptide characteristics in plants and mechanisms associated therewith can be found in Ko and Ko, (1992) J. Biol . Chem. 267, 13910-13916; Bascomb et al. (1992) Plant Microb. Biotechnol . Res . Ser. 1:142-163; and Bakau et al., (1996) Trends in Cell Biol . 6:480-486; which are incorporated herein by reference.
In this regard, the first 90 amino acid residues in the N-terminal region of the Arabidopsis TD protein encoded by omrl (in SEQ ID NO: 4) represent an expected region comprising the transit peptide, as indicated by:
-28- (i) the dissimilarity with the yeast. Salmonella and E. cell TD proteins, (ii) the comparison of the sizes of TD of Arabidopsis, tomato, chickpea, yeast, Salmonella and E. coli , and
(iii) the amino acid composition which contains 12 proline residues and 33 other hydrophobic residues constituting a total of 50% hydrophobic residues. Therefore, it is expected that the mature/passenger TD of Arabidopsis encoded by the omrl locus, cleavage of the transit peptide may occur at the peptide bond between the alanine at residue 90 and the glutamic acid at residue 91, leaving behind a mature/passenger TD that starts at the glutamic acid at residue 91. As such, SEQ ID NOS: 7 identify an expected mature TD for Arabidopsis that starts at the glutamic acid at residue 91 of SEQ ID NO: 4 (clone 592). This expected mature TD polypeptide comprises 502 sequential amino acid residues. The only two other higher plant TD genes that have been cloned to date are those of tomato (Samach A.. Ilaryen D., Gutfinger T.. Ken-Dror S., Lifschitz E., 1991, Proc Nat Acad Sci USA 88:2678-2682) and chickpea (Jacob John S., Srivastava V., Guha-Mukherjee S., 1995, Plant Physiol 107:1023-1024). The lengths of the transit peptides of the tomato TD and chickpea TD were predicted to be the first 80 and 91 amino terminal residues, respectively, and the full length precursor proteins were reported to be 595 residues and 590 residues, respectively (Samach et al., 1991; Jacob John et al., 1995) . In both tomato and chickpea, the amino-terminus of the TD protein contained a typical two-domain transit peptide consistent with chloroplast lumen targeting sequences (Keegstra K., Olsen L.J., Theg S.M., 1989, Chloroplast precursors and their transport across the membrane. Annu Rev Plant Physiol Plant Mol Biol 40:471- 501) . In tomato, the first domain at the amino-terminal (45 residues) of the transit peptide was rich in serine
29- and threonine (33%) while the following sequence of 35 residues contained 8 regularly spaced proline and other hydrophobic residues (Samach et al., 1991). By sequencing the first ten amino-terminal residues of a purified tomato TD from flowers, Samach et al., (1991) found that lysine at residue 52 is the first amino acid at the amino-terminal end of the mature/passenger protein. According to Samach et al., (1991), the hydrophobic domain of the transit peptide of tomato TD is not cleaved and remains as part of the mature TD in the chloroplast. Samach et al., (1991) also explained that "it is possible that only a fraction of the tomato TD protein is cleaved at position 52, while the rest of the transit peptide is cleaved elsewhere and remain refractory to amino-terminal sequencing." In chickpea, the first domain at the amino-terminal end of the transit peptide was deduced to be 45 residues and rich in threonine and serine (37%) while the remaining 46 residues contained 8 regularly spaced proline residues and 19 other hydrophobic residues (Jacob John et al., 1995) . The cleavage site of the transit peptide of chickpea TD was not determined.
By analogy to tomato and chickpea, Arabidopsis TD also showed a typical two-domain transit peptide consistent with chloroplast lumen targeting sequences (as reviewed by Keegstra et al., 1989). The first 49 residues of the amino terminal end represented a domain that was rich in serine and threonine (31%) and other hydrophilic residues while the remaining 41 residues represented a second domain that contained 59% hydrophobic residues. The cleavage site of the transit peptide of Arabidopsis TD was not determined. Therefore, by analogy to tomato, it is expected that the cleavage site of the transit peptide of Arabidopsis TD may alternatively start at the lysine at residue 54 or at the lysine at residue 61. This is a presumptive cleavage site and one skilled in the art can readily determine the cleavage site in a similar fashion as in the case of
-30- tomato (Samach et al., 1991) by purifying Arabidopsis TD then sequencing the first ten amino acids in the amino- terminal end. Therefore, two additional sequences are provided as SEQ ID NOS: 10 and 12 that alternatively identify two expected mature TD in Arabidopsis .
It is within the scope of the present invention to create chimeric polynucleotides encoding precursor proteins wherein a transit peptide of choice is in the proper reading frame with the mature coding sequence of mutated TD. In creating a chimeric DNA construct encoding a transit peptide as disclosed herein, the transit peptide being heterologous to the mature, mutated TD, the DNA encoding the transit peptide is place 5' and in the proper reading frame with the DNA encoding the mature, mutated TD protein. Placement of the chimeric DNA in correct relationship with promoter regulatory elements and other sequences as described herein can allow production of mRNA molecules that encode for heterologous precursor proteins. The mRNA can then be translated thus producing a functional heterologous precursor protein which can be delivered to the chloroplast. It is, of course, understood that a DNA construct may be made in accordance with the invention to include a promoter that is native to the gene of a selected species that encodes that species' TD precursor polypeptide. Uptake of the protein by the chloroplast and cleavage of the associated transit peptide can result in a chloroplast containing a mature, mutated form of TD, thus rendering the cell resistant to feedback inhibition which would normally inhibit cells containing only the wild-type TD protein.
It is readily understood that, in the case of transforming prokaryotes, it is not necessary to include a transit peptide in the coding region of the vector. Rather, since such cells do not possess chloroplasts, an inventive DNA construct for transforming, for example, bacteria, may be made by simply attaching a start codon directly to, and in the proper reading frame with, a
-31- nucleotide sequence encoding a mature peptide. Of course, other elements are preferably present as described herein, such as a promoter upstream of the start codon and a termination sequence downstream of the coding region.
Nucleotide Sequences: omrl
Turning now to nucleotide sequences encoding inventive insensitive TD enzymes, nucleotide sequences encoding preferred feedback insensitive precursor TD of the species Arabidopsis thaliana are set forth in SEQ ID NOS: 3 and 5 herein. The mutated polynucleotides set forth therein and described polynucleotides related thereto are referred to as omrl . omrl has been found to be a dominant allele, this imparting significant value to the invention. It is of course not intended that the present invention be limited to the exemplary nucleotide sequences, but include sequences having substantial identity thereto and sequences which encode variant forms of insensitive TD as described above. It is therefore understood that the invention encompasses more than the specific exemplary nucleotide sequence of omrl . For example, a nucleic acid sequence encoding a variant amino acid sequence, as discussed above, is within the scope of the invention. Modifications to a sequence, such as deletions, insertions, or substitutions in the sequence which produce "silent" changes that do not substantially affect the functional properties of the resulting polypeptide molecule are expressly contemplated by the present invention. For example, it is understood that alterations in a nucleotide sequence which reflect the degeneracy of the genetic code, or which result in the production of a chemically equivalent amino acid at a given site, are contemplated. Thus, a codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue,
-32- such as valine, leucine, or isoleucine. Similarly, changes which result in substitution of one negatively charged residue for another, such as aspartic acid for glutamic acid, or one positively charged residue for another, such as lysine for arginine, can also be expected to produce a biologically equivalent product.
Nucleotide changes which result in alteration of the N-terminal and C-terminal portions of the polypeptide molecule would also not be expected to alter the activity of the polypeptide. In some cases, it may in fact be desirable to make mutations in the sequence in order to study the effect of alteration on the biological activity of the polypeptide. Each of the proposed modifications is well within the routine skill in the art. In a preferred aspect, therefore, the present invention contemplates nucleotide sequences having substantial identity to the sequences set forth herein and variants thereof as described herein. A further requirement of an inventive polynucleotide variant is that it must encode a polypeptide having similar functionality to the specific mutated TD enzymes recited herein, i.e., good catalytic functionality and insensitivity to feedback inhibition.
Preparation of Sequences A suitable DNA sequence selected for use according to the invention may be obtained, for example, by cloning techniques using cDNA libraries corresponding to a wide variety of species, these techniques being well known in the relevant art. Suitable nucleotide sequences may be isolated from DNA libraries obtained from a wide variety of species by means of nucleic acid hybridization or PCR, using as hybridization probes or primers nucleotide sequences selected in accordance with the invention, such as those set forth in the Sequence Listing included herewith; nucleotide sequences having substantial identity thereto; or portions thereof. Isolated wild-type
-33- sequences encoding TD may then be altered as provided by the present invention by site-directed mutagenesis.
Alternatively, a suitable sequence may be made by techniques -which are also well known in the art. For example, nucleic acid sequences encoding enzymes of the invention may be constructed using standard recombinant DNA technology, for example, by cutting or splicing nucleic acids which encode cytokines and/or other peptides using restriction enzymes and DNA ligase. Alternatively, nucleic acid sequences may be constructed using chemical synthesis, such as solid-phase phosphoramidate technology. In preferred embodiments of the invention, polymerase chain reaction (PCR) is used to accomplish splicing of nucleic acid sequences by overlap extension as is known in the art.
Incorporation of Sequences into Microorganisms and Plants Inventive DNA sequences can be incorporated into the genome or a plant or microorganism using conventional recombinant DNA technology, thereby making a transformed plant or microorganism having the excellent features described herein. In this regard, the term "genome" as used herein is intended to refer to DNA which is present in a plant or microorganism and which is inheritable by progeny during propagation thereof. As such, an inventive transformed plant or microorganism may alternatively be produced by producing FI or higher generation progeny of a directly transformed plant or microorganism, wherein the progeny comprise the foreign nucleotide sequence. Transformed plants or microorganisms and progeny thereof are all contemplated by the invention and are all intended to fall directly within the meaning of the terms "transformed plant" and "transformed microorganism. "
In this manner, the present invention contemplates the use of transformed plants which are selfed to produce an inbred plant. The inbred plant produces seed containing the gene of interest. These seeds can be grown
-34- to produce plants that express the protein of interest. The inbred lines can also be crossed with other inbred lines to produce hybrids. Parts obtained from the regenerated plant, such as flowers, seeds, leaves, branches, fruit, and the like are covered by the invention provided that said parts contain genes encoding and/or expressing the protein of interest. Progeny and variants, and mutants of the regenerated plants are also included within the scope of the invention. In diploid plants, typically one parent may be transformed and the other parent is the wild type. After crossing the parents, the first generation hybrids (FI) are sulfate to produce second generation hybrids (F2) . Those plants exhibiting the highest levels of the expression can then be chosen for further breeding.
Genes encoding precursor mutated TD polypeptides, as disclosed herein as SEQ ID NOS: 3 and 5, can be used in conjunction with other plant regulatory elements to create plant cells expressing the polypeptides. By "expressing" as used herein, is meant the transcription and stable accumulation of mRNA inside a cell, the cell being of prokaryotic or eukaryotic origin. Furthermore, it is within the scope of the invention to place mutated mature TD from Arabidopsis into a wide variety of other species including monocotyledonous and dicotyledonous plants. In so doing, chimeric gene constructs encoding the mature, mutated TD proteins having transit peptides heterologous thereto (transit peptides from a different protein or species) can be used. Transit peptides of the present invention, when covalently attached to the mature, mutated TD protein, can provide intracellular transport to the chloroplast. In plants, a mutated mature form of TD found in a chloroplast of a cell renders the cell resistant to feedback inhibition and resistance to lie structural analogs.
-35- Expression Vectors
Generally, transformation of a plant or microorganism involves inserting a DNA sequence into an expression vector in proper orientation and correct reading frame. The vector may desirably contain the necessary elements for the transcription of the inserted polypeptide-encoding sequence. A wide variety of vector systems known in the art can be advantageously used in accordance with the invention, such as plasmids, bacteriophage viruses or other modified viruses. Suitable vectors include, but are not limited to the following viral vectors: lambda vector system gtll, gtlO. Charon 4, and plasmid vectors such as pBI121, pBR322, pACYCl77, pACYCI84, pAR series, pKK223-3, pUC8, pUC9, pUCI8, pUC 19, pLG339, pRK290, pKC37, pKClOl, pCDNAII, and other similar systems. The DNA sequences may be cloned into the vector using standard cloning procedures in the art, for example, as described by Maniatis et al. Molecular Cloning: A Laboratory Manual, Cold Springs Laboratory, Cold Springs Harbor, New York (1982), which is hereby incorporated by reference in its entirety. The plasmid pBI121 is available from Clontech Laboratories, Palo Alto, California. It is understood that known techniques may be advantageously used according to the invention to transform microorganisms such as, for example,
Agrojacterium sp. , yeast, E. coli and Pseudomonas sp. In order to obtain satisfactory expression of a nucleotide sequence which encodes an inventive feedback insensitive TD in a plant or microorganism, it is preferred that a promoter be present in the expression vector. The promoter is preferably a constitutive promoter, but may alternatively be a tissue-specific promoter or an inducible promoter. Preferably, the promoter is one isolated from a native gene which encodes a TD. Although promoters for certain classes of genes commonly differ between species, it is understood that the present invention includes promoters which regulate expression of
-36- a wide variety of genes in a wide variety of plant or microorganism species.
An expression vector according to the invention may be either naturally or artificially produced from parts derived from heterologous sources, which parts may be naturally occurring or chemically synthesized, and wherein the parts have been joined by ligation or other means known in the art. The introduced coding sequence is preferably under control of the promoter and thus will be generally downstream from the promoter. Stated alternatively, the promoter sequence will be generally upstream (i.e., at the 5' end) of the coding sequence. The phrase "under control of" contemplates the presence of such other elements as may be necessary to achieve transcription of the introduced sequence. As such, in one representative example, enhanced production of a feedback insensitive TD may be achieved by inserting an inventive nucleotide sequence in a vector downstream from and operably linked to a promoter sequence capable of driving expression in a host cell. Two DNA sequences (such as a promoter region sequence and a feedback insensitive TD- encoding nucleotide sequence) are said to be operably linked if the nature of the linkage between the two DNA sequences does not (1) result in the introduction of a frame-shift mutation, (2) interfere with the ability of the promoter region sequence to direct the transcription of the desired nucleotide sequence, or (3) interfere with the ability of the desired nucleotide sequence to be transcribed by the promoter region sequence. RNA polymerase normally binds to the promoter and initiates transcription of a DNA sequence or a group of linked DNA sequences and regulatory elements (operon) . A transgene, such as a nucleotide sequence selected in accordance with the present invention, is expressed in a transformed cell to produce in the cell a polypeptide encoded thereby. Briefly, transcription of the DNA sequence is initiated by the binding of RNA polymerase to the DNA sequence's promoter region. During
•37- transcription, movement of the RNA polymerase along the DNA sequence forms messenger RNA ("mRNA") and, as a result, the DNA sequence is transcribed into a corresponding mRNA. This mRNA then moves to the ribosomes of the cytoplasm or rough endoplasmic reticulum which, with transfer RNA ("tRNA"), translates the mRNA into the polypeptide encoded thereby.
It is well known that there may or may not be other regulatory elements (e.g., enhancer sequences) which cooperate with the promoter and a transcriptional start site to achieve transcription of the introduced (i.e., foreign) coding sequence. By "enhancer" is meant nucleotide sequence elements which can stimulate promoter activity in a cell such as those found in plants as exemplified by the leader sequence of maize streak virus (MSV) , alcohol dehydrogenase intron 1, and the like. Also, the recombinant DNA will preferably include a transcriptional termination sequence downstream from the introduced sequence. It may also be desirous to use a reporter gene. In some instances, a reporter gene may be used with or without a selectable marker. Reporter genes are genes which are typically not present in the recipient organism or tissue and typically encode proteins resulting in some phenotypic change or enzymatic property. Examples of such genes are provided in K.
Wising et al. (1988) Ann. Rev. Genetics, 22:421, which is incorporated herein by reference. Preferred reporter genes include the beta-glucuronidase (GUS) of the uidA locus of E. co/i . the green fluorescent protein from the bioluminescent jellyfish Aequorea victoria, and the luciferase genes from firefly Photinus pyralis . An assay for detecting reporter gene expression may then be performed at a suitable time after the gene has been introduced into recipient cells. A preferred such assay entails the use of the gene encoding beta-glucuronidase (GUS) of the uidA locus or E. coli , as described by Jefferson et al., (1987 Biochem . Soc . Trans . 15, 17-19) to identify transformed cells.
-38- Plant promoter regulatory elements from a wide variety of sources can be used efficiently in plant cells to express foreign genes. For example, promoter regulatory elements of bacterial origin, such as the octopine synthase promoter, the nopaline synthase promoter, the mannopine synthase promoter, and promoters of viral origin, such as the cauliflower mosaic virus (35S and 19S), 35T (which is a re-engineered 35S promoter, WO 97/13402 published April 17, 1997) and the like may be used. Plant promoter regulatory elements include, but are not limited to, ribulose-1-5- bisphosphate (RUBP) carboxylase small subunit (ssu) , beta-conglycinin promoter, beta-phaseolin promoter, ADH promoter, heat-shock promoters, and tissue-specific promoters.
Other elements such as matrix attachment regions, scaffold attachment regions, introns, enhancers, polyadenylation sequences, and the like, may be present and thus may improve the transcription efficiency or DNA integration. Such elements may or may not be necessary for DNA function, although they can provide better expression or functioning of the DNA by affecting transcription, mRNA stability, and the like. Such elements may be included in the DNA as desired to obtain optimal performance of the transformed DNA in the plant. Typical elements include, but are not limited to, Adh- intron 1, Adh-intron 6, the alfalfa mosaic virus coat protein leader sequence, the maize streak virus coat protein leader sequence, as well as others available to a skilled artisan.
Constitutive promoter regulatory elements may be used thereby directing continuous gene expression in all cell types at all times (e.g., actin, ubiquitin, CaMV 35S, and the like) . Tissue specific promoter regulatory elements are responsible for gene expression in specific cell or tissue types, such as the leaves or seeds (e.g., zein, oleosin, napin, ACP, globulin, and the like) and these may alternatively be used.
-39- Promoter regulatory elements may also be active during a certain stage of the plants' development as well as active in plant tissues and organs. Examples of such include, but are not limited to, pollen-specific, embryo- specific, corn silk-specific, cotton fiber-specific, root-specific, seed endosperm-specific promoter regulatory elements, and the like. Under certain circumstances, it may be desirable to use an inducible promoter regulatory element, which is responsible for expression of genes in response to a specific signal, such as, for example, physical stimulus (heat shock genes) , light (RUBP carboxylase) , hormone (Em) , metabolites, chemicals and stress. Other desirable transcription and translation elements that function in plants may also be used. Numerous plant-specific gene transfer vectors are known in the art.
Transformation
Once the DNA construct of the present invention has been cloned into an expression vector, it may then be transformed into a host cell. In addition to numerous technologies for transforming plants, the type of tissue which is contacted with foreign polynucleotides may vary as well. Plant tissue suitable for transformation of a plant in accordance with certain preferred aspects of the invention include, for example, whole plants, leaf tissues, flower buds, root tissues, callus tissue types I, II and III, embryogenic tissue, meristems, protoplasts, hypocotyls and cotyledons. It is understood, however, that this list is not intended to be limiting, but only to provide examples of plant tissues which may be advantageously transformed in accordance with the present invention. A wide variety of plant tissues may be transformed during dedifferentiation using appropriate techniques described herein. Transformation of a plant or microorganism may be achieved using one of a wide variety of techniques known in the art. The manner in which the transcriptional unit
-40- is introduced into the plant host is not critical to the invention. Any method which provides efficient transformation may be employed. One technique of transforming plants with a DNA construct in accordance with the present invention is by contacting the tissue of such plants with an inoculum of bacteria transformed with a vector comprising the DNA construct. Generally, this procedure involves inoculating the plant tissue with a suspension of bacteria and incubating the tissue for about 48 to about 72 hours on regeneration medium without antibiotics at about 25-28 °C. Bacteria from the genus Agrobacteri um may be advantageously utilized to transform plant cells. Suitable species of such bacterium include Agrobacteri um tumefaciens and Agrobacteri um rhizogenes Agrobacterium tumefaciens (e.g., strains LBA4404 or EHA105) is particularly useful due to its well-known ability to transform plants. Another technique which may advantageously be used is vacuum-infiltration of flower buds using Agrobacterium-based vectors. Various methods for plant transformation include the use of Ti or Ri-plasmids and the like to perform Agrobacterium mediated transformation. In many instances, it will be desirable to have the construct used for transformation bordered on one or both sides by T-DNA borders, more specifically the right border. This is particularly useful when the construct uses Agrobacteri um tumefaciens or Agrobacteri um rhizogenes as a mode for transformation, although T-DNA borders may find use with other modes of transformation. Where Agrobacteri um is used for plant transformation, a vector may be used which may be introduced into the host for homologous recombination with T-DNA or the Ti or Ri plasmid present in the host. Introduction of the vector may be performed via electroporation, tri-parental mating and other techniques for transforming gram-negative bacteria which are known to those skilled in the art. The manner of vector transformation into the Agrobacterium host is not critical to the invention.
-41- In some cases where Agrobacteri um is used for transformation, the expression construct being within the T-DNA borders will be inserted into a broad spectrum vector such as pRK2 or derivatives thereof as described in Ditta et al. (PNAS USA (1980) 77:7347-7351 and EPO 0 120 515), which are incorporated herein by reference. Explants may be combined and incubated with the transformed AgroJacteriwn for sufficient time to allow transformation thereof. After transformation, the Agrobacteria and plant cells are cultured with the appropriate selective medium. Once calli are formed, shoot formation can be encouraged by employing the appropriate plant hormones according to methods well known in the art of plant tissue culturing and plant regeneration. However, a callus intermediate stage is not always necessary. After shoot formation, said plant cells can be transferred to medium which encourages root formation thereby completing plant regeneration. The plants may then be grown to seed and the seed can be used to establish future generations. Regardless of transformation technique, the polynucleotide of interest is preferably incorporated into a transfer vector adapted to express the polynucleotide in a plant cell by including in the vector a plant promoter regulatory element, as well as 3' non-translated transcriptional termination regions such as Nos and the like.
Plant RNA viral based systems can also be used to express genes for the purposes disclosed herein. In so doing, the chimeric genes of interest can be inserted into the coat promoter regions of a suitable plant virus under the control of a subgenomic promoter which will infect the host plant of interest. Plant RNA viral based systems are described, for example, in U.S. Patent Nos. 5,500,360; 5,316,931 and 5,589,367, each of which is hereby incorporated herein by reference in its entirety. Another approach to transforming plant cells with a DNA sequence selected in accordance with the present invention involves propelling inert or biologically
42- active particles at plant tissues or cells. This technique is disclosed in U.S. Patent Nos. 4,945,050, 5,036,006 and 5,100,792, all to Sanford et al., which are hereby incorporated by reference. Generally, this procedure involves propelling inert or biologically active particles at the cells under conditions effective to penetrate the outer surface of the cell and to be incorporated within the interior thereof. When inert particles are utilized, the vector can be introduced into the cell by coating the particles with the vector.
Alternatively, the target cell can be surrounded by the vector so that the vector is carried into the cell by the wake of the particle. Biologically active particles (e.g., dried yeast cells, dried bacterium or a bacteriophage, each containing DNA material sought to be introduced) can also be propelled into plant cells. It is not intended, however, that the present invention be limited by the choice of vector or host cell. It should of course be understood that not all vectors and expression control sequences will function equally well to express the DNA sequences of this invention. Neither will all hosts function equally well with the same vector expression system. However, one of skill in the art may make a selection among vectors, expression control sequences, and hosts without undue experimentation and without departing from the scope of this invention.
An isolated DNA construct selected in accordance with the present invention may be utilized in an expression vector to transform a wide variety of plants, including monocots and dicot. The invention finds advantageous use, for example, in transforming the following plants: rice, wheat, barley, rye, corn, potato, carrot, sweet potato, bean, pea, chicory, lettuce, cabbage, cauliflower, broccoli, turnip, radish, spinach, asparagus, onion, garlic, eggplant, pepper, celery, squash, pumpkin, zucchini, cucumber, apple, pear, quince, melon, plum, cherry, peach, nectarine, apricot, strawberry, grape, raspberry, blackberry, pineapple,
-43- avocado, papaya, mango, banana, soybean, tobacco, tomato, sorghum and sugarcane. Additional literature describing plant and/or microorganism transformation includes the following, each of which is incorporated herein by reference in its entirety: Zhijian Li et al. "A
Sulfonylurea Herbicide Resistance Gene from Arabidopsis thaliana as a New Selectable Marker for Production of Fertile Transgenic Rice Plants" Plant Physiol . 100, 662- 668 (1992); Parsons et al. (1997) Proc . Natl . Acad. Sci . USA 84:4161-4165; Daboussi et al. (1989) Curr. Genet .
15:453-456; Leung et al. (1990) Curr. Genet . 17:409-411; Koetter et al., "Isolation and characterization of the Pichia stipitis xylitol gehydrogenase gene, XYL2, and construction of a xylose-utilizing Saccharomyces cerevisiae transformant, " Curr. Genet . , 18:493-500 (1990); Strasser et al., "Cloning of yeast xylose reductase and xylitol dehydrogenase genes and their use, " German patent application (1990); Hallbom et al., "Xylitol production by recombinant Saccharomyces cerevisiae," Bio. /Technol . , 9:1090 (1991); Becket and Guarente, "High efficiency transformation of yeast by electroporation," Methods in Enzymol . 194:182-186 (1991); Ammerer, "Expression of genes in yeast using the ADC1 promoter," Methods in Enzymol . 101:192-201 (1983); Sarthy et al., "Expression of the E. coil xylose isomerase gene in S. cerevisiae," Appl . Environ . Microb. , 53:1996-2000 (1987); U.S. Patent Nos. 4,945,050, 5,141,131, 5,177,010, 5,104,310, 5,149,645, 5,469,976 5,464,763, 4,940,838, 4,693,976, 5,591,616, 5,231,019, 5,463,174, 4,762,785, 5,004,863, 5,159,135, 5,302,523, 5,464,765, 5,472,869, 5,384,253; European Patent Application Nos. 0131624B1, 120516, 159418B1, 176112, 116718, 290799, 320500, 604662, 627752, 0267159, 0292435, WO 87/06614; WO 92/09696; and WO 93/21335.
Enhanced Nutritional Value
Those skilled in the art will recognize the commercial and agricultural advantages inherent in plants
-44- transformed to express feedback insensitive TD. Such plants have the improved ability to synthesize lie and, therefore, are expected to be more valuable nutritionally, compared to a corresponding non- transformed plant. Humans do not synthesize lie and can only obtain lie from their diet.
Commercial Intermediates
Certain intermediates of the lie biosynthetic pathway have significant commercial value, and production of these intermediates is advantageously increased in a transformant in accordance with the invention. For example, 2-oxobutyrate, the reaction product of the reaction catalyzed by TD, is known to be a precursor for the production of polyhydroxybutyrate in plants that have been genetically engineered using techniques known in the art to include bacterial genes necessary to produce polyhydroxybutyrate. Polyhydroxybutyrate is a desired biopolymer in the plastic industry because it may be biologically degraded. Because plants and microorganisms transformed in accordance with the invention feature increased production of 2-oxobutyrate, such plants and/or microorganisms may be advantageously utilized by plastic manufacturers in this manner. For example, plants that overproduce 2-oxobutyrate would be ideal for metabolic engineering by bacterial genes for polyhydroxybutyrate production because the overproduction of 2-oxobutyrate would provide plenty of substrate for both the natural lie biosynthetic pathway and the engineered polyhydroxybutyrate pathway.
Selectable Markers
Perhaps the most significant advantage of the present invention is that an inventive nucleotide sequence may be used in an expression vector as a selectable marker. In this aspect of the invention, an inventive nucleotide sequence is incorporated into a vector such that it is expressed in a cell transformed
-45- thereby, along with a second pre-selected nucleotide sequence (i.e., the primary sequence) which is desired to be incorporated into the genome of the target cell. In this inventive selection protocol, successful transformants will not only express the primary sequence, but will also express a feedback insensitive TD. Thus, once the recombinant DNA is introduced into the plant tissue or microorganism, successful transformants can be screened in accordance with the invention by growing the plant or microorganism in a substrate comprising a toxic lie analog, such as, for example, OMT (termed "toxic substrate" hereto) . The lie structural analog is toxic to wild-type TD, and only the successful transformants, i.e., those expressing feedback insensitive TD, will live, grow and/or proliferate in the toxic substrate. In this manner, omrl is also an excellent biochemical marker to be used in experiments of genetic engineering of bacteria replacing the traditionally used and environmentally-hazardous antibiotic-resistant genes (such as ampicillin- and kanamycin-resistant marker genes), omrl is very environmentally friendly and poses no risk to human health when included in a transformant, because it does not have an ortholog in humans. Humans do not synthesize isoleucine and may only obtain it by digesting food.
Herbicidal Uses
Based upon the advantageous features of the invention, there is also provided a novel herbicide system. In accordance with this system, agriculturally valuable plant lines comprising an expressible nucleotide sequence encoding an insensitive TD ("transformed plant line") are grown in a substrate and an lie structural analog selected in accordance with the invention is contacted with the substrate or with the plants themselves. As a result, only the transformed plants will continue to grow and other plants contacted with the analog will die.
-46- The invention will be further described with reference to the following specific Examples. It will be understood that these Examples are illustrative and not restrictive in nature. Restriction enzyme digestions, phosphorylations, ligations and bacterial transformations were done as described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory Press. Plant transformations were done according to Bent et al . "RPS2 of Arabidopsis thaliana: A leucine-rich repeat class of plant disease resistance genes." Science 265:1856-1860 (1994). Each reference is incorporated herein by reference in its entirety.
EXAMPLES
Example 1: GMllb
As reported in Mourad G. King J (1995) L-O- methylthreonine-resistant mutant of Arabidopsis defective in isoleucine feedback regulation. Plant Physiol . 107:43- 52, the mutated line GM I lb of Arabidopsis thaliana' was obtained, using EMS-mutagenesis, by selection in the presence of the toxic lie structural analog, L-O- methylthreonine (OMT) . The basis of mutant selection was that OMT is incorporated into cellular proteins in place of lie, causing loss of protein function and, thus, cell death. GMllb was rescued because of a dominant mutation in the single gene omrl which encodes TD. The mutation in the omrl gene causes TD from GMllb to be insensitive to feedback control by lie. TD activity in extracts from GMllb plants was about 50-fold more resistant to feedback inhibition by lie than TD in extracts from wild type plants. The loss of lie feedback sensitivity in GMllb led to a 20-fold overproduction of free lie when compared to the wild type. This overproduction of lie in GMllb had no effect on plant growth or reproduction.
47- O 99/41395 rL
Example 2 : Cloning, Sequencing and Testing omrl as a Selectable Marker in Genetic Engineering Experiments.
1. The construction of a cDNA library from GMllb (omrl/omrl) :
Total RNA was extracted from 16-day-old GMllb (omrl/omrl) plants that were germinated in a minimal agar medium supplemented with 0.2 mM MTR. Poly (A) RNA (mRNA) was extracted from the total RNA and complementary DNA (cDNA) was synthesized using reverse transcriptase. The cDNA library was synthesized using the ZAP-cDNA synthesis kit of Stratagene. To prime the cDNA synthesis, a 50-base oligonucleotide linker primer containing an Xhol site and an 18-base poly(dT) was used. A 13-mer oligonucleotide adaptor containing an EcoRI cohesive end was ligated to the double stranded cDNA molecules at the 5' end. This allowed unidirectional cloning of the cDNA molecules, in the sense orientation, into the EcoRI and Xhol si tes of the Uni-ZAP XR vector of Stratagene. The recombinant λ phage library was amplified using the XLl-Blue M1RF' E. coli host cells yielding a titer 6.8 x 109 pfu/ml. The average size insert was approximately 1.4 kb. This was calculated from PCR analysis of 20 random, clear plaques isolated from the amplified library. The Uni-ZAP XR vector contains the pBluescript SK(-), a plasmid containing the N-terminus of the lacZ gene. To excise the pBluescript phagemid, containing the cloned cDNA insert, the ExAssist/SOLR system provided by Stratagene was used. This allowed the rescue of the cDNA inserts from the positive λ clones in pBluescritpt SK plasmids in a single step.
2. The isolation of a small TD-DNA fragment to use as a homologous probe:
To isolate the omrl gene encoding TD from the cDNA library of the line GMllb, a homologous oligonucleotide, isolated from, Arabidopsis DNA was used as a probe against the cDNA library. Taking into consideration that
-48- TD is conserved in a variety of organisms, degenerate primers were designed from conserved amino acid regions of TD. Such conserved regions were identified by aligning the amino acid sequence of TD from chickpea and tomato. Figure 2 shows the location of the conserved amino sequences in tomato and chickpea and also the location of the degenerate oligonucleotide primers TD205 and TD206 that were designed to isolate a TD-DNA fragment from Arabidopsis . Figure 4 shows the structure and degree of degeneracy of the PCR oligonucleotide primers, TD205 (the 5' end primer) and TD206 (the 3' end primer). Both primers TD205 and TD206 were designed to accommodate the Arabidopsis codon usage bias. Primer TD205 had 384-fold degeneracy and was a 28-mer anchored with an EcoRI site starting 2 bases downstream from the first nucleotide at the 5' end of the primer. TD 206 had 324-fold degeneracy and was a 28-mer anchored with a Hindlll site starting 2 bases downstream from the first nucleotide at the 5' end of the primer. Genomic DNA was isolated from GMllb and used as a template in a PCR amplification with the primers TD205 and TD206. A 438 bp fragment was amplified. The fragment was cloned into the EcoRI- Hindi I I si tes of the plasmid pGEM3Zf(+). The fragment was sequenced to completion using the dideoxy chain termination method and the sequenase kit of USB. The fragment showed a putative 280 bp intron. The remaining 158 bp of the PCR-fragment had 60.1% identical nucleotide sequence with the chickpea TD gene. To eliminate the putative intron sequences, a second pair of primers TD211 and TD212 were designed and used in a PCR reaction with the 438 bp fragment as a template. A DNA fragment of about 100 bp length, containing exon sequences, was amplified and purified. This was the homologous probe used for screening the cDNA library constructed from GMllb.
-49- 3. Screening the cDNA library of GMllb:
The 100 bp PCR-fragment was labeled with [«-32P]dCTP (3000 Ci/mmol) using random priming (prime-a gene labeling kit of Promega) and used as a probe to screen plaque lifts (two replicas per plate) of the plated GMllb cDNA library. Hybridization was done at 42°C. in formamide for 2 days. The nylon membranes containing the plaque lifts were washed 3X at room temperature (25°C) in 7XSSPE and 0.5%SDS for 5 minutes. The nylon membranes were then put on X-ray film and exposed for 1 day. Two plaques hybridized and showed signal on the X-ray films of the two replicas taken from the same plate. At the site of positive hybridization, plugs were cut out of the agar plate and put in 1 ml of SM buffer with 20 p,L chloroform. A secondary, tertiary and quaternary screening was performed until about 90% of the plaques on the plate showed a strong signal on the X-ray film of both replicas of the same plate. A well isolated plaque representing each clone was cut out from the plate and put in SM buffer. The phage eluate was infected with the ExAssist helper phage to excise the pBluescript SK plasmid containing the cDNA insert and the resulting recombinant bacteria was plated on media with ampicillin (60 pg/ml) . A few bacterial colonies were selected, plasmid DNA was prepared then digested with EcoRI and Xhol to release the inserts. A Southern blot was prepared from the plasmid digests and probed with the 32P-labelled 100 bp TD fragment. All the clones, descendants from the two phage clones, showed very strong signal. This was a strong indication that the isolated clones contained the TD from the line GMllb. One clone was named TD23 and was selected for DNA sequencing. The size of the cDNA insert in clone TD23 was 2229 nucleotides .
4. Sequencing of the 2229 bp fragment of the clone TD23: Sequencing of the cDNA insert of clone TD23 was performed by the dideoxy chain termination method using
-50- 99/41395 the sequenase kit of USB. To start the sequencing- project, an oligonucleotide primer complementary to the T3 promoter of pBluescript SK was synthesized and used to obtain the sequence of the first few nucleotides of the insert. This sequence, 30 nucleotides, included the multiple cloning site downstream of the T3 promoter. The start of the cDNA sequence was immediately following the EcoRI site which starts at position 31. DNA sequencing was also performed on the opposite strand starting from the 3' end and using the T7 promoter of the pBluescrLpt SK. Both strands of the TD 23 insert were sequenced to completion using a set of oligonucleotide primers designed from the DNA revealed after each sequencing reaction. A total of 19 oligonucleotide primers were synthesized and used in sequencing the cDNA insert.
The total length of the sequenced fragment was 2277 nucleotides of which 2229 were the cDNA insert. Of the remaining 48 nucleotides, 2277-2229, 31 nucleotides were the multiple cloning site between the T3 promoter and the EcoRI site at the 5' end of the insert and 17 nucleotides were multiple cloning site between the T7 promoter and Xhol site at the 3' end of the insert (Figure 4). Figure 5 shows the nucleotide sequence and the predicted amino acid sequence of clone 23 as isolated from the cDNA library constructed from line GMllb of Arabidopsis
(omrl/omrl) . The TD insert in clone 23 is in pBluescript vector between the EcoRI and Xhol sites. An open reading frame (top reading frame) was observed which showed an ATG codon at nucleotide 166 and a termination codon at nucleotide 1801. The total coding region of the cDNA insert in clone 23 is 1758 nucleotides (including the stop codon) encoding a polypeptide of 585 amino acids. Figure 4 shows the DNA sequence of clone 23 and Figure 5 shows the DNA sequence and the open reading frame with the predicted amino acid sequence encoded by the cDNA insert. The predicted amino acid sequence encoded by the TD 23 cDNA gene shared greater than 50% identity with the amino acid sequence of TD of potato and tomato
-51- respectively. This was strong evidence that the cDNA insert of the clone TD23 is indeed the gene encoding threonine dehydratase/deaminase, omrl , of the L-0- methylthreonine-resistant line GMllb of Arabidopsis thaliana .
5. Test of functionality, of the cDNA insert (omrl) encoding TD of Arabidopsis :
To test that the cloned cDNA insert of the clone TD 23 is indeed encoding a functional threonine dehydratase/deaminase, a complementation test was performed. The E. coli strain TGXA is all auxotroph with a deletion in the ilvA gene encoding threonine dehydratase/deaminase. Fisher KE, Eisenstein (1993) An efficient approach to identify ilva mutations reveals an amino-terminal catalytic domain in biosynthetic threonine dea inase from Escherichia coli , J Bacteriol 175:6605- 6613. This strain cannot grow on a minimal medium without supplementation with lie. This strain was a generous gift from Drs . Kathryn E. Fisher and Edward Eisenstein, University of Maryland Baltimore County, Maryland.
First complementation experiments were done to test the ability of omrl to revert the bacteria lie autόtroph TGXA to prototrophy. This was done by transforming TGXA with pGM-td23, containing the cDNA insert omrl in pBluescript SK under the control of the T3 promoter. In addition, the cDNA insert containing omrl was subcloned in two different prokaryotic expression vectors. An Xbal - Xhol fragment, containing the cDNA sequence of omrl , was excised from pGM-td23 and cloned into Xbal- Sail linearized prokaryotic expression vectors pTrc99A and pUCK2. In pTrc99A, omrl was cloned in front of the lacZ IPTG-inducible promoter while in pUCK2, omrl was cloned in front of a constitutive promoter. Xhol and Sail cohesive termini are compatible and therefore allowed the ligation of the inserts into the expression vectors. The recombinant vectors pTrc-td23, pUCK-rd23 or pBluescript- td23 all containing full length omrl were transformed
52- into the strain TGXA and plated on minimal media without supplementation. All of the three constructs were able to revert lie auxotrophy of the host TGXA to prototrophy. These experiments confirmed that omrl encoding Arabidopsis thaliana (line GMllb) TD is functional and able to unblock the lie biosynthetic pathway of the E. coli strain TGXA.
In the second complementation experiment, the E. coli prototroph host DH5α was transformed with pTrc-td23 or pUCK-td23 and plated on minimal medium supplemented with varying concentrations of the toxic analog L-O- methylthreonine. Both of the constructs were able to confer upon DH5α resistance to 30 μM L-O-methylthreonine. No bacterial colonies grew on plates containing untransformed DH5α. This result provided strong evidence that the mutated omrl gene of the line GMllb of Arabidopsis is able to confer resistance to L-O- methylthreonine present in the growth medium. Therefore omrl provides a new environmentally friendly selectable marker for genetic transformation of bacteria.
6. Construction of the pCM35S-omrl expression vector for plant transformation:
The strategy for cloning the omrl allele into a plant expression vector was as follows: A. The coding region of omrl allele was excised from pGM- td23 as an Xbal - Kpnl fragment.
B. The 500 bp CaMV 35S promoter was cleaved out of the vector pBI121.1 (Jefferson et al. 1987) with HindUI and BamHI. The pBIN19 vector was linearized by Hindi I I and BamHI then ligated to the CaMV 35S promoter so as to place the promoter into the multiple cloning site in the correct orientation. This vector was named pCM35S.
C. The plasmid pCM35S was digested with Xbal - Kpnl and the omrl fragment isolated in step A was cloned into the Xbal - Kpnl sites placing the omrl coding region sequence under the transcriptional control of the CaMV 35S promoter and creating a plasmid with the kanamycin
53- 99/41395 resistance gene (NOS. NPTII: NOS) close to the right border RB of the T-DNA region of the Ti plasmid and 35S : omrl downstream and close to the left border LB of the T-DNA region of the Ti plasmid. This plasmid was named pCM35S- omrl -nos (ca. 13 kb) .
D. The NOS terminator of pBlN19 was PCR-amplified using a pair of oligonucleotide primers, the 5' primer was anchored with an Xbal site and the 3' primer was anchored with a Sail site. PCR amplification yielded a 300 bp NOS terminator fragment.
E. To clone a NOS terminator to the 3' end of the omrl gene, the recombinant plasmid pCM35S-omrl -nos was digested with Nhel and Xhol. This yielded three fragments: (i) a 5 kb NΛe I- Nhe/ fragment containing part of the ΝOS promoter of the ΝPTII gene, the 35S promoter and the full length omrl cDΝA except 200 bp of non- translated sequences at the 3' end which include the poly A tail, (ii) a 200 bp Nhel - Xhol fragment containing the
200 bp fragment mentioned in (i) and that contained the poly A tail and non-translated sequences at the 3' end of omrl . (iii) an 8 kb Nhel - Xhol fragment containing the 5' end ΝOS promoter of the ΝPTII gene and the remaining sequences outside LB and RB of the pCM35S-omrl-nos.
F. To clone the ΝOS terminator immediately downstream from the omrl gene in pCM35S-omrl-nos, a triple ligation was performed including the 5- kb Nhe 1 - Nhe 1 fragment containing part of the ΝOS promoter of the ΝPTII gene mentioned above in E(i), the 300 bp Xbal - Sail ΝOS terminator fragment mentioned in C, and the 8 kb NΛe I - Xhol fragment containing the 5' end ΝOS promoter of the ΝPTII gene and the remaining sequences outside LB and P,B of the pCM35S-omrl . The result of this triple cloning was the ligation of the 5 kb fragment at one Nhel end (the ΝOS promoter end) to the Νhel site of the 8 kb fragment Nhel/Nhel) and the other Nhel end (at the 3 'end
-54- 99/41395 of the omrl coding sequence) of the 5 kb fragment was ligated to the Xbal (isoschizomer) of the 300 bp NOS terminator fragment. The Sail end of the 300 bp NOS terminator was ligated to the Xhol (isoschizomer) end of the 8 kb fragment. This generated the recombinant plasmid pCM35S-omrl containing the omrl gene driven by the CaMV 35S promoter and terminated by the NOS terminator and the kanamycin resistance gene (NOS promoter:NPTII :NOS: terminator) between the LB and RB (Figure 16) . To confirm the cloning of the three fragments in the proper orientation, a diagnostic digestion with Sba I <__ Kpn I produced a 2.3-2.4 kb fragment. The plasmid pCM35S-omr_Z therefore contained two constructs that could be expressed in plants, the CaMV35S. omrl . NOS terminator expressing L-O- methylthreonine-resistance and the NOS promoter : NPTII. NOS terminator expressing kanamycin-resistance.
7. Plant transformation using pCM35S-orml :
Using the vacuum infiltration method of Bent et al. (1994), L-O-methyhthreonine-sensitive Arabidopsis thaliana Columbia wild type were transformed with pCM35S- omrl . Ten pots, each with 3-4 plants, were transformed and TI seeds were harvested from the T transformed plants of each pot separately. The TI seeds from each pot were screened tbr expression of L-O-methylthreonine resistance by germinating in agar medium supplemented with 0.2 mM L- O-methylthreonine, a concentration previously determined and known to completely inhibit the growth of wild type seedlings beyond the cotyledonous stage (Mourad and King, 1995) . Half of the Tl seeds From each of the ten pots were screened for L-O-methylthreonine resistance and 5 independent transformants were able to germinate and continue to grow healthy roots and shoots among thousands of seedlings that were completely bleached immediately after the emergence of the cotyledons. In a crowded plate, it is possible to identify the transformants by looking at the bottom of the plate, the transformants
55- 99/41395 show root growth while the nontransformants will have none. After three weeks of growth in the 0.2 mM L-O- methylthreonine agar medium, each of the 5 positive transformants was transferred to soil, kept separately and allowed self-fertilize to produce the T2 seed.
8. Genetic characterization of the omrl transformants:
The T2 seed was harvested from each of the 5 positive TI transformants and 50 T2 seeds/transfor ant were planted in a separate petri plate containing 0.2 mM L-O-methylthreonine agar medium. In each of the 5 petri plates, the majority (75% or more) of the T2 seedlings were resistant to L-O-methylthreonine indicating that a single copy of the transgene omrl had been inserted in the parent TI transgenic plant. Figure 6b shows that 585 amino acid residues of the total 592 residues representing the full length mutant TD were expressed in the transgenic plants. This slightly truncated precursor mutant TD was able to translocate to the chloroplast and confer upon transgenic plants resistance to OMT.
9. Molecular characterization of the omrl transformants:
Two to three leaves of each of the five TI transformants was excised from the plants at the rosette stage and total DNA was extracted according to a modification of the procedure of Konieczny and Ausubel (1993) . A PCR approach was used to confirm the presence of the introduced transgene omrl . For that, a pair of oligonucleotide primers were synthesized such that one primer is complementary to the start of the omrl and the other primer was complementary to the end of the NOS terminator. The PCR reaction using DNA extracted from each of the five TI transformants was PCR amplified and each produced a 2.5 kb fragment confirming the presence of the transgene omrl followed by the NOS terminator in each of the transformants. The native wild type allele OMRl did not PCR amplify because it is not followed by
-56- the NOS terminator and therefore no PCR reaction could take place. DNA extracted from untransformed Arabidopsis plants failed to amplify using such primers.
Example 3: Cloning of a Full-Length cDNA That Encodes a Mutated Threonine Dehydratase/Deaminase Enzyme
Plasmid pGMtd23 (Figure 20) is a cDNA clone that contains a portion of a transcript that encodes a mutant threonine dehydratase/deaminase enzyme. The sequence of the cDNA insert portion of pGMtd23, including the EcoRI (GAATTC, bases 1-6) and Xhol (CTCGAG, bases 2230-2235) recognition sites that were added in the preparation of the clone, is set forth as SEQ ID NO: 19. It is pertinent to this invention that an uninterrupted open reading frame (ORF) begins with base numbered 1 of SEQ ID NO: 19, and continues to base number 1770, where the ORF is terminated by a TGA stop codon. The amino acid sequence of the protein encoded by this ORF is given in SEQ ID NO: 19 as three letter designations of the amino acids underneath the cDNA sequence. It is seen that the first Met (methionine) residue in this deduced protein sequence occurs at amino acid number 46 (underlined) . It is well known in the field of eukaryotic gene expression that translation of proteins in most cases originates at the first ATG (methionine start codon) encountered by the ribosomes as they scan the mRNA from the 5' end. This observation suggests that bases 7-135 of SEQ ID NO: 19 represent a 5' untranslated sequence of the mRNA represented as a cDNA in pGMtd23, and that the Met encoded by bases 136-138 represents the first amino acid of a 545 amino acid encoded protein. However, in the case of the cDNA sequence presented in SEQ ID NO: 19, the first ATG codon is found at bases 125-127. This ATG codon is in a different reading frame from that which is found at bases 136-138. The presence of this out-of frame ATG codon 11 bases 5' to the putative protein initiation codon is a highly unusual feature of the 5' untranslated sequences of plant mRNAs, and might indicate
-57- that bases 7-135 of SEQ ID NO: 19 are actually not the 5' untranslated leader sequence of the mRNA represented by SEQ ID N0:19.
There exist in the field of molecular biology several computer programs which can compare the amino acid sequences of proteins from various sources. One such computer program is the GAP algorithm of the Wisconsin Package Version 9.1, Genetics Computer Group (GCG), Madison, Wise. When the GAP program is used to compare the deduced amino acid sequence of the protein encoded by bases 136-1770 of SEQ ID NO: 19 with the amino acid sequences of threonine dehydratase/deaminase proteins deduced from a tomato genomic clone (Accession Number M61915 in the Entrez database of the National Center for Biotechnology Information, NCBI) and a chickpea cDNA clone (NCBI Accession Number X78575) , it is found that the three proteins share a great deal of amino acid sequence homology. Further, the alignments reveal that both the tomato and chickpea proteins are substantially longer at their amino terminus than the deduced protein encoded by bases 136-1770 of SEQ ID NO: 19. Specifically, the tomato protein is 49 amino acids longer, and the chickpea protein is 43 amino 'acids longer. It is known [Samach, A., Hareven, D., Gutfinger, T., Ken-Dror, S., and Lifschitz, E., Biosynthetic threonine deaminase gene of toma to : isolation, structure, and upregulation in floral organs . Proc. Natl. Acad. Sci. USA 88 (7), 2678-2682 (1991)] that amino acids 1-50 of the tomato protein comprise the chloroplast transit peptide which directs the transport of the preprotein form of threonine dehydratase/deaminase into the chloroplast where it is naturally found in plant cells. This observation suggests that the 545 amino acid protein encoded by bases 136-1770 of SEQ ID NO: 19 does not include the complete chloroplast transit peptide.
Further support for this conclusion is provided when one aligns the deduced protein sequence encoded by bases 7- 1770 of SEQ ID NO: 19 with the tomato and chickpea
-58- 99/41395 proteins. Substantial homology is seen between the amino acids encoded by bases 7-135 of SEQ ID NO: 19 and the chloroplast transit peptides of the tomato and chickpea proteins. This analysis further suggests that bases 7- 135 of SEQ ID NO: 19 do not represent a 5' untranslated leader sequence, but rather are part of a longer ORF that is incompletely represented in pGMtd23.
One may obtain the base sequence of the 5' end of the putative ORF in the following manner. Synthetic oligonucleotide molecules having the sequences 5'-
CACAGGAAACAGGAC TCTAGA-3' (tdexF, complementary to the lambda YES cloning vector) and 5'-GGAGAGACC TTAAGACGTGG- 3' (tdintR, the reverse complement of bases 166-185 of SEQ ID NO: 19) were used as forward and reverse primers, respectively in Polymerase Chain Reactions (PCR) . A bulk cDNA library in the lambda YES vector, prepared from mRNA isolated from Arabidopsis thaliana plants, was used to provide template for the amplifications. A typical reaction contained in a volume of 100 μl, 50 pmole of tdintR primer, 185 pmole of tdexF primer, 1 μg of lambda YES library template DNA, 2.5 units of Amplitaq enzyme, and buffers as recommended by the manufacturer of the gene amplification PCR kit (Roche Molecular Systems, Branchburg, NJ, USA) . This reaction was cycled through 35 cycles of 94°, 1 min; 50°, 2 min; and 72°, 5 min, and then followed by incubation at 72° for 7 min. Amplification products in a range of sizes less than about 500 base pairs (bp) in length were detected by agarose gel electrophoresis. Following cloning of the amplification products into the TOPO TA vector
(Invitrogen, Carlsbad, CA) , the DNA sequences of the insert fragments were determined by standard dideoxy terminator methodologies. One clone had the DNA sequence set forth as SEQ ID NO: 21. It can be seen that bases 22- 191 of SEQ ID NO: 21 correspond precisely to bases 16-186 of SEQ ID NO: 19, thus indicating that SEQ ID NO: 21 represents a partial clone of a cDNA derived from an Arabidopsi s mRNA encoding threonine
59- O 99/41395 dehydratase/deaminase. It is important to note that bases 1-21 of SEQ ID NO: 21 are an upstream continuation of the ORF that is known to encode the mutated threonine dehydratase/deaminase as presented in SEQ ID NO: 19, thereby indicating that bases 7-136 of SEQ ID NO: 19 do not represent a 5' untranslated sequence. By replacing bases 1-15 of SEQ ID NO: 19 with bases 1-21 of SEQ ID NO: 21, one derives SEQ ID NO: 23, which contains the full- length 1776 base coding region for a mutated threonine dehydratase/deaminase protein of 592 amino acids, and which includes a chloroplast transit peptide sequence.
Example 4 : Modification Of a Full-Length Coding Region
Encoding A Mutated Threonine/Deaminase Enzyme For Expression In Transformed Plant Cells. By standard molecular biology methods, an Rca I recognition site (TCATGA) was introduced at the beginning of the sequence set forth as SEQ ID NO: 23 in the following manner. Synthetic oligonucleotides having the sequence (in the 5' to 3' direction) : GCTCTAGATCATGA ATTCCGTTCAGCTTCCGACGGCGCAATCCTCTCTCCGTAGCCACATT (TD5XREL primer) and CTCGTTCGTACGTTCTGGTACAGCACCGAG (tdSPL R primer) were used as forward and reverse primers, respectively, in PCR reactions using pGMtd23 DNA as template. It should be noted that in the TD5XREL primer, the sequence TCTAGA comprises an Xbal recognition site, and the underlined bases correspond precisely to bases 15-45 of SEQ ID NO: 19. In the tdSPL R primer, the sequence CGTACG comprises a BsiWI recognition site, and the primer sequence in its entirety forms the reverse complement of bases 214-243 of SEQ ID NO: 19. For PCR amplification the reaction contained, in a total volume of 50 μl, 50 pmol each of TD5XREL and tdSPL R primers, 20 ng of pGMtd23 DNA, 8 nmol each of dATP, dGTP, dCTP, and dTTP, with 2.5 units of Amplitaq Gold polymerase and IX buffer as supplied by the manufacturer (Roche Molecular Systems, Branchburg, NJ, USA) . Amplification products of the expected size (259 bp) were detected by agarose gel
-60- electrophoresis after 35 cycles of 94°, 30 sec; 60°, 30 sec; and 72°, 45 sec; followed by 72° for 10 min. The PCR products were cloned into the TOPO TA vector as above, and the DNA sequence of one of the isolates was determined to be precisely as expected. It is to be noted that this engineering step was done in such a manner that the ATG which forms part of the .Real recognition site is also the ATG start codon represented as bases 1-3 of SEQ ID NO: 23. The 245 bp XballBsiWl fragment from this clone was used to replace the corresponding Xbal/BsiWI fragment of pGMtd23, generating a plasmid named pDAB5017, which has as a portion of its sequence the DNA sequence set forth as SEQ ID NO: 23.
In a series of cloning steps the Nhel site of SEQ ID ΝO:23 (underlined bases 2005-2010) was converted to an Smal recognition site, generating plasmid pDAB5018. Following cleavage of plasmid pDAB5018 with Real and Smal, a 2013 bp DNA fragment that includes the entire full-length coding region for the mutated threonine dehydratase/deaminase enzyme and 228 bp corresponding to the 3' untranslated region of the mRNA, was isolated. This Rcal/Smal fragment was ligated to DNA of plasmid pDAB1538 that had been digested with restriction enzymes Ncol and Ecll36I I to generate plasmid pDAB1850 (Figure 21) . It is of note that DΝA fragments generated by cleavage with Real and Ncol have compatible overhanging ends, and that fragments generated by digestion with Smal and Ecll36II have blunt ends. In this way, the coding region for the full-length mutated threonine dehydratase/deaminase enzyme was placed, in plasmid pDAB1850, under the transcriptional control of the maize ubiquitin 1 promoter, and transcription was terminated and polyadenylation mediated by the nopaline synthase (Νos) terminator region present in plasmid pDAB1850. Plasmid pDAB1850 therefore is capable of independent expression of the coding region for the full-length mutated threonine dehydratase/deaminase enzyme in a transformed plant cell.
-61- A 4381 bp DNA fragment prepared from DNA plasmid pDAB1850 by Not I digestion was ligated to DNA of plasmid pDAB367 to generate plasmid pDAB1852 (Figure 23) . In this plasmid context, the coding region for the full- length mutated threonine dehydratase/deaminase enzyme, whose expression is regulated by the maize ubiquitin 1 promoter/intron and Nos terminator, is covalently linked to a plant selectable marker gene, specifically, the phosphinothricin acetyl transferase resistance (Jar) coding region. The bar coding region is under the transcriptional control of a highly modified version of the cauliflower mosaic virus (CaMV) 35S promoter (the modified version being called 35T) , with transcription termination and polyadenylation being mediated by the Nos terminator. The 35T promoter is comprised of a doubly- enhanced version of the basic 35S promoter, which is placed in front of a chimeric 5' untranslated leader sequence. This leader consists of the 5' untranslated leader of the Maize Streak Virus (MSV) coat protein gene, into which has been ligated an internally deleted version of intron 1 of the maize alcohol dehydrogenase IS gene.
Plasmid pDAB1852 has utility in testing the expression of the coding region for the full-length mutated threonine dehydratase/deaminase enzyme in transgenic plant cells. By virtue of the simultaneous introduction of both plant-expressible genes present on pDAB1852 into transformed plant cells, it is possible to first select for transformed cells using the bar selectable marker gene, and then screen the transformed plant cells for production of the mutant threonine dehydratase/deaminase enzyme. In this instance, the presence of the mutated threonine dehydratase/deaminase enzyme can be exemplified by biochemical methods described elsewhere. It is also possible to introduce the coding region for the full-length mutated threonine dehydratase/deaminase enzyme by co-transformation methods. In these instances, the coding region for the
-62- full-length mutated threonine dehydratase/deaminase enzyme, whose expression is regulated by the maize ubiquitin 1 promoter/intron and Nos terminator, as borne on plasmid pDABl850, is mixed with another plasmid that contains the bar gene as described for plasmid pDABl852. In such instance, transformed plant cells independently incorporate and express the plant-expressible genes. It is often the case that a nonselectable, but scorable, marker gene is included in such experiments. One such scorable gene is the Escherichia coli uidA gene, which encodes the GUS protein. It is well known to those in the field of plant transformation that production of the GUS protein at high levels in transformed plant cells is easily accomplished when the GUS coding region is placed under the control of a plant-expressible promoter such as the CaMV 35S promoter, or modified versions of it (e.g. 35T) . Figure 24 presents the map of plasmid pDAB311, which contains the 35T/jar/Nos and 35T/GUS/Nos genes, and is used in co-transformation experiments with plasmid pDABl850. Figure 25 presents the map of plasmid pDAB305, which contains the 35T/GUS/Nos gene, and is used in co- transformations with plasmid pDAB1852.
The plasmids pDABl850, pDABl852, pDAB311 and pDAB305 may be used for the production of transgenic plants, such as maize. Examples of the production of transgenic lines are described further in the Examples.
EXAMPLE 5 : Functionality of omrl in Transgenic Maize
1. Cloning and plasmid construction.
Details of cloning of long or full length omrl, sequence information, subcloning, descriptions of plasmids used in this study, i.e., pDAB1850, pDAB1852, pDAB311, and pDAB305) are described above.
2. Production of transgenic lines
Part A. Initiation, establishment and maintenance of embryogenic maize suspension cultures.
-63- Greenhouse-grown plants of two maize inbred lines, A188 and B73, were crossed and ears were harvested 10 to 12 days post-pollination. Immature zygotic embryos were aseptically excised and cultured on N6 based medium (Chu et al, 1975, The N6 medium and its application to anther culture of cereal crops. Proc. Symp. Plant Tissue Culture, Peking Press, p43-56) plus AgN03 (10 mg/L) . To initiate suspension cultures, approximately 3 g of healthy xType II' callus originating from a single embryo were added to 20 mL of H9CP+ liquid medium. H9CP+ is liquid MS medium (Murashige and Skoog, 1962, A revised medium for rapid growth and bioassays with tobacco tissue cultures. Physiol. Plant.15: 473-497) plus 2 mg/L 2,4-D, 2 mg/L NAA, 100 mg/L myo-inositol, 0.69g/L L-proline, 200 mg/L casein hydrolysate, 30 mg/L sucrose and 5% coconut water (added at subculture) adjusted to pH 6 prior to autoclaving. Cultures were maintained in 125 ml Erlenmeyer flasks in the dark at 28°C on a 125 rpm shaker. Cultures typically became established 2 to 3 months after initiation, and were maintained by subculture every 3.5 days. For subculture, 3 ml packed cell volume (pcv) of cell was measured in a 10 ml wide bore pipet. The measured cells plus 7 ml of old medium was pipetted into 20 ml of fresh medium.
Part B. Preparation of silicon carbide whiskers for use in transformation experiments.
Flat top 2.0 mL microcentrifuge tubes (Fisher Scientific) were labeled and weighed. Approximately 45 to 95 mg of dry whiskers were added to each tube in a fume hood. An NlOO particulate respirator (3M), gloves and a lab coat were worn during this procedure. Each tube was capped and re-weighed to determine the whisker weight. Tubes were placed in Magenta boxes, and autoclaved for 30 minutes on a gravity cycle. Once cooled, tubes remained in the Magenta box in a fume hood until used.
-64- 99/41395 r^
During transformation experiments, a 5% w/v whisker suspension was made by adding an appropriate amount of an osmotic culture medium per tube of sterile whiskers. The osmotic medium was liquid N6 medium plus 45 g/L D- sorbitol, 45g/L D-mannitol, and 30 g/L sucrose, adjusted to pH 6.0 before autoclaving. The whisker suspension was vortexed 1 to 2 minutes immediately prior to use.
Part C. Preparation of maize suspension cultures for use in transformation experiments. Approximately 16 to 24 hours prior to transformation experiments, maize suspension cultures were each subcultured into 20 ml of liquid N6 medium. On the day of the experiment, all cells of a given line are pooled, to reduce flask to flask variability. The N6 osmotic medium described in part B of this example is added to sterile 125 ml Erlenmeyer flasks (12 ml/flask) . Typically 6 or 8 flasks were used per experiment. To each flask, 2 ml pcv from the pool was added. The flasks were placed on a shaker in the dark for 45 minutes. After that time period, the contents of each flask were transferred to a 15 ml centrifuge tube. After the cells had settled to the bottom of the tube, all but 1 ml of liquid was drawn off and added back to the original flask.
Part D. Whisker-mediated Transformation. The 5% w/v whisker suspension was prepared and vortexed as outlined in part B. Using a wide bore pipet tip, 160 μl was added to each centrifuge tube of cells. In approximately half the experiments, 20 μl of plasmid pDAB1852, adjusted to a concentration of 1 mg/ml, was added to each tube. In the other half of the experiments, 10 μl of pDAB311 and 10 μl of pDABl850, both adjusted to lmg/ml, were added to each tube. The tubes were securely capped and swirled or tapped to mix the contents. Each tube was placed on a modified Vari-Mix dental amalgamator. The amalgamator was adapted to hold a
-65- centrifuge tube by the addition of a lightweight metal arm extension. The tubes were agitated for 60 seconds on either low or medium speed. After agitation, the contents of each tube were transferred back to the original flask. An additional 18 ml of liquid N6 medium was added per flask to reduce the osmoticum. The flasks were placed on a 125 rpm shaker in the dark for a 2 hour recovery.
Part E. Plating of whisker-treated suspension cells, selection and recovery of stable transformants.
The contents of each flask were transferred to a 50 ml centrifuge tube. A sterilized glass cell collector unit was connected to a vacuum, and a sterile Whatman #4 filter paper was placed on the unit. Six ml of cell suspension was pipetted into the unit, with the vacuum drawing through the liquid, leaving the cells on the filter paper. One flask yielded 5 filters of cells. Each filter paper was placed on a 60 x 20 plate of N6 solid medium. Plates were wrapped with 3M micropore tape and placed in the dark for 1 week at 28° C.
After 1 week, the filter papers were transferred to plates containing solid N6 medium + 1 mg/L bialaphos. This step was repeated after an additional week. At 3 weeks post-experiment, the tissue was embedded on 100 x 15, which also contained solid N6 medium + bialaphos. To embed, 5 ml of melted 37° C agarose was added to a sterile test tube which contained 50 μl of bialaphos stock. The contents of each filter was scraped into the test tube and pipetted up and down to mix. Approximately 2.5 ml (1/2 of the contents) was pipetted over the surface of individual 100 x 15 selection plates. Each test tube yielded 2 plates. Plates were wrapped with parafilm and incubated in the dark at 28°C.
Bialaphos-resistant transformants were recovered 2 to 8 weeks post-embedding. They appeared as light yellow sectors proliferating against a background of dark yellow to brown growth-inhibited tissue. The growing tissue was
66- placed on fresh selection medium. Transgenic cultures were established after 1 to 3 additional subcultures.
3. Southern Analysis of Maize Transgenic Callus Lines Produced on Bialaphos. Southern analysis was used to identify maize Type II callus lines that contained intact copies of the omrl gene. Callus samples of transgenic colonies produced on bialaphos selection and negative controls (non- transformed callus) were collected and rinsed in deionized water for 30 minutes prior to lyophilization. Genomic DNA from the callus material was prepared from lyophilized tissue as described by Saghai-Maroof et al. (1984, Proc. Natl. Acad. Sci. USA 81:8014). Eight micrograms of DNA from the callus tissue was digested with the restriction enzyme Ncol and EcoRI using conditions suggested by the manufacturer (Bethesda Research Laboratory, Gaithersburg, MD) and separated by agarose gel electrophoresis, which should result in a 1.4 kb hybridization product when radiolabled with the probe specific for the omrl coding sequence. The DΝA was blotted onto nylon membranes as described by Southern (1975, J. Mol. Biol.98:503) . Radiolabled probe DΝA was hybridized to the genomic DΝA on the blots using 50 ml of minimal hybridization buffer (10% polyethylene glycol, 7% sodium dodecyl sulfate, 0.6x SSC, 10 mM sodium phosphate, 5 mM EDTA and 100 mg/ml denatured salmon sperm DΝA) and was heated to 60° C and mixed with the denatured radiolabeled hybridization at 60° C. The blots were washed at 60 °C in 0.25X SSC and 2% SDS for 45 minutes, blotted dry and exposed to XAR-5 film with two intensifying screens overnight.
The expected 1.4 kb hybridization product was detected in transgenic lines as shown in Table 1.
4. Biochemical Analysis of Mutated Threonine Dehydratase/Deaminase in Transgenic Lines Selected on Bialaphos .
-67- As described herein, callus was produced which was transformed with the Arabidopsis thaliana mutated threonine dehydratase/deaminase (TD) denoted as omrl with either pDAB1852 (plasmid containing BAR and omrl) or cotransformed with pDAB1850 (plasmid with omrl) and pDAB311 (plasmid containing BAR and GUS) . Maize callus material was selected on bialaphos and analyzed for threonine dehydratase/deaminase activity in the presence and absence of isoleucine. Maize callus was homogenized, proteins extracted, and normalized for protein concentration (BioRAD Protein assay, Hercules, CA) . Threonine dehydratase/deaminase assays were conducted according to Strauss et al., ((1985) Planta 163:554-562) with slight modifications. A standard reaction contained 0.15 M Tris-HCl, pH 9.0, 60 mM threonine, 0.3 M K2HP04, 0.3 mM Na2EDTA, pH 9.0, 0.3 mM DTT, 2-5 mM L-isoleucine in treated assays, and enzyme in total volume of 500 μL. Reactions were conducted at 30° C for 20 min and terminated with 200 μL of 50% (w/v) TCA. Ketoacid produced was determined according to Friedmann and Haugen (1943) (?) by adding 200 μL of 0.1% (w/v) 2,4- dinitrophenylhydrazine in 2 N HCl and incubated for 20 min at room temperature. KOH (900 μL of 2.5 N) was then added and mixed, the tubes were incubated for 15 min at room temperature, and the A515 was determined. Natural variations in threonine dehydratase/deaminase activity were determined using nontransformed callus lines as a control. The results are shown in Table 1.
-68- Table 1. Increased threonine dehydratase/deaminase activity, isoleucine insensitive in maize callus transformed with either pDABl852 (BAR + OMRl) or pDAB1850 (OMRl) and ρDAB311 (BAR/GUS) and selected on bialaphos.
Callus Lines TD Activity3 Isoleucine0 Percent Southernd Control0
omrl-2. 2.205 1.632 1141 + omrl-27 0.521 0.401 280 + omrl-30 0.669 0.536 374 +
omrl-3 0.412 0.155 108 + omrl-6 0.412 0.162 113 ND omrl-10 0.344 0.148 103 ND
NT control 0.372 0.143 100 ND
NT control 0.389 0.152 106 ND
Figure imgf000071_0001
aThreonine dehydratase/deaminase (TD) activity was measured by absorbance at 515 nm. bThe effect of 2 mM isoleucine on TD activity. cPercent change of TD activity relative to the control. Southern analysis presence of omrl gene (+) or no band determined (ND) . A significant correlation was observed between the presence and absence of isoleucine on TD activity. Callus lines omrl-2, 27, 30 were insensitive to isoleucine and overall showed an increase in TD activity as compared to the control lines and callus lines omrl-3, 6, and 10. One callus line, omrl-3 was determined to contain the gene of interest however, was not shown to have a difference in TD activity. The results described above demonstrate that transformation of maize callus
-69- 99/41395 "" " with omrl increased the overall TD activity and were insensitive to isoleucine.
5. Growth Responses of Stable Transformants at Different Concentrations of OMT. To determine the functionality of the omrl gene, i.e., the ability to confer resistance to lethal concentrations of O-methyl threonine (OMT) , growth response studies were conducted for some bialaphos- resistant transformants with or without omrl and varying levels of TD activity. Pre-weighed samples of selected lines were transferred to λcallus maintenance' medium with 0, 0.5 and 1.0 mM concentrations of O-methyl threonine (OMT) (Sigma, St. Louis, MO) and incubated in the dark at 28 C. Callus maintenance medium consisted of N6 salts and vitamins (Chu et al, (1978) The N6 medium and its application to anther culture of cereal crops. Proc. Symp. Plant Tissue Culture, Peking Press, 43-56) , 1.0 mg/L 2,4-D, 2.5 g/L GELRITE, and 20 g/L sucrose, with a pH of 5.8. After 2 and 4 weeks of culture, fresh weight of the callus was measured. Growth responses of callus lines with and without increase in threonine dehydratase' (TD) activity are presented in Table X. Transgenics, which contained the omrl gene and showed increased TD activity (i.e., omrl-2, 27 and 30), were found to grow at lethal or sub-lethal concentrations of OMT (0.5 and 1.0 mM) , however, at different levels. No or very little growth was observed in the case of transgenic lines with TD enzyme activity similar to that of the non-transgenic controls as described previously. These results demonstrate that the omrl gene is functional in maize transgenics and confer resistance to lethal or sub-lethal concentrations of OMT.
-70- Table 2. Growth responses of transgenic maize callus lines at different concentrations of OMT after 4 weeks of culture. . Increase in Fresh Weight
Callus Lines 0 mM OMT 0. 5 mM OMt 1. 0 mM OMT
omrl-2 2125.5 431.5 232.1 omrl-27 2858.6 907.7 981.1 omrl-30 3358.9 1777.1 311.0
omrl-3 2481.5 -37.9 -61.35 omrl-6 3209.9 -10.5 -9.2 omrl-10 3583.4 4.1 -49.9
NT control 1560.6 82.7 23.6
NT control 1252.9 -46.5 -39.8
Figure imgf000073_0001
Omrl-2, 27 and 28 have increased levels of TD activity, which are significantly different from that of the controls. These lines show resistance to OMT as shown by their growth at 0.5 and 1.0 mM OMT.
NT control (non-transgenic control) as well as omrl- 3, 6, and 10 lines have low levels of TD activity, which are not significantly different from each other.
Example 6: Use of omrl as a Selectable Marker for Maize Transformation
1. Production of Stable Transformants on O-Methyl Threonine Selection.
Part A. Initiation and establishment of λMaize Type II Callus.
71- Maize ΛType II' callus was used as tissue targets for transformation via helium blasting (Pareddy et al . , 1987, Maize transformation via helium blasting, Maydica, 42: 143-154). xType II' callus cultures were initiated from immature zygotic embryos of the genotype "Hi-II." (Armstrong et al, (1991) Maize Cooperation Newsletter, pp.92-93). Embryos were isolated from greenhouse-grown ears from crosses between Hi-II parent A and Hi-II parent B or F2 embryos derived from a self- or sib-pollination of a Hi-II plant. Immature embryos (1.5 to 3.5 mm) were cultured on initiation medium consisting of N6 salts and vitamins (Chu et al, (1978) The N6 medium and its application to anther culture of cereal crops. Proc. Symp. Plant Tissue Culture, Peking Press, 43-56), 1.0 mg/L 2,4-D, 25mM L-proline, 100 mg/L casein hydrolysate, 10 mg/L AgN03, 2.5 g/L GELRITE, and 20 g/L sucrose, with a pH of 5.8. Selection for Type II callus took place for ca. 2-12 weeks. After four weeks callus was subcultured onto maintenance medium (initiation medium in which AgN03 was omitted and L-proline was reduced to 6 mM) .
Part B. Precipitation of gold particles for use in helium blasting.
About 140 μg of plasmid DNA (two plasmids, i.e., PDAB1852 and pDAB305, in 1 : 1 molar ratio) was precipitated onto 60 mg of alcohol-rinsed, spherical gold particles (Bio-Rad 1.0 μm diameter or Aldrich 1.0-1.5 μm diameter) by adding 74 μL of 2.5M CaCl2 and 30 μL of 0.1 M spermidine (free base) to 300 μL of plasmid DNA. The solution was immediately vortexed and the DNA-coated gold particles were allowed to settle. The resulting clear supernatant was removed and the gold particles were resuspended in 1 mL of absolute ethanol. This suspension was diluted with absolute ethanol to obtain 15 mg DNA- coated gold/ mL.
-72- Part C. Tissue Preparation and Helium Blasting.
Ca. 600 mg of embryogenic callus tissue was spread over the surface of λType II' callus maintenance medium as described herein lacking casein hydrolysate and L- proline, but supplemented with 0.2 M sorbitol and 0.2 M mannitol as an osmoticum. Following a 4-16 h pre- treatment, tissue was transferred to culture dishes containing blasting medium (osmotic media solidified with 20 g/L tissue culture agar (JRH Biosciences, Lenexa, KS) instead of 7 g/L GELRITE (Schweizerhall) .
Helium blasting accelerated suspended DNA-coated gold particles towards and into the prepared tissue targets. The device used was an earlier prototype of that described in US Patent #5,141,131 which is incorporated herein by reference. Tissues were covered with a stainless steel screen (230 μm openings) and placed under a partial vacuum of 25 inches of Hg in the device chamber. The DNA-coated gold particles were further diluted 1:1 with absolute ethanol prior to blasting and were accelerated at the callus target once using a helium pressure of 1500 psi, with each blast delivering 20 μL of the DNA/gold suspension.
Part D. Callus Selection.
Immediately post-blasting, the tissue was transferred to osmotic media for a 16-24 h recovery period. Afterwards, the tissue was divided into small pieces and transferred to selection medium (maintenance medium lacking casein hydrolysate and L-proline but having 0.5 mM concentration of O-methyl threonine (Sigma, St. Louis, MO). Every three weeks for 3 months, tissue pieces were non-selectively transferred to fresh selection medium containing either 0.5 or 1.0 mM OMT. After 6-8 weeks, callus sectors found proliferating against a background of growth-inhibited tissue were removed and isolated. The resulting OMT-resistant tissue was subcultured biweekly onto fresh selection medium.
-73- 2. Southern Analysis Of Transgenic Maize Callus Lines Recovered On OMT Selection.
To determine the presence of intact copies of the omrl gene in the transgenic callus lines selected on OMT, Southern analysis was employed as described herein. Callus samples of transgenic colonies produced on OMT selection and negative controls (non-transformed callus) were collected and rinsed in deionized water for 30 minutes prior to lyophilization. Genomic DNA from the callus material was prepared from lyophilized tissue as described by Saghai-Maroof et al. (1984, Proc. Natl. Acad. Sci. USA 81:8014) .
Eight micrograms of DNA from the callus tissue was digested with the restriction enzyme Nco I and Eco RI using conditions suggested by the manufacturer (Bethesda Research Laboratory, Gaithersburg, MD) and separated by agarose gel electrophoresis, which should result in a 1.4 kb hybridization product when radiolabeled with the probe specific for the omrl coding sequence. The DNA was blotted onto nylon membranes as described by Southern (1975, J. Mol. Biol.98:503) . Radiolabled probe DNA was hybridized to the genomic DNA on the blots using 50 ml of minimal hybridization buffer (10% polyethylene glycol, 7% sodium dodecyl sulfate, 0.6x SSC, 10 mM sodium phosphate, 5 mM EDTA and 100 mg/ml denatured salmon sperm DNA) and was heated to 60° C and mixed with the denatured radiolabeled hybridization at 60° C. The blots were washed at 60 °C in 0.25X SSC and 2% SDS for 45 minutes, blotted dry and exposed to XAR-5 film with two intensifying screens overnight.
The expected 1.4 kb hybridization product was detected in transgenic lines as shown in Table 3.
3. Biochemical Analysis of Mutated Threonine Dehydratase/Deaminase in Transgenic Lines Selected on L- O-Methyl Threonine
As described herein, callus was produced which was transformed with the Arabidopsis thaliana mutated
-74- threonine dehydratase/deaminase (TD) denoted as omrl cobombarded with pDAB1852 (plasmid containing BAR and omrl) and pDAB305 (plasmid containing GUS) . Maize callus material was selected on L-O-methylthreonine and analyzed for threonine dehydratase/deaminase activity in the presence and absence of isoleucine. Threonine dehydratase/deaminase activity, assayed as described herein, was performed on extracted proteins from each individual callus line normalized for protein concentrations (BioRAD Protein assay, Hercules, CA) .
Analysis was performed using threonine as the substrate, as described previously, and is shown in Table 3. Natural variations in TD activity were determined using the nontransformed callus line as a control.
-75- Table 3. Increased threonine dehydratase/deaminase activity, isoleucine insensitive in maize callus co- bombarded with pDABl852 (BAR + OMRl) and pDAB305 (GUS selected on L-O-methylthreonine.
Callus Lines TD Activity3 Isoleucmeb Percent Southernd Control0
NT-cont rol 0.221 0.142 100 -
NT-control 0.220 0.151 106 - omt-01 0.429 0.363 255 + omt-02 0.180 0.136 96 ND omt-03 0.348 0.301 210 + omt-0 0.166 0.138 97 ND omt-05 0.182 0.129 91 + omt-06 0.227 0.167 118 + omt-07 0.246 0.188 132 ND omt-08 0.421 0.355 250 + omt-09 0.327 0.289 203 ND omt-10 0.164 0.128 90 ND omt-11 0.176 0.129 91 ND omt-12 0.737 0.612 431 ND omt-13 0.300 0.253 178 ND omt-14 0.211 0.148 104 ND
Figure imgf000078_0001
aThreonme dehydratase/deaminase (TD) activity was measured by absorbance at 515 nm. bThe effect of 2 mM isoleucine on TD activity. cPercent change of TD activity relative to the control. dSouthern analysis presence of omrl gene (+) , absence of omrl gene (-), or no band determined (ND) .
-76- A significant correlation was observed between the presence and absence of isoleucine on TD activity. Eight callus lines were insensitive to isoleucine and overall showed an increase in TD activity as compared to the control lines and callus lines omrl-02, 04, 05, 10, 11, and 14. The results described above demonstrate that transformation of maize callus with omrl increased the overall TD activity and were insensitive to isoleucine.
4. Histochemical GUS Assay of Transgenic Maize Callus Lines Recovered on OMT Selection
As described herein, the transgenic lines produced here were transformed with pDAB1852 and pDAB305 containing the GUS reporter gene, which is the gene of interest in this study. To determine the expression of this gene in the transformed lines selected on OMT, callus samples of each line were subjected to histochemical GUS analysis (Jefferson, 1987, Plant Mol. Biol. Rep. 5:387-405). Briefly, tissues were placed in 24-well microtiter plates (Corning) containing 500 μL of assay buffer [0.1 M sodium phosphate, pH 8.0, 0.5 mM' potassium ferricyanide, 0.5 mM potassium ferrocyanide, 10 mM sodium EDTA, 1.9 mM 5-bromo-4-chloro-3-indolyl-beta-D- glucuronide, and 0.06% TRITON X-100] per well and incubated in the dark for 1-2 days at 37° C before analysis. GUS expression was observed as blue spots on callus or intense blue of entire callus under a microscope and are presented in Table 4. Four transgenic lines displayed GUS expression. These results provide additional proof that the transgenics selected on OMT are true transformants and thus, the omrl gene can be used as a selectable marker, in conjunction with OMT as the selection agent, for transformation of maize and possibly other monocot species.
-77- Table 4: GUS expression in maize transgenic lines transformed with the omrl gene and selected on O-methyl threonine.
Transgenic Line Hi stochemical GUS Expression
OMT-01 -
OMT-02 -
OMT-03 -
OMT-04 -
OMT-05 +
OMT-06 +
OMT-07 -
OMT-08 +
OMT-09 -
OMT-10 -
OMT-11 -
OMT-12 +
OMT-13 -
OMT-14 -
Figure imgf000080_0001
GUS expression present (T) or absent T
Example 7 : Use Of omrl As A Selectable Marker For Rice
Transformation Cloning and Plasmid Construction
Details of cloning of long or full length omrl, sequence information, subcloning, descriptions of plasmids used in this study, i.e., pDABl850 and pDABl518, are described herein.
2. Production of Rice transgenic lines Part A. Callus Initiation and Maintenance. For initiation of embryogenic callus, mature seeds of a Japonica cultivar, Taipei 309 were dehusked and surface-
-78- sterilized in 70% ethanol for 2-5 min. followed by a 30- 45 min soak in 50% commercial bleach (2.6% sodium hypochlorite) with a few drops of 'Liquinox' soap. The seeds were then rinsed 3 times in sterile distilled water and placed on filter paper before transferring to 'callus induction' medium (i.e., NB) . The NB medium consisted of N6 macro elements (Chu, 1978, The N6 medium and its application to anther culture of cereal crops. Proc. Symp. Plant Tissue Culture, Peking Press, p43-56) , B5 micro elements and vitamins (Gamborg et al., 1968,
Nutrient requirements of suspension cultures of soybean root cells. Exp. Cell Res. 50: 151-158), 300 mg/L casein hydrolysate, 500 mg/L L-proline, 500 mg/L L-glutamine, 30 g/L sucrose, 2 mg/L 2, 4-dichloro-phenoxyacetic acid (2,4- D) , and 2.5 g/L gelrite (Schweizerhall, NJ) with the pH adjusted to 5.8. The mature seed cultured on 'induction' media were incubated in the dark at 28°C. After 3 weeks of culture, the emerging primary callus induced from the scutellar region of mature embryo was transferred to fresh NB medium for further maintenance.
Part B. Precipitation of gold particles for use in helium blasting.
About 70 μg of plasmid DNA (pDAB1850 and pDAB1518)was precipitated onto 30 mg of 1.0 micron (Bio-Rad) gold particles as described herein.
Part C. Tissue Preparation and Helium Blasting. For helium blasting, actively growing embryogenic callus cultures, 2-4 mm in size, were subjected to a high osmoticum treatment. This treatment included placing of callus on NB medium with 0.2 M mannitol and 0.2 M sorbitol (Vain et al., 1993, Osmoticum treatment enhances particle bombardment-mediated transient and stable transformation of maize. Plant Cell Rep. 12: 84-88) for 4 h before helium blasting. Following osmoticum treatment, callus cultures were transferred to 'blasting' medium (NB+2% agar) and covered with a stainless steel
-79- screen (230 micron) . The callus cultures were blasted at 2,000 psi helium pressures twice per target.
Part D. Callus Selection.
After blasting, callus was transferred back to the media with high osmoticum overnight before placing on selection medium, which consisted of NB medium with 1.5 mM O-methyl threonine (OMT) . The cultures were incubated in the dark at 28° C. After every 2 weeks, the cultures were transferred to fresh selection medium with the same concentration of OMT (i.e., 1.5 mM) .
3. Biochemical Analysis of Mutated Threonine dehydratase/deaminase Selected on L-O-methylthreonine
As described herein, callus was produced which was transformed with the Arabidopsis thaliana mutated threonine dehydratase/deaminase (TD) denoted as omrl cobombarded with pDABl850 (plasmid containing omrl) and pDABl518 (plasmid containing GUS) . Rice callus material was selected on L-O-methylthreonine and analyzed for threonine dehydratase/deaminase activity. Threonine dehydratase/deaminase activity, assayed as described herein, was performed on extracted proteins from each individual callus line normalized for protein concentrations (BioRAD Protein assay, Hercules, CA) . Analysis was performed using threonine as the substrate, as described previously, and is shown in Table 5.
Natural variations in TD activity were determined using the nontransfor ed callus line as a control.
-80- Table 5. Increased threonine dehydratase/deaminase activity and GUS expression in rice callus co-bombarded with pDABl850 (OMRl) and pDABl518 (GUS), and selected on L-O-methyl threonine.
Callus Lines TD GUS
Activity3 Expression0
OMT-01 0.381 +
OMT-02 0.520 +
OMT-03 0.484 +
OMT-04 0.761 +
OMT-05 0.456 +
OMT-06 0.528 +
OMT-07 0.656 +
OMT-08 0.179 -
OMT-09 0.343 -
OMT-10 0.231 -
OMT-11 0.430 -
NTD 0.203 ND
NT 0.246 ND
NT 0.183 ND
NT 0.214 ND
Figure imgf000083_0001
aThreonine dehydratase/deaminase (TD) activity was measured by absorbance at 515 nm. bNT denoted for lines not transformed.
CGUS expression present (+) or absent (-)
Seven callus lines had increased TD activity which correlated to GUS activity. Four callus lines did not
ι_L- have GUS activity, however, two of these lines did have increased TD activity compared to the nontransformed control lines. The results described above demonstrate that transformation of rice callus with omrl increased the overall TD activity.
4. Histochemical GUS Assay of Transgenic Rice Callus Lines Recovered on OMT Selection
As described herein, the transgenic lines produced here were transformed with pDAB1850 and pDAB1518 containing the GUS reporter gene. To determine expression of this gene in the transformed lines selected on OMT, callus samples of each line were subjected to histochemical GUS analysis (Jefferson, 1987, Plant Mol. Biol. Rep. 5:387- 405) as described herein. The results in Table 5 provide clear evidence that the transgenics selected on OMT are true transformants and contain the gene of interest, i.e., GUS reporter gene. Thus, omrl can also be used as a selectable marker in rice and other monocots.
Example 8: The Molecular Basis of L-O-Methylthreonine Resistance Encoded by the omrl Allele of Line
GMllb of Arabidopsis thaliana
1. Isolation of the wild type OMRl allele:
An Arabidopsis thaliana Columbia wild type cDNA library, constructed from 3-day-old seedlings in Stratagene 's λ ZAP II vector was screened with a 32P- labeled 1080 base pair DNA fragment PCR-amplified from the cDNA sequence of omrl (described above) as a probe. The screening yielded a positive clone TD54 which was purified and was proven to be the wild type allele OMRl by PCR and Southern analysis.
2. Sequencing of the OMRl wild type allele:
The recombinant plasmid containing the wild type allele OMRl was named pGM-td54 and the OMRl allele was manually sequenced using the sequenase kit of USB and the
-82- same set of oligonucleotide primers that were previously used in sequencing the omrl allele. The DNA sequence of the wild type OMRl was similar to that of omrl except for two different base substitutions predicting two amino acid substitutions in the mutated TD encoded by omrl . In an attempt to clone the 5' upstream sequences from the ATG start codon of clone 23 (Figure 5) and using a PCR approach, a new ATG codon was detected at 141 nucleotides upstream from the ATG codon reported in clone 23. This was confirmed in both the wild type allele OMRl and the mutated allele orml . Therefore the full length cDNA of the omrl locus was found to be 1779 nucleotides (Figure 7) encoding a TD protein of 592 amino acids (Figures 8 and 9) . The omrl insert as shown in Figure 6b (SEQ ID NOS: 5-6) was not only strongly expressed in the first transgenic plants (Tl) but was also inherited and strongly expressed in their progeny (12 plants) . As expected, the full length cDNA of the OMRl allele of the omrl locus was 1779 nucleotides (Figure 10) encoding a wild type TD of 592 amino acids (Figures 11 and 12) . Amino acid alignment of wild type threonine dehydratase/deaminase of Arabidopsis thaliana with that of chickpea (John et al., 1995), tomato (Samach et al., 1991), potato (Hild ann T., Ebneth M., Pena-Cortes H, Sanchez-Serrano J.J., Willmitzer L, Prat S. (1992). General roles of abscisic and jasmonic acids in gene activation as a result of mechanical wounding. Plant Cell 4:1157-1170), yeast 1 (Kielland-Brandt MC, Holmbcrg S. Petersen JGL, Nilsson-Tillgren T (1984) Nucleotide sequence of the gene for threonine deaminase (ilv) of Saccharomyces cerevisiae . Carlsberg Res Commun 49:567- 575 ), yeast 2 (Bornacs C, Petersen JG, Hol berg S (1992) Serine and threonine catabolism in Saccharomyces cerevisiae : the CHAI polypeptide is homologous with other serine and threonine dehydratases. Geneti cs 131:531-
539.), E. coli biosynthetic (Wek RC, Hatfield GC (1986) Nucleotide sequence and in vivo expression of ilvY and ilvC genes in Escherichia coli K12. Transcription from
83- divergent overlapping promoters. (J. Biol Chem 261:2441- 2450.), E. coli catabolic (Datta P, Goss T J, Omnaas JR, Patil RV (1987) Covalent structure of biodegradative threonine dehydratase of Escherichia coli : homology with other dehydratases . Proc Na tl Acad Sci USA 84:393-397.), and Salmonella typhimuri um (Taillon BE, Little R, Lawther RP (1988) Analysis of the functional domains of biosynthetic threonine deaminase by comparison of the amino acid sequences of three wild type alleles to the amino acid of biodegradative threonine deaminase. Gene 62:245-252.) is set forth in Figure 13. The Megalign program of the Lasergene software was used, DNASTAR Inc., Madison, Wisconsin. The degree of similarity between amino acid residues of Arabidopsis threonine dehydratase/deaminase and those of other organisms was calculated by the Lipman-Pearson protein alignment method using the Lasergene software and was found to be 46.2% with chickpea, 52.7% with tomato, 55.0% with potato (partial), 45.0% with yeast 1, 24.7% yeast 2, 43.4% with E. coli (biosynthetic), 39.3% with E. coil (catabolic) and 43.3% with Salmonella .
3. Comparing DNA sequences of omrl and OMRl revealed the point mutations involved:
With reference to the nucleotide residue numbering in SEQ ID NO:l and SEQ ID NO: 3, the first base substitution occurred at nucleotide 1519 where C (cytosine) in the wild type allele OMRl was substituted by T (thymine) in the mutated allele omrl (Figures 14 & 15) . This base substitution predicted an amino acid- substitution at amino acid residue 452 at the polypeptide level where the arginine residue in the wild type TD encoded by OMRl was substituted by a cysteine residue in the mutated isoleucine-insensitive TD encoded by omrl (Figure 15) . This point mutation resides in a conserved regulatory region of amino acids designated R4
(regulatory) by Taillon et al. (1988) where the mutated amino acid is normally an arginine residue in the TD of Arabidopsis, yeast 1, E. coil (biosynthetic) and
-84- Salmonella and a lysine residue in the TD of chickpea, tomato, and potato (partial) (Figure 16) . The second base substitution occurred at nucleotide 1655 where G (guanine) in the wild type allele OMRl was substituted by A (adenine) in the mutated allele omrl (Figures 17 & 18) . This base substitution predicted an amino acid substitution at residue 597 at the polypeptide level where the arginine residue in the wild type TD encoded by OMRl was substituted by a histidine residue in the mutated isoleucine-insensitive TD encoded by omrl (Figure 18) . This point mutation resides in a conserved regulatory region of amino acids designated R6 (regulatory) by Taillon et al. (1988) where the mutated amino acid is normally an arginine residue in TD of Arabidopsis, chickpea, tomato, potato (partial), yeast 1, E. coli (biosynthetic) and Salmonella (Figure 19) .
Numerous modifications and variations in practice of the invention are expected to occur to those skilled in the art upon consideration of the foregoing detailed description of the invention. Consequently, such modifications and variations are intended to be included within the scope of the following claims.
SEQUENCE LISTING <110> Mourad, George Merlo, Donald
Pareddy, Daya ar Larrinua, Ignacio
<120> Methods and Compositions for Producing
Plants and Microorganisms that Express Feedback Insensitive Threonine Dehydratase Deaminase
<130> 65011
<160> 26
<170> FastSEQ for Windows Version 3.0
<210> 1 <211> 1779
<212> DNA
<213> Unknown
<220>
-85- <221> CDS <222> (1) ... (1779) <400> 1 atg aat tec gtt cag ctt ccg acg gcg caa tec tet etc cgt age cac 48 Met Asn Ser Val Gin Leu Pro Thr Ala Gin Ser Ser Leu Arg Ser His 1 5 10 15
att cac cgt cca tea aaa cca gtg gtc gga ttc act cac ttc tec tec 96 lie His Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser Ser 20 25 30
cgt tet egg ate gca gtg gcg gtt ctg tec cga gat gaa aca tet atg 44
Arg Ser Arg lie Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser Met
35 40 45
act cca ccg cct cca aag ctt cct tta cca cgt ctt aag gtc tet ccg 92
Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser Pro
50 55 60
aat teg ttg caa tac cct gcc ggt tac etc ggt get gta cca gaa cgt 240 Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg 65 70 75 80
acg aac gag get gag aac gga age ate gcg gaa get atg gag tat ttg 288 Thr Asn Glu Ala Glu Asn Gly Ser lie Ala Glu Ala Met Glu Tyr Leu
85 90 95
acg aat ata ctg tec act aag gtt tac gac ate gcc att gag tea cca 336 Thr Asn lie Leu Ser Thr Lys Val Tyr Asp lie Ala lie Glu Ser Pro 100 105 110
etc caa ttg get aag aag eta tet aag aga tta ggt gtt cgt atg tat 384
Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr 115 120 125
ctt aaa aga gaa gac ttg caa cct gta ttc teg ttt aag ctt cgt gga 432
Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg Gly 130 135 140
get tac aat atg atg gtg aaa ctt cca gca gat caa ttg gca aaa gga 480 Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys Gly 145 150 155 160
gtt ate tgc tet tea get gga aac cat get caa gga gtt get tta tet 528 Val lie Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu Ser
165 170 175
-86- get agt aaa etc ggc tgc act get gtg att gtt atg cct gtt acg act 576
Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr Thr
180 185 190
cct gag ata aag tgg caa get gta gag aat ttg ggt gca acg gtt gtt 624
Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val Val
195 200 205
ctt ttc gga gat teg tat gat caa gca caa gca cat get aag ata cga 672 Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He Arg 210 215 220
get gaa gaa gag ggt ctg acg ttt ata cct cct ttt gat cac cct gat 720 Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro Asp 225 230 235 240
gtt att get gga caa ggg act gtt ggg atg gag ate act cgt cag get 768 Val He Ala Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin Ala 245 250 255
aag ggt cca ttg cat get ata ttt gtg cca gtt ggt ggt ggt ggt tta 816
Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly Leu
260 265 270
ata get ggt att get get tat gtg aag agg gtt tet ccc gag gtg aag 864
He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val Lys
275 280 285
ate att ggt gta gaa cca get gac gca aat gca atg get ttg teg ctg 912 He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser Leu 290 295 300
cat cac ggt gag agg gtg ata ttg gac cag gtt ggg gga ttt gca gat 960 His His Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala Asp 305 310 315 320
ggt gta gca gtt aaa gaa gtt ggt gaa gag act ttt cgt ata age aga 1008 Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg 325 330 335
aat eta atg gat ggt gtt gtt ctt gtc act cgt gat get att tgt gca 1056 Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys Ala 340 345 350
-87- tea ata aag gat atg ttt gag gag aaa egg aac ata ttg gaa cca gca 1104 Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala 355 360 365
ggg get ctt gca etc get gga get gag gca tac tgt aaa tat tat ggc 1152 Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly 370 375 380
eta aag gac gtg aat gtc gta gcc ata ace agt ggc get aac atg aac 1200 Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met Asn 385 390 395 400
ttt gac aag eta agg att gtg aca gaa etc gcc aat gtc ggt agg caa 1248 Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg Gin 405 410 415
cag gaa get gtt ctt get act etc atg ccg gaa aaa cct gga age ttt 1296
Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe
420 425 430
aag caa ttt tgt gag ctg gtt gga cca atg aac ata age gag ttc aaa 1344
Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe Lys
435 440 445
tat aga tgt age teg gaa aag gag get gtt gta eta tac agt gtc gga 1392 Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly 450 455 460
gtt cac aca get gga gag etc aaa gca eta cag aag aga atg gaa tet 1440 Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser 465 470 475 480
tet caa etc aaa act gtc aat etc act ace agt gac tta gtg aaa gat 1488 Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp 485 490 495
cac ctg cgt tac ttg atg gga gga aga tet act gtt gga gac gag gtt 1536 His Leu Arg Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val 500 505 510
eta tgc cga ttc ace ttt ccc gag aga cct ggt get eta atg aac ttc 1584 Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe 515 520 525
-88- ttg gac tet ttc agt cca egg tgg aac ate ace ctt ttc cat tac cgt 1632 Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr Arg 530 535 540
gga cag ggt gag acg ggc gcg aat gtg ctg gtc ggg ate caa gtc ccc 1680 Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val Pro 545 550 555 560
gag caa gaa atg gag gaa ttt aaa aac cga get aaa get ctt gga tac 1728 Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr
565 570 575
gac tac ttc tta gta agt gat gac gac tat ttt aag ctt ctg atg cac 1776 Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His 580 585 590
tga 1779
<210> 2 <211> 592 <212> PRT
<213> Unknown <400> 2 Met Asn Ser Val Gin Leu Pro Thr Ala Gin Ser Ser Leu Arg Ser His 1 5 10 15 He His Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser Ser 20 25 30
Arg Ser Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser Met
35 40 45
Thr Pro Pro,Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser Pro 50 55 60
Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg 65 70 75 80
Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr Leu 85 90 95 Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser Pro 100 105 110
Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr
115 120 125
Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg Gly 130 135 140
Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys Gly 145 150 155 160
Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu Ser 165 170 175
-89- Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr Thr
180 185 190
Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val Val 195 200 205 Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He Arg 210 215 220
Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro Asp 225 230 235 240
Val He Ala Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin Ala 245 250 255
Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly Leu
260 265 270
He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val Lys 275 280 285 He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser Leu 290 295 300
His His Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala Asp 305 310 315 320
Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg 325 330 335
Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys Ala
340 345 350
Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala 355 360 365 Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly 370 375 380
Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met Asn 385 390 395 400
Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg Gin 405 410 415
Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe
420 425 430
Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe Lys 435 440 445 Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly 450 455 460
Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser 465 470 475 480
Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp 485 490 495
His Leu Arg Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val
500 505 510
Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe 515 520 525 Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr Arg 530 535 540
-90- Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val Pro
545 550 555 560
Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr
565 570 575 Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His
580 585 590
<210> 3
<211> 1779
<212> DNA <213> Unknown
<220>
<221> CDS
<222> (1) ... (1779)
<400> 3 atg aat tec gtt cag ctt ccg acg gcg caa tec tet etc cgt age cac 48 Met Asn Ser Val Gin Leu Pro Thr Ala Gin Ser Ser Leu Arg Ser His 1 5 10 15
att cac cgt cca tea aaa cca gtg gtc gga ttc act cac ttc tec tec 96 He His Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser Ser
20 25 30
cgt tet egg ate gca gtg gcg gtt ctg tec cga gat gaa aca tet atg 144 Arg Ser Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser Met 35 40 45
act cca ccg cct cca aag ctt cct tta cca cgt ctt aag gtc tet ccg 192
Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser Pro 50 55 60
aat teg ttg caa tac cct gcc ggt tac etc ggt get gta cca gaa cgt 240
Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg 65 70 75 80
acg aac gag get gag aac gga age ate gcg gaa get atg gag tat ttg 288 Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr Leu 85 90 95
acg aat ata ctg tec act aag gtt tac gac ate gcc att gag tea cca 336 Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser Pro 100 105 110
etc caa ttg get aag aag eta tet aag aga tta ggt gtt cgt atg tat 384 Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr 115 120 125
-91- ctt aaa aga gaa gac ttg caa cct gta ttc teg ttt aag ctt cgt gga 432
Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg Gly
130 135 140
get tac aat atg atg gtg aaa ctt cca gca gat caa ttg gca aaa gga 480
Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys Gly
145 150 155 160
gtt ate tgc tet tea get gga aac cat get caa gga gtt get tta tet 528 Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu Ser
165 170 175
get agt aaa etc ggc tgc act get gtg att gtt atg cct gtt acg act 576
Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr Thr 180 185 190
cct gag ata aag tgg caa get gta gag aat ttg ggt gca acg gtt gtt 624
Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val Val 195 200 205
ctt ttc gga gat teg tat gat caa gca caa gca cat get aag ata cga 672
Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He Arg
210 215 220
get gaa gaa gag ggt ctg acg ttt ata cct cct ttt gat cac cct gat 720
Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro Asp
225 230 235 240
gtt att get gga caa ggg act gtt ggg atg gag ate act cgt cag get 768 Val He Ala Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin Ala
245 250 255
aag ggt cca ttg cat get ata ttt gtg cca gtt ggt ggt ggt ggt tta 816
Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly Leu 260 265 270
ata get ggt att get get tat gtg aag agg gtt tet ccc gag gtg aag 864
He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val Lys 275 280 285
ate att ggt gta gaa cca get gac gca aat gca atg get ttg teg ctg 912 He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser Leu 290 295 300
92- cat cac ggt gag agg gtg ata ttg gac cag gtt ggg gga ttt gca gat 960 His His Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala Asp 305 310 315 320
ggt gta gca gtt aaa gaa gtt ggt gaa gag act ttt cgt ata age aga 1008 Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg 325 330 335
aat eta atg gat ggt gtt gtt ctt gtc act cgt gat get att tgt gca 1056 Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys Ala 340 345 350
tea ata aag gat atg ttt gag gag aaa egg aac ata ttg gaa cca gca 1104 Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala 355 360 365
ggg get ctt gca etc get gga get gag gca tac tgt aaa tat tat ggc 1152
Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly
370 375 380
eta aag gac gtg aat gtc gta gcc ata ace agt ggc get aac atg aac 1200
Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met Asn 385 390 395 400
ttt gac aag eta agg att gtg aca gaa etc gcc aat gtc ggt agg caa 1248 Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg Gin 405 410 415
cag gaa get gtt ctt get act etc atg ccg gaa aaa cct gga age ttt 1296 Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe 420 425 430
aag caa ttt tgt gag ctg gtt gga cca atg aac ata age gag ttc aaa 1344 Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe Lys 435 440 445
tat aga tgt age teg gaa aag gag get gtt gta eta tac agt gtc gga 1392 Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly 450 455 460
gtt cac aca get gga gag etc aaa gca eta cag aag aga atg gaa tet 1440 Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser 465 470 475 480
-93- tet caa etc aaa act gtc aat etc act ace agt gac tta gtg aaa gat 1488 Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp 485 490 495
cac ctg tgt tac ttg atg gga gga aga tet act gtt gga gac gag gtt 1536 His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val 500 505 510
eta tgc cga ttc ace ttt ccc gag aga cct ggt get eta atg aac ttc 1584 Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe 515 520 525
ttg gac tet ttc agt cca egg tgg aac ate ace ctt ttc cat tac cat 1632 Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr His 530 535 540
gga cag ggt gag acg ggc gcg aat gtg ctg gtc ggg ate caa gtc ccc 1680
Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val Pro
545 550 555 560
gag caa gaa atg gag gaa ttt aaa aac cga get aaa get ctt gga tac 1728
Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr
565 570 575
gac tac ttc tta gta agt gat gac gac tat ttt aag ctt ctg atg cac 1776 Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His 580 585 590
tga 1779
<210> 4
<211> 592 <212> PRT <213> Unknown <400> 4 Met Asn Ser Val Gin Leu Pro Thr Ala Gin Ser Ser Leu Arg Ser His 1 5 10 15
He His Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser Ser
20 25 30
Arg Ser Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser Met 35 40 45
Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser Pro
50 55 60
Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg 65 70 75 80 Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr Leu
-94- 85 90 95
Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser Pro
100 105 110
Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr 115 120 125
Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg Gly
130 135 140
Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys Gly 145 150 155 160 Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu Ser
165 170 175
Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr Thr
180 185 190
Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val Val 195 200 205
Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He Arg
210 215 220
Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro Asp 225 230 235 240 Val He Ala Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin Ala
245 250 255
Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly Leu
260 265 270
He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val Lys 275 280 285
He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser Leu
290 295 300
His His Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala Asp 305 310 315 320 Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg
325 330 335
Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys Ala
340 345 350
Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala 355 360 365
Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly
370 375 380
Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met Asn 385 390 395 400 Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg Gin
405 410 415
Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe
420 425 430
Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe Lys 435 440 445
Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly
-95- 450 455 460
Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser 465 470 475 480
Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp 485 490 495
His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val
500 505 510
Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe 515 520 525 Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr His 530 535 540
Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val Pro 545 550 555 560
Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr 565 570 575
Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His 580 585 590
<210> 5 <211> 1830 <212> DNA
<213> Unknown <220> <221> CDS <222> (1) ... (1830) <400> 5 atg ggc gag etc ggt ace egg gga tec tet aga act agt gga tec ccc 48 Met Gly Glu Leu Gly Thr Arg Gly Ser Ser Arg Thr Ser Gly Ser Pro 1 5 10 15
ggg ctg cag gaa ttc ggc acg agg acg gcg caa tec tet etc cgt age 96 Gly Leu Gin Glu Phe Gly Thr Arg Thr Ala Gin Ser Ser Leu Arg Ser 20 25 30
cac att cac cgt cca tea aaa cca gtg gtc gga ttc act cac ttc tec 144 His He His Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser 35 40 45
tec cgt tet egg ate gca gtg gcg gtt ctg tec cga gat gaa aca tet 192 Ser Arg Ser Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser 50 55 60
atg act cca ccg cct cca aag ctt cct tta cca cgt ctt aag gtc tet 240 Met Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser 65 70 75 80
-96- ccg aat teg ttg caa tac cct gcc ggt tac etc ggt get gta cca gaa 288
Pro Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu
85 90 95
cgt acg aac gag get gag aac gga age ate gcg gaa get atg gag tat 336
Arg Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr
100 105 110
ttg acg aat ata ctg tec act aag gtt tac gac ate gcc att gag tea 384 Leu Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser
115 120 125
cca etc caa ttg get aag aag eta tet aag aga tta ggt gtt cgt atg 432
Pro Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met 130 135 140
tat ctt aaa aga gaa gac ttg caa cct gta ttc teg ttt aag ctt cgt 480
Tyr Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg
145 150 155 160
gga get tac aat atg atg gtg aaa ctt cca gca gat caa ttg gca aaa 528
Gly Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys
165 170 175
gga gtt ate tgc tet tea get gga aac cat get caa gga gtt get tta 576
Gly Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu
180 185 190
tet get agt aaa etc ggc tgc act get gtg att gtt atg cct gtt acg 624 Ser Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr
195 200 205
act cct gag ata aag tgg caa get gta gag aat ttg ggt gca acg gtt 672
Thr Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val 210 215 220
gtt ctt ttc gga gat teg tat gat caa gca caa gca cat get aag ata 720 Val Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He
225 230 235 240
cga get gaa gaa gag ggt ctg acg ttt ata cct cct ttt gat cac cct 768 Arg Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro 245 250 255
-97- gat gtt att get gga caa ggg act gtt ggg atg gag ate act cgt cag 816
Asp Val He Ala Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin 260 265 270
get aag ggt cca ttg cat get ata ttt gtg cca gtt ggt ggt ggt ggt 864
Ala Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly 275 280 285
tta ata get ggt att get get tat gtg aag agg gtt tet ccc gag gtg 912 Leu He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val 290 295 300
aag ate att ggt gta gaa cca get gac gca aat gca atg get ttg teg 960
Lys He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser 305 310 315 320
ctg cat cac ggt gag agg gtg ata ttg gac cag gtt ggg gga ttt gca 1008
Leu His His Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala 325 330 335
gat ggt gta gca gtt aaa gaa gtt ggt gaa gag act ttt cgt ata age 1056
Asp Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser 340 345 350
aga aat eta atg gat ggt gtt gtt ctt gtc act cgt gat get att tgt 1104
Arg Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys 355 360 365
gca tea ata aag gat atg ttt gag gag aaa egg aac ata ttg gaa cca 1152 Ala Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro 370 375 380
gca ggg get ctt gca etc get gga get gag gca tac tgt aaa tat tat 1200
Ala Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr 385 390 395 400
ggc eta aag gac gtg aat gtc gta gcc ata ace agt ggc get aac atg 1248
Gly Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met 405 410 415
aac ttt gac aag eta agg att gtg aca gaa etc gcc aat gtc ggt agg 1296 Asn Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg 420 425 430
98- 99/41395 "" '
caa cag gaa get gtt ctt get act etc atg ccg gaa aaa cct gga age 1344 Gin Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser 435 440 445
ttt aag caa ttt tgt gag ctg gtt gga cca atg aac ata age gag ttc 1392 Phe Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe 450 455 460
aaa tat aga tgt age teg gaa aag gag get gtt gta eta tac agt gtc 1440 Lys Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val 465 470 475 480
gga gtt cac aca get gga gag etc aaa gca eta cag aag aga atg gaa 1488 Gly Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu 485 490 495
tet tet caa etc aaa act gtc aat etc act ace agt gac tta gtg aaa 1536 Ser Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys 500 505 510
gat cac ctg tgt tac ttg atg gga gga aga tet act gtt gga gac gag 1584 Asp His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu 515 520 525
gtt eta tgc cga ttc ace ttt ccc gag aga cct ggt get eta atg aac 1632 Val Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn 530 535 540
ttc ttg gac tet ttc agt cca egg tgg aac ate ace ctt ttc cat tac 1680 Phe Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr 545 550 555 560
cat gga cag ggt gag acg ggc gcg aat gtg ctg gtc ggg ate caa gtc 1728 His Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val 565 570 575
ccc gag caa gaa atg gag gaa ttt aaa aac cga get aaa get ctt gga 1776 Pro Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly 580 585 590
tac gac tac ttc tta gta agt gat gac gac tat ttt aag ctt ctg atg 1824 Tyr Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met 595 600 605
cac tga 1830
His *
-99- <210> 6 <211> 609 <212> PRT <213> Unknown <400> 6
Met Gly Glu Leu Gly Thr Arg Gly Ser Ser Arg Thr Ser Gly Ser Pro
1 5 10 15
Gly Leu Gin Glu Phe Gly Thr Arg Thr Ala Gin Ser Ser Leu Arg Ser 20 25 30 His He His Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser 35 40 45
Ser Arg Ser Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser
50 55 60
Met Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser 65 70 75 80
Pro Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu
85 90 95
Arg Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr 100 105 110 Leu Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser 115 120 125
Pro Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met
130 135 140
Tyr Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg 145 150 155 160
Gly Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys
165 170 175
Gly Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu 180 185 190 Ser Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr 195 200 205
Thr Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val
210 215 220
Val Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He 225 230 235 240
Arg Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro
245 250 255
Asp Val He Ala Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin 260 265 270 Ala Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly 275 280 285
Leu He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val
290 295 300
Lys He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser 305 310 315 320
-100- Leu His His Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala
325 330 335
Asp Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser 340 345 350 Arg Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys 355 360 365
Ala Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro
370 375 380
Ala Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr 385 390 395 400
Gly Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met
405 410 415
Asn Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg 420 425 430 Gin Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser 435 440 445
Phe Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe
450 455 460
Lys Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val 465 470 475 480
Gly Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu
485 490 495
Ser Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys 500 505 510 Asp His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp GΪu 515 520 525
Val Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn
530 535 540
Phe Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr 545 550 555 560
His Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val
565 570 575
Pro Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly 580 585 590 Tyr Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met 595 600 605
His
<210> 7 <211> 1509
<212> DNA
<213> Unknown
<220>
<221> CDS <222> (1) ... (1509)
-101- <400> 7 gaa get atg gag tat ttg acg aat ata ctg tec act aag gtt tac gac 48 Glu Ala Met Glu Tyr Leu Thr Asn He Leu Ser Thr Lys Val Tyr Asp 1 5 10 15
ate gcc att gag tea cca etc caa ttg get aag aag eta tet aag aga 96 He Ala He Glu Ser Pro Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg 20 25 30
tta ggt gtt cgt atg tat ctt aaa aga gaa gac ttg caa cct gta ttc 144 Leu Gly Val Arg Met Tyr Leu Lys Arg Glu Asp Leu Gin Pro Val Phe 35 40 45
teg ttt aag ctt cgt gga get tac aat atg atg gtg aaa ctt cca gca 192 Ser Phe Lys Leu Arg Gly Ala Tyr Asn Met Met Val Lys Leu Pro Ala 50 55 60
gat caa ttg gca aaa gga gtt ate tgc tet tea get gga aac cat get 240 Asp Gin Leu Ala Lys Gly Val He Cys Ser Ser Ala Gly Asn His Ala 65 70 75 80
caa gga gtt get tta tet get agt aaa etc ggc tgc act get gtg att 288
Gin Gly Val Ala Leu Ser Ala Ser Lys Leu Gly Cys Thr Ala Val He 85 90 95
gtt atg cct gtt acg act cct gag ata aag tgg caa get gta gag aat 336
Val Met Pro Val Thr Thr Pro Glu He Lys Trp Gin Ala Val Glu Asn 100 105 110
ttg ggt gca acg gtt gtt ctt ttc gga gat teg tat gat caa gca caa 384 Leu Gly Ala Thr Val Val Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin 115 120 125
gca cat get aag ata cga get gaa gaa gag ggt ctg acg ttt ata cct 432 Ala His Ala Lys He Arg Ala Glu Glu Glu Gly Leu Thr Phe He Pro 130 135 140
cct ttt gat cac cct gat gtt att get gga caa ggg act gtt ggg atg 480 Pro Phe Asp His Pro Asp Val He Ala Gly Gin Gly Thr Val Gly Met 145 150 155 160
gag ate act cgt cag get aag ggt cca ttg cat get ata ttt gtg cca 528 Glu He Thr Arg Gin Ala Lys Gly Pro Leu His Ala He Phe Val Pro 165 170 175
-102- 41395
gtt ggt ggt ggt ggt tta ata get ggt att get get tat gtg aag agg 576 Val Gly Gly Gly Gly Leu He Ala Gly He Ala Ala Tyr Val Lys Arg 180 185 190
gtt tet ccc gag gtg aag ate att ggt gta gaa cca get gac gca aat 624 Val Ser Pro Glu Val Lys He He Gly Val Glu Pro Ala Asp Ala Asn 195 200 205
gca atg get ttg teg ctg cat cac ggt gag agg gtg ata ttg gac cag 672 Ala Met Ala Leu Ser Leu His His Gly Glu Arg Val He Leu Asp Gin 210 215 220
gtt ggg gga ttt gca gat ggt gta gca gtt aaa gaa gtt ggt gaa gag 720 Val Gly Gly Phe Ala Asp Gly Val Ala Val Lys Glu Val Gly Glu Glu 225 230 235 240
act ttt cgt ata age aga aat eta atg gat ggt gtt gtt ctt gtc act 768 Thr Phe Arg He Ser Arg Asn Leu Met Asp Gly Val Val Leu Val Thr 245 250 255
cgt gat get att tgt gca tea ata aag gat atg ttt gag gag aaa egg 816 Arg Asp Ala He Cys Ala Ser He Lys Asp Met Phe Glu Glu Lys Arg 260 265 270
aac ata ttg gaa cca gca ggg get ctt gca etc get gga get gag gca 864 Asn He Leu Glu Pro Ala Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala 275 280 285
tac tgt aaa tat tat ggc eta aag gac gtg aat gtc gta gcc ata ace 912 Tyr Cys Lys Tyr Tyr Gly Leu Lys Asp Val Asn Val Val Ala He Thr 290 295 300
agt ggc get aac atg aac ttt gac aag eta agg att gtg aca gaa etc 960 Ser Gly Ala Asn Met Asn Phe Asp Lys Leu Arg He Val Thr Glu Leu 305 310 315 320
gcc aat gtc ggt agg caa cag gaa get gtt ctt get act etc atg ccg 1008 Ala Asn Val Gly Arg Gin Gin Glu Ala Val Leu Ala Thr Leu Met Pro 325 330 335
gaa aaa cct gga age ttt aag caa ttt tgt gag ctg gtt gga cca atg 1056 Glu Lys Pro Gly Ser Phe Lys Gin Phe Cys Glu Leu Val Gly Pro Met 340 345 350
103- /41395 P rC^< Tι/ιUυSo9y9/00560
aac ata age gag ttc aaa tat aga tgt age teg gaa aag gag get gtt 1104 Asn He Ser Glu Phe Lys Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val 355 360 365
gta eta tac agt gtc gga gtt cac aca get gga gag etc aaa gca eta 1152 Val Leu Tyr Ser Val Gly Val His Thr Ala Gly Glu Leu Lys Ala Leu 370 375 380
cag aag aga atg gaa tet tet caa etc aaa act gtc aat etc act ace 1200 Gin Lys Arg Met Glu Ser Ser Gin Leu Lys Thr Val Asn Leu Thr Thr 385 390 395 400
agt gac tta gtg aaa gat cac ctg tgt tac ttg atg gga gga aga tet 1248 Ser Asp Leu Val Lys Asp His Leu Cys Tyr Leu Met Gly Gly Arg Ser 405 410 415
act gtt gga gac gag gtt eta tgc cga ttc ace ttt ccc gag aga cct 1296 Thr Val Gly Asp Glu Val Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro 420 425 430
ggt get eta atg aac ttc ttg gac tet ttc agt cca egg tgg aac ate 1344 Gly Ala Leu Met Asn Phe Leu Asp Ser Phe Ser Pro Arg Trp Asn He 435 440 445
ace ctt ttc cat tac cat gga cag ggt gag acg ggc gcg aat gtg ctg 1392 Thr Leu Phe His Tyr His Gly Gin Gly Glu Thr Gly Ala Asn Val Leu 450 455 460
gtc ggg ate caa gtc ccc gag caa gaa atg gag gaa ttt aaa aac cga 1440 Val Gly He Gin Val Pro Glu Gin Glu Met Glu Glu Phe Lys Asn Arg 465 470 475 480
get aaa get ctt gga tac gac tac ttc tta gta agt gat gac gac tat 1488 Ala Lys Ala Leu Gly Tyr Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr 485 490 495
ttt aag ctt ctg atg cac tga 1509
Phe Lys Leu Leu Met His * 500
<210> 8 <211> 502 <212> PRT <213> Unknown <400> 8
-104- Glu Ala Met Glu Tyr Leu Thr Asn He Leu Ser Thr Lys Val Tyr Asp
1 5 10 15
He Ala He Glu Ser Pro Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg 20 25 30 Leu Gly Val Arg Met Tyr Leu Lys Arg Glu Asp Leu Gin Pro Val Phe 35 40 45
Ser Phe Lys Leu Arg Gly Ala Tyr Asn Met Met Val Lys Leu Pro Ala
50 55 60
Asp Gin Leu Ala Lys Gly Val He Cys Ser Ser Ala Gly Asn His Ala 65 70 75 80
Gin Gly Val Ala Leu Ser Ala Ser Lys Leu Gly Cys Thr Ala Val He
85 90 95
Val Met Pro Val Thr Thr Pro Glu He Lys Trp Gin Ala Val Glu Asn 100 105 110 Leu Gly Ala Thr Val Val Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin 115 120 125
Ala His Ala Lys He Arg Ala Glu Glu Glu Gly Leu Thr Phe He Pro
130 135 140
Pro Phe Asp His Pro Asp Val He Ala Gly Gin Gly Thr Val Gly Met 145 150 155 160
Glu He Thr Arg Gin Ala Lys Gly Pro Leu His Ala He Phe Val Pro
165 170 175
Val Gly Gly Gly Gly Leu He Ala Gly He Ala Ala Tyr Val Lys Arg 180 185 190 Val Ser Pro Glu Val Lys He He Gly Val Glu Pro Ala Asp Ala Asn 195 200 205
Ala Met Ala Leu Ser Leu His His Gly Glu Arg Val He Leu Asp Gin
210 215 220
Val Gly Gly Phe Ala Asp Gly Val Ala Val Lys Glu Val Gly Glu Glu 225 230 235 240
Thr Phe Arg He Ser Arg Asn Leu Met Asp Gly Val Val Leu Val Thr
245 250 255
Arg Asp Ala He Cys Ala Ser He Lys Asp Met Phe Glu Glu Lys Arg 260 265 270 Asn He Leu Glu Pro Ala Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala 275 280 285
Tyr Cys Lys Tyr Tyr Gly Leu Lys Asp Val Asn Val Val Ala He Thr
290 295 300
Ser Gly Ala Asn Met Asn Phe Asp Lys Leu Arg He Val Thr Glu Leu 305 310 315 320
Ala Asn Val Gly Arg Gin Gin Glu Ala Val Leu Ala Thr Leu Met Pro
325 330 335
Glu Lys Pro Gly Ser Phe Lys Gin Phe Cys Glu Leu Val Gly Pro Met 340 345 350 Asn He Ser Glu Phe Lys Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val 355 360 365
-105- Val Leu Tyr Ser Val Gly Val His Thr Ala Gly Glu Leu Lys Ala Leu
370 375 380
Gin Lys Arg Met Glu Ser Ser Gin Leu Lys Thr Val Asn Leu Thr Thr 385 390 395 400 Ser Asp Leu Val Lys Asp His Leu Cys Tyr Leu Met Gly Gly Arg Ser
405 410 415
Thr Val Gly Asp Glu Val Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro
420 425 430
Gly Ala Leu Met Asn Phe Leu Asp Ser Phe Ser Pro Arg Trp Asn He 435 440 445
Thr Leu Phe His Tyr His Gly Gin Gly Glu Thr Gly Ala Asn Val Leu
450 455 460
Val Gly He Gin Val Pro Glu Gin Glu Met Glu Glu Phe Lys Asn Arg 465 470 475 480 Ala Lys Ala Leu Gly Tyr Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr
485 490 495
Phe Lys Leu Leu Met His 500
<210> 9 <211> 1620 <212> DNA <213> Unknown <220> <221> CDS <222> (1) ... (1620)
<400> 9 aag ctt cct tta cca cgt ctt aag gtc tet ccg aat teg ttg caa tac 48 Lys Leu Pro Leu Pro Arg Leu Lys Val Ser Pro Asn Ser Leu Gin Tyr 1 5 10 15
cct gcc ggt tac etc ggt get gta cca gaa cgt acg aac gag get gag 96 Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg Thr Asn Glu Ala Glu 20 25 30
aac gga age ate gcg gaa get atg gag tat ttg acg aat ata ctg tec 144 Asn Gly Ser He Ala Glu Ala Met Glu Tyr Leu Thr Asn He Leu Ser 35 40 45
act aag gtt tac gac ate gcc att gag tea cca etc caa ttg get aag 192 Thr Lys Val Tyr Asp He Ala He Glu Ser Pro Leu Gin Leu Ala Lys 50 55 60
aag eta tet aag aga tta ggt gtt cgt atg tat ctt aaa aga gaa gac 240 Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr Leu Lys Arg Glu Asp 65 70 75 80
-106- ttg caa cct gta ttc teg ttt aag ctt cgt gga get tac aat atg atg 288 Leu Gin Pro Val Phe Ser Phe Lys Leu Arg Gly Ala Tyr Asn Met Met 85 90 95
gtg aaa ctt cca gca gat caa ttg gca aaa gga gtt ate tgc tet tea 336 Val Lys Leu Pro Ala Asp Gin Leu Ala Lys Gly Val He Cys Ser Ser 100 105 110
get gga aac cat get caa gga gtt get tta tet get agt aaa etc ggc 384 Ala Gly Asn His Ala Gin Gly Val Ala Leu Ser Ala Ser Lys Leu Gly 115 120 125
tgc act get gtg att gtt atg cct gtt acg act cct gag ata aag tgg 432 Cys Thr Ala Val He Val Met Pro Val Thr Thr Pro Glu He Lys Trp 130 135 140
caa get gta gag aat ttg ggt gca acg gtt gtt ctt ttc gga gat teg 480
Gin Ala Val Glu Asn Leu Gly Ala Thr Val Val Leu Phe Gly Asp Ser
145 150 155 160
tat gat caa gca caa gca cat get aag ata cga get gaa gaa gag ggt 528
Tyr Asp Gin Ala Gin Ala His Ala Lys He Arg Ala Glu Glu Glu Gly
165 170 175
ctg acg ttt ata cct cct ttt gat cac cct gat gtt att get gga ca'a 576 Leu Thr Phe He Pro Pro Phe Asp His Pro Asp Val He Ala Gly Gin 180 185 190
ggg act gtt ggg atg gag ate act cgt cag get aag ggt cca ttg cat 624 Gly Thr Val Gly Met Glu He Thr Arg Gin Ala Lys Gly Pro Leu His 195 200 205
get ata ttt gtg cca gtt ggt ggt ggt ggt tta ata get ggt att get 672 Ala He Phe Val Pro Val Gly Gly Gly Gly Leu He Ala Gly He Ala 210 215 220
get tat gtg aag agg gtt tet ccc gag gtg aag ate att ggt gta gaa 720 Ala Tyr Val Lys Arg Val Ser Pro Glu Val Lys He He Gly Val Glu 225 230 235 240
cca get gac gca aat gca atg get ttg teg ctg cat cac ggt gag agg 768 Pro Ala Asp Ala Asn Ala Met Ala Leu Ser Leu His His Gly Glu Arg 245 250 255
-107- 41395 rt
gtg ata ttg gac cag gtt ggg gga ttt gca gat ggt gta gca gtt aaa 816 Val He Leu Asp Gin Val Gly Gly Phe Ala Asp Gly Val Ala Val Lys 260 265 270
gaa gtt ggt gaa gag act ttt cgt ata age aga aat eta atg gat ggt 864 Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg Asn Leu Met Asp Gly 275 280 285
gtt gtt ctt gtc act cgt gat get att tgt gca tea ata aag gat atg 912 Val Val Leu Val Thr Arg Asp Ala He Cys Ala Ser He Lys Asp Met 290 295 300
ttt gag gag aaa egg aac ata ttg gaa cca gca ggg get ctt gca etc 960 Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala Gly Ala Leu Ala Leu 305 310 315 320
get gga get gag gca tac tgt aaa tat tat ggc eta aag gac gtg aat 1008
Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly Leu Lys Asp Val Asn
325 330 335
gtc gta gcc ata ace agt ggc get aac atg aac ttt gac aag eta agg 1056
Val Val Ala He Thr Ser Gly Ala Asn Met Asn Phe Asp Lys Leu Arg
340 345 350
att gtg aca gaa etc gcc aat gtc ggt agg caa cag gaa get gtt ctt 1104 He Val Thr Glu Leu Ala Asn Val Gly Arg Gin Gin Glu Ala Val Leu 355 360 365
get act etc atg ccg gaa aaa cct gga age ttt aag caa ttt tgt gag 1152 Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe Lys Gin Phe Cys Glu 370 375 380
ctg gtt gga cca atg aac ata age gag ttc aaa tat aga tgt age teg 1200 Leu Val Gly Pro Met Asn He Ser Glu Phe Lys Tyr Arg Cys Ser Ser 385 390 395 400
gaa aag gag get gtt gta eta tac agt gtc gga gtt cac aca get gga 1248 Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly Val His Thr Ala Gly 405 410 415
gag etc aaa gca eta cag aag aga atg gaa tet tet caa etc aaa act 1296 Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser Ser Gin Leu Lys Thr 420 425 430
-108- gtc aat etc act ace agt gac tta gtg aaa gat cac ctg tgt tac ttg 1344 Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp His Leu Cys Tyr Leu 435 440 445
atg gga gga aga tet act gtt gga gac gag gtt eta tgc cga ttc ace 1392 Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val Leu Cys Arg Phe Thr 450 455 460
ttt ccc gag aga cct ggt get eta atg aac ttc ttg gac tet ttc agt 1440 Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe Leu Asp Ser Phe Ser 465 470 475 480
cca egg tgg aac ate ace ctt ttc cat tac cat gga cag ggt gag acg 1488 Pro Arg Trp Asn He Thr Leu Phe His Tyr His Gly Gin Gly Glu Thr 485 490 495
ggc gcg aat gtg ctg gtc ggg ate caa gtc ccc gag caa gaa atg gag 1536
Gly Ala Asn Val Leu Val Gly He Gin Val Pro Glu Gin Glu Met Glu
500 505 510
gaa ttt aaa aac cga get aaa get ctt gga tac gac tac ttc tta gta 1584
Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr Asp Tyr Phe Leu Val
515 520 525
agt gat gac gac tat ttt aag ctt ctg atg cac tga 1620
Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His * 530 535
<210> 10 <211> 539 <212> PRT <213> Unknown <400> 10 Lys Leu Pro Leu Pro Arg Leu Lys Val Ser Pro Asn Ser Leu Gin Tyr 1 5 10 15
Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg Thr Asn Glu Ala Glu
20 25 30
Asn Gly Ser He Ala Glu Ala Met Glu Tyr Leu Thr Asn He Leu Ser 35 40 45 Thr Lys Val Tyr Asp He Ala He Glu Ser Pro Leu Gin Leu Ala Lys 50 55 60
Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr Leu Lys Arg Glu Asp 65 70 75 80
Leu Gin Pro Val Phe Ser Phe Lys Leu Arg Gly Ala Tyr Asn Met Met 85 90 95
-109- /41395 r
Val Lys Leu Pro Ala Asp Gin Leu Ala Lys Gly Val He Cys Ser Ser
100 105 110
Ala Gly Asn His Ala Gin Gly Val Ala Leu Ser Ala Ser Lys Leu Gly 115 120 125 Cys Thr Ala Val He Val Met Pro Val Thr Thr Pro Glu He Lys Trp 130 135 140
Gin Ala Val Glu Asn Leu Gly Ala Thr Val Val Leu Phe Gly Asp Ser 145 150 155 160
Tyr Asp Gin Ala Gin Ala His Ala Lys He Arg Ala Glu Glu Glu Gly 165 170 175
Leu Thr Phe He Pro Pro Phe Asp His Pro Asp Val He Ala Gly Gin
180 185 190
Gly Thr Val Gly Met Glu He Thr Arg Gin Ala Lys Gly Pro Leu His 195 200 205 Ala He Phe Val Pro Val Gly Gly Gly Gly Leu He Ala Gly He Ala 210 215 220
Ala Tyr Val Lys Arg Val Ser Pro Glu Val Lys He He Gly Val Glu 225 230 235 240
Pro Ala Asp Ala Asn Ala Met Ala Leu Ser Leu His His Gly Glu Arg 245 250 255
Val He Leu Asp Gin Val Gly Gly Phe Ala Asp Gly Val Ala Val Lys
260 265 270
Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg Asn Leu Met Asp Gly 275 280 285 Val Val Leu Val Thr Arg Asp Ala He Cys Ala Ser He Lys Asp Met 290 295 300
Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala Gly Ala Leu Ala Leu 305 310 315 320
Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly Leu Lys Asp Val Asn 325 330 335
Val Val Ala He Thr Ser Gly Ala Asn Met Asn Phe Asp Lys Leu Arg
340 345 350
He Val Thr Glu Leu Ala Asn Val Gly Arg Gin Gin Glu Ala Val Leu 355 360 365 Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe Lys Gin Phe Cys Glu 370 375 380
Leu Val Gly Pro Met Asn He Ser Glu Phe Lys Tyr Arg Cys Ser Ser 385 390 395 400
Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly Val His Thr Ala Gly 405 410 415
Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser Ser Gin Leu Lys Thr
420 425 430
Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp His Leu Cys Tyr Leu 435 440 445 Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val Leu Cys Arg Phe Thr 450 455 460
-110- Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe Leu Asp Ser Phe Ser
465 470 475 480
Pro Arg Trp Asn He Thr Leu Phe His Tyr His Gly Gin Gly Glu Thr
485 490 495 Gly Ala Asn Val Leu Val Gly He Gin Val Pro Glu Gin Glu Met Glu
500 505 510
Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr Asp Tyr Phe Leu Val
515 520 525
Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His 530 535
<210> 11
<211> 1599
<212> DNA
<213> Unknown <220>
<221> CDS
<222> (1) ... (1599)
<400> 11 aag gtc tet ccg aat teg ttg caa tac cct gcc ggt tac etc ggt get 48 Lys Val Ser Pro Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala 1 5 10 15
gta cca gaa cgt acg aac gag get gag aac gga age ate gcg gaa get 96
Val Pro Glu Arg Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala 20 25 30
atg gag tat ttg acg aat ata ctg tec act aag gtt tac gac ate gcc 144
Met Glu Tyr Leu Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala 35 40 45
att gag tea cca etc caa ttg get aag aag eta tet aag aga tta ggt 192
He Glu Ser Pro Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly 50 55 60
gtt cgt atg tat ctt aaa aga gaa gac ttg caa cct gta ttc teg ttt 240 Val Arg Met Tyr Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe 65 70 75 80
aag ctt cgt gga get tac aat atg atg gtg aaa ctt cca gca gat caa 288 Lys Leu Arg Gly Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin
85 90 95
ttg gca aaa gga gtt ate tgc tet tea get gga aac cat get caa gga 336 Leu Ala Lys Gly Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly 100 105 110
-111- gtt get tta tet get agt aaa etc ggc tgc act get gtg att gtt atg 384 Val Ala Leu Ser Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met 115 120 125
cct gtt acg act cct gag ata aag tgg caa get gta gag aat ttg ggt 432 Pro Val Thr Thr Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly 130 135 140
gca acg gtt gtt ctt ttc gga gat teg tat gat caa gca caa gca cat 480
Ala Thr Val Val Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His
145 150 155 160
get aag ata cga get gaa gaa gag ggt ctg acg ttt ata cct cct ttt 528 Ala Lys He Arg Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe
165 170 175
gat cac cct gat gtt att get gga caa ggg act gtt ggg atg gag ate 576
Asp His Pro Asp Val He Ala Gly Gin Gly Thr Val Gly Met Glu He 180 185 190
act cgt cag get aag ggt cca ttg cat get ata ttt gtg cca gtt ggt 624
Thr Arg Gin Ala Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly
195 200 205
ggt ggt ggt tta ata get ggt att get get tat gtg aag agg gtt tet 672
Gly Gly Gly Leu He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser
210 215 220
ccc gag gtg aag ate att ggt gta gaa cca get gac gca aat gca atg 720
Pro Glu Val Lys He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met
225 230 235 240
get ttg teg ctg cat cac ggt gag agg gtg ata ttg gac cag gtt ggg 768 Ala Leu Ser Leu His His Gly Glu Arg Val He Leu Asp Gin Val Gly
245 250 255
gga ttt gca gat ggt gta gca gtt aaa gaa gtt ggt gaa gag act ttt 816
Gly Phe Ala Asp Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe 260 265 270
cgt ata age aga aat eta atg gat ggt gtt gtt ctt gtc act cgt gat 864
Arg He Ser Arg Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp
275 280 285
-112- get att tgt gca tea ata aag gat atg ttt gag gag aaa egg aac ata 912
Ala He Cys Ala Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He 290 295 300
ttg gaa cca gca ggg get ctt gca etc get gga get gag gca tac tgt 960
Leu Glu Pro Ala Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys 305 310 315 320
aaa tat tat ggc eta aag gac gtg aat gtc gta gcc ata ace agt ggc 1008 Lys Tyr Tyr Gly Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly
325 330 335
get aac atg aac ttt gac aag eta agg att gtg aca gaa etc gcc aat 1056
Ala Asn Met Asn Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn 340 345 350
gtc ggt agg caa cag gaa get gtt ctt get act etc atg ccg gaa aaa 1104
Val Gly Arg Gin Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys
355 360 365
cct gga age ttt aag caa ttt tgt gag ctg gtt gga cca atg aac ata 1152
Pro Gly Ser Phe Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He 370 375 380
age gag ttc aaa tat aga tgt age teg gaa aag gag get gtt gta eta 1200
Ser Glu Phe Lys Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu 385 390 395 400
tac agt gtc gga gtt cac aca get gga gag etc aaa gca eta cag aag 1248 Tyr Ser Val Gly Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys
405 410 415
aga atg gaa tet tet caa etc aaa act gtc aat etc act ace agt gac 1296
Arg Met Glu Ser Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp 420 425 430
tta gtg aaa gat cac ctg tgt tac ttg atg gga gga aga tet act gtt 1344
Leu Val Lys Asp His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val
435 440 445
gga gac gag gtt eta tgc cga ttc ace ttt ccc gag aga cct ggt get 1392 Gly Asp Glu Val Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala 450 455 460
-113- eta atg aac ttc ttg gac tet ttc agt cca egg tgg aac ate ace ctt 1440 Leu Met Asn Phe Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu 465 470 475 480
ttc cat tac cat gga cag ggt gag acg ggc gcg aat gtg ctg gtc ggg 1488 Phe His Tyr His Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly 485 490 495
ate caa gtc ccc gag caa gaa atg gag gaa ttt aaa aac cga get aaa 1536 He Gin Val Pro Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys 500 505 510
get ctt gga tac gac tac ttc tta gta agt gat gac gac tat ttt aag 1584 Ala Leu Gly Tyr Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys 515 520 525
ctt ctg atg cac tga 1599
Leu Leu Met His * 530
<210> 12 <211> 532 <212> PRT <213> Unknown <400> 12
Lys Val Ser Pro Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala
1 5 10 15
Val Pro Glu Arg Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala 20 25 30 Met Glu Tyr Leu Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala 35 40 45
He Glu Ser Pro Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly
50 55 60
Val Arg Met Tyr Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe 65 70 75 80
Lys Leu Arg Gly Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin
85 90 95
Leu Ala Lys Gly Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly 100 105 110 Val Ala Leu Ser Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met 115 120 125
Pro Val Thr Thr Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly
130 135 140
Ala Thr Val Val Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His 145 150 155 160
-114- Ala Lys He Arg Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe
165 170 175
Asp His Pro Asp Val He Ala Gly Gin Gly Thr Val Gly Met Glu He 180 185 190 Thr Arg Gin Ala Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly 195 200 205
Gly Gly Gly Leu He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser
210 215 220
Pro Glu Val Lys He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met 225 230 235 240
Ala Leu Ser Leu His His Gly Glu Arg Val He Leu Asp Gin Val Gly
245 250 255
Gly Phe Ala Asp Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe 260 265 270 Arg He Ser Arg Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp 275 280 285
Ala He Cys Ala Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He
290 295 300
Leu Glu Pro Ala Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys 305 310 315 320
Lys Tyr Tyr Gly Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly
325 330 335
Ala Asn Met Asn Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn 340 345 350 Val Gly Arg Gin Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys 355 360 365
Pro Gly Ser Phe Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn'He
370 375 380
Ser Glu Phe Lys Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu 385 390 395 400
Tyr Ser Val Gly Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys
405 410 415
Arg Met Glu Ser Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp 420 425 430 Leu Val Lys Asp His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val 435 440 445
Gly Asp Glu Val Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala
450 455 460
Leu Met Asn Phe Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu 465 470 475 480
Phe His Tyr His Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly
485 490 495
He Gin Val Pro Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys 500 505 510 Ala Leu Gly Tyr Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys 515 520 525
-115- Leu Leu Met His 530
<210> 13 <211> 720 <212> DNA
<213> Unknown <220> <221> CDS <222> (1) ... (720) <400> 13 tea ata aag gat atg ttt gag gag aaa egg aac ata ttg gaa cca gca 48 Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala 1 5 10 15
ggg get ctt gca etc get gga get gag gca tac tgt aaa tat tat ggc 96 Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly 20 25 30
eta aag gac gtg aat gtc gta gcc ata ace agt ggc get aac atg aac 144 Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met Asn 35 40 45
ttt gac aag eta agg att gtg aca gaa etc gcc aat gtc ggt agg ca-a 192 Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg Gin 50 55 60
cag gaa get gtt ctt get act etc atg ccg gaa aaa cct gga age ttt 240
Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe
65 70 75 80
aag caa ttt tgt gag ctg gtt gga cca atg aac ata age gag ttc aaa 288
Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe Lys
85 90 95
tat aga tgt age teg gaa aag gag get gtt gta eta tac agt gtc gga 336 Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly 100 105 110
gtt cac aca get gga gag etc aaa gca eta cag aag aga atg gaa tet 384 Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser 115 120 125
tet caa etc aaa act gtc aat etc act ace agt gac tta gtg aaa gat 432 Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp 130 135 140
-116- cac ctg tgt tac ttg atg gga gga aga tet act gtt gga gac gag gtt 480 His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val 145 150 155 160
eta tgc cga ttc ace ttt ccc gag aga cct ggt get eta atg aac ttc 528 Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe 165 170 175
ttg gac tet ttc agt cca egg tgg aac ate ace ctt ttc cat tac cat 576 Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr His 180 185 190
gga cag ggt gag acg ggc gcg aat gtg ctg gtc ggg ate caa gtc ccc 624 Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val Pro 195 200 205
gag caa gaa atg gag gaa ttt aaa aac cga get aaa get ctt gga tac 672
Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr 210 215 220
gac tac ttc tta gta agt gat gac gac tat ttt aag ctt ctg atg cac 720
Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His 225 230 235 240
<210> 14 <211> 240 <212> PRT <213> Unknown <400> 14 Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala 1 5 10 15
Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly
20 25 30
Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met Asn 35 40 45
Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg Gin
50 55 60
Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe 65 70 75 80 Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe Lys
85 90 95
Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly
100 105 110
Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser 115 120 125
-117- Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp
130 135 140
His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val 145 150 155 160 Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe
165 170 175
Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr His
180 185 190
Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val Pro 195 200 205
Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr
210 215 220
Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His 225 230 235 240
<210> 15
<211> 81
<212> DNA
<213> Unknown
<220> <221> CDS
<222> (1) ... (81)
<400> 15 gtc aat etc act ace agt gac tta gtg aaa gat cac ctg tgt tac ttg 41 Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp His Leu Cys Tyr Leu 1 5 10 15
atg gga gga aga tet act gtt gga gac gag gtt 8:
Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val 20 25
<210> 16 <211> 27 <212> PRT <213> Unknown <400> 16
Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp His Leu Cys Tyr Leu
1 5 10 15
Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val 20 25
<210> 17 <211> 75 <212> DNA <213> Unknown <220>
-118- <221> CDS <222> ( 1 ) . . . ( 75 ) <400> 17 tgg aac ate ace ctt ttc cat tac cat gga cag ggt gag acg ggc gcg 48 Trp Asn He Thr Leu Phe His Tyr His Gly Gin Gly Glu Thr Gly Ala 1 5 10 15
aat gtg ctg gtc ggg ate caa gtc ccc 75
Asn Val Leu Val Gly He Gin Val Pro 20 25
<210> 18 <211> 25 <212> PRT <213> Unknown <400> 18
Trp Asn He Thr Leu Phe His Tyr His Gly Gin Gly Glu Thr Gly Ala
1 5 10 15
Asn Val Leu Val Gly He Gin Val Pro 20 25
<210> 19
<211> 2235
<212> DNA
<213> Unknown
<220> <221> CDS
<222> ( 1 ) . . . ( 1773 )
<400> 19 gaa ttc ggc acg agg acg gcg caa tec tet etc cgt age cac att cac 48 Glu Phe Gly Thr Arg Thr Ala Gin Ser Ser Leu Arg Ser His He His 1 5 10 15
cgt cca tea aaa cca gtg gtc gga ttc act cac ttc tec tec cgt tet 96
Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser Ser Arg Ser 20 25 30
egg ate gca gtg gcg gtt ctg tec cga gat gaa aca tet atg act cca 144
Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser Met Thr Pro 35 40 45
ccg cct cca aag ctt cct tta cca cgt ctt aag gtc tet ccg aat teg 192 Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser Pro Asn Ser 50 55 60
-119- ttg caa tac cct gcc ggt tac etc ggt get gta cca gaa cgt acg aac 240
Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg Thr Asn
65 70 75 80
gag get gag aac gga age ate gcg gaa get atg gag tat ttg acg aat 288
Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr Leu Thr Asn 65 90 95
ata ctg tec act aag gtt tac gac ate gcc att gag tea cca etc caa 336 He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser Pro Leu Gin
100 105 110
ttg get aag aag eta tet aag aga tta ggt gtt cgt atg tat ctt aaa 384
Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr Leu Lys 115 120 125
aga gaa gac ttg caa cct gta ttc teg ttt aag ctt cgt gga get tac 432 Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg Gly Ala Tyr 130 135 140
aat atg atg gtg aaa ctt cca gca gat caa ttg gca aaa gga gtt ate 480 Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys Gly Val He 145 150 155 160
tgc tet tea get gga aac cat get caa gga gtt get tta tet get agt 528 Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu Ser Ala Ser 165 170 175
aaa etc ggc tgc act get gtg att gtt atg cct gtt acg act cct gag 576 Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr Thr Pro Glu 180 185 190
ata aag tgg caa get gta gag aat ttg ggt gca acg gtt gtt ctt ttc 624 He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val Val Leu Phe 195 200 205
gga gat teg tat gat caa gca caa gca cat get aag ata cga get gaa 672 Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He Arg Ala Glu 210 215 220
gaa gag ggt ctg acg ttt ata cct cct ttt gat cac cct gat gtt att 720 Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro Asp Val He 225 230 235 240
-120- cgt gga caa ggg act gtt ggg atg gag ate act cgt cag get aag ggt 768
Arg Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin Ala Lys Gly 245 250 255
cca ttg cat get ata ttt gtg cca gtt ggt ggt ggt ggt tta ata get 816
Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly Leu He Ala 260 265 270
ggt att get get tat gtg aag agg gtt tet ccc gag gtg aag ate att 864 Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val Lys He He 275 280 285
ggt gta gaa cca get gac gca aat gca atg get ttg teg ctg cat cac 912
Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser Leu His His 290 295 300
ggt gag agg gtg ata ttg gac cag gtt ggg gga ttt gca gat ggt gta 960
Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala Asp Gly Val
305 310 315 320
gca gtt aaa gaa gtt ggt gaa gag act ttt cgt ata age aga aat eta 1008
Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg Asn Leu 325 330 335
atg gat ggt gtt gtt ctt gtc act cgt gat get att tgt gca tea ata 1056
Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys Ala Ser He 340 345 350
aag gat atg ttt gag gag aaa egg aac ata ttg gaa cca gca ggg get 1104 Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala Gly Ala 355 360 365
ctt gca etc get gga get gag gca tac tgt aaa tat tat ggc eta aag 1152
Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly Leu Lys 370 375 380
gac gtg aat gtc gta gcc ata ace agt ggc get aac atg aac ttt gac 1200
Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met Asn Phe Asp
385 390 395 400
aag eta agg att gtg aca gaa etc gcc aat gtc ggt agg caa cag gaa 124. Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg Gin Gin Glu 405 410 415
-121- get gtt ctt get act etc atg ccg gaa aaa cct gga age ttt aag caa 1296
Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe Lys Gin
420 425 430
ttt tgt gag ctg gtt gga cca atg aac ata age gag ttc aaa tat aga 1344
Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe Lys Tyr Arg
435 440 445
tgt age teg gaa aag gag get gtt gta eta tac agt gtc gga gtt cac 1392 Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly Val His 450 455 460
aca get gga gag etc aaa gca eta cag aag aga atg gaa tet tet caa 1440
Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser Ser Gin 465 470 475 480
etc aaa act gtc aat etc act ace agt gac tta gtg aaa gat cac ctg 1488
Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp His Leu
485 490 495
tgt tac ttg atg gga gga aga tet act gtt gga gac gag gtt eta tgc 1536
Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val Leu Cys
500 505 510
cga ttc ace ttt ccc gag aga cct ggt get eta atg aac ttc ttg gac 1584
Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe Leu Asp
515 520 525
tet ttc agt cca egg tgg aac ate ace ctt ttc cat tac cat gga cag 1632 Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr His Gly Gin 530 535 540
ggt gag acg ggc gcg aat gtg ctg gtc ggg ate caa gtc ccc gag caa 1680
Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val Pro Glu Gin 545 550 555 560
gaa atg gag gaa ttt aaa aac cga get aaa get ctt gga tac gac tac 1728
Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr Asp Tyr
565 570 575
ttc tta gta agt gat gac gac tat ttt aag ctt ctg atg cac tga 1773
Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His *
580 585 590
gtttgaagct gtggtggata atccaaatct eaggaagaag aagaacecat gagagtcttcl833 ctegtgatca tggttgttct tgagattctt tagtctgttt tctctegggt ctgtgtctgtl893
-122- cggatgagcg ttttagccac tgtagttcaa tgagtaacct ctatttgcta cgaactctcal953 ttcetagatc gtgggttacc ttttggtttc tccaagcaat ttgaggctag cctccaataa2013 aaaatagtat ttctagtatt tgaaaaaaeg etactttcgt ggtatagaga aagataaaga2073 gagagagaga gagagagaga gagagagaga gagagagaga gagagagaga gagagatgct2133 cttgatattg etcttgatac aactctatta ttattgetet taatceataa tgaaagtgct2193 ttatgaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaactcg ag 2235
<210> 20 <211> 590 <212> PRT <213> Unknown <400> 20 Glu Phe Gly Thr Arg Thr Ala Gin Ser Ser Leu Arg Ser His He His
1 5 10 15
Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser Ser Arg Ser 20 25 30
Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser Met Thr Pro
35 40 45
Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser Pro Asn Ser 50 55 60 Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg Thr Asn 65 70 75 80
Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr Leu Thr Asn
85 90 95
He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser Pro Leu Gin 100 105 110
Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr Leu 'Lys
115 120 125
Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg Gly Ala Tyr
130 135 140 Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys Gly Val He
145 150 155 160
Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu Ser Ala Ser
165 170 175
Lys Leu Gly Cys Thr Ala Val He Val Met Pro' Val Thr Thr Pro Glu 180 185 190
He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val Val Leu Phe
195 200 205
Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He Arg Ala Glu
210 215 220 Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro Asp Val He
225 230 235 240
Arg Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin Ala Lys Gly
245 250 255
Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly Leu He Ala 260 265 270
-123- Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val Lys He He
275 280 285
Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser Leu His His
290 295 300 Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala Asp Gly Val
305 310 315 320
Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg Asn Leu
325 330 335
Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys Ala Ser He 340 345 350
Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala Gly Ala
355 360 365
Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly Leu Lys
370 375 380 Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met Asn Phe Asp
385 390 395 400
Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg Gin Gin Glu
405 410 415
Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe Lys Gin 420 425 430
Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe Lys Tyr Arg
435 440 445
Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly Val His
450 455 460 Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser Ser Gin
465 470 475 480
Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp His Leu
485 490 495
Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val Leu Cys 500 505 510
Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe Leu Asp
515 520 525
Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr His Gly Gin
530 535 540 Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val Pro Glu Gin
545 550 555 560
Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr Asp Tyr
565 570 575
Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His 580 585 590
<210> 21 <211> 191 <212> DNA <213> Unknown <220>
-124- <221> CDS <222> (1) ... (189) <400> 21 atg aat tec gtt cag ctt ccg acg gcg caa tec tet etc cgt age cac 48 Met Asn Ser Val Gin Leu Pro Thr Ala Gin Ser Ser Leu Arg Ser His 1 5 10 15
att cac cgt cca tea aaa cca gtg gtc gga ttc act cac ttc tec tec 96 He His Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser Ser '20 25 30
cgt tet egg ate gca gtg gcg gtt ctg tec cga gat gaa aca tet atg 144 Arg Ser Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser Met 35 40 45
act cca ccg cct cca aag ctt cct tta cca cgt ctt aag gtc tet 189 Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser 50 55 60
cc 91
<210> 22
<211> 63
<212> PRT <213> Unknown
<400> 22
Met Asn Ser Val Gin Leu Pro Thr Ala Gin Ser Ser Leu Arg Ser His
1 5 10 15
He His Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser Ser 20 25 30
Arg Ser Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser Met
35 40 45
Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser 50 55 60
<210> 23
<211> 2241
<212> DNA
<213> Unknown
<220> <221> CDS
<222> (1) ... (1779)
<400> 23
•125- atg aat tec gtt cag ctt ccg acg gcg caa tec tet etc cgt age cac 48
Met Asn Ser Val Gin Leu Pro Thr Ala Gin Ser Ser Leu Arg Ser His
1 5 10 15
att cac cgt cca tea aaa cca gtg gtc gga ttc act cac ttc tec tec 96
He His Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser Ser
20 25 30
cgt tet egg ate gca gtg gcg gtt ctg tec cga gat gaa aca tet atg 144 Arg Ser Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser Met
35 40 45
act cca ccg cct cca aag ctt cct tta cca cgt ctt aag gtc tet ccg 192
Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser Pro 50 55 60
aat teg ttg caa tac cct gcc ggt tac etc ggt get gta cca gaa cgt 240
Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg
65 70 75 80
acg aac gag get gag aac gga age ate gcg gaa get atg gag tat ttg 288
Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr Leu
85 90 95
acg aat ata ctg tec act aag gtt tac gac ate gcc att gag tea cca 336
Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser Pro
100 105 110
etc caa ttg get aag aag eta tet aag aga tta ggt gtt cgt atg tat 384
Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr
115 120 125
ctt aaa aga gaa gac ttg caa cct gta ttc teg ttt aag ctt cgt gga 432
Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg Gly 130 135 140
get tac aat atg atg gtg aaa ctt cca gca gat caa ttg gca aaa gga 480
Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys Gly 145 150 155 160
gtt ate tgc tet tea get gga aac cat get caa gga gtt get tta tet 528 Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu Ser 165 170 175
-126- get agt aaa etc ggc tgc act get gtg att gtt atg cct gtt acg act 576
Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr Thr 180 185 190
cct gag ata aag tgg caa get gta gag aat ttg ggt gca acg gtt gtt 624
Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val Val 195 200 205
ctt ttc gga gat teg tat gat caa gca caa gca cat get aag ata cga 672 Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He Arg
210 215 220
get gaa gaa gag ggt ctg acg ttt ata cct cct ttt gat cac cct gat 720
Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro Asp 225 230 235 240
gtt att cgt gga caa ggg act gtt ggg atg gag ate act cgt cag get 768
Val He Arg Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin Ala 245 250 255
aag ggt cca ttg cat get ata ttt gtg cca gtt ggt ggt ggt ggt tta 816
Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly Leu 260 265 270
ata get ggt att get get tat gtg aag agg gtt tet ccc gag gtg aag 864
He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val Lys 275 280 285
ate att ggt gta gaa cca get gac gca aat gca atg get ttg teg ctg 912 He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser Leu
290 295 300
cat cac ggt gag agg gtg ata ttg gac cag gtt ggg gga ttt gca gat 960
His His Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala Asp 305 310 315 320
ggt gta gca gtt aaa gaa gtt ggt gaa gag act ttt cgt ata age aga 1008
Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg 325 330 335
aat eta atg gat ggt gtt gtt ctt gtc act cgt gat get att tgt gca 1056 Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys Ala 340 345 350
-127- tea ata aag gat atg ttt gag gag aaa egg aac ata ttg gaa cca gca 1104
Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala 355 360 365
ggg get ctt gca etc get gga get gag gca tac tgt aaa tat tat ggc 1152
Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly 370 375 380
eta aag gac gtg aat gtc gta gcc ata ace agt ggc get aac atg aac 1200 Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met Asn
385 390 395 400
ttt gac aag eta agg att gtg aca gaa etc gcc aat gtc ggt agg caa 1248
Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg Gin 405 410 415
cag gaa get gtt ctt get act etc atg ccg gaa aaa cct gga age ttt 1296
Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe 420 425 430
aag caa ttt tgt gag ctg gtt gga cca atg aac ata age gag ttc aaa 1344
Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe Lys 435 440 445
tat aga tgt age teg gaa aag gag get gtt gta eta tac agt gtc gga 1392
Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly 450 455 460
gtt cac aca get gga gag etc aaa gca eta cag aag aga atg gaa tet 1440 Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser
465 470 475 480
tet caa etc aaa act gtc aat etc act ace agt gac tta gtg aaa gat 1488
Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp 485 490 495
cac ctg tgt tac ttg atg gga gga aga tet act gtt gga gac gag gtt 1536
His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val 500 505 510
eta tgc cga ttc ace ttt ccc gag aga cct ggt get eta atg aac ttc 1584 Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe 515 520 525
-128- ttg gac tet ttc agt cca egg tgg aac ate ace ctt ttc cat tac cat 1632 Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr His 530 535 540
gga cag ggt gag acg ggc gcg aat gtg ctg gtc ggg ate caa gtc ccc 1680 Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val Pro 545 550 555 560
gag caa gaa atg gag gaa ttt aaa aac cga get aaa get ctt gga tac 1728 Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr
565 570 575
gac tac ttc tta gta agt gat gac gac tat ttt aag ctt ctg atg cac 1776 Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His 580 585 590
tga gtttgaagct gtggtggata atccaaatct eaggaagaag aagaacecat 1829 gagagtcttc ctegtgatca tggttgttct tgagattctt tagtctgttt tctctcgggtl889 ctgtgtctgt cggatgagcg ttttagccac tgtagttcaa tgagtaacct ctatttgctal949 cgaactctca ttcetagatc gtgggttacc ttttggtttc tccaagcaat ttgaggctag2009 cctccaataa aaaatagtat ttctagtatt tgaaaaaaeg etactttcgt ggtatagaga2069 aagataaaga gagagagaga gagagagaga gagagagaga gagagagaga gagagagaga2129 gagagatget cttgatattg etcttgatac aactctatta ttattgetet taatccataa2189 tgaaagtgct ttatgaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaactcg ag 2241
<210> 24
<211> 592 <212> PRT <213> Unknown <400> 24 Met Asn Ser Val Gin Leu Pro Thr Ala Gin Ser Ser Leu Arg Ser His 1 5 10 15
He His Arg Pro Ser Lys Pro Val Val Gly Phe Thr His Phe Ser Ser
20 25 30
Arg Ser Arg He Ala Val Ala Val Leu Ser Arg Asp Glu Thr Ser Met 35 40 45
Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser Pro
50 55 60
Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu Arg 65 70 75 80 Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr Leu
85 90 95
Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser Pro
100 105 110
Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met Tyr 115 120 125
-129- Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg Gly
130 135 140
Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys Gly 145 150 155 160 Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu Ser
165 170 175
Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr Thr
180 185 190
Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val Val 195 200 205
Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He Arg
210 215 220
Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro Asp 225 230 235 240 Val He Arg Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin Ala
245 250 255
Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly Leu
260 265 270
He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val Lys 275 280 285
He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser Leu
290 295 300
His His Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala Asp 305 310 315 320 Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser Arg
325 330 335
Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys Ala
340 345 350
Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro Ala 355 360 365
Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr Gly
370 375 380
Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met Asn 385 390 395 400 Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg Gin
405 410 415
Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser Phe
420 425 430
Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe Lys 435 440 445
Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val Gly
450 455 460
Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu Ser 465 470 475 480 Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys Asp
485 490 495
-130- His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu Val
500 505 510
Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn Phe 515 520 525 Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr His 530 535 540
Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val Pro 545 550 555 560
Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly Tyr 565 570 575
Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met His 580 585 590
<210> 25
<211> 1638 <212> DNA
<213> Unknown <220>
<221> CDS
<222> (1) ... (1638) <400> 25 atg act cca ccg cct cca aag ctt cct tta cca cgt ctt aag gtc tet 48
Met Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser 1 5 10 15
ccg aat teg ttg caa tac cct gcc ggt tac etc ggt get gta cca gaa 96
Pro Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu 20 25 30
cgt acg aac gag get gag aac gga age ate gcg gaa get atg gag tat 144 Arg Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr 35 40 45
ttg acg aat ata ctg tec act aag gtt tac gac ate gcc att gag tea 192
Leu Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser 50 55 60
cca etc caa ttg get aag aag eta tet aag aga tta ggt gtt cgt atg 240 Pro Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met 65 70 75 80
tat ctt aaa aga gaa gac ttg caa cct gta ttc teg ttt aag ctt cgt 288 Tyr Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg 85 90 95
-131- gga get tac aat atg atg gtg aaa ctt cca gca gat caa ttg gca aaa 336 Gly Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys 100 105 110
gga gtt ate tgc tet tea get gga aac cat get caa gga gtt get tta 384 Gly Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu 115 120 125
tet get agt aaa etc ggc tgc act get gtg att gtt atg cct gtt acg 432 Ser Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr 130 135 140
act cct gag ata aag tgg caa get gta gag aat ttg ggt gca acg gtt 480 Thr Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val 145 150 155 160
gtt ctt ttc gga gat teg tat gat caa gca caa gca cat get aag ata 528
Val Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He 165 170 175
cga get gaa gaa gag ggt ctg acg ttt ata cct cct ttt gat cac cct 576
Arg Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro 180 185 190
gat gtt att get gga caa ggg act gtt ggg atg gag ate act cgt cag 624 Asp Val He Ala Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin 195 200 205
get aag ggt cca ttg cat get ata ttt gtg cca gtt ggt ggt ggt ggt 672 Ala Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly 210 215 220
tta ata get ggt att get get tat gtg aag agg gtt tet ccc gag gtg 720 Leu He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val 225 230 235 240
aag ate att ggt gta gaa cca get gac gca aat gca atg get ttg teg 768 Lys He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser 245 250 255
ctg cat cac ggt gag agg gtg ata ttg gac cag gtt ggg gga ttt gca 816 Leu His His Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala 260 265 270
-132- gat ggt gta gca gtt aaa gaa gtt ggt gaa gag act ttt cgt ata age 864
Asp Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser 275 280 285
aga aat eta atg gat ggt gtt gtt ctt gtc act cgt gat get att tgt 912
Arg Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys 290 295 300
gca tea ata aag gat atg ttt gag gag aaa egg aac ata ttg gaa cca 960 Ala Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu Pro
305 310 315 320
gca ggg get ctt gca etc get gga get gag gca tac tgt aaa tat tat 1008
Ala Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr 325 330 335
ggc eta aag gac gtg aat gtc gta gcc ata ace agt ggc get aac atg 1056
Gly Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met
340 345 350
aac ttt gac aag eta agg att gtg aca gaa etc gcc aat gtc ggt agg 1104
Asn Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg 355 360 365
caa cag gaa get gtt ctt get act etc atg ccg gaa aaa cct gga age 1152
Gin Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser 370 375 380
ttt aag caa ttt tgt gag ctg gtt gga cca atg aac ata age gag ttc 1200 Phe Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe
385 390 395 400
aaa tat aga tgt age teg gaa aag gag get gtt gta eta tac agt gtc 1248
Lys Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val 405 410 415
gga gtt cac aca get gga gag etc aaa gca eta cag aag aga atg gaa 1296
Gly Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu
420 425 430
tet tet caa etc aaa act gtc aat etc act ace agt gac tta gtg aaa 1344 Ser Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys 435 440 445
-133- gat cac ctg tgt tac ttg atg gga gga aga tet act gtt gga gac gag 1392 Asp His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu 450 455 460
gtt eta tgc cga ttc ace ttt ccc gag aga cct ggt get eta atg aac 1440 Val Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn 465 470 475 480
ttc ttg gac tet ttc agt cca egg tgg aac ate ace ctt ttc cat tac 1488 Phe Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr
485 490 495
cat gga cag ggt gag acg ggc gcg aat gtg ctg gtc ggg ate caa gtc 1536 His Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val 500 505 510
ccc gag caa gaa atg gag gaa ttt aaa aac cga get aaa get ctt gga 1584
Pro Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly
515 520 525
tac gac tac ttc tta gta agt gat gac gac tat ttt aag ctt ctg atg 1632
Tyr Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met
530 535 540
cac tga 1638
His * 545
<210> 26 <211> 545 <212> PRT <213> Unknown <400> 26 Met Thr Pro Pro Pro Pro Lys Leu Pro Leu Pro Arg Leu Lys Val Ser 1 5 10 15
Pro Asn Ser Leu Gin Tyr Pro Ala Gly Tyr Leu Gly Ala Val Pro Glu
20 25 30
Arg Thr Asn Glu Ala Glu Asn Gly Ser He Ala Glu Ala Met Glu Tyr 35 40 45 Leu Thr Asn He Leu Ser Thr Lys Val Tyr Asp He Ala He Glu Ser 50 55 60
Pro Leu Gin Leu Ala Lys Lys Leu Ser Lys Arg Leu Gly Val Arg Met 65 70 75 80
Tyr Leu Lys Arg Glu Asp Leu Gin Pro Val Phe Ser Phe Lys Leu Arg 85 90 95
-134- Gly Ala Tyr Asn Met Met Val Lys Leu Pro Ala Asp Gin Leu Ala Lys
100 105 110
Gly Val He Cys Ser Ser Ala Gly Asn His Ala Gin Gly Val Ala Leu 115 120 125 Ser Ala Ser Lys Leu Gly Cys Thr Ala Val He Val Met Pro Val Thr 130 135 140
Thr Pro Glu He Lys Trp Gin Ala Val Glu Asn Leu Gly Ala Thr Val 145 150 155 160
Val Leu Phe Gly Asp Ser Tyr Asp Gin Ala Gin Ala His Ala Lys He 165 170 175
Arg Ala Glu Glu Glu Gly Leu Thr Phe He Pro Pro Phe Asp His Pro
180 185 190
Asp Val He Ala Gly Gin Gly Thr Val Gly Met Glu He Thr Arg Gin 195 200 205 Ala Lys Gly Pro Leu His Ala He Phe Val Pro Val Gly Gly Gly Gly 210 215 220
Leu He Ala Gly He Ala Ala Tyr Val Lys Arg Val Ser Pro Glu Val 225 230 235 240
Lys He He Gly Val Glu Pro Ala Asp Ala Asn Ala Met Ala Leu Ser 245 250 255
Leu His His Gly Glu Arg Val He Leu Asp Gin Val Gly Gly Phe Ala
260 265 270
Asp Gly Val Ala Val Lys Glu Val Gly Glu Glu Thr Phe Arg He Ser 275 280 285 Arg Asn Leu Met Asp Gly Val Val Leu Val Thr Arg Asp Ala He Cys 290 295 300
Ala Ser He Lys Asp Met Phe Glu Glu Lys Arg Asn He Leu Glu "Pro 305 310 315 320
Ala Gly Ala Leu Ala Leu Ala Gly Ala Glu Ala Tyr Cys Lys Tyr Tyr 325 330 335
Gly Leu Lys Asp Val Asn Val Val Ala He Thr Ser Gly Ala Asn Met
340 345 350
Asn Phe Asp Lys Leu Arg He Val Thr Glu Leu Ala Asn Val Gly Arg 355 360 365 Gin Gin Glu Ala Val Leu Ala Thr Leu Met Pro Glu Lys Pro Gly Ser 370 375 380
Phe Lys Gin Phe Cys Glu Leu Val Gly Pro Met Asn He Ser Glu Phe 385 390 395 400
Lys Tyr Arg Cys Ser Ser Glu Lys Glu Ala Val Val Leu Tyr Ser Val 405 410 415
Gly Val His Thr Ala Gly Glu Leu Lys Ala Leu Gin Lys Arg Met Glu
420 425 430
Ser Ser Gin Leu Lys Thr Val Asn Leu Thr Thr Ser Asp Leu Val Lys 435 440 445 Asp His Leu Cys Tyr Leu Met Gly Gly Arg Ser Thr Val Gly Asp Glu 450 455 460
-135- Val Leu Cys Arg Phe Thr Phe Pro Glu Arg Pro Gly Ala Leu Met Asn
465 470 475 480
Phe Leu Asp Ser Phe Ser Pro Arg Trp Asn He Thr Leu Phe His Tyr
485 490 495 His Gly Gin Gly Glu Thr Gly Ala Asn Val Leu Val Gly He Gin Val
500 505 510
Pro Glu Gin Glu Met Glu Glu Phe Lys Asn Arg Ala Lys Ala Leu Gly
515 520 525
Tyr Asp Tyr Phe Leu Val Ser Asp Asp Asp Tyr Phe Lys Leu Leu Met 530 535 540
His 545
-136-

Claims

WHAT IS CLAIMED IS:
1. An isolated polynucleotide comprising a nucleotide sequence having substantial identity to a member selected from the group consisting of the sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO:25.
2. The polynucleotide in accordance with claim 1, wherein said nucleotide sequence has substantial identity to the sequence set forth in SEQ ID NO: 3.
3. The polynucleotide in accordance with claim 1, wherein said nucleotide sequence has substantial identity to the sequence set forth in SEQ ID NO: 5.
4. The polynucleotide in accordance with claim 1, wherein said nucleotide sequence has substantial identity to the sequence set forth in SEQ ID NO: 7.
5. The polynucleotide in accordance with claim 1, wherein said nucleotide sequence has substantial identity to the sequence set forth in SEQ ID NO: 9.
6. The polynucleotide in accordance with claim 1, wherein said nucleotide sequence has substantial identity to the sequence set forth in SEQ ID NO: 11.
7. The polynucleotide in accordance with claim 1, wherein said nucleotide sequence has substantial identity to the sequence set forth in SEQ ID NO: 13.
-137-
8. The polynucleotide in accordance with claim 1, wherein said nucleotide sequence has substantial identity to the sequence set forth in SEQ ID NO: 15.
9. The polynucleotide in accordance with claim 1, wherein said nucleotide sequence has substantial identity to the sequence set forth in SEQ ID NO: 17.
10. The polynucleotide in accordance with claim 1, wherein said nucleotide sequence has substantial identity to the sequence set forth in SEQ ID NO: 25.
11. A polynucleotide comprising a nucleotide sequence selected from the group consisting of the sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO: 25.
12. A polynucleotide having a nucleotide sequence that encodes a functional, feedback-insensitive threonine dehydratase/deaminase enzyme and that hybridizes under moderately stringent conditions with a member selected from the group consisting of the nucleotide sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO:25.
13. A nucleotide sequence encoding an amino acid sequence selected from the group consisting of the amino acid sequence set forth in SEQ ID NO: 4, the sequence set
-138- forth in SEQ ID NO: 6, the sequence set forth in SEQ ID NO: 8, the sequence set forth in SEQ ID NO: 10, the sequence set forth in SEQ ID NO: 12, the sequence set forth in SEQ ID NO: 14, the sequence set forth in SEQ ID NO: 16, the sequence set forth in SEQ ID NO: 18, the sequence set forth in SEQ ID NO: 26 and amino acid sequences substantially similar thereto.
14. A method for producing cells resistant to structural analogs of isoleucine, comprising: placing into a cell a construct comprising in the 5' to 3' direction of transcription a promoter functional in the cell, a first nucleotide sequence that encodes a transit peptide operably attached to the promoter, a second nucleotide sequence that encodes a mutant, feedback insensitive form of threonine deaminase/dehydratase operably attached to the first sequence, and a termination region functional in the cell operably attached to the second sequence; and growing the transformed cell whereby the first and second nucleotide sequences are expressed to provide a precursor polypeptide; wherein expression of the precursor polypeptide allows the cell to be resistant to structural analogs of isoleucine.
15. The method according to claim 14, wherein the precursor polypeptide comprises an amino acid sequence selected from the group consisting of the amino acid sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17, the sequence set forth in SEQ ID NO: 25 and amino acid sequences substantially similar thereto.
-139-
16. The method according to claim 14, wherein the cell is selected from the group consisting of a plant cell, a bacterial cell, a fungal cell and a yeast cell.
17. A cell produced in accordance with the method of claim 14.
18. A DNA construct comprising a promoter operably linked to a nucleotide sequence encoding a threonine dehydratase/deaminase that is substantially resistant to feedback inhibition.
19. The DNA construct according to claim 18, wherein the nucleotide sequence has substantial identity to a member selected from the group consisting of the sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO: 25.
20. The DNA construct according to claim 18, wherein the promoter is a plant promoter.
21. The DNA construct according to claim 18, wherein the promoter has substantial identity to a native threonine dehydratase/deaminase promoter.
22. A vector useful for 'transforming a cell, said vector comprising a nucleotide sequence having substantial identity to a member selected from the group consisting of the sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID
-140- NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO: 25.
23. A plant transformed with the vector of claim 22, or progeny thereof, the plant being capable of expressing the nucleotide sequence.
24. The plant according to claim 23, the plant being selected from the group consisting of gymnosperms, rice, wheat, barley, rye, corn, potato, carrot, sweet potato, bean, pea, chicory, lettuce, cabbage, cauliflower, broccoli, turnip, radish, spinach, asparagus, onion, garlic, eggplant, pepper, celery, squash, pumpkin, zucchini, cucumber, apple, pear, quince, melon, plum, cherry, peach, nectarine, apricot, strawberry, grape, raspberry, blackberry, pineapple, avocado, papaya, mango, banana, soybean, tobacco, tomato, sorghum and sugarcane.
25. A microorganism transformed with the vector of claim 22, or progeny thereof, the microorganism being capable of expressing the nucleotide sequence.
26. The microorganism of claim 25, wherein said microorganism is a yeast cell.
27. The microorganism of claim 25, wherein said microorganism is a bacterial cell.
28. The microorganism of claim 25, wherein said microorganism is a fungal cell.
29. A cell having incorporated therein a foreign nucleotide sequence comprising a promoter operably linked to a nucleotide sequence having substantial identity to a member selected from the group consisting of the sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the
Γûá141- sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO:25.
30. The cell according to claim 29, wherein the cell is a microorganism.
31. The cell according to claim 29, wherein the cell is a bacterial cell.
32. The cell according to claim 29, wherein the cell is a fungal cell.
33. The cell according to claim 29, wherein the cell is a yeast cell.
34. The cell according to claim 29, wherein the cell is a plant cell.
35. A plant having incorporated into its genome a foreign DNA construct comprising a promoter operably linked to a nucleotide sequence having substantial identity to a member selected from the group consisting of the sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO: 25.
36. A cell having incorporated into its genome a foreign nucleotide sequence encoding a threonine dehydratase/deaminase that is substantially resistant to feedback inhibition.
-142-
37. A method comprising: incorporating into a plant's genome a DNA construct to provide a transformed plant, the construct comprising a promoter operably linked to a nucleotide sequence having substantial identity to a member selected from the group consisting of the sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO: 25; wherein the transformed plant is capable of expressing the nucleotide sequence.
38. A method comprising: providing a vector comprising a promoter operably linked to a nucleotide sequence encoding a threonine dehydratase/deaminase that is resistant to feedback inhibition, wherein the promoter regulates expression of the nucleotide sequence in a host plant cell; and transforming a target plant with the vector to provide a transformed plant, the transformed plant being capable of expressing the nucleotide sequence.
39. The method according to claim 38, wherein the threonine dehydratase/deaminase comprises an amino acid sequence having substantial similarity to a member selected from the group consisting of the sequence set forth in SEQ ID NO: 4, the sequence set forth in SEQ ID NO: 6, the sequence set forth in SEQ ID NO: 8, the sequence set forth in SEQ ID NO: 10, the sequence set forth in SEQ ID NO: 12, the sequence set forth in SEQ ID NO: 14, the sequence set forth in SEQ ID NO: 16, the sequence set forth in SEQ ID NO: 18 and the sequence set forth in SEQ ID NO:26.
-143-
40. The method according to claim 38, wherein the nucleotide sequence has substantial identity to the nucleotide sequence of SEQ ID NO: 3.
41. A transgenic plant obtained according to the method of claim 38 or progeny thereof.
42. A method for screening potential transformants, comprising: providing a plurality of cells, wherein at least one of the cells has in its genome an expressible foreign nucleotide sequence having substantial identity to a member selected from the group consisting of the sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID' NO: 7, the sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO: 25; and contacting the plurality of cells with a substrate comprising a toxic isoleucine structural analog; wherein cells comprising the expressible foreign nucleotide sequence are capable of growing in the substrate, and wherein cells not comprising the expressible foreign nucleotide sequence are incapable of growing in the substrate.
43. A method for reliably incorporating a first, expressible, foreign nucleotide sequence into a target cell, comprising: providing a vector comprising a promoter operably linked to a first primary polynucleotide and a second polynucleotide comprising a nucleotide sequence having substantial identity to a member selected from the group consisting of the sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the sequence set forth in SEQ ID NO: 9,
-144- the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO: 25; transforming the target cell with the vector to provide a transformed cell; and contacting the cell with a substrate comprising L-O- methylthreonine; wherein successfully transformed cells are capable of growing in the substrate, and wherein unsuccessfully transformed cells are incapable of growing in the substrate.
44. A method according to claim 43, wherein the cell is selected from the group comprising a plant cell, a yeast cell, a bacterial cell and a fungal cell.
45. A method for growing a plurality of plants in the absence of undesirable plants, comprising: providing a plurality of plants, each having in its genome a foreign nucleotide sequence comprising a promoter operably linked to a nucleotide sequence encoding a threonine dehydratase/deaminase that is resistant to feedback inhibition; growing the plurality of plants in a substrate; and introducing a preselected amount of an isoleucine structural analog into the substrate.
46. A method according to claim 45, wherein the nucleotide sequence has substantial identity to a member selected from the group consisting of the sequence set forth in SEQ ID NO: 3, the sequence set forth in SEQ ID NO: 5, the sequence set forth in SEQ ID NO: 7, the sequence set forth in SEQ ID NO: 9, the sequence set forth in SEQ ID NO: 11, the sequence set forth in SEQ ID NO: 13, the sequence set forth in SEQ ID NO: 15, the sequence set forth in SEQ ID NO: 17 and the sequence set forth in SEQ ID NO:25.
-145-
47. The method in accordance with claim 45, wherein the analog is L-O-methylthreonine.
48. A method comprising: providing a nucleotide sequence having substantial identity to the nucleotide sequence set forth in SEQ ID N0:1 or a portion thereof; and mutating the sequence so that the sequence encodes a feedback insensitive threonine dehydratase/deaminase; wherein said mutating comprises site-directed mutagenesis.
49. The method according to claim 48, wherein the feedback insensitive threonine dehydratase/deaminase comprises an amino acid other than the wild-type at the amino acid location corresponding to location 452 of SEQ ID NO: 4, and at the amino acid location corresponding to location 497 of SEQ ID NO: 4.
50. A method comprising: providing a vector comprising a promoter operably linked to a nucleotide sequence encoding a threonine dehydratase/deaminase that is resistant to feedback inhibition, wherein the promoter regulates expression of the nucleotide sequence in a host cell; and transforming a target cell with the vector to provide a transformed cell, the transformed cell being capable of expressing the nucleotide sequence.
51. A polynucleotide comprising a nucleotide sequence having substantial identity to a sequence set forth in SEQ ID NO:23.
52. A polynucleotide comprising a nucleotide sequence having substantial identity to a sequence set forth in SEQ ID NO: 21.
Γûá146-
53. A polynucleotide comprising a nucleotide sequence having substantial identity to a sequence set forth in SEQ ID NO: 19.
54. A polynucleotide having a sequence effective for encoding a functional, feedback-insensitive threonine dehydratase/deaminase enzyme that hybridizes under moderately stringent conditions with a nucleotide sequence set forth in SEQ ID NO: 23.
55. A polynucleotide having a sequence effective for encoding a functional, feedback-insensitive threonine dehydratase/deaminase enzyme that hybridizes under moderately stringent conditions with a nucleotide sequence set forth in SEQ ID NO: 19.
56. A DNA construct capable of expression in monocots comprising SEQ ID NO. 23.
57. A DNA construct capable of expression in dicots comprising SEQ ID NO. 23.
58. A DNA construct capable of expression in monocots comprising SEQ ID NO. 19.
59. A DNA construct capable of expression in dicots comprising SEQ ID NO. 19.
60. A transgenic monocot plant wherein cells of the plant have been transformed with a DNA construct capable of expression in plant cells, the synthetic gene construct comprising SEQ ID NO. 23.
61. A transgenic monocot plant wherein cells of the plant have been transformed with a DNA construct capable of expression in plant cells, the synthetic gene construct comprising SEQ ID NO. 19.
-147-
62. A transgenic dicot plant wherein cells of the plant have been transformed with a DNA construct capable of expression in plant cells, the synthetic gene construct comprising SEQ ID NO. 23.
63. A transgenic dicot plant wherein cells of the plant have been transformed with a DNA construct capable of expression in plant cells, the synthetic gene construct comprising SEQ ID NO. 19.
64. A transgenic maize plant wherein cells of the plant have been transformed with a DNA construct capable of expression in plant cells, the synthetic gene construct comprising SEQ ID NO. 23.
65. A transgenic maize plant wherein cells of the plant have been transformed with a DNA construct capable of expression in plant cells, the synthetic gene construct comprising SEQ ID NO. 19.
66. A transgenic rice plant wherein cells of the plant have been transformed with a DNA construct capable of .expression in plant cells, the synthetic gene construct comprising SEQ ID NO. 23.
67. A transgenic rice plant wherein cells of the plant have been transformed with a DNA construct capable of expression in plant cells, the synthetic gene construct comprising SEQ ID NO. 19.
68. A plant seed having in its genome an inheritable synthetic gene, the synthetic gene comprising SEQ ID NO. 23.
69. A plant seed in accordance with claim 68 wherein the plant seed is a monocot.
-148-
70. A plant seed in accordance with claims 68 wherein the plant seed is a dicot.
71. A plant seed in accordance with claim 68 wherein the plant seed is maize.
72. A plant seed in accordance with claims 68 wherein the plant seed is rice.
73. A synthetic gene construct capable of expression in plant cells comprising in sequence from 5' to 3' : a promoter sequence effective to initiate transcription in plant cells; a translational enhancer sequence; a nucleotide sequence having substantial identity to a sequence set forth in SEQ ID NO: 23; and wherein said promoter sequence, translational enhancer sequence, and nucleotide sequence having substantial identity to a sequence set forth in SEQ ID NO: 23 are operably linked.
74. A synthetic gene construct capable of expression in plant cells comprising in sequence from 5' to 3' : a promoter sequence effective to initiate transcription in plant cells; a translational enhancer sequence; a nucleotide sequence having substantial identity to a sequence set forth in SEQ ID NO: 19; and wherein said promoter sequence, translational enhancer sequence, and nucleotide sequence having substantial identity to a sequence set forth in SEQ ID NO: 19 are operably linked.
-149-
PCT/US1999/000560 1998-02-17 1999-01-08 Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase deaminase WO1999041395A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU22202/99A AU2220299A (en) 1998-02-17 1999-01-08 Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase deaminase

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US7487598P 1998-02-17 1998-02-17
US60/074,875 1998-02-17
USPCT/US98/14362 1998-07-10
PCT/US1998/014362 WO1999002656A1 (en) 1997-07-10 1998-07-10 Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase/deaminase

Publications (1)

Publication Number Publication Date
WO1999041395A1 true WO1999041395A1 (en) 1999-08-19

Family

ID=26756155

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/000560 WO1999041395A1 (en) 1998-02-17 1999-01-08 Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase deaminase

Country Status (2)

Country Link
AU (1) AU2220299A (en)
WO (1) WO1999041395A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1012237A1 (en) * 1997-07-10 2000-06-28 Purdue Research Foundation Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase/deaminase
US6451564B1 (en) 1999-07-02 2002-09-17 Massachusetts Institute Of Technology Methods for producing L-isoleucine
WO2003046193A2 (en) * 2001-11-29 2003-06-05 Societe Des Produits Nestle S.A. PRODUCTION OF α-KETO BUTYRATE
WO2006050313A2 (en) * 2004-10-29 2006-05-11 The Board Of Trustees Operating Michigan State University Protection against herbivores
CN109182319A (en) * 2018-08-20 2019-01-11 浙江大学 A kind of threonine deaminase mutant and its preparation method and application

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHAUL O, GALILI G: "CONCERTED REGULATION OF LYSINE AND THREONINE SYNTHESIS IN TOBACCO PLANTS EXPRESSING BACTERIAL FEEDBACK-INSENSITIVE ASPARTATE KINASE AND DIHYDRODIPICOLINATE SYNTHASE", PLANT MOLECULAR BIOLOGY, SPRINGER, DORDRECHT., NL, vol. 23, no. 04, 1 November 1993 (1993-11-01), NL, pages 01, XP002914145, ISSN: 0167-4412, DOI: 10.1007/BF00021531 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1012237A1 (en) * 1997-07-10 2000-06-28 Purdue Research Foundation Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase/deaminase
EP1012237A4 (en) * 1997-07-10 2002-01-09 Purdue Research Foundation Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase/deaminase
US6451564B1 (en) 1999-07-02 2002-09-17 Massachusetts Institute Of Technology Methods for producing L-isoleucine
US6987017B2 (en) 1999-07-02 2006-01-17 Massachusetts Institute Of Technology Methods for producing L-isoleucine
WO2003046193A2 (en) * 2001-11-29 2003-06-05 Societe Des Produits Nestle S.A. PRODUCTION OF α-KETO BUTYRATE
WO2003046193A3 (en) * 2001-11-29 2004-02-19 Nestle Sa PRODUCTION OF α-KETO BUTYRATE
US7144715B2 (en) 2001-11-29 2006-12-05 Nestec S.A. Production of α-keto butyrate
WO2006050313A2 (en) * 2004-10-29 2006-05-11 The Board Of Trustees Operating Michigan State University Protection against herbivores
WO2006050313A3 (en) * 2004-10-29 2007-03-01 Univ Michigan State Protection against herbivores
US8871999B2 (en) 2004-10-29 2014-10-28 Board Of Trustees Of Michigan State University Protection against herbivores
US9796984B2 (en) 2004-10-29 2017-10-24 Board Of Trustees Of Michigan State University Protection against herbivores
CN109182319A (en) * 2018-08-20 2019-01-11 浙江大学 A kind of threonine deaminase mutant and its preparation method and application

Also Published As

Publication number Publication date
AU2220299A (en) 1999-08-30

Similar Documents

Publication Publication Date Title
US6563025B1 (en) Nucleotide sequences encoding anthranilate synthase
US7022895B2 (en) Plant amino acid biosynthetic enzymes
US7038108B2 (en) Polynucleotide encoding lysyl-tRNA synthetase from Zea mays
HU222085B1 (en) Chimeric genes and methods for increasing the lysine content of the seeds of corn, soybean and rapeseed plants
WO1999005902A1 (en) Transgenic plants tolerant of salinity stress
AU759068B2 (en) Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase/deaminase
US5965727A (en) For selectable markers and promoters for plant tissue culture transformation
US7176354B2 (en) Genes encoding sulfate assimilation proteins
US7943753B2 (en) Auxin transport proteins
US7195887B2 (en) Rice 1-deoxy-D-xylulose 5-phosphate synthase and DNA encoding thereof
WO1999041395A1 (en) Methods and compositions for producing plants and microorganisms that express feedback insensitive threonine dehydratase deaminase
US7122717B2 (en) Enzymes involved in squalene metabolism
US6864077B1 (en) Membrane-bound desaturases
US6204039B1 (en) Plant isocitrate dehydrogenase homologs
AU2005224325A1 (en) Post harvest control of genetically modified crop growth employing D-amino acid compounds
EP1098963A2 (en) Chorismate synthase from plants
EP0996734A1 (en) Plant transcription coactivators with histone acetyltransferase activity
US7098379B2 (en) Plant UDP-glucose dehydrogenase
WO2000006756A1 (en) Sulfur metabolism enzymes
US6600089B1 (en) Carotenoid biosynthesis enzymes
US7112722B2 (en) Plant genes encoding pantothenate synthetase
US7176353B2 (en) Genes encoding sulfate assimilation proteins
US20030145349A1 (en) Plant transcription coactivators with histone acetyl transferase activity
WO2003014373A2 (en) Metal-binding proteins
US7192758B2 (en) Polynucleotides encoding phosphoribosylanthranilate isomerase

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase