MXPA99011066A

MXPA99011066A - Plant amino acid biosynthetic enzymes

Info

Publication number: MXPA99011066A
Application number: MXPA/A/1999/011066A
Authority: MX
Inventors: Carl Falco Saverio; M Allen Stephen; Antoni Rafalski J; D Hitz William; John Kinney Anthony; Marie Abell Lynn; Jane Thorpe Catherine
Original assignee: Ei Du Pont De Nemours And Company
Priority date: 1997-06-06
Filing date: 1999-11-30
Publication date: 2000-09-04

Abstract

This invention relates to an isolated nucleic acid fragment encoding a plant enzyme that catalyzes steps in the biosynthesis of lysine, threonine, methionine, cysteine and isoleucine from aspartate, the enzyme a member selected from the group consisting of:dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase. The invention also relates to the construction of a chimeric gene encoding all or a portion of the enzyme, in sense or antisense orientation, wherein expression of the chimeric gene results in production of altered levels of the enzyme in a transformed host cell.

Description

BIOSYNTHETIC ENZYMES OF PLANT AMINO ACIDS This application claims the benefit of the U.S. Provisional Application No.60 / 048, 771, registered on June 6, 1997, and the U.S. Provisional Application No. 60 / 049,443, registered on June 12, 1997.

FIELD OF THE INVENTION This invention is in the field of molecular biology of plants. More specifically, this invention relates to nucleic acid fragments that encode enzymes involved in the biosynthesis of amino acids in plants and seeds.

BACKGROUND OF THE INVENTION Many vertebrates, including man, lack the ability to manufacture a certain number of amino acids and therefore require in their diet of these prefabricated amino acids. These are called essential amino acids. Human foods and animal feeds, derived from many grains, are deficient in essential amino acids, such as lysine, the sulfurized amino acids methionine and cysteine, threonine and tryptophan. For example in corn (Zea mays L.) lysine is the most limited amino acid for the dietary requirements of many animals. Soy flour (Glycina max L.) is used as an additive REF .: 31928 for animal foods based on corn mainly as a lysine supplement. In this way, an increase in the lysine content of either corn or soy would reduce or eliminate the need to supplement the feed of grain mixtures with lysine produced via microbial fermentation. In addition, in corn the sulfur amino acids occupy the third place among the most limited amino acids after lysine and tryptophan, for the dietary requirements of many animals. The use of soy flour, which is rich in lysine and tryptophan to supplement corn in animal feed is limited by the low sulfur amino acid content of the legume. Thus, an increase in the content of the sulfur amino acids in both corn and soy would improve the nutritional quality of the mixtures and reduce the need for additional supplementation through the addition of more expensive methionine.

Lysine, threonine, methionine, cysteine and isoleucine are amino acids derived from aspartate. The regulation of the biosynthesis of each member of this family is interconnected (see Figure 1). One approach to increasing the nutritional quality of human foods and animal feeds is to increase production and accumulation of specific free amino acids via genetic engineering of this biosynthetic pathway.

Alteration of enzyme activity in this route could lead to altered levels of lysine, threonine, methionine, cysteine and isoleucine. However, some of the genes that encode enzymes that regulate this route in plants, especially corn, soybeans and wheat, are available.

The organization of the pathway leading to the biosynthesis of lysine, threonine, methionine, cysteine and isoleucine indicates that the overexpression or reduction of expression of genes encoding, inter alia, threonine synthase, dihydrodipicolinate reductase, diaminopimelate epi erasa, threonine deaminase and S-adenosylmethionine synthetase in corn, soy, wheat and other crop plants could be used to alter the levels of these amino acids in human food and animal feed. Accordingly, the availability of nucleic acid sequences encoding all or a portion of these enzymes would facilitate the development of nutritionally enhanced crop plants.

BRIEF DESCRIPTION OF THE INVENTION The present invention relates to isolated nucleic acid fragments that encode enzymes of plants involved in amino acid biosynthesis. Specifically, this invention concerns isolated nucleic acid fragments encoding the following plant enzymes that catalyze steps in the biosynthesis of lysine, threonine, methionine, cysteine and isoleucine from aspartate: dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthetase, threonine deaminase, and S-adenosylmethionine synthetase. In addition, this invention relates to nucleic acid fragments that are complementary to nucleic acid fragments encoding the listed plant biosynthetic enzymes.

In another embodiment, the present invention relates to the chimeric genes encoding the amino acid biosynthetic enzymes listed above, or to chimeric genes comprising nucleic acid fragments that are complementary to the nucleic acid fragments encoding the enzymes, operably linked to sequences suitable regulators, wherein the expression of the chimeric genes results in production of levels of coding enzymes in transformed host cells that are altered (ie, increasing or decreasing) of the levels produced in non-transformed host cells.

In a further embodiment, the present invention relates to a transformed host cell comprising in its genome a chimeric gene encoding the plant amino acid biosynthetic enzyme, operably linked to the appropriate regulatory sequences, the enzyme selected from a group consisting of: dihydr.odipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase. The expression of the chimeric gene results in the production of altered levels of the biosynthetic enzyme in the transformed host cell. The transformed host cells may be of eukaryotic or prokaryotic origin, and include cells derived from higher plants and microorganisms. The invention also includes transformed plants that come from transformed host cells of higher plants, and seeds derived from such transformed plants.

In a further modality of the. present invention relates to a method of altering the expression levels of a plant biosynthetic enzyme in a transformed host cell consisting of: a) transformation of a host cell with a chimeric gene including a nucleic acid fragment encoding the plant biosynthetic enzyme, selected from the group consisting of dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase, operatively linked to suitable regulatory sequences; and b) growth of the transformed host cell under conditions that are suitable for the expression of the chimeric gene wherein the expression of the chimeric gene results in production of altered levels of biosynthetic enzymes in the transformed host cell.

A further embodiment of the present invention relates to a method for obtaining a nucleic acid fragment that encodes all or substantially all of the amino acid sequence coding for dihydrodipicolinate reductase, diaminopimethate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase. plant.

A further embodiment of the present invention is a method for evaluating at least one compound for its ability to inhibit the activity of a plant biosynthetic enzyme selected from a group consisting of dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine. synthetase, the method consists of the steps of: (a) transforming a host cell with a chimeric gene comprising a nucleic acid fragment encoding the plant biosynthetic enzyme selected from the group consisting of dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase, operatively linked to suitable regulatory sequences; (b) development of the transformed host cell under conditions that are suitable for the expression of the chimeric gene wherein the expression of the chimeric gene results in production of the biosynthetic enzyme in the transformed host cell; (c) optionally the purification of the biosynthetic enzyme expressed by the transformed host cell; (d) treatment of the biosynthetic enzyme with a compound to be tested; and (e) comparing the activity of a biosynthetic enzyme that has been treated with the test compound for the activity of an untreated biosynthetic enzyme, in order to select compounds with potential for inhibitory activity.

BRIEF DESCRIPTION OF THE SCHEMES AND DESCRIPTIONS OF THE SEQUENCE The invention can be fully understood from the following detailed description and accompanying drawings and sequence descriptions which are part of this application.

Figure 1 describes the biosynthetic pathway of the amino acid family of aspartate. The following abbreviations are used: AK = aspartoquinasa; ASADH = aspartic semialdehyde dehydrogenase; DHDPS = dihydrodipicolinate synthase; DHDPR = dihydrodipicolinate reductase, DAPEP = diaminopimelate epimerase; DAPDC = diaminopimelate decarboxylase; HDH = homoserine dehydrogenase, HK = homoserine kinase; TS = tresnin synthase; TD = threonine deaminase; C? S = cystathionine? -syntase; CßL = cystathionine β-lyase; MS = methionine synthase; CS = cysteine synthase, and SAMS = S-adenosylmethionine synthase.

Figure 2 shows a multiple alignment of amino acid sequence fragments reported here encoding dihydrodipicolinate reductase (SEQ ID NOs: 2 and 4) and the dihydrodipicolinate reductase sequence of Synechocystis sp. declared in DDBJ Accession No. D90899 (SEQ ID NO: 5).

Figure 3 shows a multiple alignment of the amino acid sequence fragments reported here coding for diaminopimelate epimerase (SEQ ID Nos: 7, 9, 11, and 13) and the diaminopimelate epimerase sequence of Synechocystis sp. declared in DDBJ Accession No. D90917 (SEQ ID N0: 14). - Figure 4 shows the multiple alignment of the amino acid sequence fragments reported here encoding threonine synthase (SEQ ID Nos: 16, 18, 20, 22, 24 and 26) and the threonine synthase sequence of Arabidopsis thaliana reported in GenBank Accession No L41666 (SEQ ID NO: 27).

Figure 5 shows the multiple alignment of the amino acid sequence fragments reported herein encoding threonine deaminase (SEQ ID Nos: 9, 31, and 33) of the layered Brukholderia tresnin synthase reported in GenBank Accession No. U40630 (SEQ ID NO: 3. 4).

Figure 6 shows the nucleotide sequence alignment of S-adenosylmethionine synthetase reported here for maize (SEQ ID NO: 35) with the nucleotide sequence of S-adenosylmethionine synthetase from Oryza sativa declared in EMBL Accession NO.Z26867 (SEQ ID NO. : 37).

Figure 7 shows the alignment of the nucleotide sequence of S-adenosylmethionine synthetase reported here for soy (SEQ ID NO: 38) with the nucleotide sequence of S-adenosyl-methionine synthetase from Lycopersicon esculentum reported in EMBL Accession NO.Z24741 ( SEQ ID NO: 40).

Figure 8 shows the alignment of the nucleotide sequence of S-adenosylmethionine synthetase reported here for wheat (SEQ ID NO: 41) with the nucleotide sequence of S-adenosylmethionine synthetase from Hordeum vulgare declared in DDBJ Accession No. D63835 (SEQ ID. NO: 43).

The amino acid sequence alignments were carried out using the Clustal alignment method (Higgins, DG and Sharp, PM (1989) CABIOS 5: 151-153) of the Megalign program of the LASARGENE bioinformatics department (DNASTAR Inc., Madison , Wl). The nucleotide sequence alignments were a result of the BLASTN search carried out with each individual sequence of S-adenosylmethionine.

The following sequence descriptions and sequence lists appended here comply with the nucleotide and / or amino acid sequence governing rules published in patent applications as stated in 37 C.F.R. §1.821-1.825.

SEQ ID NO: 1 is the nucleotide sequence comprising the insertion of the entire cDNA in the clone csiln.pk0042, a3 encoding a corn dihydrodipicolinate reductase.

SEQ ID NO: 2 is the deduced amino acid sequence of a portion of a corn dihydrodipicolinate reductase derived from the nucleotide sequence of SEQ ID NO: 1.

SEQ ID NO: 3 is the nucleotide sequence comprising a portion of the cDNA insert in clone rls2.pk0017.d3 encoding a dihydrodipicolinate SEQ ID NO: 4 is the deduced amino acid sequence of a portion of dihydrodipicolinate reductase from the derived rice of the nucleotide sequence of SEQ ID NO: 3.

SEQ ID NO: 5 is the entire amino acid sequence of dihydrodipicolinate reductase from Synechocystis sp. DDBJ Accession No. D90899.

SEQ ID NO: 6 is the nucleotide sequence comprising the insertion of the entire cDNA in the clone chp2.pk0008.h4 coding for the diaminopimelate corn epimerase.

SEQ ID NO: 7 is the deduced amino acid sequence of a portion of the corn diamino-epimerase epimerase derived from the nucleotide sequence of SEQ ID NO: 6.

SEQ ID NO: 8 is the nucleotide sequence comprising a portion of the cDNA insert in clone rls48.pk0036.hl0 encoding a diaminopimelate epimerase from rice.

SEQ ID NO: 9 is the deduced amino acid sequence of a portion of a diaminopimethate epimerase from rice derived from the nucleotide sequence of SEQ ID NO: 8.

SEQ ID NO: 10 is the sequence of nucleotides comprising a contiguity formed from portions of sfll.pk0031.h3, and sgslc.pk002.kl2 and the complete cDNA inserted from the se2.pk0005 clones. fl, and ses8w.pk0010.hll coding soybean diaminopimelate epimerase.

SEQ ID NO: 11 is the deduced amino acid sequence of soybean diaminopimelate epimerase derived from the nucleotide sequence of SEQ ID NO: 10.

SEQ ID NO: 12 is the nucleotide sequence comprising a portion of the cDNA insert in clone wlm24.pk0030.g4 encoding a wheat epimerase diaminopimelate.

SEQ ID NO: 13 is the deduced amino acid sequence of the portion of a wheat diaminopimethate epimerase derived from the nucleotide sequence SEQ ID NO: 12.

SEQ ID NO: 14 is the nucleotide sequence comprising the entire diaminopimelate epimerase of Synechocystis sp. From DDBJ Accession No. D90917.

SEQ ID NO: 15 is the nucleotide sequence comprising the insertion of the entire cDNA in clone cc2.pk0031.c9 encoding a threonine synthetase of maize.

SEQ ID NO: 16 is the deduced amino acid sequence of a portion of a threonine synthetase of corn derived from the nucleotide sequence declared in SEQ ID NO: 15.

SEQ ID NO: 17 is the nucleotide sequence comprising part of the cDNA insert in clone csl.pk0058.g5 encoding a threonine maize synthase.

SEQ ID NO: 18 is the amino acid sequence deduced from a portion of a threonine maize synthase, derived from the nucleotide sequence of SEQ ID NO: 17 SEQ ID NO: 19 is the nucleotide sequence comprising part of the cDNA insert in clone rls72.pk0018.e7 encoding a threonine rice synthase.

SEQ ID NO: 20 is the amino acid sequence deduced from a portion of a threonine synthase. of rice derived from the nucleotide sequence declared in SEQ ID NO: 19. SEQ ID NO: 21 is the nucleotide sequence comprising part of the cDNA insert in clone sel.06a03 encoding a threonine synthase of soybean.

SEQ ID NO: 22 is the amino acid sequence deduced from a portion of a threonine synthase of soybean, derived from the nucleotide sequence declared in SEQ ID NO: 21.

SEQ ID NO: 23 is the nucleotide sequence comprising the insertion of the entire cDNA in the srl .pk0003 clone. f6 encoding a threonine soy synthase.

SEQ ID NO: 24 is the amino acid sequence deduced from a portion of a threonine synthase of soybean derived from the nucleotide sequence declared in SEQ ID NO: 23.

SEQ ID NO: 25 is the sequence of nucleotides comprising a part of the insertion of cDNA in the clone wrl .pk0085.h2 encoding a wheat threonine synthase.

SEQ ID NO: 26 is the amino acid sequence deduced from a portion of a wheat threonine synthase derived from the nucleotide sequence declared in SEQ ID NO: 25.

SEQ ID NO: 27 is the entire amino acid sequence of an Arabidopsis thaliana threonine synthase, found in GenBank Accession No. L41666.

SEQ ID NO: 28 is the nucleotide sequence comprising the insertion of the entire cDNA in the clone above .pk0064. f4 encoding a threonine deaminase from corn.

SEQ ID NO: 29 is the amino acid sequence deduced from a portion of a threonine deaminase of corn derived from the nucleotide sequence declared in SEQ ID NO: 28.

SEQ ID NO: 30 is the nucleotide sequence comprising a portion of the cDNA insert in clone sfll.pk0055.h7 encoding a threonine deaminase of soy.

SEQ ID NO: 31 is the amino acid sequence deduced from a portion of a threonine deaminase of soybean derived from the nucleotide sequence declared in SEQ ID NO: 30.

SEQ ID NO: 32 is the nucleotide sequence comprising the entire cDNA insert in the clone sre.pk0044. f3 encoding a soy threonine deaminase.

SEQ ID NO: 33 is the deduced amino acid sequence of a portion of a soy threonine deaminase of the nucleotide sequence declared in SEQ ID NO: 32.

SEQ ID NO: 34 is the entire amino acid sequence of a threonine deaminase from Burkholderia capada found in GenBank Accession No.U49630.

SEQ ID NO: 35 is the sequence of nucleotides comprising the insertion of the entire cDNA the clone cc3.mn0002.d2 encoding the whole S-adenosylmethionine synthetase of corn.

SEQ ID NO: 36 is the deduced amino acid sequence of a corn S-adenosylmethionine synthetase derived from the nucleotide sequence declared in SEQ ID NO: 35 SEQ ID NO: 37 is the entire nucleotide sequence of an S-adenosylmethionine synthetase from Oryza sativa found in EMBL Accession No. Z26867.

SEQ ID NO: 38 is the entire nucleotide sequence of the entire cDNA insert in clone s2.12b206 encoding the whole soy S-adenosylmethionine synthetase.

SEQ ID NO: 39 is the deduced amino acid sequence of the entire S-adenosylmethionine synthetase derived from the nucleotide sequence declared in SEQ ID NO: 38.

SEQ ID NO: 0 is the entire nucleotide sequence of an S-adenosylmethionine synthetase from Lycopersicon esculentum found in EMBL Accession No.Z24741. SEQ ID NO: 41 is the nucleotide sequence comprising a contiguity formed from portions of cDNA inserts in the clones wrel.pk0002.cl2, wleln.pk0070.b8, wkmlc.pk0003.g4, wlkl.pk0028.d3, wreln.pkl70 .d8, wrl.pk0086.d5, wrl .pk0103.h8 and wreln.pk0082.b2 encoding a portion of wheat S-adenosylmethionine synthetase.

SEQ ID NO: 42 is the deduced amino acid sequence of a wheat S-adenosylmethionine synthetase derived from the nucleotide sequence declared in SEQ ID NO: 41.

SEQ ID NO: 43 is the entire nucleotide sequence of an S-adenosylmethionine synthetase from Hordeum vulgare found in DDBJ Accession No. D63835.

Sequence Descriptions contain the one letter code for the nucleotide sequence characters and the three letter codes for amino acids, defined according to the IUPAC-IYUB standards, described in Nucleic Acids Research 13: 3021-3030 ( 1985) and in Biochemical Journal 219 (No.2): 345-373 (1984) which are mentioned and incorporated herein by reference. The symbols and the format used for the nucleotide and amino acid sequence comply with the rules described in 37 C.F.R. §1.822.

DETAILED DESCRIPTION OF THE INVENTION In the context of this discovery, a number of terms will be used. As described, an "isolated nucleic acid fragment" is a DNA or RNA polymer having one or double strand; optionally they contain synthetic, unnatural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a DNA polymer may comprise one or more segments of cDNA, genomic or synthetic DNA. As described herein, "contiguity" refers to a collection of overlapping nucleic acid sequences to form a sequence of contiguous nucleotides. For example, several DNA sequences can be compared and aligned to identify the common or overlapping regions. The individual sequences can then be assembled into a single contiguous nucleotide sequence.

As described "substantially similar" refers to fragments of nucleic acids where changes in one or more nucleotide bases result in the substitution of one or more amino acids, but do not affect the functional properties of the protein encoded by the DNA sequence. "Substantially similar" also refers to nucleic acid fragments wherein changes in one or more nucleotide bases do not affect the ability of the nucleic acid fragment to mediate the alteration of gene expression by counter-sense or co-suppression technology "Substantially "similar" also refers to modifications of the nucleic acid fragments of the current invention such as the deletion or insertion of one or more nucleotides that do not substantially affect the functional properties of the resulting transcription vis-à-vis the ability to mediate the alteration. of the genetic expression by the counter-sense, technology of co-suppression or alteration of the functional properties of the resulting protein molecule. It is therefore understood that the invention encompasses more than the specific sequences exemplified.

For example, it is well known in the art that counter-sense suppression and co-suppression of gene expression can be achieved by using fragments of nucleic acid representing less than the entire coding region of a gene, and by nucleic acid fragments. that do not share 100% of the identity with the gene to be deleted. Moreover, alterations in a gene which result in the production of a chemically equivalent amino acid at a given site, but have no effect on the functional properties of the encoded protein, are well known in the art. In this way, a codon for the amino acid alanine, a hydrophobic amino acid, can be replaced by a codon coding for another less hydrophobic residue, such as glycine, or another more hydrophobic residue, such as valine, leucine or isoleucine. Similarly, in changes that result in substitution of a negatively charged residue for another, such as aspartic acid for glutamic acid, or a positively charged residue for another, such as lysine for arginine, one may also expect to produce a functionally equivalent product. Changes in nucleotides that result in the alteration of the N and C terminal portions of the protein molecule should not be expected to alter the activity of the protein. Each one of the proposed modifications is very deep in the routine of art, as is the determination of retention of biological activity of the coded products. Moreover, the skilled artisan recognizes that substantially similar sequences encompassed by this invention are also defined by their ability to hybridize, under stringent conditions (0.1X SSC, 0.1% SDS, 65 ° C), with the sequences exemplified herein. The preferred substantially similar nucleic acid fragments of the present invention are those fragments of nucleic acids whose DNA sequences are 80% identical to the DNA sequence of the nucleic acid fragments reported herein. The most preferred nucleic acid fragments are 90% identical to the identical DNA sequence of the nucleic acid fragments reported herein. The most preferred nucleic acid fragments are 95% identical to the DNA sequence of the nucleic acid fragments reported herein. The Clustal multiple alignment algorithm (Higgins, D.G. and Sharp, P.M 819899 CABIOS 5: 151-153) was used here with a GAP PENALTY of 10 and a GAP LENGTH PENALTY of 10.

A "substantial portion" of an amino acid or nucleotide sequence comprises sufficient of the amino acid sequence of a polypeptide or of the nucleotide sequence of a gene to achieve a putative identification of that polypeptide or gene, either by manual evaluation of the sequence by an expert, or by comparison of automated computer sequences and identification using algorithms such as BLAST (Basic Local Alignment Seach Tool; Altchul, SF, et al., (1993) J.Mol.Biol.215: 403410; see also www.ncbi.nlm.nih.gov/BLAST/). In general, the sequence of ten or more contiguous amino acids or thirty or more nucleotides is necessary in order to putatively identify the polypeptide or nucleic acid sequence as a homolog of a known protein or gene. In addition, with respect to the nucleotide sequences, the specific oligonucleotide gene examined, comprising 20-30 contiguous nucleotides, can be used in methods of genetic identification of dependent chain (eg, Southern hybridization) and isolation (eg, in if your hybridization of bacterial colonies or bacteriophage plaques). In addition, short oligonucleotides of 12-15 bases can be used as amplification primers in PCR in order to obtain a specific fragment of particular nucleic acid comprising primers. Accordingly, a "substantial portion" of a nucleotide sequence comprises sufficient of the sequence to provide specific identification and / or isolation of the nucleic acid fragment comprising the sequence. The current specification teaches the partial or complete sequences of amino acids and nucleotides encoding one or more particular plant proteins. The skilled artisan, having the benefit of the sequences as reported here, may now use all or a substantial portion of the disclosed sequences for purposes known to those skilled in the art. Accordingly, the present invention comprises the complete sequences as reported in the Sequence List, as well as the substantial portions of these sequences as defined above.

"Degenerate codon" refers to divergence in the genetic code allowing variation of the nucleotide sequence without affecting the amino acid sequence of a coded polypeptide. Accordingly, the present invention relates to any nucleic acid fragment that encodes all or a substantial portion of the amino acid sequence encoding biosynthetic amino acid enzymes as set forth in the sequences SEQ ID NOS: 2, 4, 7, 9, 11 , 13, 16, 18, 20, 22, 24, 26, 29, 31 and 33. The skilled artisan is well aware of the "codon-bias" displayed by a specific host cell in use of nucleotide codons to specify a given amino acid. Therefore, when a gene is synthesized to improve expression in the host cell, it is desirable to design such a gene that its frequency of codon usage approaches the frequency of use of preferred codons of the host cell.

"Synthetic genes" can be assembled from blocks constructed of oligonucleotides that are chemically synthesized using methods known to those skilled in the art. These constructed blocks are linked and hardened to form segments of genes that are then enzymatically assembled to build the entire gene. "Chemically synthesized", in relation to the DNA sequence, means that the nucleotide components were assembled in vi tro. Manual chemical synthesis of DNA can be completed using well-established procedures, or automatic chemical synthesis of DNA can be performed using one of a considerable number of commercial machines available. Therefore, genes can be adapted for optimal gene expression based on optimization of the nucleotide sequence to reflect the codon-bias of the host cell. Trained artisans appreciate the likelihood of successful gene expression if the use of the codon is predisposed to those codons favored by the host. The determination of the preferred codons can be based on a study of the genes derived from the host cell where the sequence of information is available.

"Gene" refers to the nucleic acid fragment that expresses a specific protein, including the regulatory sequences preceding (5 'non-coding sequence) and following (3' non-coding sequence) the coding sequence. "Native gene" refers to the gene as it is found in nature with its own regulatory sequences. "Chimeric gene" is. refers to any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene can comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a different way from that found in nature. "Endogenous gene" refers to the native gene in its original position in the genome of an organism. "Foreign" gene refers to a gene not normally found in the host organism, but which is introduced into the host organism by gene transfer. Foreign genes may comprise native genes inserted into a non-native organism, or chimeric genes. A "transgene" is a gene that has been introduced into the genome by a transformation procedure.

"Coding sequence" refers to a DNA sequence that codes for a specific amino acid sequence "Regulatory sequences" refers to nucleotide sequences located at the upper end (non-coding sequences), middle, or end lower (3 'non-coding sequences) of a coding sequence, and which influence the transcription, processing or stability of RNA, or translation of an associated coding sequence Regulatory sequences may include promoters, leader translation sequences, introns and polyadenylation sequences recognizing "Promoter" refers to a DNA sequence capable of controlling the expression of an RNA coding or functional sequence. In general, a coding sequence is located 3 'to a promoter sequence. The promoter sequence consists of proximal and more distal elements at the upper end as well as the last elements are referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence which can stimulate promoter activity and can be an innate element of a promoter or a heterologous element inserted to improve the level or tissue specificity of a promoter. The promoters can be derived in their entirety from a native gene, or they can be composed of different elements derived from different promoters found in nature, or they can still comprise synthetic segments of DNA. It is understood by those skilled in the art that different promoters can direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters which cause a gene to be expressed in most cell types are most often referred to as "constitutive promoters". New promoters of various useful types in plant cells are constantly being discovered; several examples can be found in the compilation of Okamuro and Goldberg, (1989) Biochemistry of Plants 15: 1 -82. It is further recognized that, in most cases, the exact boundaries of the regulatory sequences have not been fully defined, DNA fragments of different lengths may have identical promoter activity.

The "leader translation sequence" refers to a DNA sequence located between the promoter sequence of a gene and the coding sequence. The leader translation sequence is presented at the fully processed upper end of messenger RNA of the initial translation sequence. The leader translation sequence may affect the processing of the primary translation of messenger RNA, the stability of the messenger RNA or the translation efficiency. Examples of leading translation sequences have been described (Turner, R. and Foster, G.D. (1985) Molecular Biotechnology 3: 225).

The "3 'non-coding sequences" refer to DNA sequences located at the lower end of a coding sequence and include polyadenylation recognition sequences and other coding regulatory signal signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized in that it affects the addition of polyadenylic acid channels to the 3 'end of the mRNA precursor. The use of different 3 'non-coding sequences is exemplified by Ingelbrecht et. al., (1989) Plant Cell 1: 671-680.

"RNA transcription" refers to the product resulting from transcription catalyzed by an RNA polymerase to a DNA sequence. When the RNA transcript is a perfect complementary copy of a DNA sequence, it is referred to as a primary transcript or it can be an RNA sequence derived from a post-transcriptional process of the primary transcript and is referred to as mature RNA. "Messenger RNA (mRNA)" refers to RNA without introns and can be translated into proteins by the cell. "cDNA" refers to a double strand of DNA that is complementary and derived from mRNA.

"Sense of RNA" refers to the transcription of RNA that includes mRNA and can then be translated into proteins by the cell. "RNA derivative" refers to the transcription of RNA that is complementary to all or part of a primary primary transcript or mRNA and that blocks the expression of a target gene (U.S. Pat No. 5,107,065). The complementarity of an RNA in contradiction can be with any part of the transcription of the specific gene, i.e., in the 5 'non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to an RNA in contradiction, ribosomal RNA or other RNA that is not translated and still has an effect on cellular processes.

The term "operably linked" refers to the association of nucleic acid sequences in a single fragment of nucleic acid such that the function of one is affected by the other. For example, a promoter is operably linked to a coding sequence when it is capable of affecting the expression of the coding sequence (i.e., that the coding sequence is under transcriptional control of the promoter). Coding sequences can be operatively linked to regulatory sequences in normal orientation or in contradiction.

The term "expression" as used herein, refers to the transcription and stable accumulation of normal RNA (mRNA) or nonsense RNA derived from the nucleic acid fragment of the invention. Expression can also refer to the translation of mRNA into a polypeptide. "Inconsensus inhibition" refers to the production of transcripts of RNA in contradiction capable of suppressing the expression of the target protein. "Over-expression" refers to the production of a gene product in transgenic organisms that exceeds production levels in normal or non-transformed organisms. "Cosuppression" refers to the production of normal RNA transcripts capable of suppressing the expression of identical or substantially similar foreign endogenous genes (U.S. Patent No. 5,231,020).

Altered levels refers to the production of gene products in transgenic organisms in amounts or proportions that differ from those of normal or non-transformed organisms. "Mature protein" refers to a polypeptide processed post-translationally; i.e., one of which pre or propeptides present in the product of the primary translation have been excluded.

"Precursor protein" refers to the primary product of mRNA translation; i.e., with pre and propeptides still present. Pre and propeptides are not limited to intracellular localization signals.

A "chloroplast transit peptide" is an amino acid sequence which is translated in conjunction with a protein and directs the protein to the chloroplast or other types of plastids present in the cell in which the protein was made. "Chloroplast transit sequence" refers to a nucleotide sequence that encodes a transit peptide in chloroplasts. A "peptide signal" is an amino acid sequence which is translated in conjunction with a protein and directs the protein to the secretory system (Chrispeels, JJ, (1991) Ann. Rev. Plant Phys. Plant, Mol. Biol. : 21-53). If the protein is directed to a vacuole, a specific vacuolar signal (supra) can be added later, or if it is directed to the endoplasmic reticulum, an endoplasmic reticulum retention signal (supra) can be added. If the protein is directed to the nucleus, any present peptide signal must be removed and instead include a nuclear localization signal (Raikhel (1992) Plant Phys.100: 1627-1632).

"Transformation" refers to the transfer of a nucleic acid fragment to the genome of the host organism, resulting in genetically stable hereditary characters. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" organisms. Examples of methods for plant transformation include Agrobacterium-mediated transformation (De Blaere et al (1987) Meth. Enzymol 143: 277) and accelerated particles or "bombardment of gene" transformation technology (Klein et.al. 1987) Nature (London) 327: 70-73; US Pat No. 4,945,050).

As described above, the standard recombinant DNA and the molecular cloning techniques used herein are well known in the art and are best described in Sambrook, J., Fritsch, E.F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold spring Harbor Laboratory Press: Cold Spring Harbor, 1989 (hereinafter "Maniatis").

Nucleic acid fragments encoding at least a portion of several biosynthetic amino acid enzymes in plants have been isolated and identified by comparison of cDNA sequences in randomly selected plants, to publish databases containing nucleotide and protein sequences using BLAST algorithms well known to those skilled in the art. Table 1 lists the biosynthetic amino acid enzymes described herein, and the designation of cDNA clones comprising the nucleic acid fragments encoding these enzymes.

TABLE 1 Biosynthetic amino acid enzymes 3 The nucleic acid fragments of the present invention can be used to isolate cDNAs and genes encoding homologous enzymes of the same species or of other plant species. The isolation of homologous genes using sequence dependent protocols is well known in the art. Examples of dependent sequence protocols include, but are not limited to, nucleic acid hybridization methods, and DNA and RNA amplification methods as exemplified for various uses of nucleic acid amplification technologies (eg, chain reaction of the polymerase, ligase chain reaction).

For example, genes encoding other amino acid biosynthetic enzymes, either cDNAs or genomic DNAs, could be isolated directly using all or a portion of the current nucleic acid fragments as DNA hybridization tests to protect libraries of any desired plant using good methodology. known by those qualified in art. Specific oligonucleotide tests based on current nucleic acid sequences can be designated and synthesized by methods known in the art (Maniatis). In addition, whole sequences can be used directly to synthesize DNA tests by methods known to skilled artisans such as random primer DNA labeling, notch translation, or final labeling techniques, or RNA tests using suitable transcription systems in vi tro. In addition, specific primers can be designed and used to amplify a part or the entire length of the current sequences. The products resulting from the amplification can be labeled directly during the amplification or labeling reactions after the amplification reactions, and used as tests for isolation of the full length of the cDNA or genomic fragments under appropriate severity conditions.

In addition, two short segments of the current nucleic acid fragments can be used in the polymerase chain reaction protocols to amplify long nucleic acid fragments by encoding DNA or RNA homologous genes. The polymerase chain reaction can also be performed in a library of cloned nucleic acid fragments wherein the sequence of one primer is derived from the current nucleic acid fragments, and the sequence of another primer takes advantage of the presence of regions of polyadenylic acid by the end of the 3 'of the mRNA precursor encoding plant genes. Alternatively, the second starter sequence may be based on sequences derived from the cloning vector. For example, skilled artisans can follow the RACE protocol (Frohman et al., (1988) NAS USA-85: 8998) to generate cDNAs by using PCR to amplify copies of the region between an isolated point in the transcript at the end of the 3 'or 5'. Initiators oriented in the 3 'and 5' direction can be designed from the current sequences.

Using adequate commercial systems 3 'RACE or 5' RACE (BRL), specific fragments 3 'or 5' cDNA can be isolated (Ohara et al., (1989) PANS USA 86: 5673; Loh et al., (1989) Science 243: 217). Products generated by the 3 'and 5' RACE procedures can be combined to generate complete cDNA chains (Frohman, M.A. and Martin, G.R., (1989) Techniques 1: 165).

The availability of the current nucleotide and the deduced amino acid sequences facilitate the safeguarding of cDNA in libraries of immunological expression. Synthetic peptides representing portions of the current amino acid sequences can be synthesized. These peptides can be used to immunize animals to produce polyclonal or monoclonal antibodies with specificity for peptides or proteins comprising the amino acid sequences. These antibodies can then be used to protect cDNA expression libraries to isolate full-length cDNA clones of interest (Lerner, R.A. (1984) Adv. Inmumol. 36: 1; Maniatis).

The nucleic acid fragments of the present invention can be used to create transgenic plants in which the disclosed biosynthetic enzymes are present at higher or lower levels than normal or in type cells or developmental stages in which they are not normally found. This would have the effect of altering the level of free minoacids in those cells.

The overexpression of the biosynthetic enzymes of the present invention can be completed by primary construction of chimeric genes in which the coding regions are operatively linked to promoters capable of directing the expression of the gene in the desired tissue at the desired stage of development. For reasons of convenience, the chimeric genes may comprise promoter sequences and leader translation sequences derived from the same genes. The 3 'non-coding sequences encoding the translation end signals can also be provided. The current chimeric genes may also comprise one or more introns in order to facilitate the expression of the gene.

The plasmid vectors comprising the current chimeric genes can then be manufactured. The plasmid vector option is dependent on the method that will be used to transform host plants. The skilled artisan is well aware of the genetic elements that must be present in the plasmid vector in order to successfully transform, selecting and spreading host cells containing the chimeric genes. The skilled artisan will also recognize that the events of the different independent transformations will result in different levels and patterns of expression ~ (Johns et al. (1985) EMBO J. 4: 2411-2418; De Almeida et al., (1989) Mol Gen. Genetics 218: 78-86), and in this way these multiple events must be safeguarded in order to obtain lines expressing the desired levels and patterns of expression. Such a guard can be carried out by Southern DNA analysis, Northern analysis of mRNA expression, Western analysis of protein expression, or phenotypic analyzes.

For some applications this may be helpful in directing current biosynthetic enzymes to different cellular compartments, or to facilitate their secretions from the cell. It is thus imagined that the chimeric genes described above can then be supplemented by altering the coding sequences to encode enzymes with appropriate intracellular target sequences such as transient sequences (Keegstra, K. (1989) Cell 56: 247-253), signal sequences or sequences encoding the location of the endoplasmic reticulum (Chrispeels, JJ, (1991) Ann. Rev. PlantPhys, Plan Mol. Biol. 42: 21-53), or added signals of nuclear localization (Raikhel, N. (1992) Plant Phys 100: 1627-1632) and / or with white sequences that are currently removed. While the cited references give examples of each of these, the list is not exhaustive and many useful white signals can be discovered in the future.

This may also be desirable to reduce or eliminate the expression of genes encoding the present biosynthetic enzymes in plants for some applications. In order to accomplish this, chimeric genes designed for co-suppression of current biosynthetic enzymes can be constructed by joining genes or gene fragments encoding the enzyme for plant promoter sequences. Alternatively, chimeric genes designed to express RNA in contradiction to all or part of the current nucleic acid fragments can be constructed by binding the gene fragment or genes in reverse orientation to promoter sequences in plant. Both co-suppression and chimeric genes in contrasense can be introduced into plants via transformation in which the expression of the corresponding endogenous genes are reduced or eliminated.

Biosynthetic enzymes of current amino acids (or portions of enzymes) can be produced in heterologous host cells, particularly in the cells of microbial hosts, and can be used to prepare antibodies for the enzymes by methods well known to those skilled in the art. The antibodies are useful for detecting the enzyme in cells in itself or in cell-in-vitro extracts. The preferred heterologous host cells for the production of the current amino acid biosynthetic enzymes are the microbial hosts. Microbial expression systems and expression vectors containing regulatory sequences that direct high levels of expression of foreign proteins are well known to those skilled in the art. Some of these could be used to construct chimeric genes for production of the present biosynthetic amino acid enzymes. These chimeric genes can then be introduced into appropriate microorganisms via transformation to provide high levels of enzyme expression. An example of a vector for high level expression of the current biosynthetic amino acid enzymes in a bacterial host is provided (example 11).

Additionally, biosynthetic enzymes of current plant amino acids can be used as targets to facilitate the design and / or identification of enzyme inhibitors that may be useful as herbicides. This is desirable because the enzymes described here catalyze several steps in a route initiating the production of several essential amino acids. Therefore, inhibition of the activity of one or more of the enzymes described herein could initiate the inhibition of sufficient amino acid biosynthesis to inhibit the development of the plant. Thus, the plant's current amino acid biosynthetic enzymes could be appropriate for the discovery and design of new herbicides.

All or a substantial portion of the nucleic acid fragments of the present invention can also be used as tests to map genes that are genetically and physically part of them, and as markers for traits attached to those genes. Such information can be useful in the production of plants in order to develop lines with desired phenotypes. For example, current nucleic acid fragments can be used as restriction fragment extension polymorphism markers. Southern blots (Maniatis) of digested restriction of plant genomic DNA can be tested with the nucleic acid fragments of the present invention. The resulting pattern bands can also be subjected to genetic analysis using computer programs such as MapMaker (Lander et al. , (1987) Genomics 1: 174-181) in order to construct a genetic map. In addition, the nucleic acid fragments of the present invention can be used to test Southern blots containing restriction-treated endonucleases of genomic DNAs from a group of individuals represented as parents and progeny of a defined genetic cross. Segregation of the DNA polymorphism is annotated and used to calculate the position of the current nucleic acid sequence in the genetic map previously obtained using this population (Botstein, D. et al., (1980) Am. J. Hum. Genet. 32: 314-331) ..

The production and use of the gene-derived plant tests for use in genetic mapping is described in R. Bernatsky, R. And Tanksley, S.D. (1986) Plant Mol. Biol. Repórter 4 (1): 37-41. Numerous publications describe the genetic mapping of specific clones of cDNA using the methodology outlined above or variations thereof. For example, interbreeding F2 populations, populations of later crosses, random mating populations, nearby isogenic lines, and other groups of individuals can be used to map. Such methodologies are well known to those skilled in the art.

Nucleic acid assays derived from the current nucleic acid sequence can be used for physical mapping (ie, placement of sequences on physical maps; see Hoheisel, JD, et al., Tn: Nonmammalian Genomic Analysis: A Practical Guide, Academic press 1996 , pp. 319-346, and references cited here).

In another embodiment, nucleic acid assays derived from current nucleic acid sequences can be used in direct fluorescence in situ (FISH) hybridization mapping (Trask, B: J: (1991) Trends Genet., 7: 149-154) . Although current methods of FISH mapping favor the use of long clones (Several to several hundred KB; see Laan, M. et al. (1995) Genome Research 5: 13-20) improvements in sensitivity may allow the performance of the FISH mapping using shorter tests.

A variety of methods based on nucleic acid amplification of genetic and physical mapping can be conducted using the current nucleic acid sequences.

Examples include allele-specific amplification (Kazazian, HH (1989) J. Lab. Clin. Med. 114 (2) .95-96) polymorphism of PCR amplified fragments (CAPS, Sheffield, VC et.al. (1993) Genomics 16,325-332), linkage of specific alleles (Landergre, U. et al. (1988) Science 241: 1077-1080), nucleotide extension reactions (Sokolov, BP (1990) Nucleic Acid Res. 18: 3671), Hybrid Mapping Radiation (Walter, MA et al. (1997) Nature genetics 7: 22-28) and Happy Mapping (Dear, PH 7And Cook, PR (1989) Nucleic Acid Res. 17: 6795-6807). For these methods, the sequence of a nucleic acid fragment is used to design and produce primer pairs for use in the amplification reaction or primer extension reactions. The design of such initiators is well known to those skilled in the art. In methods employing genetic mapping based. in PCR, it may be necessary to identify differences in the DNA sequence between parents of the cross map in the region corresponding to the current nucleic acid sequence. This, however, is not generally necessary for mapping methods.

The loss of function of mutant phenotypes can be identified by current cDNA clones either by disruption protocols of a target gene or by identification of specific mutants for these genes contained in a maize population carrying mutations in all possible genes (Ballinger and Benzer, (1989) Proc. Nati, Acad. Sci. USA 86.9402, Koes et al., (1995) Proc. Nati, Acad. Sci USA 92.8149, Bensen et al., (1995) -Plant Cell 7:75). The last approach can be accomplished in two ways. First, short segments of the current nucleic acid fragments can be used in the polymerase chain reaction protocols in conjunction with a mutation of the labeled primer sequence in DNAs prepared from a population of plants in which mutant transposons or some other element causing mutation in DNA has been introduced, (see Bensen, supra). Amplification of a specific DNA fragment with these primers indicates the insertion of labeled labeling mutation element in or near the plant gene encoding dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase, or S-adenosylmethionine synthetase. Alternatively, the current nucleic acid fragment can be used as a hybridization test against the PCR amplification products generated from the mutated population using the tagged mutant primer sequence in conjunction with an arbitrary genomic site primer, such as that for an adapter. synthetic restriction enzyme anchor site. With either method, a plant containing a mutation in an endogenous gene encoding a dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase, or S-adenosylmethionine synthetase can be identified and obtained. This mutant plant can then be used to determine or confirm the natural function of the product gene dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase, and S-adenosylmethionine synthetase.

EXAMPLES The present invention is further defined in the following examples, in which all parts and percentages are by weight and degrees Celsius, unless stated otherwise. It should be understood that these examples, while indicating preferred embodiments of the invention, are given solely for the purpose of illustration. From the discussions before these examples, one skilled in the art can find out the essential characteristics of this invention, and without leaving the spirit and scope thereof, can make several changes and modifications of the invention to adapt this to various uses and conditions.

EXAMPLE 1 Composition of cDNA libraries; isolation and sequencing of cDNA clones cDNA libraries representing mRNAs were prepared from various tissues of corn, rice, soybeans and wheat. The characteristics of the libraries are described below.

TABLE 2 * These libraries were essentially normalized as described in U.S. Pat. No. 5,482,845. ** Application of 6-iodo-2-propoxy-3-propyl-4 (3H) -quinazolinone; Synthesis and methods of use This compound is described in USSN 08 / 545,827, incorporated herein by reference.

The cDNA libraries were prepared in UNI-ZAP ™ XR vectors according to the manufacturing protocol (Stratagene Coning Systems, La Jolla, CA). Conversion of the UNI-ZAP ™ XR libraries into the plasmid libraries was carried out according to the protocol provided by Stratagene. In the conversion, the cDNA inserts were contained in the plasmid vector pBluescript. The cDNA inserts from randomly selected bacterial colonies containing recumbent pBluescript plasmids were amplified via the polymerase chain reaction using primers specific for vector sequences flanking the inserted cDNA sequences, or the plasmid DNA was prepared from bacterial cell culture. Amplified insert DNA or plasmid DNAs were sequenced in sequencing reactions of the labeled primer to generate partial cDNA sequences (labeled expressed sequence or "ESTs", see Adams, M.D. et al., (1991) Science 252: 1651). The resulting ESTs were analyzed using a Model 377 fluorescent sequencer Perkin Elmer.

EXAMPLE 2 Identification and characterization of cAPN clones ESTs encoding plant amino acid biosynthetic enzymes were identified by BLAST conducted investigations (Basic Local Alignation Seach Tool; Altschul, S.F., et al., (1993) J. Mol. Biol. 215í 403-410; see also www.ncbi.nlm.nih.gov/BLAST/) for similarity for sequences contained in the BLAST database "nr" (including all non-redundant translations) GenBank CDS, sequences derived from Brookhaven protein DataBank 3-dimensional structure, the main release of the protein sequence database SWISS-PROT, EMBL, and DDBJ databases). The cDNA sequences obtained in Example 1 were analyzed for similarity for all publicly available DNA sequences contained in the "nr" database using the BLASTN algorithm provided by the National Center for Biotechnology Information (NCBI). The DNA sequences were translated into all reading frames and compared by similarity for all publicly available protein sequences contained in the "nr" database using the BLASTX algorithm (Gish, W. And States, DJ (1993) Nature Genetics 3: 266-272) provided by the NCBI. For convenience, the P-value (probability) of observing a match of the cDNA sequences for a sequence contained in the database searched merely by chance as calculated by BLAST are reported here as "pLog" values, which represents the negative of the logarithm of the P-value reported. Therefore, the higher the pLog value, the greater probability that the cDNA sequence and the BLAST "hit" represent homologous proteins.

EXAMPLE 3 Characterization of cDNA clones encoding homologous polypeptides for dihydrodipicolinate reductase.

The BLASTX search using nucleotide sequences from clones csin.pk0042.a3 and rls2.pk0017.d3 revealed similarity of proteins encoded by cDNA for the enzyme Dihydrodipicolinate reductase of Synechocystis sp. (DDBJ Accession No. D90899). The BLAST pLog values were 12.60 and 11.68 for csin.pk0042.a3 and rls2.pk0017.d3, respectively.

The entire cDNA sequence inserted into the clone csin.pk0042.a3 was determined and observed in SEQ ID NO: 1, the deduced amino acid sequence of this cDNA is observed in SEQ ID NO: 2. The amino acid sequence declared in SEQ ID NO: 2 was evaluated by BLASTP, yielding a pLog value of 36.72 against the Dihydrodipicolinate reductase sequence of Synechocystis sp. The sequence of a portion of cDNA inserted from clone rls2.pk0017.d3 is shown in SEQ ID NO: 3; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 4. Figure 2 shows an alignment of the amino acid sequences declared in SEQ ID NO: 2 and the dihydrodipicolinate reductase sequence of Synechocystis sp. (SEQ ID NO: 5). SEC ID NO: 2 it is 40% identical for the dihydrodipicolinate reductase sequence of Synechocystis sp. (SEC • ID NO: 5). Sequence alignments were carried out by the Clustal method of alignment (Higgins, DG and Sharp PM (1989) CABIOS 5: 151-153), using the Megalign program of the bioinformatics computing department LASARGENE (DNASTAR Inc., Madison, Wl) . Percent sequence identity calculations were carried out by the Jotun Hein method (Hein, JJ (1990) Meth., 183: 626-645) using the Megalign program of the LASARGENE bioinformatics computing department (DNASTAR Inc., Madison , Wl).

The BLAST marker and sequence alignments and probabilities indicate that the current nucleic acid fragments encode almost entirely a corn dihydrodipicolinate reductase, and a portion of a rice dihydrodipicolinate reductase. These sequences represent the first sequences encoding plant dihydrodipicolinate reductase.

EXAMPLE 4 Characterization of cDNA clones diaminopimelate epimerase.

The BLASTX search using the nucleotide sequence from clones chp2.pk0008.h4, rls48.pk0036.hl0, wlm24.pk0030. g4, and contiguous sequences assembled from clones se2.pk0005.f1, ses8w.pk0010.hll, sf11.pk0031.h3, and sgslc.pk002. kl2 revealed similarity of the proteins encoded by the cDNAs for diaminopimelate epimerase. of Synechocystis sp. (DDBJ Accession No. D90917). The BLAST results for each of these ESTs are shown in Table 3: TABLE 3 BLAST results for clones encoding homologous polypeptides for diaminopimelate epimerase The sequence of the whole cDNA insert in clone chp2, pk0008, h4 was determined and is shown in SEQ ID NO: 6; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 7. The amino acid sequence declared in SEQ ID NO: 7 was evaluated by BLASTP, yielding a pLog value of 75.66 against the sequence of Synechocystis sp. The sequence of a portion of the cDNA insert of clone rls48.pk0036.hl0 is shown in SEQ ID NO: 8; The deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 9. The assembled nucleotide sequence of the contiguous clones se2.pk0005.fl, ses8w.pk0010.hll, sfll.pk0031.h3, and sgslc.pk002.kl2 was determined and shown in SEQ ID NO: 10; The deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 11. The amino acid sequence declared in SEQ ID NO: 11 was evaluated by BLASTP, yielding a pLog value of 98.57 against the sequence of Synechocystis sp. The sequence of a portion of the cDNA insert of clone wlm24, pk0030.g4 is shown in SEQ ID NO: 12; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 13. Figure 3 shows an alignment of the amino acid sequences reported in SEQ ID Nos: 7, 9, 11 and 13 and the sequence of Synechocystis sp. (SEQ ID NO: 14). The data in Table 4 represent a calculation of the percent identity of the declared amino acid sequences, in SEQ ID Nos: 7, 9, 11 and 13 and the sequence of Synechocystis sp.

TABLE 4 Percentage of identity of amino acid sequences deduced from nucleotide sequences of cDNA clones encoding homologous polypeptides for diaminopimelate epimerase.

The sequence alignments were carried out by the Clustal alignment method (Higgins, DG and Sharp, PM (1989) CABIOS 5: 151-153) using the Megalign program of the LASARGENE bioinformatics computing department (DNASTAR Inc., Madison, Wl ). Percent sequence identity calculations were carried out by the Jotun Hein method (Hein JJ (1990) Meth., 183: 626-645) using the Megalign program of the LASARGENE bioinformatics computing department (DNASTAR Inc. , Madison, Wl).

The sequence alignments and the BLAST markers and probabilities indicate that the nucleic acid fragments encode almost completely a corn diamine epimerase (chp2.pk0008.h4), a portion of diaminopimelate rice epimerase. (rls48.pk0036.hl0), and a whole soybean diaminopimelate epimerase (se2.pk0005.f1, ses8w.pk0010.hll, sfll.pk0031.h3, and sgslc.pk002.kl2), and a portion of wheat diaminopimelate epimerase (wlm2 .pk0030.g4). These sequences represent the first plant sequences encoding the enzyme diaminopimelate epimerase.

EXAMPLE 5 Characterization of cDNA clones encoding threonine synthase Search for BLASTX using the EST sequences of clones cc2.pk0031. c9, csl .pk0058. g5, rls72.pkOOld. e7, sel.06a03, srl .pk0003. f6, and wrl .pk0085.h2 revealed similarity of the proteins encoded by the cDNAs to the threonine synthase of Arabidopsis thaliana, (GenBank Accession No. L41666). The BLAST results of each of these ESTs are shown in Table 5: TABLE 5 BLAST results for clones encoding threonine synthase homologous polypeptides The insert sequence of the entire cDNA in clone cc2.pk0031.c9 was determined and shown in SEQ ID NO: 15; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 16. The amino acid sequence declared in SEQ ID NO: 16 was evaluated by BLASTP, yielding a pLog value of 166.11 against the sequence of Arabidopsis thaliana. BLASTN against the best indicated identity of nucleotides 520 to 684 from cc2.pk0031.c9 with nucleotides 1 to 162 of an EST maize (GenBank Accession No.T18d47). The sequence of a portion of the cDNA insert of the clone csl.pk0058.g5 is shown in SEQ ID NO: 17, the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 18. The sequence of a portion of the cDNA insert of the clone rls72.pk0018.e7 is shown in SEQ ID NO: 19; the deduced amino acid sequence deduced from this cDNA is shown in SEQ ID NO: 20. The sequence of a portion of the cDNA insert of clone sel.06a03 is shown in SEQ ID NO: 21; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 22. The sequence of the entire cDNA insert in the clone srl .pk0003. f6 was determined and shown in SEQ ID NO: 23; the amino acid sequence deduced from. cDNA is shown in SEQ ID NO: 24. The amino acid sequence declared in SEQ ID NO: 24 was evaluated by BLASTP yielding a pLog value of 275.06 against the sequence of Arabidopsis thaliana. The sequence of a portion of the cDNA insert of the clone wrl .pk0085.h2 is shown in SEQ ID NO: 25; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 26. Figure 4 presents an alignment of the amino acid sequences reported in SEQ ID NOs: 16, 18, 20, 22, 24 and 26 and the Arabidopsis sequence Thaliana The data in Table 6 represents a calculation of the percent identity of the amino acid sequences reported in SEQ ID Nos: 16, 18, 20, 22, 24 and 26 and the sequence of Arabidopsis thaliana (SEQ ID NO: 27).

TABLE 6 Percentage of identity of amino acid sequences deduced from nucleotide sequences of cDNA clones encoding homologous polypeptides for threonine synthase Alignments in the sequence were carried out by the Clustal alignment method (Higgins, DG and Sharp, PM (1989) CABIOS 5: 151-153) using the Megalign program of the LASARGENE bioinformatics computing department (DNASTAR Inc., Madison, Wl). Percentage identity calculations of the sequence were carried out by the Jotun Hein method (Hein JJ (1990) Meth., 183: 626-645) using the Megalign program of the LASARGENE bioinformatics computing department (DNASTAR Inc., Madison , Wl).

The sequence alignments and the BLAST markers and probabilities indicate that the nucleic acid fragments encode portions of a maize threonine synthase (cc2.pk0031.c9 and csl.pk0058.g5), a portion of rice threonine synthase (rls72. pk0018.e7) portions of a soybean threonine synthase (sel.06a03 and srl.pk0003.f6), and a portion of wheat threonine synthase (wrl .pk00d5.h2). These sequences represent the first sequences of corn, rice, soy and wheat that encode threonine synthase.

EXAMPLE 6 Characterization of cDNA Clones that encode Threonine Deaminase The BLASTX investigation using the EST sequence of clone cen.pk0064.f revealed similarity of the protein encoded by the cDNA to threonine deaminase of Brukholderia capada (GenBank Accession No. U40630, pLog = 31.38). The BLASTX research using EST sequences of clones sfI.pk0055.h7 and sre.pk0044. f3 revealed similarity of the protein encoded by the cDNA to threonine deaminase of Solanum tuberosum and Brukholderia capada (EMBL Accssesion No. X67846 and GenBank Accession No. U40630, respectively). The BLAST pLog values were 36.55 and 31.79 for Sfll.pk0055.h7, and 19.47 and 14.51 for sre.pk0044. f3.

The sequence of the entire cDNA insert in clone cenl.pk0064.f4 was determined and is shown in SEQ ID NO: 2d; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 29. The amino acid sequence declared in SEQ ID NO: 29 was evaluated by BLASTP, yielding a pLog value of 134.85 versus the Brukholderia capped sequence. The sequence of the portion of the cDNA insert in clone sf11.pk0055.h7 is shown in SEQ ID NO: 30; the amino acid sequence deduced from this cDNA is shown in SEQ ID NO: 31. The sequence of the entire insert of cDNA in the clone sre.pk0044. f3 was determined and is shown in SEQ ID NO: 32; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 33. The amino acid sequence declared in SEQ ID NO: 33 was evaluated by BLASTP, yielding a pLog value of 19.24 versus the sequence of Solanum tuberosum and 15.19 versus the sequence of threonine deaminase from Brukholderia capada. Figure 5 shows an alignment of the amino acid sequences declared in SEQ ID NO: 29, 31 and 33 and the sequence of de Brukholderia capada (SEQ ID NO: 34). The data in Table 7 represent a percent identity calculation of the amino acid sequences reported in SEQ ID Nos: 29, 31 and 3335 and the Brukholderia capped sequence.

TABLE 7 Percentage of Identity of Derived T-amino acid Sequences of Nucleotide Sequences of cDNA Clones Encoding Polypeptide Homologs to Threonine Deaminase The sequence alignments were performed by the Clustal alignment method (Higgins, DG and Sharp, PM (1989) CABJOS 5: 151-153) using the Megalign program of the LASARGEN bioinformatics computing department (DNASTAR Inc., Madison, Wl ). Percentage sequence identity calculations were carried out by the Jotun Hein method (Hein, JJ: (1990) Meth. Enz 183: 626-645) using the Megalign program of the LASARGEN bioinformatics computing department (DNASTAR Inc. , Madison, Wl) The sequence alignments and the BLAST markers and the probabilities indicate that the current nucleic acid fragments that encode whole or almost entirely the threonine deaminase of corn (cenl.pk0064.f) and portions of threonine deaminase of soy (sfll.pk0055. h7 and sre.pk0044.f3). These sequences represent the first sequences of corn and soybean that encode threonine deaminase.

EXAMPLE 7 Characterization of cDNA Clones Encoding S-adenosylmethionine synthetase.

The BLASTX investigation using the nucleotide sequence of the clone cc3.mn0002.d2 revealed similarity of the protein encoded by the cDNA to S-adenosylmethionine synthetase from Oriza sativa (EMBL Accession No. Z26867; pLog = 99.03). The sequence of the entire cDNA insert in clone cc3.mn0002.d2 was determined and is shown in SEQ ID NO: 35; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 36. The nucleotide sequence declared in SEQ ID NO: 35 was evaluated by BLASTN, yielding a pLog value greater than 200 versus the Oriza sativa sequence. Figure 6 presents a sequence alignment of 6 nucleotides declared in SEQ ID NO: 35 and the sequence of Oriza sativa (SEQ ID NO: 37). The nucleotide sequence in SEQ ID NO: 35 is 88.5% identical over 1216 nucleotides to the nucleotide sequence of the Oriza sativa S-adenisylmethionine synthetase.

The BLASTX investigation using the nucleotide sequence of clone s2.12b06 revealed similarity of the protein encoded by the cDNA to S-adenosylmethionine synthetase from Lycopersicom esculentum (EMBL Accession No. Z24741; pLog = 62.62). The sequence of the whole cDNA insert in clone s2.12b06 was determined and is shown in SEQ ID NO: 38; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 39. The nucleotide sequence declared in SEQ ID NO: 38 was evaluated by BLASTN, yielding a pLog value greater than 200 versus the sequence of Lycopersicom esculentum. Figure 7 shows an alignment of nucleotide sequences declared in SEQ ID NO: 38 and the sequence of Lycopersicom esculentum (SEQ ID NO: 40). The nucleotide sequence declared in SEQ ID NO: 38 is 82% identical over 1210 nucleotides to the sequence of Lycopersicom esculentum.

The BLASTX investigation using the nucleotide sequence of the contiguous assembly of the clones wrel.pk0002.cl2, wleln.pk0070.b8, wkmlc.pk0003.g4, wlkl.pk0028.d3, wrelnpkl70.d8, wrl, pk0086.d5, wrl. pk0103.h8, and wreln.pk0082.b2 revealed similarity of the protein encoded by the contiguous to S-adenosyl-ethionine synthetase of Hordeum vulgare. { DDBJ Accession No. 63835) with a pLog value greater than 200. The nucleotide sequence of the contiguous assembly of the clones wrel.pk0002.cl2, wleln.pk0070.bd, wkmlc.pk0003.g4, wlkl .pk0028.d3, wrelnpkl70 .d8, wrl.pk0086.d5, wrl .pk0103.h8, and wreln.pk0082.b2 is shown in SEQ ID NO: 41; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 42. Figure 8 shows an alignment of nucleotide sequences reported in SEQ ID NO: 41 and the sequence of Hordeum vulgare (SEQ ID NO: 43). SEQ ID NO: 41 is 92% identical to the sequence of Hordeum vulgare.

The sequence alignments and the BLAST markers and probabilities indicate that the current nucleic acid fragments that encode whole or almost entirely the S-adenosylmethionine synthetase from corn, soy, or wheat. These sequences represent the first sequences of corn, soy and wheat that encode S-adenosylmethionine synthetase.

EXAMPLE 8 Expression of Chimeric Genes in Monocotyledon Cells.

A chimeric gene comprising a cDNA encoding a plant biosynthetic enzyme in sense orientation with respect to the 27 kD corn zein promoter that is located 5 'to the cDNA fragment, and the 3' terminal of 10 kD zein that is located 3 ' to the fragment of cDNA, it can be constructed. The cDNA fragment of this gene can be generated by polymerase chain reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers and under appropriate experimental conditions. The cloning sites (Ncol or Smal) can be incorporated into the oligonucleotides to provide proper orientation of the DNA fragment when inserted into the digested vector pML103 as described below. The amplified DNA can then be digested with Ncol and Smal restriction enzymes and fractionated on 0.7% low melting point agarose gel in 40 mM Tris-acetate, pH 8.5, 1 mM EDTA. An appropriate band of can be excised from the gel, melted at 68 ° C and combined with a 4.9 kb fragment of Ncol-Smal from plasmid pML 103. Plasmid pML 103 has been deposited under the terms of the Budapest treaty at ATCC (American Type Culture Collection, 10801 University Boulevard, Manassas, VA 20110-2209), and has an accession number ATCC 97366. The DNA segment of pML103 contains a 1.05 kb Ncol-Smal promoter fragment of the 27 kD corn zein gene. and a 0.96 kb fragment of S al-Sall from the 3 'terminus of the 10 kD corn zein gene in the vector pGem9Zf (+) (Promega). The vector and the inserted DNA can be ligated at 15 ° C during the night, essentially as described (Maniatis). The ligated DNA can then be used to transform E. coli XLl-Blue (Epicurian Coli XL-1 Blue, Stratagen). Bacterial transformants can be protected by restriction enzyme digestion of the plasmid DNA and analysis of the limited nucleotide sequence using the dideoxy chain termination method (Sequenase ™ DNA Sequencing Kit; U.S. Biochemical). The construction of the resulting plasmid comprises a chimeric gene encoding, in the 5 'and 3' direction, the 27 kD corn zein promoter, a cDNA fragment encoding a plant amino acid biosynthetic enzyme, and the 3 'region of zein 10 kD.

The chimeric gene described above can be introduced into maize cells by the following procedure. Immature maize embryos can be dissected from developed cariopses derived from crosses of maize lines born H99 and LH132. Embryos are isolated 10 to 11 days after pollination when they are 1.0 to 1.5 mm in length. The embryos are then placed with the sides of the shafts down and in contact with the solidified agarose medium N6 (Chu et al., (1975) Sci. Sin. Peking 16: 659-668). The embryos are kept in the dark at 27 ° C. The crumbly embryogenic callus consisting of undifferentiated masses of cells with pro-embrionoid and embryo-born suspensory structures proliferates from the scutellum of these immature embryos. The embryogenic callus isolated from the primary explant can be cultured in an N6 medium and subcultured in this medium every 2 or 3 weeks.

Plasmid p35S / Ac (obtained from Peter Eckes, Hoechst Ag, Frankfurt, Germany) can be used for transformation experiments to provide a selectable marker. This plasmid contains the Pat gene (see European Patent Publication 0 242 236) which encodes phosphinothricin acetyl transferase (PAT). The enzyme PAT confers resistance to inhibitors of glutamine synthetase herbicides such as phosphinothricin. The pat gene in p35S / Ac is under the control of the 35S promoter of the Cauliflower Mosaic Virus (Odell et al., (1985) Nature 313: 810-812) and the 3 'region of the T-DNA nopaline synthase gene of the Ti plasmid of Agrobacterium tumefaciens.

The method of bombardment of particles (Klein et al., (1987) Nature 327: 70-73) can be used to transfer genes to callus culture cells. According to this method, gold particles (1 μm in diameter) are coated with DNA using technique 8 following. Ten μg of the DNA plasmids are added to 50 μL of a suspension of gold particles (60 mg per mL). Calcium chloride (50 μL of 2.5 M solution) and spermidine free base (20 μL of a l.O M solution) are added to the particles. The suspension is vigorously stirred during the addition of these solutions. After 10 minutes, the tubes are centrifuged for a short time (5 sec at 15,000 rpm) and the supernatant removed. The particles are suspended in 200 μL of absolute ethanol, centrifuged again and the supernatant removed. The ethanol rinsing is performed again and the particles resuspended in a final volume of 30 μL of ethanol. An aliquot (5 μL) of gold particles coated with DNA can be placed in the center of a Kapton ™ disc (Bio-Rad Labs). The particles are then accelerated into the corn tissue with a Biolistic ™ PDS-1000 / He (Bio-Rad Instruments, Hercules CA), using a helium pressure of 1000 psi, a gap distance of 0.5 cm and a throw distance of 1.0 cm.

For the bombardment, the embryogenic tissue is placed on filter paper on a N6 solidified agarose medium. The fabric is arranged as a thin layer and covered with a circular area about 5 cm in diameter. The petri dish containing the fabric can be placed in the camera of the PDS-1000 / He approximately 8 cm from the limiter screen. The air in the chamber is then evacuated to a vacuum of 28 inches Hg. The macrocarrier is accelerated with a helium shock wave using a rupture membrane that bursts when the helium pressure in the shock tube reaches 1000 psi.

Seven days after the bombardment the tissue can be transferred to an N6 medium containing glufosinate (2 mg per liter) and lacks casein and proline. The tissue continues to grow slowly in this medium. After an additional 2 weeks the tissue can be transferred to a fresh N6 medium containing glufosinate. After 6 weeks, areas of about 1 cm in diameter of callus growing effectively can be identified in some of the plates containing a medium supplemented with glufosinate. These calluses can continue to grow when subcultured on a selective medium.

Plants can be regenerated from transgenic callus first by transfer of tissue clusters to an N6 medium supplemented with 0.2 mg per liter of 2,4-D. After two weeks the tissue can be transferred to a regenerated medium (Fromm et al., (1990) Bio / Technology 8: 833-839).

EXAMPLE 9 Expression of the Chimeric Genes in Dicotyledonous Cells A box or expression cassette of a specific seed composed of the promoter and terminator of the gene expression coding for the β subunit of the phaseolin stored protein of the seed of the bean Phaseolus vulgaris (Doyle et al (1986) J. Biol. Chem. 261 : 9228-9238) can be used for expression of the biosynthetic enzymes of current amino acids in transformed soybeans. The phaseolin box includes about 500 nucleotides upstream (5 ') of the translation initiation codon and about 1650 nucleotides of the terminal (3') end of the phaseolin translation stop codon. Between the 5 'and 3' regions are sites of unique Ncol restriction endonucleases (which include the translation initiation codon ATG), Sma I, Kpn I and Xba I. The entire box is flanked by Hind III sites.

The cDNA fragment of this gene can be generated by polymerase chain reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers. The cloning sites can be incorporated into the oligonucleotides to provide orientation of the DNA fragment when inserted into the expression vector. The amplification is then carried out, and the isolated fragment is inserted into a vector pUCld transporting the expression box of the seed.

Biosynthetic enzymes of plant amino acids are known to be localized in chloroplasts. In this way, for those enzymes (or polypeptides representing part of the current amino acid biosynthetic enzymes) that lack a white chloroplast signal, the DNA fragment can be inserted into the expression vector that can be synthesized by PCR with primers encoding a white chloroplast signal. For example, a chloroplast transit sequence equivalent to the cts of the small subunit of ribulose 1, 5-bisphosphate carboxylase of soybean (Berry-Lowe et al (1982) J. Mol. Appl. Gent., 483-49d) It can be used.

Soybean embryos can then be transformed with expression vectors comprising sequences encoding a plant amino acid biosynthetic enzyme. To induce tic embryos, cotyledons, 3-5 mm in length dissected from the sterilized surface, immature soybean seeds A2872, can be grown in light or in the dark at 26 ° C on an appropriate agar medium for 6 days. -10 weeks. The tic embryos which produce secondary embryos are then excised and placed in a suitable liquid medium. After repeated selection for groupings of tic embryos which multiplied early, embryos in globular stage, suspensions are maintained as described below.

Soybean embryogenic suspension cultures can be maintained in 35 mL of liquid medium on a rotary shaker, at 150 rpm, at 26 ° C with fluorescent lights on a program of 16: 8 hours day / night. The cultures are subcultured every two weeks by inoculating approximately 35 mg of tissue into 35 mL of liquid medium.

Soybean embryogenic suspension cultures can then be transformed by the particle bombardment method (Kline et al (1987) Nature (London) 327.70, US Patent No. 4,945,050). An instrument of Du Pont Biolistic ™ PDS100 / HE (retro-fitted helium) can be used for these transformations.

A selected marker gene which can be used to facilitate the transformation of soybean is a chimeric gene composed of the 35S promoter of the Cauliflower Mosaic Virus (Odell et al. (1985) Nature 313: 810-812)), the hygromycin gene phosphotransferase of plasmid pJR225 (from E. coli; Gritz the al. (1983) Gen 25: 179-188) and the 3 'region of the nopaline synthase gene of the T-DNA or of the Ti plasmid of Agrobacterium tumefaciens. The seed expression box comprising the 5 'region of the phaseolin, the fragment encoding the biosynthetic enzyme and the phaseolin of the 3' region can be isolated as a restriction fragment. This fragment can then be inserted into a single restriction site of the marker gene transporter vector.

For 50 μL of 60 mg / mL of a suspension of particles of 60 mg / mL of gold is added (in order): 5 μL DNA (1 μg / μL), 20 μl of spermidine (0.1 M), and 50 μL of CaCl2 (2.5 M). The particle preparation is then stirred for 3 minutes, centrifuged in a microfuge for 10 seconds and the supernatant removed. The particles of DNA-coated ones are then washed once in 400 μL of 70% ethanol and resuspended in 40 μL of anhydrous ethanol. The DNA / particle suspension can be sonicated three times for one second each time. Five μL of the gold particles covered with DNA are then loaded onto each macro transporter disc.

Approximately 300-400 mg of two-week-old suspension culture is placed in an empty 60x15 mm petri dish and the residual liquid removed from the tissue with a pipette. For each transformation experiment, approximately 5-10 tissue plates are normally bombarded. The rupture pressure of the membrane is adjusted to 1100 psi and the chamber is evacuated to a vacuum of 28 inches of mercury. The fabric is placed approximately 3.5 inches apart from the retention screen and bombarded three times. In subsequent bombardment, the tissue can be divided in half and placed back into the liquid and cultured as described above.

Five to seven days after the bombardment, the liquid medium can be exchanged with fresh medium, eleven to twelve days after bombardment with fresh medium containing 50 mg / mL of hygromycin. This selective medium can be refreshed every week. - Seven to eight weeks after the bombardment, the transformed, green tissue can be observed growing from necrotic embryogenic clusters. The isolated green tissue is removed and inoculated into individual flasks to generate new transformed embryogenic suspension cultures, propagated by cloning. Each new line can be treated as an independent transformation event. These suspensions can be subcultured and maintained as groups of immature or regenerated embryos within whole plants by maturation and germination of individual somatic embryos.

EXAMPLE 10 Analysis of the Amino Acid Content of the Transformed Plant Seeds To analyze by expression of the chimeric genes in seeds and for the consequences of expression on the amino acid content in the seeds, a seed meal can be prepared by any number of methods suitable for those skilled in the art. The seed meal can be partially or completely defatted, via extraction of hexane for example, if desired. The protein extracts can be prepared from the flour and analyzed by enzymatic activity. Alternatively the presence of. any of the expressed enzymes can be immunologically tested by methods well known to those skilled in the art. To measure the composition of free amino acids in seeds, free amino acids can be extracted from the flour and analyzed by methods well known by those skilled in the art (Bielinski et al. (1996) Anal. Biochem. 17: 278-293) . The composition can then be determined using any commercially available amino acid analyzer. To measure the free amino acid composition of the seeds, the flour containing both bound protein and free amino acids can be hydrolyzed by acid to release the amino acids linked to the protein and the composition can be determined using any commercially available amino acid analyzer. The seeds expressing the biosynthetic enzymes of current amino acids and with altered content of lysine, threonine, methionine, cysteine and / or isoleucine as compared to wild type seeds can then be identified and propagated.

To measure the free amino acid composition of the seeds, the free amino acids can be extracted from 8-10 milligrams of seed meal in 1.0 mL of methanol / chloroform / water mixed in a ratio of 12v / 5v / 3v (MCW) at temperature ambient. The mixture can then be vigorously stirred and then centrifuged in an eppendorf microcentrifuge for about 3 minutes; approximately 0.8 mL of the supernatant is then decanted. For this supernatant, 0.2 mL of chloroform followed by 0.3 L of water are added. The mixture is then vigorously stirred and centrifuged in an eppendorf microcentrifuge for about 3 minutes. The upper aqueous phase, approximately 1.0 mL, can then be removed and dried in a Savant Speed Vac concentrator. The samples are then hydrolysed in 6N hydrochloric acid, 0.4% ß-mercaptoethanol under nitrogen for 24 h at 110-120 ° C. Ten percent of the sample can then be analyzed using a Beckman Model 6300 amino acid analyzer using posterior detection in ninhydrin column. The relative free amino acid levels in the seeds are then compared in this manner as ratios of lysine, threonine, methionine, cysteine, and / or isoleucine to leucine, using leucine as an internal standard.

EXAMPLE 11 Expression of Chimeric Genes in Microbial Cells The cDNAs encoding the biosynthetic enzymes of current plant amino acids can be inserted into the expression vector pET24d (Novagen) T7 of E. coli. The DNA plasmid containing a -cDNA can be appropriately digested to release the nucleic acid fragment encoding the enzyme. This fragment can be purified on a NuSieve GTG ™ 1% low melting agarose gel (FMC). The buffer and the agarose contain 10 μg / ml ethidium bromide for visualization of the DNA fragment. The fragment can be purified from the agarose gel by digestion with GELase ™ (Epicenter Technology) according to the manufacturer's instructions, precipitated with alcohol, dried and suspended in 20 μL of water. Suitable oligonucleotide adapters can be ligated to the fragment using T4 DNA ligase (New England Biolabs, Beverly, MA). The fragment containing the ligated adapters can be purified from the excess of adapters using low melt agarose as described above. The vector pET24d is digested, dephosphorylated with alkaline phosphatase (NEB) and deproteinized with phenol / chloroform as * described above. The pET24 vector prepared and the fragment can then be ligated at 16 ° C for 15 hours followed by transformation into DH5 electrocompetent cells (GIBCO BRL). Transformants can be selected on agar plates containing 2xYT medium and 50 μg / mL kanamycin. The transformants containing the gene encoding the enzyme are then protected for correct orientation with respect to the pET24d T7 promoter by restriction of enzyme analysis.

Clones in the correct orientation with respect to the T7 promoter can be transformed into BL21 (DE3) competent cells (Novagen) and selected on 2xYT agar plates containing 50 μg / mL kanamycin. A colony appeared from this transformation construct can be grown overnight at 30 ° C in a 2xYT medium containing 50 μg / mL kanamycin. The culture is then diluted twice with fresh medium, allowing re-growth for 1 h, and inducing by addition of isopropylthiogalactopyranoside to a final concentration of ImM. The cells are then harvested by centrifugation after 3 h and re-suspended in 50 μL of 50 mM Tris-HCl at pH 8.0 containing 0.1 mM DTT and 0.2 mM phenyl methylsulfonyl fluoride. A small amount of a 1 mm glass bed can be added and the mix sonicated 3 times for about 5 seconds each time with a micro-test sonicator. The mixture is centrifuged and the protein concentration of the supernatant determined. One μg of protein from the soluble fraction of the culture can be separated by SDS-polyacrylamide gel electrophoresis. The gels can be observed by protein bands migrating to the expected molecular weight.

EXAMPLE 12 'Evaluation of Compounds for their Ability to Inhibit the Activity of a Plant Amino Acid Biosynthetic Enzyme The plant amino acid biosynthetic enzymes described herein can be produced using any number of known methods for that enabled in the art. Such methods include, but are not limited to, bacterial expression as described in Example 6, or expression in eukaryotic cell cultures, in-plant, and using viral expression systems in suitably infected organisms or cell lines. Current enzymes can be expressed separately as mature proteins, or can be co-expressed in E. coli or other suitable means of expression. In addition, whether expressed separately or in combination, current enzymes can be expressed either in mature forms of the proteins as observed in vivo or as a fusion of proteins bound by covalence to a variety of enzymes, proteins or affinity residue. Associations of common fusion proteins include glutathione S-transferase ("GST"), theorexin ("Trx"), protein bound maltose, and C-and / or N-terminal hexahistidine polypeptide ("(HisJß").

The fusion of proteins can be carried out with a recognized site of protease at the melting point in such a way that the fusion partners can be separated by digestion by proteases to yield intact mature enzymes. Examples of such proteases include thrombin, enterokinase and factor Xa. However, any protease can be used specifically by separating the peptide connected to the fusion protein and the biosynthetic enzyme.

The purification of the current enzymes, if desired, can utilize any number of familiar separation technologies for those skilled in the art of protein purification. Examples of such methods include, but are not limiting, homogenization, filtration, centrifugation, heat denaturation, ammonium sulfate precipitation, salt precipitation, pH precipitation, ion exchange chromatography, hydrophobic interaction chromatography and affinity chromatography, where the ligand by affinity represents a substrate, analog substrate or inhibitor. When the enzymes are expressed as protein fusion, the purification protocol may include the use of an affinity resin which is specified by the protein fusion residue bound to an expressed enzyme or an affinity resin containing ligands which are specific for the enzyme. For example, an enzyme can be expressed as a protein fusion coupled to the thioredoxin C-terminus. In addition, a peptide (His) 6 can be constructed in the N-terminus of the fused thioredoxin moiety to provide additional opportunities for affinity purification. Other suitable affinity resins could be synthesized by binding the appropriate ligands to any suitable resin such as Sepharose-4B. In an alternate modality, a thioredoxin protein fusion can be eluted using dithiothreitol; however, elution can be performed using reagents which interact to displace the thioredoxin from the resin. These reagents include β-mercaptoethanol or another reduced thiol.

The fusion of eluted protein can be subject to further purification by the traditional methods set forth above, if desired. The proteolytic separation of the thioredoxin fusion protein and the biosynthetic enzyme can be performed after the protein fusion is purified or while the protein is still bound to the ThisBond ™ affinity resin or to another resin.

The partially purified or purified, crude enzyme, either alone or as a protein fusion, can be used in assays for the evaluation of compounds for their ability to inhibit the enzymatic activation of the plant amino acid biosynthetic enzymes described herein. The assays can be conducted under well-known experimental conditions which allow optimal enzymatic activity. Examples of assays for many of these enzymes can be found in Methods in Enzymology Vol. V, (Colowich and Kaplan eds) Academic Press, New York or Methods in Enzymology Vol. XVII, (Tabor and Tabor eds) Academic Press, New York. Specific examples can be found in the following references, each of which is incorporated herein by reference: dihydrodipicolinate reductase can be assayed as described in Farkas et al. (1965) J. Biol. Chem. 240: 4717-4722, or Cremer et al. (1988) J. Gen. Microbiol. 134: 3221-3229; diaminopimelate epimerase can be assayed as described in Work (1962) Methods in Enzymology Vol. V, (Colowich and Kaplan eds) 858-864, Academic Press, New York; Threonine synthase can be assayed as described in Giovanelli et al. (1984) Plant Physiol. 76285-292 or Curien et al. (1966) FEBS Lett. 390: 85-90; Threonine deaminase can be assayed as described in Tomova et al. (1968) Biochemistry (USSR) 33: 200-208 or Dougal (1970) Phytochemistry 5: 959-964; and S-adenosylmethionine synthetase can be assayed as described in Mudd (1960) Biochim. Biophys. Acta 38: 354-355 or Boerjan et al. (1994) Plant Cell 5: 1401-1414.

SEQUENCE LIST (1) GENERAL INFORMATION (i) APPLICANT (A) CONSIGNEE: EIDU PONT DE NEMOURS AND COMPANY (B) STREET: 1007 MARKET STREET (C) CITY: WILMINGTON (D) STATE: DELAWARE (E) COUNTRY: USA (F) ZIP: 19898 (G) TELEPHONE: 302-992-4926 (H) TELEFAX: 302-773-0164 Ti) TELEX: 6717325; ii) TITLE OF THE INVENTION: BIOSYNTHETIC ENZYMES OF PLANT AMINO ACIDS (iii) NUMBER OF SEQUENCES: 43 (iv) LEGIBLE FORMAT IN COMPUTING: (A) TYPE OF MEDIUM: DISKETTE, 3.5 INCH (B) COMPUTER: IBM COMPATIBLE PC (C) OPERATING SYSTEM: MICROSOFT WINDOWS 95 (D) SOFTWARE: MICROSOFT WORD VERSION 7.0A (v) CURRENT APPLICATION REFERENCE: (A) APPLICATION NUMBER: (B) REGISTRATION DATE: (vi) CLASSIFICATION OF PREVIOUS APPLICATION: (A) APPLICATION NUMBER: 60 / 048,771 (B) REGISTRATION DATE: JUNE 6, 1997 (vii) INFORMATION OF THE LAWYER / AGENT (A) NAME: MAJARÍAN, WILLIAM R. (B) REGISTRATION NUMBER: 41,173 (C) REFERENCE / NUMBER OF CEDULA: BB-1087 (2) INFORMATION FOR SEC ID NO: l (i) ) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 908 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE (B) CLON : csiln.pk0042.a3 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: l: ACGCGGGACA GATAAGTGGC ATGGACGAGC CGCTGGAGAT CCCTGTGCTG AACGACCTCA 60 CCATGGTTCT GGGCTCCATA GCGCAGTCGA GAGCAACCGG CGTGGTGGTC GACTTCAGCG 120 AGCCTTCAGC TGTTTACGAC AATGTCAAGC AGGCAGCGGC GTTTGGTCTG AGCAGCGTCG 180 TCTACGTTCC GAAAATCGAG CTAGAGACAG TGACTGAACT GTCAGCGTTC TGCGAGAAGG 2 0 CAAGCGGCTG CTTGGTTGCG CCAACGCTGT CGATTGGGTC CGTGCTCCTT CAGCAAGCGG 300 CTATACAGGC CTCGTTCCAC TACAGCAACG TTGAGATTGT GGAATCGAGA CCAAACCCAT 360 CGGATCTTCC ATCGCAAGAT GCAATCCAGA TTGCAAACAA CATATCAGAC CTTGGTCAGA 420 TATACAACAG GGAAGATATG GATTCCAGCA GTCCAGCCAG AGGCCAGCTG CTCGGGGAAG 480 ACGGAGTGCG CGTGCACAGC ATGGTTCTCC CTGGTCTCGT CTCCAGCACG TCGATCAACT 540 TCTCTGGCCC AGGAGAGATG TACACCTTAC GGCATGACGT TGCGAATGTT CAGTGCCTGA 600 TGCCAGGACT GATCCTGGCG ATACGGAAGG TGGTGCGGTT CAAGAACTTG ATTTATGGGC 660 TAGAGAAGTT CTTGTAGTGA ACAACAAACA ACCAATGCAA AACATCGACA GGCAACAGGC 720 AAGGCAGATA TCATCTGACG TCGCAACAAC CAAAACGACA GAGATTTGGA AAATAAAGGC 780 TGCACAGAAG ACGTCTGGGG TTTTGTGTGC ACCAGGCTGC GCAGAGAACG TCTGTCATTT 840 TGTGTGCACC ACTACGGCAC TACCTGCTGA GCGCGATTTT TATAAAAAAG GCATGGGAGG 900 GAGATCAT 908 INFORMATION FOR SEQ ID NO: 2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 224 amino acids (B) TYPE: amino acids (C) STRING: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: csiln .pk0042. a3 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 2; Ala Gly Gln He Ser Gly Met Asp Glu "Pro'-Leu - '? L-u He Pro Val -Leu 1 5 10 15. í .-- .A "ll • '. - * 7 J.. * Asn Asp Leu Thr Met Val Leu Gly Ser He Wing G n Ser Arg Wing Thr 20 25 30 Gly Val Val Val Asp Phe Ser Glu Pro Ser Val Val Tyr Asp Asn Val 35 40 45 Lys Gln Wing Wing Wing Phe Gly Leu Ser Val Val Tyr Val Pro Lys 50 55 60 He Glu Leu Glu Thr Val Thr Glu Leu Ser Wing Phe Cys Glu Lys Wing 65 70 75 80 Be Gly Cys Leu Val Wing Pro Thr Leu Be He Gly Ser Val Leu Leu 85 90 95 Gln Gln Wing Wing Gln Wing Being Phe His Tyr Being Asn Val Glu He 100 105 110 Val Glu Ser Arg Pro Asn Pro Ser Asp Leu Pro Ser Gln Asp Ala He 115 120 125 Gln He Ala Asn Asn He Ser Asp Leu Gly Gln He Tyr Asn Arg Glu 130 135 140 Asp Met Asp Being Ser Pro Pro Wing Arg Gly Gln Leu Leu Gly Glu Asp 145 150 155 160 Gly Val Arg Val His Ser Met Val Leu Pro Gly Leu Val Ser Ser Thr 165 170 175 Be He Asn Phe Be Gly Pro Gly Glu Met Tyr Thr Leu Arg His Asp 180 185 190 Val Wing Asn Val Gln Cys Leu Met Pro Gly Leu He Leu Wing He Arg 195 200 205 Lys Val Val Arg Phe Lys Asn Leu He Tyr Gly Leu Glu Lys Phe Leu 210 215 220 (2) INFORMATION FOR SEQ ID NO: 3: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 339 base pairs (B) TYPE: nucleic acids (C) CHAIN: single (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE (B) CLON: rls2.pk0017.d3 (i) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 3:GGAGAAATGC AGCAAAGGTC CTCTGCTCAA CGCAGATGCC GCCATCTCAG 60 AGCACAATCA AGGTTGTTAT CATTGGGGCG ACAAAAGAGA TTGGAAGAAC GGCAATAGCG 120 GCAGTAAGTA AAGCAAGGGG AATGGAGCTT GCAGGGGCCA TAGATTCTCA GTGTATAGGC 180 CTAGATGCAG GAGAGATAAG TGGCATGGGA AGAACCCTGG AAATTCCGGT GCTCAATGAT 240 CTCACAATGG TTCTGGGCTC AATTGCACAA ACCAGAGCAA CTGGAGTGGT GGTTGATTTT 300 AGTGAACCTT CAACTGTTTA TGATAATGTC AAACAGGCA 339 (2) INFORMATION FOR SEQ ID NO: (i) CHARACTERISTICS OF THE SEQUENCE (A) LENGTH: 113 amino acids (B) TYPE: amino acids (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: rls2.pk0017.d3 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 4: Lys He Gly Arg Arg Asn Wing Wing Lys Val Leu Cys Ser Thr Gln Met 1 5 10 15 Pro Pro Ser Gln Ser Thr He Lys Val Val He He Gly Wing Thr Lys 20 25 30 Glu He Gly Arg Thr Wing He Wing Wing Val Ser Lys Wing Arg Gly Met 35 40 45 Glu Leu Wing Gly Wing He Asp Ser Gln Cys He Gly Leu Asp Wing Gly 50 55 60 Glu He Be Gly Met Gly Arg Thr Leu Glu He Pro Val Leu Asn Asp 65 70 75 80 Leu Thr Met Val Leu Gly Ser He Wing Gln Thr Arg Wing Thr Gly Val 85 90 95 Val Val Asp Phe Ser Glu Pro Ser Thr Val Tyr Asp Asn Val Lys Gln 100 105 110 Wing (2) INFORMATION FOR SEQ ID NO: 5: (i) CHARACTERISTICS OF THE SEQUENCE (A) LENGTH: 275 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE : peptide (vi) ORIGINAL SOURCE (A) Synechocystus sp (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 5: Met Ala Asn Gln Asp Leu He Pro Val Val Val Asn Gly Ala Ala Glv 5 • 10. - - fifteen _. * Lys Met Gly Arg Glu Val He Lys Wing Val Wing Gln Ala Pro Asp Leu. 20 - ^ 25 3p Gln Leu Val Gly Ala Val Asp His Asn Pro Ser ¿eu 'dlñ Gly Gln sp 45 He Gly Glu Val Val Gly He Ala Pro Leu Glu Val ro Val.Leu Ala 50 55 60 Asp Leu Gln Ser Val Leu Val Leu Ala Thr Gln Glu Lys He G n Gly 65 70 75 80 Val Met Val Asp Phe Thr His Pro Ser Gly Val 'Tyr Asp Asn Val Arg 85 90 -' 95 Ser Ala He Ala Tyr Gly Val Arg Pro Val Val Gly Thr Thr Gly I? U 100 105 '110 Ser Glu Gln Gln He Gln Asp Leu Gly Asp Phe Wing Glu Lys Wing Ser 115 120 125 Thr Gly Cys Leu He Wing Pro Asn Phe Wing He Gly Val Leu Leu Met 130 135 140 Gln Gln Wing Wing Val Gln Wing Cys Gln Tyr Phe Asp His Val Glu He 145 150 155 160 He Glu Leu His His Asn Gln Lys Wing Asp Wing Pro Ser Gly Thr Wing 165 170 175 He Lys Thr Wing Gln Met Leu Wing Glu Met Gly Lys Thr Phe Asn Pro 180 185 190 Pro Wing Val Glu Glu Lys Glu Thr He Wing Gly Wing Lys Gly Gly Leu 195 200 205 Gly Pro Gly Gln He Pro He His Be He Arg Leu Pro Gly Leu He 210 215 220 Wing His Gln Glu Val Leu Phe Gly Ser Pro Gly Gln Leu Tyr Thr He 225 230 2 35 240 Arg His Asp Thr Thr Asp Arg Wing Cys Tyr Met Pro Gly Val Leu Leu 245 250 255 Gly He Arg Lys Val Val Glu Leu Lys Gly Leu Val Tyr Gly Leu Glu 260 265 270 Lys Leu Leu 275 (2) INFORMATION FOR SEQ ID NO: 6: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1012 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE (B) CLON: Chp2 .pk0008. h4 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 6: TATTGCCAGA GATGTGTGGT AATGGAGTCC GTTGCTTCGC TCGGTTTATA GCCGAGATTG 60 AAAATCTGCA GGGGACAAAT AGATTCACTA TTCATACTGG TGCTGGAAAG ATCGTTCCTG 120 AAATACAAAG TGATGGGCAG GTAAAGGTTG ATATGGGCGA GCCTATCCTT TCTGGACTAG_180_ACATCCCCAC AAAACTGCTA GCTACCAAGA ACAAAGCTGT TGTTCAAGCT GAATTGGCAG 2 0 TTGAGGGCTT AACATGGCAT GTCACATGTG TTAGCATGGG AAACCCTCAC TGTGTCACAT 300 TTGGTGCAAA TGAGTTAAAG GTATTGCAGG TCGACGATTT AAAACTTAGC GaAATTGGGC 360 CTAAATTTGA GCATCATGAA ATGTTTCCTG CTCGCACAAA CACAGAATTC GTACAGGTTT 420 TGTCTCGCTC ACACCTCAAA ATGCGGGTCT GGGAACGTGG TGCTGGAGCA ACTCTTGCCT 80 GTGGTACTGG TGCTTGTGCA GTGGTTGTTG CAGCTGTTCT TGAGGGTCGA GCTGAGCGGA 540 AATGTGTAGT TGATTTGCCT GGCGGGCCAT TGGAAATTGA GTGGAGGGAG GATGACAATC 600 ATGTTTACAT GACTGGTCCT GCAGAGGTCG TCTTTTATGG ATCTGTTGTT CACTAGGTAC 660 TGGGGACCAA GATAGAAGGG TTGGCTGCCA CTCAGAGCTT GTGAGATTGG TTATAGTATC 720 CATGAAACAG AGTGTTCTGG TACCAGTACA CTTGTXCAGA TATTCTTAAT TATGATTGCT 780 TGATTTGGGT AGCMGTAGAG GCTTCCTTTT GAAGCATTCT AGTGTTCMCC TTTTGTACTC 840 CTTTAGTTTG TCAGGTTTGA ACACTACATG GGTAACATGT CYTTCCCACC ATTTTCYGTT 900 TCTTTTCTTT GTAAGTGAAC GCCAATGCAG TTTTAGTATT GTTTTCTATA GATTTGTCTT 960 GATGCACTGG GCTTACTACT TATTTTCTGG TATGAATGCT GCCTATTTCC TG 1012 (2) INFORMATION FOR SEQ ID NO: 7: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 217 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: chp2.pk0008.h4 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 7: Leu Pro Glu Met Cys Gly Asn Gly Val Arg Cys Phe Ala Arg Phe He 1 5 1 i r0s 1.5- Wing Glu He Glu Asn Leu Gln Gly Thr Asn? Rg Phe Thr He His Thr 20 5 30 Gly Wing Gly Lys He Val Pro Glu He Gln Ser? Sp Gly Gln Val 35 Lys 40 45 Val Asp Met Gly Glu Pro He Leu Ser Gly Leu Asp He Pro Thr X, ys 55 60 Leu Leu Wing Thr Lys Asn Lys Wing Val Val Gln Wing Glu Leu Wing Val 65 70 75 80 Glu Gly Leu Thr Trp His Val Thr Cys Val Ser Met Gly Asn Pro His 85 90 95 Cys Val Thr Phe Ely Wing Asn Glu Leu Lys Val Leu Gln Val Asp Asp 100 IOS - ..- • no Leu Lys Ser Glu He Gly Pro Lys Phe Glu His Hxs Glu Oest. Phe 115 120. 125 Pro Ala Arg Thr Asn Thr Glu Phe Val Gln Val Leu Ser Arg Ser His 130 135 140 Leu Lys Met Arg Val Trp Glu Arg Gly Ala Gly Ala Thr Leu Ala Cys 145 150 155 160 Gly Thr Gly Wing Cys Wing Val Val Val Wing Wing Val Leu Glu Gly Arg 165 170 175 Wing Glu Arg Lys Cys Val Val Asp Leu Pro Gly Gly Pro Leu Glu He 180 185 190 Glu Trp Arg Glu Asp Asp Asn His Val Tyr Met Thr Gly Pro Ala Glu 195 200 205 Val Val Phe Tyr Gly Ser Val Val His 210 215 (2) INFORMATION FOR SEQ ID NO: 8: (i) CHARACTERISTICS OF THE SEQUENCE: - (A) LENGTH: 481 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE (B) CLON: rls48.pk0036.hl0. { xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 8: TGTATCCGGC GCCGACGGTG TGATCTTCGT CATGCCGGGG GTCAATGGCG CGGACTACAC 60 CATGAGGATC TTCAACTCGG ACGGCAGTGA GCCGGAGATG TGTGGCAATG GAGTCCGTTG 120 CTTTGCCCGG TTTATAGCTG AGCTTGAAAA CCTACAGGGA ACACATAGCT TCAAAATTCA 180 CACTGGCGCT GGGCTAATCA TTCCTGAAAT ACAAAATGAT GGCAAGGTAA AGGTTGATAT 240 GGGCCAGCCC ATTCTCTCTG GACCAGATAT TCCAACAAAA CTGCCATCCA CCAAGAATGA 300 AGCCGTTGTC CAAGCTGATT TGGGCAGTTG ATGGCTCAAC ATGGCAAGTA ACCTGTGTTA 360 GCATGGGCAA TCCACATTGT GTCACATTTG GCACAAAGGA GCTCAAGGTT TTGCATGTTG 420 ATGATTAAAG CTTAATGATA TTGGGGCCTA AATTCAGCAT CATGAAATGT TCCTGCCCCA 480 C 481 (2) INFORMATION FOR SEQ ID NO: 9: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 85 amino acids (B) TYPE: amino acids (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: rls48.pk0036.hl0 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 9 Val Ser Gly Wing Asp Gly Val He Phe Val Met Pro Gly Val Asn Gly 1 5 10 15 Wing Asp Tyr Thr Met Arg He Phe Asn Ser Asp Gly Ser Glu Pro Glu 20 25 30 Met Cys Gly Asn Gly Val Arg Cys Phe Ala Arg Phe He Wing Glu Leu 35 40 45 Glu Asn Leu Gln Gly Thr His Ser Phe Lys He- His Thr Gly Wing Gly 50 55 60 Leu He He Pro Glu He Gln Asn Asp Gly Lys Val Lys Val Asp Met 65 70 75 80 Gly Gln Pro He Leu 85 (2) INFORMATION FOR SEQ ID NO: 10: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1301 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 10: ATCCCTTATT AAGCAGGGGT TTCGCGGCGC GAGACGGTGA CACTGGCAGA GTGGAATTTC 60 CGCCGCCATT CGAAGCTACA GCGATGGCCA TAACCGCCAC CATTTCCGTT CCCCTCACAT 120 CCCCCAGTCG CCGCACTCTC ACCTCCGTCA ATAGCCTCTC TCCCCTTTCT ACCCGATCCA 180 CTTTGCCCAC ACCGCAACGC ACTTTCAAAT ACCCTAATTC GCGCCTCGTC GTGTCTTCCA 240 TGAGCACCGA AACAGCCGTC AAAACTTCAT CCGCCTCCTT CCTCAACCGC AAGGAGTCCG 300 GCTTCCTCCA TTTCGCCAAG TACCACGGCC TCGGAAACGA CTTCGTTTTG ATTGACAATA 360CTC CGAGCCCAAG ATCAGTGCTG AGAAAGCGGT GCAACTGTGT GATCGGAACT 420 TCGGCGTTGG AGCTGACGGA GTTATCTTTG TCTTGCCTGG CATCAGTGGC ACCGATTATA 480 CCATGAGGAT TTTTAACTCT GATGGTAGTG AGCCTGAGAT GTGTGGCAAT GGAGTTCGAT 540 GCTTTGCCAA ATTTGTTTCT CAGCTTGAGA ATTTACATGG GAGGCATAGT TTTACCATTC 600 ATACTGGTGC TGGTCTGATT ATTCCTGAAG TCTTGGAGGA TGGAAATGTC AGAGTTGATA 660 TGGGGGAGCC AGTTCTTAAA GCCTTGGATG TGCCTACTAA ATTACCTGCA AATAAGGATA 720 ATGCTGTTGT TAAATCACAG CTAGTTGTAG ATGGAGTTAT TTGGCATGTG ACCTGTGTTA 780 GCATGGGGAA TCCACACTGT GTAACTTTCA GTAGAGAAGG AAGCCAGAAT TTGCTTGTTG 8 0 ATGAA TGAA GCTAGCAGAA ATTGGGCCAA AATTTGAACA TCATGAGGTG TTCCCTGCAC 900 GAACTAACAC AGAGTTTGTG CAAGTATTAT CTAACTCTCA CTTGAAAATG CGTGTTTGGG 960 AGCGGGGAGC AGGAGCAACC CTAGCCTGTG GAACTGGAGC TTGTGCTACT GTTGTTGCAG 1020 CAGTTCTTGA GGGTCGTGCT GGGAGGAATT GCACGGTTGA TCTACCTGGA GGGCCTCTTC 1080 AGATTGAGTG GAGGGAGGAA GATAATCATG TTTATATGAC AGGCTCAGCC GATGTÁGTTT 1140 ATTATGGTTC TTTGCCCCTT TGATATGTTG CCCCCATTGT TAAACCCAAT ATGGAATTAG_1200_GAATTGGTGA ATAATATTTG TATGAGAGGT GGACTTTCTG CTTGTTCCTA ATATTTTGCC 1260 ACGTCTTTAT AAAAAAAAAA AAAAAAAAAA AAAAAAAAAAA TO 1301 (2) INFORMATION FOR SEQ ID NO: 11: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 359 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 11: Met Wing He Thr Wing Thr He Ser Val Pro Leu Thr Ser Pro Ser Arg 1 5 - 10 15 Arg Thr Leu Thr Ser Val Asn Ser Leu Ser Pro Leu Ser Thr Arg Ser 20 25 30 Thr Leu Pro Thr Pro Gln Arg Thr Phe Lys Tyr Pro Asn Ser Arg Leu 35 40 45 Val Val Ser Ser Met Ser Thr Glu Thr Wing Val Lys Thr Ser Wing 50 55 60 Ser Phe Leu Asn Arg Lys Glu Ser Gly Phe Leu His Phe Wing Lys Tyr 65 70 75 80 His Gly Leu Gly Asn Asp Phe Val Leu As Asn Arg Asp Be Ser 85 90 95 Glu Pro Lys He Ser Wing Glu Lys Wing Val Gln Leu Cys Asp Arg Asn 100 105"110 Phe Gly Val Gly Wing Asp Gly Val He Phe Val Leu Pro Gly He Ser 115 120 125 Gly Thr Asp Tyr Thr Met Arg He Phe Asn Ser Asp Gly Ser Glu Pro 130 135 140 Glu Met Cys Gly Asn Gly Val Arg Cys Phe Ala Lys Phe Val Ser Gln 145 150 155 160 Leu Glu Asn Leu His Gly Arg His Ser Phe Thr He His Thr Gly Wing 165 170 175 Gly Leu He He Pro Glu Val Leu Glu Asp Gly Asn Val Arg Val Asp 180 185 190 Met Gly Glu Pro Val Leu Lys Ala Leu Asp Val Pro Thr Lys Leu Pro 195 200 205 Wing Asn Lys Asp Asn Wing Val Val Lys Ser Gln Leu Val Val Asp Gly 210 215 220 Val He Trp His Val Thr Cys Val Ser Met Gly Asn Pro His Cys Val 225 230 235 240 Thr Phe Ser Arg Glu Gly Ser Gln Asn Leu Leu Val Asp Glu Leu Lys 245 250 255 Leu Wing Glu He Gly Pro Lys Phe Glu His His Glu Val Phe Pro Wing 260 265 270 Arg Thr Asn Thr Glu Phe Val Gln Val Leu Ser Asn Ser His Leu Lys 275 280 285 Met Arg Val Trp Glu Arg Gly Ala Gly Ala Thr Leu Ala Cys Gly Thr 290 295 300 Gly Ala Cys Ala Thr Val Val Ala Ala Ala Val Leu Glu Gly Arg Ala Gly 305 310 315 320 Arg Asn Cys Thr Val Asp Leu Pro Gly Gly Pro Leu Gln He Glu Trp 325 330 335 Arg Glu Glu Asp Asn His Val Tyr Met Thr Gly Ser Wing Asp Val Val 340 345 350 Tyr Tyr Gly Ser Leu Pro Leu 355 (2) INFORMATION FOR SEQ ID NO: 12: (i) CHARACTERISTICS OF SEQUENCE: (A) LENGTH: 602 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (i) TYPE OF MOLECULE: cDNA (Ü) IMMEDIATE SOURCE (B) CLON: Wlm24.pk0030.g4 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 12 CTCCACCGCC CCCTCCTCGG GCGGTCGCCT CCTCCGTCCG TTCTGTGGGA ATCCGCGCCC 60 CCGCCGCGCC GTCGCCTCGA TGGCCGTGTC CGCTCCCAAG TCGCCAGCCG CCGCCTCGTT 120 CCTCGAGCGC CGCGAGTCCG AGCGCGCGCT CCACTTCGTG AAGTACCAGG GCCTCGGCAA 180 CGACTTCATA ATGGTCGACA ACAGGGATTC GGCCGTACCG AAGGTGACAC CGGAGGAGGC 240 GGCGAAGCTA TGCGACCGAA ACTTTGGGTA TTGGGTGCTG ATGGCGTCAT CTTCGTCCTG 300 CCGGGGGGTCA ACGGCGCGGA CTACACTATG AGGATATTCA ACTCCGATGG CAGCAACCGG 360 AATGTNTGGN ATGGATTCGT TGCTTGCTCG CTTTATACGG AGTTGAAATC TACANGGAAA 420 CATACTTCAA AACAANAGGG GGCTGGATTA ATATCCTGAA ATANAHACAT GNAAGTTANG 480 TNATATGGGC AACAATCTTA TGGCANATTT CA AAAATGC ATCACAAGAT AACTTNTAAA 540 ACGATTGAAT TAGGCAANAG AANTACCGTT ATAGGAACCC ATGAAMCTTG TNAAATTAAG 600 GT "602 (2) INFORMATION FOR SEQ ID NO: 13: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 80 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) ) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: wlm24 .pk0030. g4 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 13: Wing Leu His Phe Val Lys Tyr Gln Gly Leu Gly Asn Asp Phe He Met 1 5 10 15 Val Asp Asn Arg Asp Ser Wing Val Pro Lys Val Thr Pro Glu Glu Wing 20 25 30 Wing Lys Leu Cys Asp Arg Asn Phe Gly Xaa Gly Wing Asp Gly Val He 35 40 45 Phe Val Leu Pro Gly Val Asn Gly Wing Asp Tyr Thr Met Arg He Phe 50 55 60 Asn Ser Asp Gly Ser Asn Arg Asn Val Trp Xaa Gly Phe Val Ala Cys 65 70 75 80 INFORMATION FOR SEQ ID NO: 14: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 279 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) ORIGINAL SOURCE (A) ORGANISM: Synechocystus sp (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 14 Met Ala Leu Ser Phe Ser Lys Tyr His Gly Leu Gly Asn Asp Phe He 1 5 10 15 Leu Val Asp Asn Arg Gln Ser Thr Glu Pro Cys Leu Thr Pro Asp Gln 20 25 30 Wing Gln Gln Leu Cys Asp Arg His Phe Gly He Gly Wing Asp Gly Val 35 40 45 He Phe Wing Leu Pro Gly Gln Gly Gly Thr Asp Tyr Thr Met Arg He 50 55 60 Phe Asn Ser Asp Gly Ser Glu Pro Glu Met Cys Gly "Asn Gly He Arg 65 70 75 80 Cys Leu Wing Lys Phe Leu Wing Asp Leu Glu Gly Val Glu Glu Lys Thr 85 90 95 Tyr Arg He His Thr Leu Wing Gly Val He Thr Pro Gln Leu Leu Wing 100 105 110 Asp Gly Gln Val "Lys Val Asp Met Gly Glu Pro Gln Leu Leu Wing Glu 115 120 125 Leu He Pro Thr Thr Leu Ala Pro Wing Gly Glu Lys Val Val Asp Leu 130 135 140 Pro Leu Wing Val Wing Gly Gln Thr Trp Wing Val Thr Cys Val Ser Met 145 150 155 160 Gly Asn Pro His Cys Leu Thr Phe Val Asp Asp Val Asp Ser Leu Asn 165 170 175 Leu Thr Glu He Gly Pro Leu Phe Glu His His Pro Gln Phe Ser Gln 180 185 190 Arg Thr Asn Thr Glu Phe He Gln Val Leu Gly "Be Asp Arg Leu Lys 195 200 205 Met Arg Val Trp Glu Arg Gly Wing Gly He Thr Leu Ala Cys Gly - hr 210 215 220 Gly Wing Cys Wing Thr Val Val Wing Wing Val Leu Thr Gly Arg Gly Asp 225 230 235 240 Arg Arg Cys Thr Val Glu Leu Pro Gly Gly Asn Leu - Glu He Glu Trp 245 250 255 Be Wing Gln Asp Asn Arg Leu Tyr Met Thr Gly Pro Wing Gln Arg Val 260 265 270 Phe Ser Gly Gln Wing Glu He 275 (2) INFORMATION FOR SEQ ID NO: 15: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH : 1160 base pairs (B) TYPE: nucleic acid (C) STRING: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE (vii) IMMEDIATE SOURCE (B) CLON: cc2.pk0031. c9 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 15: GTCGGCTGCG CGTCCACGGG AGACACCTCC GCCGCGCTCT CGGCCTACTG CGCAGCCGCG 60 GGAATCCCCG CCATCGTGTT CCTGCCAGCG GACCGCATCT CGCTGCAGCA GCTCATCCAG 120 CCGATCGCCA ACGGCGCCAC CGTGCTCTCT CTAGACACTG ATTTTGATGG CTGCATGCGG 180 CTCATTCGCG AGGTCACTGC AGAGCTGCCA ATCTACCTTG CCAATTCGCT CAACCCGCTC 240 CGCCTTGAGG GGCAGAAGAC AGCGGCCATC GAGATATTGC AGCAGTTCAA TTGGCAGGTG 300 CCAGATTGGG TCATTGTTCC AGGAGGCAAT CTTGGGAATA TCTATGCATT CTACAAGGGG 360 TTTGAGATGT GCCGCGTTCT TGGACTTGTT GATCGCGTGC CACGGCTTGT CTGCGCACAG 420 GCTGCAAATG CAAATCCATT GTACCGGTAC TACAAGTCAG GTTGGACTGA GTTTGAGCCA 480 CAAACTGCCG AGACTACATT TGCATCTGCG ATACAGATTG GTGATCCTGT ATCTGTTGAC 540 CGTGCGGTGG TCGCGCTGAA GGCCACTGAC GGTATTGTGG AGGAGGCTAC AGAGGAGGAG 600 CTAATGGATG CAACGGCGCT TGCTGACCGC ACTGGGATGT TTGCTTGCCC ACATACTGGG 660 GTTGCACTTG CTGCTTTGTT TAAGCTTCAG GGTCAGCGTA TAATTGGCCC TAATGACCGC 720 ACTGTGGTTG TTAGCACAGC TCATGGGCTG AAGTTCACGC AGTCAAAGAT TGACTACCAT 780 GACAAAAACA TCAAAGACAT GGTTTGCCAG TATGCTAATC CACCGATCAG TGTGAAGGCT 840 GACTTTGGTT CTGTGATGGA TGTTCTCCAG AAAAATCTCA ATGGTAAGAT ATAAAGTTAT 900 ATGATTAATT AACCCTCCAA ACTGTTTTTT TTTGTTTTTT CGTTCCAGGA ATTTTATTCC 960 TGAGTC TTC AACTTTGTTT GGTGAACATG GTATGGTGCT AAAATCTAGA CCTAATACCT 1020 TGTAGTACTA GTTCTGGAGG TCTTTTTGGT TGTAGGTCGA AGTGGATAGA GCTGTTCCTT 1080 GTACTTTATC TGTTTCATGT AATATGAATA ATAAATTATG GTCTAAATAT TTGAATAAAA 1140 AATCGTTTGG AATGACCCAC 1160 (2) INFORMATION FOR SEQ ID NO: 16: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 297 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: cc2 .pk0031. c9 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 16: Val Gly Cys Wing Ser Thr Gly Asp Thr Ser Wing Wing Leu Ser Wing Tyr 1 5 10 15 Cys Wing Wing Wing Gly He Pro Wing Hew Val Phe Leu Pro Wing Asp Arg 20 25 30 He Be Leu Gln Gln Leu He Gln Pro He Wing Asn Gly Ala Thr Val 35 40 45 Leu Ser Leu Asp Thr Asp Phe Asp Gly Cys Met Arg Leu He Arg Glu 50 55 60 Val Thr Wing Glu Leu Pro He Tyr Leu Wing Asn Ser Leu Asn Pro Leu 65 70 75 80 Arg Leu Glu Gly Gln Lys Thr 'Wing Wing He Glu He Leu Gln Gln Phe 85 90 95 Asn Trp Gln Val Pro Asp Trp Val He Val Pro Gly Gly Asn Leu Gly 100 105 110 Asn He Tyr Wing Phe Tyr Lys Gly Phe Glu Met Cys Arg Val Leu Gly 115 120 125 Leu Val Asp Arg Val Pro Arg Leu Val Cys Ala Gln Ala Ala Asn Ala 130 135 140 Asn Pro Leu Tyr Arg Tyr Tyr Lys Ser Gly Trp Thr Glu Phe Glu Pro 145 150 155 160 Gln Thr Wing Glu Thr Thr Phe Wing Being Wing He Gln He Gly Asp Pro 165 170 175 Val Ser Val Asp Arg Ala Val Val Ala Leu Lys Wing Thr Asp Gly He 180 185 190 Val Glu Glu Wing Thr Glu Glu Glu Leu Met Asp Wing Thr Ala Leu Wing 195 200 205 Asp Arg Thr Gly Met Phe Wing Cys Pro His Thr Gly Val Wing Leu Wing 210 215 220 Wing Leu Phe Lys Leu Gln Gly Gln Arg He He Gly Pro Asn Asp Arg 225 230 235 240 Thr Val Val Val Ser Thr Wing His Gly Leu Lys Phe Thr Gln Ser Lys 245 250 255 He Asp Tyr His Asp Lys Asn He Lys Asp Met Val Cys Gln Tyr Wing 260 265 270 Asn Pro Pro He Ser Val Lys Wing Asp Phe Gly Ser Val Met Asp Val 275 280. 285 Leu Gln Lys Asn Leu Asn Gly Lys He 290 ^. "_ _ __ 295 (2) INFORMATION FOR SEQ ID NO: 17: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 325 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: csl.pk0058.g5 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 17: ATGGCTTGCA AGTACTCCAA CCCGCCTCTG AGCGTGAAGG CTGACTTTGG CGCCGTGATG 60 GATGTGCTGA AGAAGAGGCT CAAGGGCAAG CTCTGAGCGC CTGTGCCTGG CTAATGCAAT 120 CAACTGATTG GAATGCAGTG GTTTCGTCGG TATCGGGGGG TCTTTTAGGC TTCAGAAATT 180 CTGTCTGGGT TAGACTATTT GTTTGTGGAG TTTAGCAGGA GAATGGCTAT CTCTCCTGCA 240 AGACTGGCGC TCTTTCTTGT GCTACGAATG TGTTACCATG GATAATAAGT GTAGTCGCTG 300 TCGGATTGAA TAATCAAAAA AAAAR 325 (2) INFORMATION FOR SEQ ID NO: 18: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 31 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: csl. pk0058 g5 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 18: Met Wing Cys Lys Tyr Ser Asn Pro. Pro Val Ser Val Lys Wing Asp Phe 5 - 10 15 Gly Wing Val Met Asp Val Leu Lys Lys Arg Leu Lys Gly Lys Leu 20 25 30 (2) INFORMATION FOR SEQ ID NO: 19: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 528 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: ( B) CLON: rls72.pk0018.e7 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 19; ACACCCAACA CGCAGACTTG ACAGATTCTG CTACTACAAA TCCTGCATAT TTAACAGCGC 60 TGCAACTCGA CGATGGAGAA CGGTGCTGCA ACCAACGGGG CGTCGGAGAA GTCGCACTCT 120 CCTTCACAGA CCTACCTCTC CACAAGGGGA GACGATTATG GGCTCTCATT CGAGACCGTC 180 GTCCTCAAAG GTCTTGCGGC TGACGGGGGT CTTTTCCTGC CCGAGGAAGT GCCCGCGGCA 240 ACCGAGTGGC AAAGCTGGAA AGACCTGCCC TACACCGAGC TTGCCGTCAA GGTTCTCAGC 300 TTGTACATCT CCCCCGCCGA GGTGCCGACG GAAGACCTCA GGGCGCTCGT CGAGCGCAGC 360 TACTCGACCT TCCGATCCAA GGAGGTTGTG CCGCTGGTGA AGCTGGAGGA CAACCTTCAC 420 CTGCTGGAGC TATTCCACGG CCCCAACTAC TCGTTCAAGG ACTGCGCGCT GCAATTCCTT 480 GG AACCTCN TCGAGTACTT TTGACTCNCA AGAACAAGGG AAAGGAGG 528 (2) INFORMATION FOR SEQ ID NO: 20: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 143 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: rls72 .pk0018. e7 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 20: Met Glu Asn Gly Wing Wing Thr Asn Gly Wing Ser Glu Lys Ser His Ser 1 5 10 15 Pro Ser Gln Thr Tyr Leu Ser Thr Arg Gly Asp Asp Tyr Gly Leu Ser 20 25 30 Phe Glu Thr Val Val Leu Lys Gly Leu Ala Wing Asp Gly Gly Leu Phe 35 40 45 Leu Pro Glu Glu Val Pro Ala Wing Thr Glu Trp Gln Ser Trp Lys Asp 50 55 60 Leu Pro Tyr Thr Glu Leu Wing Val Lys Val Leu Ser Leu Tyr He Ser 65 70 75 80 Pro Wing Glu Val Pro Thr Glu Asp Leu Arg Ala Leu Val Glu Arg Ser 85 90 95 Tyr Ser Thr Phe Arg Ser Lys Glu Val Val Pro Leu Val Lys Leu Glu 100 105 110 Asp Asn Leu His Leu Leu Glu Leu Phe His Gly Pro Asn Tyr Ser Phe 115 120 125 Lys Asp Cys Ala Leu Gln Phe Leu Gly Asn Leu Xaa Glu Tyr Phe 130 135 1 0 (2) INFORMATION FOR SEQ ID NO: 21: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 571 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: sel.06a03 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 21: GGATGCAATG GTGCAGGCTG ATTCCACTGG AATGTTCATA TGTCCACACA CTGGGGTGGC 60 TCTGGCGGCG CTTATTAAGC TGAGGAATCG TGGGGTTATC GGTGCCGGTG AGAGGGTTGT 120 GGTGGTGAGC ACTGCACATG GATTGAAGTT TGCACAGAGC AAGATTGATT ATCATTCTGG 180 GCTCATTCCT GGAATGGGCC GCTATGCTAA CCCGCTGGTT TCGGTTAAGG CGGATTTTGG 240 ATCGGTCATG GATGTTCTCA AGGATTCTTG CACAACAAGT CCCCCGACTT TAACAAGTCT 300 TGACGTTGCC AAGTAAGTTT TAGTTCGGGG TTTTTTCTGA TTAAAGATGT TTTTAAACAT 360 GTTTGTGTNC ACTTTCGGTC GTTATTATGG ATTTGTAAGA TTGGGCCCAA GTATTCGAGG 420 GTTTGATTTC AAACAACATG CTTCTGGTGA. CGCAATGCAA ATTTCGGNGC ATAACATCAT 480 TGTCGAAGAT GGATCNCGAC CGATGAAACT GTGTGGCAAG TAATGAGAAG AAAATAGGGC 540 ACTTGTACAG AGATTTAAA GNTTAATTTC N 571 (2) INFORMATION FOR SEQ ID NO: 22: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 104 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: sel.06a03 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 22: Asp Ala Met Val Gln Ala Asp Ser Thr Gly Met Phe He Cys Pro His 1 .. 5 - - 10-- ••• ': t .. -15.7 .---. Thr Gly Val Ala Leu Ala Ala Leu Xle Lys Leu Arg Asn Arg Gly Val 20 25 30 He Gly Wing Gly Glu Arg Val Val Val. Val Ser Thr Wing His Gly Leu 35 40 45 Lys Phe Wing Gln Ser Lys He Asp Tyr His Ser Gly Leu He Pro Gly 50 55 60 Met Gly Arg Tyr Wing Asn Pro Leu Val Ser Val Lys Wing Asp Phe Gly 65 70 75 80 Ser Val Met Asp Val Leu Lys Asp Ser Cys Thr Thr Ser Pro Pro Thr 85 90 95 Leu Thr Ser Leu Asp Val Ala Lys 100 (2) INFORMATION FOR SEQ ID NO: 23: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2191 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY : linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: srl.pk0003.f6 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 23: GCTTCCTCTT CTCTGTTTCA GTCTCTCCCT TTCTCTCTCC AAACCTCTAA ACCCTACGCG 60 CCTCCCAAAC CCGCCGCCCA CTTCGTTGTC CGCGCCCAATENTOTCAC TCAGAACAAC 120 AACTCCTCCT CCAAGCATCG CCGCCCCGCC GACGAGAACA TCCGCGACGA GGCCCGCCGC 180 ATCAATGCGC CCCACGACCA CCACCTCTTC TCGGCCAAGT ACGTCCCCTT CAACGCCGAC 240 TCCTCCTCCT CCTCCTCCAC GGAGTCCTAC TCGCTCGACG AGATCGTCTA CCGCTCCCAA 300 TCCGGCGGCC TCCTGGACGT CCAGCACGAC ATGGATGCCC TCAAGCGTTT CGACGGCGAG 360 TACTGGCGCA ACCTCTTCGA CTCGCGCGTG GGCAAAACCA CCTGGCCTTA CGGCTCCGGC 420 GTCTGGAGCA AAAAAGAATG GGTCCTCCCC GAGATCCACG ACGACGATAT CGTCTCCGCC 480 TTCGAGGGTA ACTCCAACCT CTTCTGGGCC GAGCGTTTCG GCAAACAGTT CCTCGGCATG 540 AACGATTTGT GGGTCAAACA CTGCGGAATC AGCCACACGG GCAGCTTCAA GGATCTCGGC 600 ATGACCGTCC TCGTCAGCCA GGTCAATCGC TTGAGAAAAA TGAACCGCCC CGTCGTCGGT 660 GTTGGTTGCG CCTCCACCGG TGACACATCG GCCGCTTTAT CCGCCTATTG CGCTTCCGCT 720 GCCATTCCTT CCATTGTGTT TTTGCCTGCT AATAAAATCT CTCTTGCCCA ACTTGTTCAG 780 CCTATTGCCA ATGGAGCCTT TGTGTTGAGT ATCGACACTG ATTTTGATGG TTGCATGCAG 840 TTGATCAGAG AAGTCACTGC TGAATTGCCT ATTTATTTGG CTAACTCTCT CAACAGTTTG 900 AAGTTGGAAG GGCAGAAAAC TGCTGCTATT GAGATTCTGC AGCAGTTTGA TTGGCAGGTT 960 CCTGATTGGG TCATTGTGCC TGGAAGCAAC CTTGGCAACA TTTATGCCTT TTACAAAGGG 1020 TTTAAGATGT TTCAAGAGCT TGGGCTTGTG GATAAGATTC CAAGGCTTGT TTGTGCTCAG 1080 GCTGCCAATG CTGATCCTTT GTATTTGTAC TTTAAATCCG GGTGGAAGGA GTTTAAGCCT 1140 GTGAAGTCGA GCACTACATT TGCTTCTGCC ATTCAAATTG GTGATCCTGT TTCCATTGAC 1200 AGGGCGGTTC ACGCGCTAAA GAGTTGCGAT GGGATTGTGG AGGAGGCCAC GGAGGAGGAG 1260 TTGATGGATG CTACAGCGCA GGCGGATTCT ACTGGGATGT TTATTTGCCC CCACACCGGG 1320 GTTGCTTTAA CTGCATTGTT TAAGCTCAGG AACAGCGGGG TTATTAAGGC CACTGATAGG 1380 ACTGTGGTGG TTAGCACTGC TCATGGCTTG AAGTTCACTC AGTCCAAGAT TGATTACCAT 1440 TCTAAGGACA TCAAGGACAT GGCTTGCCGC TATGCTAACC CGCCCATGCA AGTGAAGGCA 1500 GACTTTGGCT CGGTTATGGA TGTTTTGAAG ACGTATTTGC AGAGTAAGGC TCATTAGGTT 1560 AGCATTGCAA GTTTTGCTCC TCCTGAGTTT GCTCATTATT TACTTACTTT TAGGCACTAC 1620 TGCTGTATTG TCTTTTCTAT GAGCTAGGTT TGAGTGTTGT AATAATTTGC TTGCTGCATT 1680 ATGTATGCCG TCTAGTGTTC CATATTGGGC ATCATCCTTA GTATTTGTTG TAGATTTTCT 1740 TTGCTGAGCA TTTGATATAA TAGCTCAAGT AGGAAAATGA ATTGGGTACT ATGAGGAATG 1800 CATATCATTG GCTTGTTATT ACTGGATTCC AGACCACCCC AAAAGAAAAT AATTCCAAAA 1860 AATATAATTA GAACAAATTT CGTCCTTGTT ATGCTGTTGG CATTAAGCTC AGTGTGGGTA 1920 TTACCAAGCA ACTCGAAATC AAGAGAAAAA AAAATTGACA GCAAAGGAGC TGCATTGTTG 1980 GACTGAGTCA CATCACTTCA TTGCTATGTC GTCATATTTC GTTGAATTAC GGGAAGGCAG 2040 CATGCACAGC AATATGCAGC GATTAACTGA AGCCACACCG CACACATTGA AGTAGTAGTC 2100 AATTTAGACA CTCCATCTTG TACTTTCTAC AAAAATGAAT TTTTCTTAGC CATTAAGTAT 2160 AATATTTTAT TCTAAAAAAA AAAAAAAAAA TO 2191 (2) INFORMATION FOR SEQ ID NO: 24: () SEQUENCE CHARACTERISTICS: (A) LENGTH: 518 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE : peptide (vii) IMMEDIATE SOURCE: (B) CLON: srl .pk0003. f6 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 24; Wing Being Ser Leu Phe Gln Being Leu Pro Phe Being Leu Gln Thr Being 1 5 10 .., 15 Lys Pro Tyr Pro Wing "Pro Lys Pro Wing Wing His Phe Val Val Arg Wing 20 .25 30? Gln Ser Pro Leu Thr Gln Asn Asn Asn Ser Ser Ser Lys His Arg Arg 35 - ~. _ 40 -. ^., J - ...: J 45 ._ s.

Pro Wing Asp Glu Asn He Arg Asp Glu Wing Arg Arg He Asn Wing Pro 50 55 60 His Asp His His Leu Phe Ser.Ala Lys Tyr Val Pro Phe Asn Wing Asp 65 70 75 80 Being Being Being Being Thr Glu Being Tyr Ssr Leu Asp Glu He Val 85 = 90 95 Tyr Arg Ser Gln Ser Gly Gly Leu Leu Asp Val Gln His Asp Met Asp 100 105 110 Wing Leu Lys Arg Phe Asp Gly Glu Tyr Trp Arg Asn Leu Phe Asp Ser 115 120 125 Arg Val Gly Lys Thr Thr Trp Pro Tyr Gly Ser Gly Val Trp Ser Lys 130 135 140 Lys Glu Trp Val Leu Pro Glu He His Asp Asp Asp He Val Ser Wing 145 150 155 '160 Phe Glu Gly Asn Ser Asn Leu Phe Trp Wing Glu Arg Phe Gly Lys Gln 165 170 175 Phe Leu Gly Met As Asp Leu Trp Val Lys His Cys Gly He Ser His 180 185 190 Thr Gly Ser Phe Lys Asp Leu Gly Met Thr Val Leu Val Ser Gln Val 195 200 • 205 Asn Arg Leu Arg Lys Met Asn Arg Pro Val Val Gly Val Gly Cys Wing 210 215 220 Ser Thr Gly Asp Thr Ser Wing Wing Leu Wing Wing Tyr Cys Wing Wing Wing 225 230 235 240 Ala He Pro Ser He Vai Phe Leu Pro Ala Asn Lys He Ser Leu Ala 245 250 255 Gln Leu Val Gln Pro He Wing Asn Gly Wing Phe Val Leu Ser He Asp 260 265 270 Thr Asp Phe Asp Gly Cys Met Gln Leu He Arg Glu Val Thr Wing Glu 275 280 285 Leu Pro He Tyr Leu Wing Asn Ser Leu Asn Ser Leu Lys Leu Glu Gly 290 295 300 Gln Lys Thr Wing Wing He Glu He Leu Gln Gln Phe Asp Trp Gln Val 305 310 315 320 Pro Asp Trp Val He Val Pro Gly Ser Asn Leu Gly Asn He Tyr Ala 325 330 -335 Phe Tyr Lys Gly Phe Lys Met Phe Gln Glu Leu Gly Leu Val Asp Lys 340 345 350 He Pro Arg Leu Val Cys Ala Gln Ala Ala Asn Ala Asp Pro Leu Tyr 355 360 365 Leu Tyr Phe Lys Ser Gly Trp Lys Glu Phe Lys Pro Val Lys Ser Ser 370 375 380 Thr Thr Phe Wing Being Wing He Gln He Gly Asp Pro Val Ser He Asp 385 390 395 400 Arg Wing Val His Wing Leu Lys Ser Cys Asp Gly He Val Glu Glu Wing 405 410 415 Thr Glu Glu Glu Leu Met Asp Wing Thr Wing Gln Wing Asp Be Thr Gly 420 425 430 Met Phe He Cys Pro His Thr Gly Val Ala Leu Thr Ala Leu Phe Lys 435 440 445 Leu Arg Asn Ser Gly Val He Lys Ala Thr Asp Arg Thr Val Val Val 450 455 460 Ser Thr Ala His Gly Leu Lys Phe Thr Gln Ser Lys He Asp Tyr His 465 470 475 480 Ser Lys Asp He Lys Asp Met Wing Cys Arg Tyr Wing Asn Pro Pro Met 485 490. 495 Gln Val Lys Wing Asp Phe Gly Ser Val Met Asp Val Leu Lys Thr Tyr 500 505 510 Leu Gln Ser Lys Wing His 515 (2) INFORMATION FOR SEQ ID NO: 25: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 643 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: wrl.pk0085.h2 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 25: GCTCATCCAG CCCATCGCCA ACGGCGCCAC GGTGCTCTCG CTTGACACGG ATTTCGACGG 60 ATGCATGCGG CTÍATCAGGG "AGGTGACAGC TGAGCTGCCC ATATACCTCG CAAACTCACT 120 CAACTCGCTT CCGGCTGGAG GGGCAGAAGA CTGCAGCCAT CCGAGATATT GCAACANTCA 180 ATTGGCAGGT GCCCGGACTG GGTCACATCC CAAGGAGGCA ATCTGGGGGA ACATTTTATG 240 CTTTCCTACA AGGATTTNAA TTTCCGTGTC CTTNGCTAGT TGATTNCCTT CCNACTCCTT 300 GTTANTNCAA AGGCCGCCA ACGCAAACCC ACTGTACCCG TACTACAATC CTGGGGTGAC 360 TGATTTCCAT CCACTTGNTT GCCGGGACAA TTTNCATCCK GCAACAATTT GGGGATTCCA 420 TATCNATTAC CNTCGGTTTT TTCNCCCTNA AAGGACNNAT GATTNTCCNA GGAACTCCNN 480 AGGNGGATCA AGGATCCAAA GGCTTTCTAC TCACTGGAAN TTGCTTCCCA ANACGGGGTT 540 CACTNCCGCC CGTTAAACCC NTGACAAGTA TAATGGACAA CACNCCGGGG TNTATKACAA 600 CGGCAANTTN AAANCAAGTT NATCATTAGA ACNGGAANTT NCC "643 (2) INFORMATION FOR SEQ ID NO: 26: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 84 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: wrl.pk0085.h2 (Xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 26: Leu He Gln Pro He Wing Asn Gly Wing Thr Val Leu Ser Leu Asp Thr 1 5 10 15 Asp Phe Asp Gly Cys Met Arg Leu He Arg Glu Val Thr Wing Glu Leu 20 25 30 Pro He Tyr Leu Wing Asn Ser Leu Asn Ser Leu Xaa Leu Glu Gly Gln 35 40 45 Lys Thr Wing Wing He Arg Asp He Wing Thr Xaa Asn Trp Gln Val Pro 50 55 60 Gly Leu Gly His He Pro Arg Arg Gln Ser Xaa Thr Phe Tyr Wing Phe 65 1 75 80 Leu Gln Gly Phe (2) INFORMATION FOR SEQ ID NO: 27: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 525 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (ii) OROGINAL SOURCE: (A) ORGANISM: Arabidopsis thaliana (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 27 Leu Be Ser Cys Leu Phe Asn Wing Be Val Be Ser Leu Asn Pro Lys 1 5 10 15 Gln Asp Pro He Arg Arg His Arg Ser Thr Ser Leu Leu Arg His Arg 20 25 30 Pro Val Val He Ser Cys Thr Wing Asp Gly Asn Asn He Lys Wing Pro 35 40 45 He Glu Thr Wing Val Lys Pro Pro His Arg Thr Glu Asp Asn lie Arg 50 55 60 Asp Glu Wing Arg Arg Asn Arg Ser Asn Wing Val Asn Pro Phe Ser Wing 65 70 75 80 Lys Tyr Val Pro Phe Asn Wing Wing Pro Gly Ser Thr Glu Ser Tyr Ser 85 90 95 Leu Asp Glu He Val Tyr Arg Ser Arg Ser Gly Gly Leu Leu Asp Val 100 105 110 Glu His Asp Met Glu Ala Leu Lys Arg Phe Asp Gly Wing Tyr Trp Arg 115 120 125 Asp Leu Phe Asp Ser Arg Val Gly Lys Ser Thr Trp Pro Tyr Gly Ser 130 135 140 Gly Val Trp Ser Lys Lys Glu Trp Val Leu Pro Glu He Asp Asp Asp 145 150 155 160 Asp He Val Ser Ala Phe Glu Gly Asn Ser Asn Leu Phe Trp Wing Glu 165 170 175 Arg Phe Gly Lys Gln Phe Leu Gly Met Asn Asp Leu Trp Val Lys His 180 185 190 Cys Gly He Ser HIS Thr Gly Ser Phe Lys Asp Leu Gly Met Thr Val 195 200 205 Leu Val Se Gln Val Asn Arg Leu Arg Lys Met Lys Arg Pro Val Val 210 215 220 Gly Val Gly Cys Ala Ser Thr Gly Asp Thr Ser Ala Ala Leu Ser Ala 225 230 235 240 Tyr Cys Wing Being Wing Gly He Pro Being He Val Phe Leu Pro Wing Asn 245 250 255 Lys He Ser Met Wing Gln Leu Val Gln Pro He Wing Asn Gly Wing Phe 260 265 270 Val Leu Ser He Asp Thr Asp Phe Asp Gly Cys Met Lys Leu He Arg 275 280 285 Glu He Thr Wing Glu Leu Pro He Tyr Leu Wing Asn Ser Leu Asn Ser 290 295 300 Leu Arg Leu Glu Gly Gln Lys Thr Ala Wing He Glu He Leu Gln Gin 305 310 315 320 Phe Asp Trp Gln Val Pro Asp Trp Val He Val Pro Gly Gly Asn Leu 325 330 335 Gly Asn lie Tyr Wing Phe Tyr Lys Gly Phe Lys Met Cys Gln Glu Leu 340 345 350 Gly Leu Val Asp Arg He Pro Arg Met Val Cys Wing Gln Wing Wing Asn 355 360 365 Wing Asn Pro Leu Tyr Leu His Tyr Lys Ser Gly Trp Lys Asp Phe Lys 370 375 380 Pro Met Thr Wing Being Thr Thr Phe Wing Being Wing He Gln He Gly Asp 385 390 395 400 Pro Val Ser He Asp Arg Ala Val Tyr Ala Leu Lys Lys Cys Asn Gly 405 410 415 He Val Glu Glu Ala Thr Glu Glu Glu Leu Met Asp Ala Met Wing Gln 420 425 430 Wing Asp Ser Tfax Gly Met Phe He Cys Pro His Thr Gly Val Ala Leu 435 440 445 Thr Ala Leu Phe Lys Leu Arg Asn Gln Gly Val He Wing Pro Thr Asp 450 455 460 Arg Thr Val Val Val Ser Thr Wing His Gly Leu Lys Phe Thr Gln Ser 465 470 - 475 - 480 Lys He Asp Tyr His Ser Asn Wing He Pro Asp Met Wing Cys Arg Phe 485 490 495 Ser Asn Pro Pro Val Asp Val Lys Wing Asp Phe Gly Wing Val Met Asp 500 505 510 Val Leu Lys Ser Tyr Leu Gly Ser Asn Thr Leu Thr Ser 515 • 520 525 (2) INFORMATION FOR SEQ ID NO: 28: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1478 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (E) CLON: cenl.pk0064.f (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 28 CAACAGTGGT CCTTGAGGGG GACTCATATG ATGAAGCTCA GTCATATGCA AAATTGCGTT 60 GCCAGCAGGA AGGCCGCACA TTTGTACCTC CTTTTGACCA TCCTGATGTC ATCACTGGAC 120 AAGGAACTAT CGGCATGGAA ATTGTTAGGC AGCTGCAAGG TCCACTGCAT GCAATATTTG 180 TACCTGTTGG AGGTGGTGGA TTAATTGCTG GAATTGCTGC CTATGTAAAA CGGGTTCGCC 240 CAGAGGTGAA AATAATTGGA GTGGAACCCT CAGATGCAAA TGCAATGGCA TTATCCTTGT 300 GTCATGGTAA GAGGGTCATG TTGGAGCATG TTGGTGGGTT TGCTGATGGT GTAGCTGTCA 360 AAGCTGTTGG GGAAGAAACA TTTCGCCTGT GCAGAGAGCT AGTAGATGGC ATTGTTATGG 420 TCAGTCGAGA TGCTATTTGT GCTTCAATAA AGGATATGTT TGAGGAGAAA AGAAGTATCC 80 TTGAACCTGC TGGTGCCCTT GCATTGGCTG GGGCTGAAGC CTACTGCAAA TACTATAACT 540 TGAAAGGAGA AACTGTGGTT GCAATAACTA GTGGGGCAAA TATGAACTTT GATCGACTTA 600 GACTAGTAAC CGAGCTAGCT GATGTTGGCC GAAAACGGGA AGCAGTGTTA GCTACATTTC 660 TGCCAGAGCG GCAGGGAAGC TTCAAAAAAT TCACAGAATT GGTTGGCAGG ATGAATATTA 720 CTGAATTCAA ATACAGATAC GATTCTAATG CAAAAGATGC CCTTGTTCTT TACAGTGTTG 780 GCATCTACAC TGACAATGAG CTTGGAGCAA TGATGGATCG CATGGAATCT GCGAAACTGA 840 GGACTGTTAA CCTTACTGAC AATGATTTGG CAAAGGACCA CCTTAGATAC TTTATTGGAG 900 GAAGATCAGA AATAAAAGAT GAACTGGTTT ACCGGTTCAT TTTCCCGGAA AGGCCTGGGG 960 CCCTTATGAA ATTTTTGGAC ACGTTTAGTC CTCGTTGGAA CATCAGCCTT TTCCATTACC 1020 GTGCACAGGG TGAAGCTGGA GCAAATGTAT TAGTTGGTAT ACAAGTGCCG CCAGCAGAAT 1080 TTGATGAAT7 CAAGAGTCAT GCCAACAATC TTGGGTACGA GTACATGTCA GAGCACAACA 1140 ATGAGATATA CCGGTTGCTG TTGCGTGACC CAAAGGTCTA ATGTATATGC CTTTGCTCCC 1200 ATAATAAGTT GGTGACACTT TTCAAGGAAG ATTTTGCTCC AAGGTAGAAG TTGCGAGTTT 1260 CTTCAAGTTG AAATGAAGCC ATCACCAAAT GTAGCTTCGG TGTGCCATCT GTTTACTCAG 1320 TTAGATCATG TAGTGTATCA GTTGTGTATC TTTGTTGTTG TGCTTCGTGA TCTCAATTTA 1380 TTGCTTTGTG CACCTAGAGG TTGTCAAATA ATGATAACCG ATATGTTATC TAAATATCTA 1 0 ATAATGATTA TGTGATTGTG ATTAAAAAGG GGGGGCCC 1 78 (2) INFORMATION FOR SEQ ID NO: 29: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 392 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: cenl .pk0064. f4 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 29: Thr Val Val Leu Glu Gly Asp Ser Tyr Asp Glu Wing Gln Ser Tyr Wing 1 5 10 15 Lys Leu Arg Cys Gln Gln Glu Gly Arg Thr Phe Val Pro- Pro Phe Asp 20 2 - > c5 30 His Pro Asp Val lie Thr Gly Gln Gly Thr He Gly Met Glu He Val 35 40 45 Arg Gin Leu Gln Gly Pro Leu His Wing He Phe Val Pro Val Gly Gly 50 5 «i 6n0 Gly Gly Leu He Wing Gly He Ala Wing Tyr Val Lys Arg Val Arg Pro 65 70 75 80 Glu Val Lys He He Gly Val Glu Pro Ser Asp Wing Asn Wing Met Wing 85 90 9 * Leu Ser Leu Cys His Gly Lys Arg Val Met Leu Glu Hi = Val Gly Gly 100 105 110 Phe Wing Asp Gly Val Wing Val Lys Wing Val Gly Glu Glu Thr Phe Arg 120 125 Leu Cys Arg Glu Leu Val Asp Gly He Val Val Val Ser Arg Asp Ala 130 1T3í5 1.4 * 0? He Cys Wing Being He Lys Asp Met Phe Glu Glu Lys Arg Ser He Leu 145 1"• 5 = 0" 155 160 Glu Pro Wing Gly Wing Leu Wing Leu Wing Gly Wing Glu Wing Tyr Cys Lys 165 170 175 Tyr Tyr Asn Leu Lys Gly Glu Thr Val Val Wing He Thr Ser Gly Wing 180 185 190. .

Asn Met Asn Phe Asp Arg Leu Arg Leu Val Thr Glu Leu Wing Asp Val .195 200 205 Gly Arg Lys Arg Glu Wing Val Leu Wing Thr Phe Leu Pro Glu Arg Gln 210 215 -.-.- V-. -220. .

Gly Ser Phe Lys Lys Phe Thr Glu Leu Val Gly Arg Met Asn He Thr 225 230 235 240 Glu Phe Lys Tyr Arg Tyr Asp Ser Asn Ala Lys Asp Ala Leu Val Leu 245 250 255 Tyr Ser Val Gly He Tyr Thr Asp Asn Glu Leu Gly Wing Met Met Asp 260 265 270 Arg Met Glu Be Wing Lys Leu Arg Thr Val Asn Leu Thr Asp Asn Asp 275 280 285 Leu Wing Lys Asp His Leu Arg Tyr Phe He Gly Gly Arg Ser Glu He 290 295 300 Lys Asp Glu Leu Val Tyr Arg Phe He Phe Pro Glu Arg Pro Gly Ala 305 310 315 320 Leu Met Lye Phe Leu Asp Thr Phe Ser Pro Arg Trp Asn He Ser Leu 325 330 335 Phe His Tyr Arg Wing Gln Gly Glu Wing Gly Wing Asn Val Leu Val Gly 340 345 350 He Gln Val Pro Pro Wing Glu Phe Asp Glu Phe Lys Ser His Wing Asn 355 360 365 Asn Leu Gly Tyr Glu Tyr Met Ser Glu His Asn Asn Glu He Tyr Arg 370 375 380 Leu Leu Leu Arg Asp Pro Lys Val 385 390 (2) INFORMATION FOR SEQ ID NO: 30: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 728 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: sfll.pk0055.h7 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 30: AAAATATTGT AGCAATAACC AGTGGAGCAA ACATGAATTT TGATAAACTT CGGGTTGTAA 60 CTGAACTTGC TAATGTTGGT CGTAAACAAG AGGCTGTGCT GGCAACTGTT ATGGCAGAGG 120 AGCCTGGCAG TTTCAAACAA TTTTGTGAAT TGGTGGGGCA GATGAACATA ACAGAATTCA 180. AATACAGATA TAACTCAAAT GAGAAGGCAG TTGTCCTTTA CAGTGTTGGG GTTCACACAA 240 TCTCCGAACT AAGAGCAATG CAGGAGAGGA TGGAATCTTC TCAGCTCAAA ACTTACAATC 300 TCACAGAAAG TGACTTGGTG AAAGACCACT TGCGTTACTT GATGGGAGGC CGATCAAACG 360 TTCAGAATGA GGTCTTTGTC GTCTCACCTT TCCAAGAAAG ACTGGTGCTT TGATGAAATT 420 TTTGGACCCT TCAGTCCACG TTGGGATATT AGTTTATCCA TTACCGAGGG GAGGTGAAAC 480 TGGAGCAAAC TGCTAGTTGG NTACAGGTAC CAAAATGAGA TAGATGAGTC CATGATCGTG 540 CTAACAAACT GGATATGATT ATAAGTGGNA ATATGTGATG NCTCAGCTCA ATCNCGATGG 600 GGNTTAAGCA CTGCATATGG GNATTAGGGG NAGNTACANT TAAATTCACG GCCTCAAGNT 660 AAGCATANTN TAGGAACTAG CTTTACAGGG GGCTACNANT TAACCGNGTA TTTTTTTTGA GATGANNG 720 728 (2) INFORMATION FOR SEQ ID NO: 31: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 152 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: sfll.pk0055.h7 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 31 Asn He Val Wing He Thr Ser Gly Wing Asn Met Asn Phe Asp Lys Leu 1 5 10 15 Arg Val Val Thr Glu Leu Ala Asn Val Gly Arg Lys Gln Glu Val Wing 20 25 30 Leu Wing Thr Val Met Wing Glu Glu Pro Gly Ser Phe Lys Gln Phe Cys 35 40 45 Glu Leu Val Gly Gln Met Asn He Thr Glu Phe Lys Tyr Arg Tyr Asn 50 55 60 Ser Asn Glu Lys Ala Val Val Leu Tyr Ser Val Gly Val HAS Thr He 65 70 75 80 Ser Glu Leu Arg Ala Met Gln Glu Arg Met Glu Be Ser Gln Leu Lys 85 90 95 Thr Tyr Asn Leu Thr Glu Be Asp Leu Val Lys Asp HAS Leu Arg Tyr 100 105 110 Leu Met Gly Gly Arg Ser Asn Val Gln Asn Glu Val Phe Val Val Ser 115 120 125 Pro Xaa Pro Arg Lys Thr Gly Ala Leu Met Lys Phe Leu Asp Xaa Phe 130 135 140 Ser Pro Arg Trp Asp He Ser Leu 145 150 (2) INFORMATION FOR SEQ ID NO: 32: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 572 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: sre. pk0044. f3 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 32: AAAGACCTGG TGCTTTGATG AAATTTTTGG ACCCCTTCAG TCCACGTTGG AATATCAGTT 60 TATTCCATTA CCGAGGGGAG GGTGAAACTG GAGCAAATGT GCTAGTTGGA ATACAGGTAC 120 CCAAAAGTGA GATGGATGAG TTCCACGATC GTGCCAACAA ACTTGGATAT GATTATAAAG 180 TGGTGAATAA TGATGATGAC -TTCCAGCTTC TAATGCACTG ATGATGGTTT TAGGCACTTG 240 CCATTATTGT GTATTTTAGT CAACAAGTTT GCCATATTTA ATATTTCCAC GGTCGTTTCT 300 AAAAGTTGGA TGGGGAAAAA AGGTGGAAAG GAAGTGGCCT TCAGACATGT CATTAGTTGA 360 TTAGAGGAAC AACTAGTTCT TTTTACCTAA TGCGGCGTCT TATTACATTT TTTATAATCT 420 GTAATTTATG TTTTTTTGTT GTTGTTAACA TTGGAATCTT -ATAATGTTGT TGCCTGGTCT 480 TTTGTGTCTG TAATATAAGT GTCTTCAAAA GGTTGTTTGC TAAATTTCAG CAGCCTAAAA 540 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AA '572 (2) INFORMATION FOR SEQ ID NO: 33: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 72 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: sre.pk0044.f3 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 33: Arg Pro Gly Ala Leu Met Lys Phe Leu Asp Pro Phe Ser Pro Arg Trp 1 5 10 15 Asn He Ser Leu Phe His Tyr Arg Gly Glu Glu Glu Thr Gly Wing Asn 20 2S 30 Val Leu Val Gly He Gln Val Pro Lys Ser Glu Met Asp Glu Phe His .35 40 45 Asp Arg Wing Asn Lys Leu Gly Tyr Asp Tyr Lys Val Val Asn Asn Asp 50 55 60 Asp Asp Phe Gln Leu Met His 65 65 (2) INFORMATION FOR SEQ ID NO: 34: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 507 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) ORIGINAL SOURCE: (A) ORGANISM: Burkholderia capacia (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 34; Met Wing His Asp Tyr Leu Lys Lys He Leu Thr Wing Arg Val Tyr 1 5 10 15 Asp Val Wing Phe Glu Thr Glu Leu Glu Pro Wing Arg Asn Leu Wing 20 25 30 Arg Leu Arg Asn Pro Val Tyr Leu Lys Arg Glu Asp Asn Gln Pro Val 35 40 45 Phe Ser Phe Lys Leu Arg Gly Wing Tyr Asn Lys Met Wing His He Pro 50 55 60 Wing Asp Wing Leu Wing Arg Gly Val He Thr Wing Being Wing Gly Asn His € 5 70 75 80 Wing Gln Gly Val Wing Phe Wing Wing Arg Met Gly Val Lys Wing Val 85 90 95 He Val Val Val Pro Val Thr Pro Gln Val Val Val Val Val W Val Val 100 Valve No Ala Wing Gly Gly Pro Gly Val Glu Val He Gln Wing Gly Val Ser Tyr 115 120 125 Ser Asp Wing Tyr Wing His Wing Leu Lys Val Gln Glu Glu Arg Gly Leu 130 135 i4th Thr Phe Val HAS Pro Phe Asp Asp Pro Tyr Val He Wing Gly Gln Glv 145 iso i ** 1, 60 Thr He Wing Met Glu He Leu Arg Gln HAS Gln Gly Pro He His Wing 165 170 175 He Phe Val Pro He Gly Gly Gly Gly Leu Wing Ala Gly Val Wing Wing 180 185 190 Tyr Val Lys Wing Val Arg Pro Glu He Lys Val He Gly Val Gln Ala 195 200 205 Glu Asp S er Cys Wing Met Wing Gln Ser Leu Gln Wing Gly Lys Arg Val 210 215 220 Glu Leu Wing Glu Val Gly Leu Phe Wing Asp Gly Thr Wing Val Lys Leu 225 230 235 240 Val Gly Glu Glu Thr Phe Arg Leu Cys Lys Glu Tyr Leu Asp Gly Val 245 250 255 Val Thr Val Asp Thr Asp Wing Leu Cys Wing Wing He Lys Asp Val Phe 260 265 270 Gln Asp Thr Arg Ser Val Leu Glu Pro Ser Gly Wing Leu Wing Val Wing 275 280 285 Gly Wing Lys Leu Tyr Wing Glu Arg Glu Gly He Glu Asn Gln Thr Leu 290 295 300 Val Wing Val Thr Ser Gly Wing Asn Met Asn Phe Asp Arg Met Arg Phe 305 310 3 5 ..- .. -320 Val Ala Glu Arg Ala Glu Val Gly Glu Ala Arg Glu Ala Val Phe Ala 325 330 335 Val Thr He Pro Glu Glu Arg Gly Ser Phe Lys Arg Phe Cys Ser Leu 340 345 350 Val Gly Asp Arg Asn Val Thr Glu Phe Asn Tyr Arg He Wing Asp Wing 355 360 365 Gln Ser Wing His He Phe Val Gly Val Gln He Arg Arg Arg Gly Glu 370 375 380 Be Ala Asp He Ala Ala Asn Phe Glu Ser His Gly Phe Lys Thr Ala 385 39C 395 400 Asp Leu Thr His Asp Glu Leu Ser Lys Glu His He Arg Tyr Met Val 405 410 415 Gly Gly Arg Ser Pro Leu Ala Leu Asp Glu Arg Leu Phe Arg Phe Glu 420 425 430 Phe Pro Glu Arg Pro Gly Ala Leu Met Lys Phe Leu Ser Ser Met Ala 435 440 445 Pro Asp Trp Asn He Ser Leu Phe His Tyr Arg Asn Gln Gly Wing Asp 450 455 460 Tyr Ser Ser He Leu Val Gly Leu Gln Val Pro Gln Wing Asp His Wing 465 470 475 480 Glu Phe Glu Arg Phe Leu Ala Ala Leu Gly Tyr Pro Tyr Val Glu Glu 485 490 495 Be Ala Asn Pro Ala. -r Ar? Leu Phe Leu Ser 500 505 (2) INFORMATION FOR SEQ ID NO: 35: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1582 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: cc3.mn0002d2 (xii) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 35: ACGAGACGAG TCCCCTCCCC CCACCTCGCC TCACCCAACC GGAACGAACA AGTTACCATC 60 TCATCCCAAC CCCGCCTCGA CCGGATCTCG TCGGACTCGG ATCCGCCCGA CCACCCCGCG 120 CCGCCGCAGA TCAAAGAAGA TGGCAGCTCT CGACACCTTC CTCTTCACCT CGGAGTCTGT 180 GAACGAGGGA CACCCTGACA AGCTCTGCGA CCAGGTCTCA GATGCCGTTC TTGACGCTTG 240 CCTTGCTGAG GACCCTGACA GCAAGGTTGC TTGTGAGACC TGCACCAAGA CCAACATGGT 300 CATGGTCTTT GGTGAGATCA CCACCAAGGC CAATGTCGAC TACGAGAAGA TTGTCAGGGA 360 GACCTGCCGC AACATTGG7T TTGTGTCAAA CGATGTCGGG CTTGACGCTG ACCACTGCAA 420 GGTGCTCGTG AACATTGAGC AGCAGTCCCC TGATATTGCT CAGGGTGTGC ATGGCCACTT 480 CACCAAGCGC CCCGAGGAGA TTGGAGCTGG TGACCAGGGA CACATGTTCG GGTATGCGAC 540 CGATGAGACC CCTGAGTTGA TGCCCCTCAG CCATGTCCTT GCCACCAAGC TAGGTGCTCG 600 TCTCACCGAG GTCCGCAAGA ACGGAACCTG CCCCTGGCTC AGGCCTGATG GGAAGACCCA 660 GGTGACAGTC GAGTACCGCA ATGAGGGTGG TGCCATGGTC CCCATCCGTG TCCACACCGT 720 CCTCATCTCC ACCCAGCACG ACGAGACAGT GACCAATGAT GAGATCGCTG CTGACCTGAA 780 GGAGCATGTC ATCAAGCCTA TCATCCCTGA GCAGTACCTT GACGAGAAGA CCATCTTCCA 840 CCTTAACCCA TCCGGCCGCT TTGTCATTGG TGGACCTCAC GGCGATGCTG GCCTCACTGG 900 CCGCAAGATC ATCATTGACA CCTACGGTGG CTGGGGAGCC CATGGCGGTG GCGCTTTCTC 960 CGGCAAGGAC CCAACCAAGG TTGACCGCAG CGGAGCCTAT GTCGCGAGGC AGGCTGCCAA 1020 GAGCATCGTC GCCAGCGGCC TTGCTCGCCG CGCCATCGTC CAGGTGTCCT ACGCCATCGG 1080 CGTGCCCGAG CCTCTCTCCG TGTTTGTCGA CACGTACGGC ACCGGCGCGA TCCCCGACAA 1140 GGAGATCCTC AAGATTGTCA AGGAGAACTT CGATTTCAGG CCTGGCATGA TTATCATCAA 1200 CCTTGACCTC AAGAAAGGCG GCAACGGGCG CTACCTCAAG ACGGCAGCCT ACGGCCACTT 1260 CGGAAGGGAC GACCCTGACT TCACCTGGGA GGTGGTGAAG CCACTCAAGT CGGAGAAACC 1320 TTCTGCCTAA GGCGGCCTTT TTTTCAGTAA GAAGCTTTTG GTGGTCTGCT GTGCTTAATC 1380 ATGCTTTTAT ATGGCTTCTA CATGTTGTGG TTCTTTCTTG ATCTGCACCG CGCTTATCGT 14 0 TTGTGTTGTA CTGCCCTAAT AAGTGGTGCT TATGAGGACT GTTTCTGGTT TTGCTGCTTA 1500 TGTTGTAATG CTTTGAAACA ATGAAAGAAG CTACAGGCCA CAGCTATTTT GAGAAGTAAT 1560 GGAACCTCGT GCCGTTTTGA TT 1582 (2) INFORMATION FOR SEQ ID NO: 36: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 396 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: CLON: cc3.mn0002.d2 (i) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 36: Met Ala Ala Leu Asp Thr Phe Leu Phe Thr Ser Glu Ser Val Asn Glu 1 5 10 15 Gly His Pro Asp Lys Leu Cys Asp Gln Val Ser Asp Ala Val Leu Asp 20 25 30 Wing Cys Leu Wing Glu Asp Pro Asp Ser Lys Val Ala Cys Glu Thr Cys 35 40 45 Thr Lys Thr Asn Met Val Val Met Val Phe Gly Glu He Thr Thx Lys Ala 50 55 60 Asn Val Asp Tyr Glu Lys He Val Arg Glu Thr Cys Arg Asn He Gly 65 70 75 80 Phe Val Ser Asn Asp Val Gly Leu Asp Wing Asp His Cys Lys Val Leu 85 90 95 Val Asn He Glu Gln Gln Ser Pro Asp He Wing Gln Gly Val His Gly 100 105 110 His Phe Thr Lys Arg Prc Glu Glu He Gly Wing Gly Asp Gln Gly His 115 120 125 Met Phe 'Gly Tyr Ala Thr Asp Glu Thr Pro Glu Leu Met Pro Leu Ser 130 135 140 Kis Val Leu Wing Thr Lys Leu Gly Wing Arg Leu Thr Glu Val Arg Lys 145 150 155 160 Asn Gly Thr Cys Pro Trp Leu Arg Pro Asp Gly Lys Thr Gln Val Thr 165 170 175 Val Glu Tyr Arg Asn Glu Gly Gly Wing Met Val Pro He Arg Val His 180 185 190 Thr Val Leu He Ser Thr Gln His Asp Glu Thr Val Thr Asn Asp Glu 1S5 200 205 He Al2 Wing Asp Leu Lys Giu His Val He Lys Pro He He Pro Glu 210 215 220 Gin Tyr Leu Asp Glu Lys Thr lie Phe His Leu Asn Pro Ser Gly Arg 225 230 235 240 Phe Val He Gly Glv Pro His Gly Asp Wing Gly Leu Thr Gly Arg Lys 245 250- 255 He He lie Aso Thr Tyr Gly Gly Trp Gly Wing Gly Gly Wing 260 265 270 Phe Ser Gly Lys Asp Pro Thr Lys Val Asp Arg Ser Gly Wing Tyr Val 275 280 285 Wing Arg Gln Wing Wing Lys Ser He Val Wing Wing Gly Leu Ala Arg Arg 290 255 300 Ala He Val Gln Val Ser Tyr Ala He Gly Val Pro Glu Pro Leu Ser 305 310 315 320 Val Phe Val Asp Thr Tyr Gly Thr Gly Ala He Pro Asp Lys Glu He 325 330 335 Leu Lys He Val Lys Glu Asn Phe A = p Phe Arg Pro Gly Met He He 340 345 350 He Asn Leu Asp Leu Lys Lys Gly Gly Asn Gly Arg Tyr Leu Lys Thr 355 360 365 Wing Wing Tyr Gly His Phe Gly Arg Asp Asp Pro Asp Phe Thr Trp Glu 370 375 380 Val Val Lys Pro Leu Lys Ser Glu Lys Pro Ser Wing 385 390 395 • (2 INFORMATION FOR: SEQ ID NO: 37: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2183 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) ORIGINAL SOURCE: (A) ORGANISM: Oryza sativa (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 37:AAATGAACGG AAAATGGAAA AAAAAATTGA TTGGTGCCAC TTCAAAGTTA 60 AATATGCCAA GACGAATTGA TATGTTTCTG CTGTTGTTTT ATGCTCTTGA TTAGTTGATG 120 CGCATGTTCA ATGATTTATG ATGTTTGTCT TTGTGGAAAG ATTACATGTA AAGAGTATAG_180_TAGAACCCCT AAAAGCTAGC CAGCGATTTC GCTCTTTTTT TCCAGGTCTC CATGATATGT 240 TTACCCCTAA AAGTGGTATA TTTATGTGAT AGTTACAATA CATAGTGGAC CACGATTGAT 300 TATGCGTTTA TGCTGATTCC GGCAGAAAAT TGTTAGATTC CTTGTGCTCT ATACCTGCTT 360 GTTGCGCTTG TAGAGAATAT TACAAATACC TAACACTTGC CCAAGGAACT TAGGAACTTA 420 GTCAACTCTT TGTAGGGACA ACTATTTTAG CCCAAAATTG TGGTCTTGTC AGGTGCCAAC 480 AAAACAGCAT CTTGGCGTAC ATAAGCTATA TAGAGGATTA AAAGGAATGT TTTGTTCCTT 540 GCTACTGTTT TTTTAACCTG TTTACTCAGG ACAAATTTTG TTGCATAAAC CATTTGTTCT 600 AGGGATCAGT ATTGTCCTCT CAGTGTGTTA TGTAAGCATT TCCAGAAATC AATTGTCGCT 660 ATCAGCTTCC CTCACATTAG CTATCACTTA TACCCCTTTT TTTCTCATAG GCTCACCATG 720 TCCATTTTAT TCATGATATT TCTTTGTCTA AAGTATGTGA AATACCATTT TATGCAGATA 780 GGAGAAGATG GCCGCACTTG ATACCTTCCT CTTTACCTCG GAGTCTGTGA ACGAGGGCCA 840 CCCTGACAAG CTCTGCGACC AAGTCTCAGA TGCTGTGCTT GATGCCTGCC TCGCCGAGGA 900 CCCTGACAGC AAGGTCGCTT GTGAGACCTG CACCAAGACA AACATGGTCA TGGTCTTTGG 960 TGAGATCACC ACCAAGGCTA ACGTTGACTA TGAGAAGATT GTCAGGGAGA CATGCCGTAA 1020 CATCGGTTTT GTGTCAGCTG ATGTCGGTCT CGATGCTGAC CACTGCAAGG TGCTTGTGAA 1080 CATCGAGCAG CAGTCCCCTG ACATTGCACA GGGTGTGCAC GGGCACTTCA CCAAGCGCCC 1140 TGAGGAGATT GGTGCTGGTG ACCAGGGACA CATGTTTGGA TATGCAACTG ATGAGACCCC 1200 TGAGTTGATG CCCCTCAGCC ATGTCCTTGC TACCAAGCTT GGCGCTCGTC TTACGGAGGT 1260 TCGCAAGAAT GGGACCTGCG CATGGCTCAG GCCTGACGGG AAGACCCAAG TGACTGTTGA 1320 GTACCGCAAT GAGAGCGGTG CCAGGGTCCC TGTCCGTGTC CACACCGTCC TCATCTCTAC 1380 CCAGCATGAT GAGACAGTCA CCAACGATGA GATTGCTGCT GACCTGAAGG AGCATGTCAT 1440 CAAGCCTGTC ATTCCCGAGC AGTACCTTGA TGAGAAGACA ATCTTCCATC TTAACCCATC 1500 TGGTCGCTTC GTCATTGGCG GACCTCATGG TGATGCTGGT CTCACTGGCC GGAAGATCAT 1560 CATTGACACT TATGGTGGCT GGGGAGCTCA CGGTGGTGGT GCCTTCTCTG GCAAGGACCC 1620 - • - • '' •: '- AACCAAGGTT GACCGCAGTG GAGCATACGT CGCAAGGCAA GCTGCCAAGA GCATTGTTGC' 1680 TAGTGGCCTT GCTCGCCGCT GCATTGTCCA AGTATCATAC GCCATCGGTG TCCCAGAGCC 1740 ACTGTCCGTA TTCGTCGACA CATACGGCAC TGGCAGGATC CCTGACAAGG AGATCCTCAA 1800 GATTGTGAAG GAGAACTTCG ACTTCAGGCC TGGCATGATC ATCATCAACC TTGACCTCAA 1860 GAAAGGCGGC AACGGACGCT ACCTCAAGAC GGCGGCTTAC GGTCACTTCG GAAGGGACGA 1920 CCCAGACTTC ACCTGGGAGG -TGGTGAAGCC CCTCAAGTGG GAGAAGCCTT CTGCCTAAAA 1980 GCTCCCTTTC GGAGGCTTTT GCTCTGTCCC ATTATGGTGT TTTGTTTCCT CGCTGCTCAG 2040 CATTGTGATT CTTAACCTGC CCCCCGCTGC CATTTATGCC CATGCACGCT ACTTTCCTAA 2100 TAATAAGTAC TTATAAGGGT ATTGTGTTTG AATATTTTAC CTAGAGGAGG AGGAGGATTT 2160 GTTATCTGTT ATTGCTTAAG CTT 2183 (2) INFORMATION FOR SEQ ID NO: 38: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1484 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: s2.12b06 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 38; AGCGAAGCCC CACTCAACCA CCACACCACT CTCTCTGCTC TTCTTCTACC TTTCAAGTTT 60 TTAAAGTATT AAGATGGCAG AGACATTCCT ATTTACCTCA GAGTCAGTGA ACGAGGGACA 120 CCCTGACAAG CTCTGCGACC AAATCTCCGA TGCTGTCCTC GACGCTTGCC TTGAACAGGA 180 CCCAGACAGC AAGGTTGCCT GCGAAACATG CACCAAGACC AACTTGGTCA TGGTCTTCGG 240 AGAGATCACC ACCAAGGCCA ACGTTGACTA CGAGAAGATC GTGCGTGACA CCTGCAGGAA 30C CATCGGCTTC GTCTCAAACG ATGTGGGACT TGATGCTGAC AACTGCAAGG TCCTTGTAAA 360 CATTGAGCAG CAGAGCCCTG ATATTGCCCA GGGTGTGCAC GGCCACCTTA CCAAAAGACC 420 CGAGGAAATC GGTGCTGGAG ACCAGGGTCA CATGTTTGGC TATGCCACGG ACGAAACCCC 480 AGAATTGATG CCATTGAGTC ATGTTCTTGC AÁCTAAACTC GGTGCTCGTC TCACCGAGGT 540 TCGCAAGAAC GGAACCTGCC CATGGTTGAG GCCTGATGGG AAAACCCAAG TGACTGTTGA 600 GTATTACAAT GACAACGGTG CCATGGTTCC AGTTCGTGTC CACACTGTGC TTATCTCCAC 660 CCAACATGAT GAGACTGTGA CCAACGACGA AATTGCAGCT GACCTCAAGG AGCATGTGAT 720 CAAGCCGGTG ATCCCGGAGA AGTACCTTGA TGAGAAGACC ATTTTCCACT TGAACCCCTC 780 TGGCCGTTTT GTCATTGGAG GTCCTCACGG TGATGCTGGT CTCACCGGCC GCAAGATCAT 840 CATCGATACT TACGGAGGAT GGGGTGCTCA TGGTGGTGGT GCTTTCTCCG GGAAGGATCC 900 CACCAAGGTT GATAGGAGTG GTGCTTACAT TGTGAGACAG GCTGCTAAGA GCATTGTGGC 960 AAGTGGACTA GCCAGAAGGT GCATTGTGCA AGTGTCTTAT GCCATTGGTG TGCCCGAGCC 1020 TTTGTCTGTC TTTGTTGACA CCTATGGCAC CGGGAAGATC CATGATAAGG AGATTCTCAA 1080 CATTGTGAAG GAGAACTTTG ATTTCAGGCC CGGTATGATC TCCATCAACC TTGATCTCAA 1140 GAGGGGTGGG AATAACAGGT TCTTGAAGAC TGCTGCATAT GGACACTTCG GCAGAGAGGA 1200 CCCTGACTTC ACATGGGAAG TGGTCAAGCC CCTCAAGTGG GAGAAGGCCT AAGGCCATTC 1260 ATTCCACTGC AATGTGCTGG GAGTTTTTTA GCGTTGCCCT TATAATGTCT ATTATCCATA 1320 ACTTTCCACG TCCCTTGCTC TGTGTTTTTC TCTCGTCGTC CTCCTCCTAT TTTGTTTCTC 1380 CTGCCTTTCA TTTGTAATTT TTTACATGAT CAACTAAAAA ATGTACTCTC TGTTTTCCGA 1440 CCATTGTGTC TCTTAATATC AGTATCAAAA AGAATGTTCC AAGTT 1485 (2) INFORMATION FOR SEQ ID NO: 39: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 392 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: CLON: s2.12b06 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 39; Met Ala Glu Thr Phe Leu Phe Thr Ser Glu Ser Val Asn Glu Gly His 1 5 10 15 Pro Asp Lys Leu Cys Asp Gln He Ser Asp Ala Val Leu Asp Ala Cys 20 25 30. Leu Glu Gln Aso Pro Asp Ser Lys Val Wing Cys Glu Thr Cys Thr Lys 35"40 45 Thr Asn Leu Val Met Val Phe Glv Glu He Thr Thr Lys Wing Asn Val 50 55" 60 Asp Tyr Glu Lys He Val Arg Asp Thr Cys Arg Asn He Gly Phe Val 65 70 75 80 Ser Asn Asp Val Gly Leu Asp Wing Asp Asn Cys Lys Val Leu Val Asn 85 90 95 He Glu Gln Gln Ser Pro Asp He Wing Gln Gly Val His Gly His Leu 100 105 110 Thr Lys Arg Pro Glu Glu He Gly Wing Gly Asp Gln Gly His Met Phe 115 120 125 Gly Tyr Ala Thr Asp Glu Thr Pro Glu Leu Met Pro Leu Ser His Val 130 135. ... 140 ... ...

Leu Ala Thr Lys Leu Gly Ala Arg Leu Thr Glu Val Arg Lys Asn Gly 145 150 155 160 Thr Cys Pro Trp Leu Arg Pro Asp Gly Lys Thr Gln Val Thr "Val Glu -165 - .170 175 Tyr Tyr Asn Asp Asn Gly Wing Met Val Pro Val Arg Val His Thr Val 180 185 190 Leu He Ser Thr Gln His ASD Glu Thr Val Thr Asn Asp Glu He Ala 195 * 200 205 Wing Asp Leu Lys Glu His Val He Lys Pro Val He Pro Glu Lys Tyr 210 215 220 Leu ASD Glu Lvs Thr He Phe His Leu Asn Pro Ser Gly Arg Phe Val 225"* 230 235 240 He Gly Giy Pro Kis Gly Asp Wing Gly Leu Thr Gly Arg Lys He He 245 250 • 255 He Asp Thr Tyr Gly Gly or Gly Wing Gly Gly Ala Ghe Wing Phe Ser 260 265 270 Gly Lys ASD Pro Thr Lys Val ASD Arg Ser Gly Ala Tyr He Val Arg 275 280 285 Gln Ala Ala Lvs Ser He Val Ala Ser Gly Leu Ala Arg Arg Cys He 29C 295 300 Val Gln Val Ser Tyr Ala I;. Glv Val Pro Glu Pro Leu Ser Val Phe 305 310 315 320 Val Asp Thr Tyr Gly Thr Gly Lys lie His Asp Lys Glu He Leu Asn 325 330 335 He Val Lys Glu Asn Phe ASD Phe Arg Pro Gly Met He Ser He As As 340 345 350 Leu ASD Leu Ly = Ara Giv Glv Asn Asn Arg Phe Leu Lys Thr Ala Wing 35 = "" '360 365 Tvr Giy HAS Phe Gly Arg Glu Asp Pro Aso Phe Thr Trp Glu Val Val 370 375 380 Lys Pro Leu Lys Tro Glu Lys Wing 385 '390 (2) INFORMATION FOR SEQ ID NO: 40: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1479 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) ORIGINAL SOURCE: (A) ORGANISM: Lycopersicon esculentum (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 40: GAATTCCTAC AAAGAGGTTA TTTCTCTCAA GGGGTAAAAA GATTGCCCCT TTTCGACATT 60 TATAATCCTC TTTTTCTCTT TGTTCGCCGT TGGGTTCTTC ACTTTCCTGT TTCTTGAGAA 120 TGGAAACTTT CTTATTCACC TCCGAGTCTG TGAACGAGGG TCACCCAGAC AAGCTCTGTG 180 ATCAGATCTC TGATGCAGTT CTTGATGCCT GCCTTGAGCA AGATCCCGAG AGCAAAGTTG 240 CATGTGAAAC TTGCACCAAG ACCAACTTGG TCATGGTCTT TGGTGAGATC ACAACCAAGG 300 CTATTGTAGA CTATGAGAAG ATTGTGCGTG ACACATGCCG TAATATTGGA TTTGTTTCTG 360 ATGATGTTGG TCTTGATGCT GACAACTGCA AGGTCCTTGT TTACATTGAG CAGCAAAGTC 420 CTGATATTGC TCAAGGTGTC CACGGCCATC TGACCAAACG CCCCGAGGAG ATTGGTGCTG 480 GTGACCAGGG CCACATGTTT GGCTATGCAA CAGATGAGAC CCCTGAATTA ATGCCTCTCA 5 0 GTCACGTGCT TGCAACTAAA CTTGGTGCCC GTCTTACAGA AGTCCGCAAG AATGGCACCT 600 GCGCCTGGTT GAGGCCTGAT GGCAAGACCC AAGTTACTGT TGAGTATAGC AATGACAATG 660 GTGCCATGGT TCCAATTAGG GTACACACTG TTCTTATCTC CACCCAACAC GATGAGACCG 720 TTACCAATGA TGAGATTGCC CGCGACCTTA AG6AGCATGT CATCAAACCA GTCATCCCAG 780 AGAAGTACCT TGATGAGAAT ACTATTTTCC ACCTTAACCC ATCTGGCCGA TTCGTTATTG 840 GTGGACCTCA TGGTGATGCT GGTCTCACTG GTCGTAAAAT CATCATCGAC ACTTATGGTG 900 GTTGGGGTGC TCATGGTGGT GGTGCTTTCT CGGGCAAAGA CCCAACCAAG GTCGACAGGA 960 GTGGTGCATA CATTGTAAGG CAGGCTGCAA AGAGTATCGT AGCTAGTGGA CTTGCTCGTA 1020 GATGCATCGT GCAGGTATCT TATGCCATCG GTGTGCCTGA GCCATTGTCT GTATTCGTTG 1080 ACACCTATGG CACTGGAAAG ATCCCTGACA GGGAAATTTT GAAGATCGTT AAGGAGAACT 1140 TTGACTTCAG ACCTGGAATG ATGTCCATTA ACTTGGATTT GAAGAGGGGT GGCAATAGAA 1200 GATTCTTGAA AACTGCTGCC TATGGTCACT TTGGACGTGA TGACCCCGAT TCACATGGG 1260 AAGTTGTCAA GCCCCTCAAG TGGGAAAAGC CCCAAGACTA ATAAGTGCTT GCCTATGTTT 1320 TTGTTCTTTG TTGTTTGCTT GTGGCTTTAG AATCTCCCCC GTGTTTGCTT GTTTGTCTTT 1380 GTATTTTCTC TTTTGACCCT TTATTTTGTT ATTGTCCTGT TTCCATTGTG TTGGATGGAT 1440 ATCTTAGGCC TTGGAATATT AAGGAAAGAA AAGGAATTC 1479 (2) INFORMATION FOR SEQ ID NO: 1: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1380 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ü) TYPE OF MOLECULE: cDNA (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 41 CCCTCCCTTC GGTTCATCGG CCTCCCGATC GAGCAGTAGA AGCAGCGCAA GGGCATCGCT 60 AGCACTAAAG AAATGGCAGC CGAGACGTTC CTCTTCACGT CCGAGTCTGT GAACGAGGGC 120 CATCCCGACA AGCTCTGTGA CCAAGTCTCC GACGCCGTCT TGGATGCCTG CTTGGCCCAG 180 GATGCCGACA GCAAGGTCGC CTGCGAGACC GTCACCAAGA CCAACATGGT CATGGTCTTG 240 GGCGAGATCA CCACCAAGGC CACCGTCGAC TATGAGAAGA TCGTG GTGA CACCTGCCGC 300 AACATCGGTT TCATCTCTGA TGACGTTGGT CTCGACGCCG ACCGTTGCAA RGTGCTCGTC 360 AACATCGAGC AGCAGTCCCC TGACATTGCC CAGGGTGTTC ATGGACACTT CACCAAGCGT 420 CCCGAAGAAG TCGGCGCCGG TGACCAGGGC ATCATGTTCG GCTATGCCAC CGATGAGACC 480 CCTGAGCTGA TGCCCCTCAA GCACGTGCTT GCCACCAAGC TYGGAGCTCG CCTCACSGAG 540 GTCCGCAAGA ATGGCACCTG CGCCTGGGTC AGGCCTGACG GAAAGACCCA -GGTCACAGTC 600 GAGTACCTAA ACGAGGATGG TGCCATGGTA CCTGTTCGTG TGCACACCGT CCTCATCTCC 660 ACCCAGCACG ACGAGACCGT CACCAACGAC GAGATTGCTG CGGACCTCAA GGAGCATGTC 720 ATCAAGCCGG TGATCCCCGC AAAGTACCTC GATGAGAACA CCATCTTCCA CCTGAACCCG 780 TCTGGCCGCT TCGTCATCGG CGGCCCCCAC GGTGACGCCG GTCTCACCGG CCGCAAGATC 840 ATCATCGACA CCTATGGTG G CTGGGGAGCC CACGGCGGCG GTGCCTTCTC TGGCAAGGAC 900 CCAACCAAGG TCGACCGYAG TGGCGCCTAC ATTGCCAGGC ARGCCGCCAA GAGCATCATC 960 GCCAGCGGCC TCGCACGCCG CTGCATTGTG CAGATCTCAT ACGCCATCGG TGTGCCTGAG 1020 CCTTTGTCTG TGTTCGTCGA CTCCTACGGC ACCGGCAAGA TCCCCGACAG GGAGATCCTC 1080 AAGCTCGTGA AGGAGAACTT TGACTTCAGG CCCGGGATGA TCAGCATCAA CCTGGACTTG 1140 AAGAAAGGTG GAAACAGGTT CATCAAGACC GCTGCTTACG GTCACTTTGG CCGTGATGAT 1200 GCCGACTTCA CCTGGGAGGT GGTGAAGCCC CTCAAGTTCG ACAAGGCATC TGCCTAAGAG 1260 CATGGCATTC TCTTGGTCTG CCGCCTCTCA AGTTCGTCAA GACGGGATCA TGTTGCTCCT 1320 GGGAAGTGGG AAGAAGCATT AGACATTGAA GCGACGCTCT ACACTGGTCT TGTTGTATGG 1380 (2) INFORMATION FOR SEQ ID NO: 42: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 394 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 42; Met Ala Ala Glu Thr Phe Leu Phe Thr Ser Glu Ser Val Asn Glu Gly 5 10 15 HAS Pro Asp Lys Leu Cys Asp Gln Val Ser Asp Ala Val Leu Asp Ala 20 25 30 Cys Leu Ala Gln Asp Ala Asp Ser Lys Val Ala Cys Glu Thr Val Thr 35 40 45 Lys Thr Asn Met Val Met Val Leu Gly Glu He Thr Thr Lys Wing Thr 50 55 60 Val Asp Tyr Glu Lys He Val Arg Asp Thr Cys Arg Asn He Gly Phe 65 70 75 80 He Ser Asp Asp Val Gly Leu Asp Wing Asp Arg Cys Lys Val Leu Val 85 90 95 Asn He Glu Gln Gln Ser Pro Asp He Wing Gln Gly Val HAS Gly His 100 105 110 Phe Thr Lys Arg Pro Glu Glu Val Gly Wing Gly Asp Gln Gly He Met 115 120 125 Phe Gly Tyr Ala Thr Asp Glu Thr Pro Glu Leu Met Pro Leu Lys His 130 135 140 Val Leu Wing Thr Lys Leu Gly Wing Arg Leu Thr Glu Val Arg Lys Asn 145 150 155 160 Gly Thr Cys Wing Trp Val Arg Pro Asp Gly Lys Thr Gln Val Thr Val 165 170 175 Glu Tyr Leu Asn Glu Asp Gly Wing Met Val Pro Val Arg Val His Thr 180 185 190 Val Leu He Ser Thr Gln His Asp Glu Thr Val Thr Asn Asp Glu He 195 200 205 Wing Wing Asp -Leu Lys Glu His Val He Lys Pro Val He Pro Wing Lys 210 215 220 Tyr Leu Asp Glu Asn Thr He Phe His Leu Asn Pro Ser Gly Arg Phe 225 230 235 240 Val He Gly Gly Pro HAS Gly Asp Wing Gly Leu Thr Gly Arg Lys He 245 250 255 He He As Asp Thr Tyr Gly Gly Trp Gly Wing His Gly Gly Gly Wing Phe 260 265 270 Ser Gly Lvs Asp Pro Thr Lys Val Asp Arg Ser Gly Wing Tyr He Wing 275 280 285 Arg Gln Wing Wing Ly = Ser He He Wing Ser Gly Leu Ala Arg Arg Cys 290 295 300 He Val Gln He Ser Tyr Ala He Gly Val Pro Glu Pro Leu Ser Val 305 310 315 320 he Val Asp Ser Tyr Gly Thr Gly Lys He Pro Asp Arg Glu He Leu 325 330 335 Lys Leu Val Lys Glu Asn Phe Asp Phe Arg Pro Gly Met He Ser He 340 345 350 Asn Leu Asp Leu Lys Lys Gly Gly Asn Arg Phe He Lys Thr Wing Wing 355 360 365 Tyr Gly His Phe Gly Arg Asp Asp Wing Asp Phe Thr Trp Glu Val Val 370 375 380 Ly = Pro Leu Lys Phe Asp Lys Wing Ser Wing 385 390 (2) INFORMATION FOR SEQ ID NO: 43: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1353 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ií) TYPE OF MOLECULE: cDNA (vii) ORIGINAL SOURCE: (A) ORGANISM: Hordeum vulgare (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 43: GAATTCCGGA TAGCATCAGC ACAACTGCAC GAGAGCATCT CTACCACCAA AGAAATGGCG 60 GCCGAGACGT TCCTCTTCAC GTCCGAGTCC GTGAACGAGG GCCATCCCGA CAAGCTGTGC 120 GACCAGGTCT CTGACGCCGT CTTGGACGCC TGCTTGGCCC AGGATCCTGA CAGCAAGGTT 180 GCTTGCGAGA CCTGCACCAA GACCAACATG GTCATGGTCT TCGGCGAGAT CACCACCAAG 240 GCCACCGTTG ACTATGAGAA GATTGTGCGC GACACCTGCC GTGACATCGG CTTCATCTCT 300 GACGACGTCG GTCTCGATGC CGACCATTGC AAGGTGCTCG TCAACATCGA GCAGCAATCC 360 CCTGACATTG CCCAGGGTGT TCACGGACAC TTCACCAAGC GTCCAGAAGA GGTCGGCGCC 420 GGTGACCAGG GCATCATGTT TGGCTACGCC ACTGATGAGA CCCCTGAGCT GATGCCCCTC 480 ACCCACATGC TTGCCACCAA GCTCGGAGCT CGCCTCACCG AGGTCCGCAA GAATGGCACC 540 TGCGCCTGGC TCAGGCCTGA TGGAAAGACC CAGGTCACCA TTGAGTACCT AAACGAGGGT 600 GGTGCCATGG TGCCCGTTCG TGTGCACACC GTCCTCATCT CCACCCAGCA TGATGAGACC 660 GTCACCAACG ATGAGATCGC TGCAGACCTC AAGGAGCATG TCATCAAGCC GGTGATTCCC 720 GGGAAG7ACC TCGATGAGAA CACCATC7TC CACCTGAACC CATCGGGCCCC CTTTGTCATC 780 GGTGGCCCTC ACGGCGA7GC CGG7C7CACC GCCCGCAAGA TCATCATCGA CACCTATGGT 840 GGC7GGGGAG CCCACGGCGG CGG7GCC77C TCTGGCAAGG ACCCTACCAA GGTCGACCGC 900 AGTGGCGCC7 ACATTGCCAG GCAGGCTGCC AAGAGCATCA TCGCCAGCGG CCTCGCACGC 960 CGGTGCATTG TGCAGATC7C A7ATGCCATC GGTGTACCTG AGCCTTTGTC TGTGTTCGTC 1020 GACTCCTACG GCACTGGCAA GATCCCTGAC AGGGAGATCC TCAAGCTCGT GAAGGAGAAC 1080 7TTGACTTCA GACCCGGGAT GATCACGATC AACCTCGACT TGAAGAAAGG TGGAAACAGG 1140 77CA7CAAGA CAGCTGC7TA CGG7CAC7TT GGCCGCGATG ATGC7GACTT CACCTGGGAG 120C GTGGTGAAGC CCCTCAAGTT CGACAAGGCA TCTGCTTAAG AAGAAGACAT CACATTGAGG 1260 G7TCTTCTTG GTCTGATGCC TCTCAAGTTC GGCAAGGCGG GATCCTTTTG CTCCTCGGAA 1320 GTAAGAAGAA GCATTCAACA TCGCCCGGAA TTC 1353 It is noted that in relation to this date, the best method known to the applicant to put into practice the aforementioned • invention, which is clear from the present description of the invention. Having described the invention as above, it is claimed as property contained in the following:

Claims

An isolated nucleic acid fragment encoding all or a substantial portion of a plant dihydrodipicolinate reductase, characterized in that it comprises an element selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 2 and 4; Y b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the. amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 2 and 4; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
A fragment of nucleic acid isolated from Claim 1 characterized in that the nucleotide sequence of the fragment comprises all or a portion of the sequence declared a member selected from the group consisting of SEQ ID NO: 1 and 3.
3. A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 1 operatively linked to suitable regulatory sequences.
4. A transformed host cell characterized in that it comprises a chimeric gene of Claim 3.
5. A dihydropicolinate reductase polypeptide characterized by comprises all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 2 and 4.
An isolated nucleic acid fragment encoding all or a substantial portion of plant diaminopimelate epimerase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 7, 9, 11, and 13. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 7, 9, 11, and 13; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
The isolated nucleic acid fragment of Claim 6 characterized in that the nucleotide sequence of the fragment comprising all or a portion of the sequence declared in a member selected from the group consisting of SEQ ID NO: 6, 8, 10 and 12.
A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 6 operatively linked to suitable regulatory sequences.
A transformed host cell characterized by ague comprises the chimeric gene of Claim 8.
A diaminopimelate epimerase polypeptide characterized by comprises all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 7, 9, 11 and 13.
An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine synthase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 16 and 18. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 16 and 18; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
The nucleic acid fragment isolated from Claim 11 characterized in that the nucleotide sequence of the fragment comprising all or a portion of the sequence declared in a member selected from the group consisting of SEQ ID NO: 15 and 17.
A chimeric gene characterized by comprising the nucleic acid fragment of Claim 11 operatively linked to suitable regulatory sequences.
A transformed host cell characterized in that it comprises the chimeric gene of Claim 13.
A threonine synthetase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 16 and 18.
An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine synthase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 20. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 20; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
The nucleic acid fragment isolated from the Claim 16 characterized in that the nucleotide sequence of the fragment comprising all or a portion of the sequence declared in SEQ ID. NO: 19
A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 16 operatively linked to suitable regulatory sequences.
A transformed host cell characterized in that it comprises the chimeric gene of Claim 18.
A threonine synthase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 20.
An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine synthase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 22 and 24. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 22 and 24; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
The nucleic acid fragment isolated from Claim 21 characterized in that the nucleotide sequence of the fragment comprises all or a portion of the sequence declared in a member selected from the group consisting of SEQ ID NO: 21 and 23.
A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 21 operatively linked to suitable regulatory sequences.
A transformed host cell characterized in that it comprises the chimeric gene of Claim 23.
A threonine synthase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 22 and 24.
An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine synthase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 26. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 26; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
27. The isolated nucleic acid fragment of Claim 26 characterized in that the nucleotide sequence of the fragment comprising all or a portion of the sequence declared in SEQ ID NO. NO: 25
28. A chimeric gene characterized in that it comprises the nucleic acid fragment of. Claim 26 operatively linked to suitable regulatory sequences.
29. A transformed host cell characterized in that it comprises the chimeric gene of Claim 28.
30. A threonine synthase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 26.
31. An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine deaminase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 29. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 29; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
The nucleic acid fragment isolated from the Claim 31 characterized in that the nucleotide sequence of the fragment comprising all or a portion of the sequence declared in SEQ ID. NO: 28
A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 31 operably linked to suitable regulatory sequences.
34. A transformed host cell characterized in that it comprises the chimeric gene of Claim 33.
35. A threonine deaminase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 29.
36. An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine deaminase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 31 and 33. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 31 and 33; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
37. The nucleic acid fragment isolated from Claim 36 characterized in that the nucleotide sequence of the fragment comprises all or a portion of the sequence declared in a member selected from the group consisting of SEQ ID NO: 30 and 32.
38. A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 36 operably linked to suitable regulatory sequences.
39. A transformed host cell characterized in that it comprises the chimeric gene of Claim 38.
40. A threonine deaminase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 31 and 33.
41. An isolated nucleic acid fragment encoding all or a substantial portion of a plant S-adenosylmethionine synthetase characterized in that the nucleotide sequence of the fragment comprises all of a portion of the sequence reported in SEQ ID NO: 35.
A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 41 operably linked to suitable regulatory sequences.
A transformed host cell characterized in that it comprises the chimeric gene of Claim 42.
An isolated nucleic acid fragment encoding all or a substantial portion of a plant S-adenosylmethionine synthetase characterized in that the nucleotide sequence of the fragment comprises all or a portion of the sequence reported in SEQ ID NO: 38.
A chimeric gene characterized in that it comprises the nucleic acid fragment of claim 44 operably linked to suitable regulatory sequences.
A transformed host cell characterized in that it comprises the chimeric gene of Claim 45.
An isolated nucleic acid fragment encoding all or a substantial portion of a plant S-adenosylmethionine synthetase characterized in that the nucleotide sequence of the fragment comprises all or a portion of the sequence reported in SEQ ID NO: 41.
48. A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 47 operably linked to suitable regulatory sequences.
49. A transformed host cell characterized in that it comprises the chimeric gene of Claim 48.
50. A method of altering the level of expression of the plant amino acid biosynthetic enzyme in a host cell characterized in that it comprises: a) transforming a host cell with the chimeric gene of any of claims 3, 8, 13, 18, 23, 28, 33, 38, 42, 45, and 48; Y b) growing the transformed host cell produced in step (a) under conditions that are suitable for the expression of the chimeric gene wherein the expression of the chimeric gene leads to the production of altered levels of an amino acid biosynthetic enzyme of a plant in the transformed host cell.
51. A method of obtaining a nucleic acid fragment encoding all or substantially all of the amino acid sequence encoding a biosynthetic amino acid enzyme of a plant characterized in that it comprises: a) probing a cDNA or genomic library with the nucleic acid fragment of any of Claims 1, 6, 11, 16, 21, 26, 31, 36, 41, 44, and 47; b) identifying a DNA clone that hybridizes with the nucleic acid fragment of any of Claims 1, 6, 11, 16, 21, 26, 31, 36, 41, 44, and 47; c) isolating the DNA clone identified in step (b); Y d) sequencing cDNA or genomic fragment comprising the clone isolated in step (c). wherein the fragment of the nucleic acid sequence encodes all or substantially all of the amino acid sequence encoding a plant amino acid biosynthetic enzyme. A method of obtaining a nucleic acid fragment encoding a portion of an amino acid sequence encoding a plant amino acid biosynthetic enzyme characterized in that it comprises: a) synthesizing the initiator oligolucleotide corresponding to a portion of the sequence declared in any of SEQ ID NOs: 1, 3, 6, 8, 10, 12, 15, 17, 19, 21, 23, 25, 28, 30, 32, 35, 38, and 41; Y b) amplifying a cDNA insert present in a cloning vector using the initiator oligonucleotide of step (a) and one primer representing sequences of the cloning vector wherein the amplified nucleic acid fragment encodes a portion of an amino acid sequence encoding a biosynthetic amino acid enzyme of a plant. A product, characterized in that it is produced by the method of Claim 51. A product, characterized in that it is produced by the method of Claim
52. A method for evaluating at least one compound for its ability to inhibit the activity of a plant biosynthetic enzyme selected from the group consisting of dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase, characterized in that it comprises the steps of : a) transforming a host cell with a chimeric gene comprising a nucleic acid fragment encoding a plant biosynthetic enzyme selected from the group consisting of dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase, operatively linked to suitable regulatory sequences; b) growing the transformed host cell under conditions that are suitable for the expression of the chimeric gene characterized in that the expression of the chimeric gene results in the production of the operatively encoded biosynthetic enzyme bound to the nucleic acid fragment in the transformed host cell; c) optionally purifying the plant biosynthetic enzyme expressed by the transformed host cell; d) treating the biosynthetic enzyme with a compound to be tested; Y e) comparing the activity of the biosynthetic enzyme that has been treated with a test compound for the activity of an untreated plant biosynthetic enzyme, or which compounds are selected with potential for inhibitory activity.