FATTY ACID DESATURASES AND MUTANT SEQUENCES THEREOF
Technical Field This invention relates to fatty acid desaturases and nucleic acids encoding desaturase proteins . More particularly, the invention relates to nucleic acids encoding delta-12 and delta-15 fatty acid desaturase proteins that affect fatty acid composition in plants, polypeptides produced from such nucleic acids and plants expressing such nucleic acids.
Background of the Invention Many breeding studies have been conducted to improve the fatty acid profile of Brassica varieties. Pleines and Freidt, Fat Sci. Technol . , 90(5), 167-171
(1988) describe plant lines with reduced C18;3 levels (2.5- 5.8%) combined with high oleic content (73-79%) . Rakow and McGregor, J. Amer. Oil Chem. Soc . , 50, 400-403 (Oct. 1973) discuss problems associated with selecting mutants for linoleic and linolenic acids. In. Can. J. Plant Sci., 68, 509-511 (Apr. 1988) Stellar summer rape producing seed oil with 3% linolenic acid and 28% linoleic acid is disclosed. Roy and Tarr, Z. Pflanzenzuchtg, 95(3), 201-209 (1985) teaches transfer of genes through an interspecific cross from Brassica juncea into Brassica napus resulting in a reconstituted line combining high linoleic with low linolenic acid content. Roy and Tarr, Plant Breeding, 98, 89-96 (1987) discuss prospects for development of B . napus L. having improved linolenic and linolenic acid content. European Patent application 323,753 published July 12, 1989 discloses seeds and oils having greater than 79% oleic acid combined with less than 3.5% linolenic acid. Canvin,
Can. J. Botany, 43, 63-69 (1965) discusses the effect of temperature on the fatty acid composition of oils from several seed crops including rapeseed.
Mutations typically are induced with extremely high doses of radiation and/or chemical mutagens (Gaul, H. Radiation Botany (1964) 4:155-232). High dose levels which exceed LD50, and typically reach LD90, led to maximum achievable mutation rates. In mutation breeding of Brassica varieties high levels of chemical mutagens alone or combined with radiation have induced a limited number of fatty acid mutations (Rakow, G.Z. Pflanzenzuchtg (1973) 69:62-82). The low α-linolenic acid mutation derived from the Rakow mutation breeding program did not have direct commercial application because of low seed yield. The first commercial cultivar using the low α-linolenic acid mutation derived in 1973 was released in 1988 as the variety Stellar (Scarth, R. et al., Can. J. Plant Sci. (1988) 68:509-511). Stellar was 20% lower yielding than commercial cultivars at the time of its release.
Alterations in fatty acid composition of vegetable oils is desirable for meeting specific food and industrial uses. For example, Brassica varieties with increased monounsaturate levels (oleic acid) in the seed oil, and products derived from such oil, would improve lipid nutrition. Canola lines which are low in polyunsaturated fatty acids and high in oleic acid tend to have higher oxidative stability, which is a useful trait for the retail food industry. Delta-12 fatty acid desaturase (also known as oleic desaturase) is involved in the enzymatic conversion of oleic acid to linoleic acid. Delta-15 fatty acid desaturase (also known as linoleic acid desaturase) is involved in the enzymatic conversion of linoleic acid to α-linolenic acid. A microsomal delta-12 desaturase has
been cloned and characterized using T-DNA tagging. Okuley, et al . , Plant Cell 6:147-158 (1994). The nucleotide sequences of higher plant genes encoding microsomal delta-12 fatty acid desaturase are described in Lightner et al . , W094/11516. Sequences of higher plant genes encoding microsomal and plastid delta-15 fatty acid desaturases are disclosed in Yadav, N. , et al., Plant Physiol . , 103:467-476 (1993), WO 93/11245 and Arondel, V. et al . , Science, 258:1353-1355 (1992). However, there are no teachings that disclose mutations in delta-12 or delta-15 fatty acid desaturase coding sequences from plants. There is a need in the art for more efficient methods to develop plant lines that contain delta-12 or delta-15 fatty acid desaturase gene sequence mutations effective for altering the fatty acid composition of seeds.
Summary of the Invention The invention comprises Brassicaceae or Helianthus seeds, plants and plant lines having at least one mutation that controls the levels of unsaturated fatty acids in plants. One embodiment of the invention is an isolated nucleic acid fragment comprising a nucleotide sequence encoding a mutation from a mutant delta-12 fatty acid desaturase conferring altered fatty composition in seeds when the fragment is present in a plant. A preferred sequence comprises a mutant sequence as shown in Fig. 2. Another embodiment of the invention is an isolated nucleic acid fragment comprising a nucleotide sequence encoding a mutation from a mutant delta-15 fatty acid desaturase. A plant in this embodiment may be soybean, oilseed Brassica species, sunflower, castor bean or corn. The mutant sequence may be derived from, for example, a Brassica napus, Brassica rapa, Brassica juncea or Helianthus delta-12 or delta-15 desaturase gene.
Another embodiment of the invention involves a method of producing a Brassicaceae or Helianthus plant line comprising the steps of: (a) inducing mutagenesis in cells of a starting variety of a Brassicaceae or Helianthus species; (b) obtaining progeny plants from the mutagenized cells; (c) identifying progeny plants that contain a mutation in a delta-12 or delta-15 fatty acid desaturase gene; and (d) producing a plant line by selfing or crossing. The resulting plant line may be subjected to mutagenesis in order to obtain a line having both a delta-12 desaturase mutation and a delta-15 desaturase mutation.
Yet another embodiment of the invention involves a method of producing plant lines containing altered fatty acid composition comprising: (a) crossing a first plant with a second plant having a mutant delta-12 or delta-15 fatty acid desaturase; (b) obtaining seeds from the cross of step (a) ; (c) growing fertile plants from such seeds; (d) obtaining progeny seed from the plants of step (c) ; and (e) identifying those seeds among the progeny that have altered fatty acid composition. Suitable plants are soybean, rapeseed, sunflower, safflower, castor bean and corn. Preferred plants are rapeseed and sunflower.
The invention is also embodied in vegetable oil obtained from plants disclosed herein, which vegetable oil has an altered fatty acid composition.
Brief Description of the Sequence Listing SEQ ID NO:l shows a hypothetical DNA sequence of a Brassica Fad2 gene. SEQ ID NO : 2 is the deduced amino acid sequence of SEQ ID NO:l.
SEQ ID NO: 3 shows a hypothetical DNA sequence of a Brassica Fad2 gene having a mutation at nucleotide 316. SEQ ID NO: is the deduced amino acid sequence of SEQ ID NO: 3.
SEQ ID NO: 5 shows a hypothetical DNA sequence of a Brassica Fad2 gene. SEQ ID NO: 6 is the deduced amino acid sequence of SEQ ID NO : 5.
SEQ ID NO: 7 shows a hypothetical DNA sequence of a Brassica Fad2 gene having a mutation at nucleotide 515. SEQ ID NO : 8 is the deduced amino acid sequence of SEQ ID NO: 7.
SEQ ID NO : 9 shows the DNA sequence for the coding region of a wild type Brassica Fad2-D gene. SEQ ID NO: 10 is the deduced amino acid sequence for SEQ ID NO: 9.
SEQ ID NO: 11 shows the DNA sequence for the coding region of the IMC 129 mutant Brassica Fad2-D gene. SEQ ID NO: 12 is the deduced amino acid sequence for SEQ ID NO: 11. SEQ ID NO: 13 shows the DNA sequence for the coding region of a wild type Brassica Fad2-F gene. SEQ ID NO: 14 is the deduced amino acid sequence for SEQ ID NO: 13.
SEQ ID NO: 15 shows the DNA sequence for the coding region of the Q508 mutant Brassica Fad2-F gene. SEQ ID NO: 16 is the deduced amino acid sequence for SEQ ID NO: 15.
SEQ ID NO: 17 shows the DNA sequence for the coding region of the Q4275 mutant Brassica Fad2-F gene. SEQ ID NO: 18 is the deduced amino acid sequence for SEQ ID NO:17.
SEQ ID NOS: 19-27 show oligonucleotide sequences. SEQ ID NO: 28 shows the genomic DNA sequence for the Fad2-U gene from Brassica .
SEQ ID NOS: 30-31 show genomic sequences located upstream from the start codon of Brassica Fad2-D genes.
Brief Description of the Figures Figure 1 is a histogram showing the frequency distribution of seed oil oleic acid (C18:1) content in a segregating population of a Q508 X Westar cross. The bar labeled WSGA 1A represents the C18;1 content of the Westar
parent. The bar labeled Q508 represents the C18:1 content of the Q508 parent.
Figure 2 shows the nucleotide sequences for a Brassica Fad2-D wild type gene (Fad2-D wt) , IMC129 mutant gene (Fad2-D GA316 IMC129) , Fad2-F wild type gene (Fad2-F wt) , Q508 mutant gene (Fad2-F TA515 Q508) and Q4275 mutant gene (Fad2-F GA908 Q4275) .
Figure 3 shows the deduced amino acid sequences for the polynucleotides of Figure 2.
Description of the Preferred Embodiments
All percent fatty acids herein are percent by weight of the oil of which the fatty acid is a component. As used herein, a "line" is a group of plants that display little or no genetic variation between individuals for at least one trait. Such lines may be created by several generations of self-pollination and selection, or vegetative propagation from a single parent using tissue or cell culture techniques. As used herein, the term "variety" refers to a line which is used for commercial production.
The term "mutagenesis" refers to the use of a mutagenic agent to induce random genetic mutations within a population of individuals. The treated population, or a subsequent generation of that population, is then screened for usable trait (s) that result from the mutations. A "population" is any group of individuals that share a common gene pool. As used herein "M0" is untreated seed. As used herein, "M-." is the seed (and resulting plants) exposed to a mutagenic agent, while "M2" is the progeny (seeds and plants) of self-pollinated Mx plants, "M3" is the progeny of self-pollinated M2 plants, and "M4" is the progeny of self-pollinated M3 plants. "M5" is the progeny of self-pollinated M4 plants. "M6", "M7", etc. are each the progeny of self-pollinated plants
of the previous generation. The term "selfed" as used herein means self -pollinated.
"Stability" or "stable" as used herein means that with respect to a given fatty acid component, the component is maintained from generation to generation for at least two generations and preferably at least three generations at substantially the same level, e.g., preferably +.5%. The method of invention is capable of creating lines with improved fatty acid compositions stable up to +.5% from generation to generation. The above stability may be affected by temperature, location, stress and time of planting. Thus, comparison of fatty acid profiles should be made from seeds produced under similar growing conditions. Stability may be measured based on knowledge of prior generation.
Intensive breeding has produced Brassica plants whose seed oil contains less than 2% erucic acid. The same varieties have also been bred so that the defatted meal contains less than 30 μmol glucosinolates/gram. "Canola" as used herein refers to plant variety seed or oil which contains less than 2% erucic acid (C22:1) , and meal with less than 30 μmol glucosinolates/gram.
Applicants have discovered plants with mutations in a delta-12 fatty acid desaturase gene. Such plants have useful alterations in the fatty acid compositions of the seed oil. Such mutations confer, for example, an elevated oleic acid content, a decreased, stabilized linoleic acid content, or both elevated oleic acid and decreased, stabilized linoleic acid content. Applicants have further discovered plants with mutations in a delta-15 fatty acid desaturase gene. Such plants have useful alterations in the fatty acid composition of the seed oil, e.g., a decreased, stabilized level of c_-linolenic acid.
Applicants have further discovered isolated nucleic acid fragments (polynucleotides) comprising sequences that carry mutations within the coding sequence of delta-12 or delta-15 fatty acid desaturases . The mutations confer desirable alterations in fatty acid levels in the seed oil of plants carrying such mutations. Delta-12 fatty acid desaturase is also known as omega-6 fatty acid desaturase and is sometimes referred to herein as Fad2 or 12-DES. Delta-15 fatty acid desaturase is also known on omega-3 fatty acid desaturase and is sometimes referred to herein as Fad3 or 15-DES.
A nucleic acid fragment of the invention may be in the form of RNA or in the form of DNA, including cDNA, synthetic DNA or genomic DNA. The DNA may be double- stranded or single-stranded, and if single-stranded, can be either the coding strand or non-coding strand. An RNA analog may be, for example, mRNA or a combination of ribo- and deoxyribonucleotides . Illustrative examples of a nucleic acid fragment of the invention are the mutant sequences shown in Fig. 3.
A nucleic acid fragment of the invention contains a mutation in a microsomal delta-12 fatty acid desaturase coding sequence or a mutation in a microsomal delta-15 fatty acid desaturase coding sequence. Such a mutation renders the resulting desaturase gene product nonfunctional in plants, relative to the function of the gene product encoded by the wild-type sequence. The non- functionality of the delta-12 desaturase gene product can be inferred from the decreased level of reaction product (linoleic acid) and increased level of substrate (oleic acid) in plant tissues expressing the mutant sequence, compared to the corresponding levels in plant tissues expressing the wild-type sequence. The non- functionality of the delta-15 desaturase gene product can be inferred from the decreased level of reaction product (α-linolenic
acid) and the increased level of substrate (linoleic acid) in plant tissues expressing the mutant sequence, compared to the corresponding levels in plant tissues expressing the wild-type sequence. A nucleic acid fragment of the invention may comprise a portion of the coding sequence, e.g., at least about 10 nucleotides, provided that the fragment contains at least one mutation in the coding sequence. The length of a desired fragment depends upon the purpose for which the fragment will be used, e.g., PCR primer, site- directed mutagenesis and the like. In one embodiment, a nucleic acid fragment of the invention comprises the full length coding sequence of a mutant delta-12 or mutant delta-15 fatty acid desaturase, e.g., the mutant sequences of Fig. 3. In other embodiments, a nucleic acid fragment is about 20 to about 50 nucleotides (or base pairs, bp) , or about 50 to about 500 nucleotides, or about 500 to about 1200 nucleotides in length.
In another embodiment, the invention relates to an isolated nucleic acid fragment of at least 50 nucleotides in length that has at least 70% sequence identity to the nucleotide sequences of SEQ ID NO: 30 or SEQ ID NO -.31. In some embodiments, such nucleic acid fragments have at least 80% or 90% sequence identity to SEQ ID NO: 30 or SEQ ID NO: 31. Sequence identity for these and other nucleic acids disclosed herein can be determined, for example, using Blast 2.0.4 (Feb. 24, 1998) to search the nr database (non-redundant GenBank, EMBL, DDBT and PDB) . BLAST 2.0.4 is provided by the National Center for Biotechnology (http://www.ncbi.nlm.nih.gov). Altschul, S.F. et al., Nucleic Acids Res., 25:3389-3402 (1997). Alternatively, MEGALIGN® (DNASTAR, Madison, WI) sequence alignment software can be used to determine sequence identity by the Clustal algorithm. In this method, sequences are grouped into clusters by examining the
distance between all pairs. Clusters are aligned pairwise, then as groups. The Jotun Hem algorithm is also available in MEGALIGN®. The nucleotide sequences of SEQ ID NO:30 and NO:31 are about 85% identical using the Clustal algorithm with default parameters.
The nucleotide sequences of SEQ ID NO: 30 and SEQ ID NO: 31 are located upstream of the ATG start codon for the fad2-D gene and can be isolated from Bridger and Westar canola plants, respectively. These upstream elements contain intron-like features.
The invention also relates to an isolated nucleic acid fragment that includes a sequence of at least 200 nucleotides. The fragment has at least 70% identity to nucleotides 1 to about 1012 of SEQ ID NO: 28. In some embodiments, the fragment has 80% or at least 90% sequence identity to nucleotides 1 to about 1012 of SEQ ID NO: 28. This portion of SEQ ID NO: 28 is located upstream of the ATG start codon and has intron-like features . A mutation in a nucleic acid fragment of the invention may be in any portion of the coding sequence that renders the resulting gene product non-functional. Suitable types of mutations include, without limitation, insertions of nucleotides, deletions of nucleotides, or transitions and transversions in the wild-type coding sequence. Such mutations result in insertions of one or more amino acids, deletions of one or more amino acids, and non-conservative amino acid substitutions in the corresponding gene product. In some embodiments, the sequence of a nucleic acid fragment may comprise more than one mutation or more than one type of mutation.
Insertion or deletion of amino acids in a coding sequence may, for example, disrupt the conformation of essential alpha-helical or beta-pleated sheet regions of the resulting gene product. Amino acid insertions or
deletions may also disrupt binding or catalytic sites important for gene product activity. It is known in the art that the insertion or deletion of a larger number of contiguous amino acids is more likely to render the gene product non- functional , compared to a smaller number of inserted or deleted amino acids.
Non-conservative amino acid substitutions may replace an amino acid of one class with an amino acid of a different class. Non-conservative substitutions may make a substantial change in the charge or hydrophobicity of the gene product. Non-conservative amino acid substitutions may also make a substantial change in the bulk of the residue side chain, e.g., substituting an alanyl residue for a isoleucyl residue. Examples of non-conservative substitutions include the substitution of a basic amino acid for a non-polar amino acid, or a polar amino acid for an acidic amino acid. Because there are only 20 amino acids encoded in a gene, substitutions that result in a non- functional gene product may be determined by routine experimentation, incorporating amino acids of a different class in the region of the gene product targeted for mutation.
Preferred mutations are in a region of the nucleic acid encoding an amino acid sequence motif that is conserved among delta-12 fatty acid desaturases or delta- 15 fatty acid desaturases, such as a His-Xaa-Xaa-Xaa-His motif (Tables 1-3) . An example of a suitable region has a conserved HECGH motif that is found, for example, in nucleotides corresponding to amino acids 105 to 109 of the Arabidopsis and Brassica delta-12 desaturase sequences, in nucleotides corresponding to amino acids 101 to 105 of the soybean delta-12 desaturase sequence and in nucleotides corresponding to amino acids 111 to 115 of the maize delta-12 desaturase sequence. See e.g., WO 94/115116; Okuley et al . , Plant Cell 6:147-158 (1994).
The one letter amino acid designations used herein are described in Alberts, B. et al . , Molecular Biology of the Cell, 3rd edition, Garland Publishing, New York, 1994. Amino acids flanking this motif are also highly conserved among delta-12 and delta-15 desaturases and are also suitable candidates for mutations in fragments of the invention.
An illustrative embodiment of a mutation in a nucleic acid fragment of the invention is a Glu to Lys substitution in the HECGH motif of a Brassica microsomal delta-12 desaturase sequence, either the D form or the F form. This mutation results in the sequence HECGH being changed to HKCGH as seen by comparing SEQ ID NO: 10 (wild- type D form) to SEQ ID NO: 12 (mutant D form) . A similar mutation in other Fad-2 sequences is contemplated to result in a non-functional gene product. (Compare SEQ ID NO: 2 to SEQ ID NO : 4 ) .
A similar motif may be found at amino acids 101 to 105 of the Arabidopsis microsomal delta-15 fatty acid desaturase, as well as in the corresponding rape and soybean desaturases (Table 5). See, e.g., WO 93/11245; Arondel, V. et al . , Science, 258:1153-1155 (1992); Yadav, N. et al., Plant Physiol . , 103:467-476 (1993). Plastid delta-15 fatty acids have a similar motif (Table 5) . Among the types of mutations in an HECGH motif that render the resulting gene product non- functional are non-conservative substitutions. An illustrative example of a non-conservative substitution is substitution of a glycine residue for either the first or second histidine. Such a substitution replaces a charged residue
(histidine) with a non-polar residue (glycine) . Another type of mutation that renders the resulting gene product non- functional is an insertion mutation, e.g., insertion of a glycine between the cysteine and glutamic acid residues in the HECGH motif.
Other regions having suitable conserved amino acid motifs include the HRRHH motif shown in Table 2, the HRTHH motif shown in Table 6 and the HVAHH motif shown in Table 3. See, e.g., WO 94/115116; Hitz, W. et al . , Plant Physiol., 105:635-641 (1994); Okuley, J., et al . , supra; and Yadav, N. et al . , supra. An illustrative example of a mutation in the region shown in Table 3 is a mutation at nucleotides corresponding to the codon for glycine (amino acid 303 of B . napus) . A non-conservative Gly to Glu substitution results in the amino acid sequence
DRDYGILNKV being changed to sequence DRDYEILNKV (compare wild-type F form SEQ ID NO: 14 to mutant Q4275 SEQ ID NO: 18, Fig. 3) .
Another region suitable for a mutation in a delta- 12 desaturase sequence contains the motif KYLNNP at nucleotides corresponding to amino acids 171 to 175 of the Brassica desaturase sequence. An illustrative example of a mutation is this region is a Leu to His substitution, resulting in the amino acid sequence (Table 4) KYHNN (compare wild-type Fad2-F SEQ ID NO: 14 to mutant SEQ ID NO: 16) . A similar mutation in other Fad-2 amino acid sequences is contemplated to result in a nonfunctional gene product. (Compare SEQ ID NO : 6 to SEQ ID NO: 8) .
TABLE 1
Alignment of Amino Acid Sequences from Microsomal Delta-12 Fatty Acid Desaturases
Species Position Ammo Acid Sequence Arabidopsis thaliana 100-129 I VIAHECGH HAFSDYQ LD DTVG IFHSF Glycine max 96-125 VWVIAHECGH HAFSKYQWVD DWGLTLHST
Zea mays 106-135 V VIAHECGH HAFSDYS LD DWGLVLHSS Ricinus communis* 1- 29 VMAHDCGH HAFSDYQLLD DWGLILHSC
Brassica napus D 100-128 VWVIAHECGH HAFSDYQWLD DTVGLIFHS Brassica napus F 100-128 VWVIAHECGH HAFSDYQWLD DTVGLIFHS from plasmid pRF2-lC
TABLE 2
Alignment of Amino Acid Sequences from Microsomal Delta-12 Fatty Acid Desaturases Species Position Am o Acid Sequence
Arabidopsis thaliana 130-158 LLVPYFSWKY SHRRHHSNTG SLERDEVFV Glycine max 126-154 LLVPYFSWKI SHRRHHSNTG SLDRDEVFV Zea mays 136-164 LMVPYFSWKY SHRRHHSNTG SLERDEVFV
Ricinus communis 30- 58 LLVPYFSWKH SHRRHHSNTG SLERDEVFV Brassica napus D 130-158 LLVPYFSWKY SHRRHHSNTG SLERDEVFV
Brassica napus F 130-158 LLVPYFSWKY SHRRHHSNTG SLERDEVFV from plasmid pRF2-lC
TABLE 3
Alignment of Amino Acid Sequences from Microsomal Delta-12 Fatty Acid Desaturases
Species Position Amino Acid Sequence
Arabidopsis thaliana 298-333 DRDYGILNKV FHNITDTHVA HHLFSTMPHY
NAMΞAT
Glycine max 294-329 DRDYGILNKV FHHITDTHVA HHLFSTMPHY HAMEAT
Zea mays 305-340 DRDYGILNRV FHNITDTHVA HHLFSTMPHY
HAMEAT
Ricinus communis 198-224 DRDYGILNKV FHNITDTQVA HHLF TMP
Brassica napus D 299-334 DRDYGILNKV FHNITDTHVA HHLFSTMPHY HAMEAT
Brassica napus F 299-334 DRDYGILNKV FHNITDTHVA HHLFSTMPHY
HAMEAT from plasmid pRF2-lC
TABLE 4
Alignment of Conserved Amino Acids from Microsomal Delta-12 Fatty Acid Desaturases
Species Position Amino Acid Sequence
Arabidopsi s thaliana 165-180 IKWYGKYLNN PLGRIM
Glycine max 161-176 VAWFSLYLNN PLGRAV
Zea mays 172-187 PWYTPYVYNN PVGRW
Ricinus communis 65- 80 IRWYSKYLNN PPGRIM
Brassica napus D 165-180 IKWYGKYLNN PLGRTV
Brassica napus F 165-180 IKWYGKYLNN PLGRTV
from plasmid pRF2-lC
TABLE 5
Alignment of Conserved Amino Acids from Plastid and Microsomal Delta-15 Fatty Acid Desaturases
Species Position Amino Acid Sequence
Arabidopsis thalianaa 156-177 WALFVLGHD CGHGSFSNDP KLN Brassica napus* 114-135 WALFVLGHD CGHGSFSNDP RLN Glycine max 164-185 WALFVLGHD CGHGSFSNNS KLN Arabidopsis thaliana 94-115 WAIFVLGHD CGHGSFSDIP LLN Brassica napus 87-109 WALFVLGHD CGHGSFSNDP RLN Glycine max 93-114 WALFVLGHD CGHGSFSDSP PLN a Plastid sequences
TABLE 6
Alignment of Conserved Amino Acids from Plastid and Microsomal Delta-15 Fatty Acid Desaturases
Species Position Amino Acid Sequence
A . thaliana 188-216 ILVPYHGWRI SHRTHHQNHG HVENDESWH
B . napus3 146-174 ILVPYHGWRI SHRTHHQNHG HVENDESWH Glycine max 196-224 ILVPYHGWRI SHRTHHQHHG HAENDESWH A . thaliana 126-154 ILVPYHGWRI SHRTHHQNHG HVENDESWV
Brassica napus 117-145 ILVPYHGWRI SHRTHHQNHG HVENDESWV Glycine max 125-153 ILVPYHGWRI SHRTHHQNHG HIEKDESWV a Plastid sequences
The conservation of amino acid motifs and their relative positions indicates that regions of a delta-12 or delta-15 fatty acid desaturase that can be mutated in one species to generate a non-functional desaturase can
be mutated in the corresponding region from other species to generate a non- functional delta-12 desaturase or delta-15 desaturase gene product in that species.
Mutations in any of the regions of Tables 1-6 are specifically included within the scope of the invention and are substantially identical to those mutations exemplified herein, provided that such mutation (or mutations) renders the resulting desaturase gene product non- functional , as discussed hereinabove. A nucleic acid fragment containing a mutant sequence can be generated by techniques known to the skilled artisan. Such techniques include, without limitation, site-directed mutagenesis of wild-type sequences and direct synthesis using automated DNA synthesizers.
A nucleic acid fragment containing a mutant sequence can also be generated by mutagenesis of plant seeds or regenerable plant tissue by, e.g., ethyl methane sulfonate, X-rays or other mutagens. With mutagenesis, mutant plants having the desired fatty acid phenotype in seeds are identified by known techniques and a nucleic acid fragment containing the desired mutation is isolated from genomic DNA or RNA of the mutant line. The site of the specific mutation is then determined by sequencing the coding region of the delta-12 desaturase or delta-15 desaturase gene. Alternatively, labeled nucleic acid probes that are specific for desired mutational events can be used to rapidly screen a mutagenized population.
The disclosed method may be applied to all oilseed Brassica species, and to both Spring and Winter maturing types within each species. Physical mutagens, including but not limited to X-rays, UV rays, and other physical treatments which cause chromosome damage, and other chemical mutagens, including but not limited to ethidium bromide, nitrosoguanidine, diepoxybutane etc. may also be
used to induce mutations. The mutagenesis treatment may also be applied to other stages of plant development, including but not limited to cell cultures, embryos, microspores and shoot apices. "Stable mutations" as used herein are defined as
M5 or more advanced lines which maintain a selected altered fatty acid profile for a minimum of three generations, including a minimum of two generations under field conditions, and exceeding established statistical thresholds for a minimum of two generations, as determined by gas chromatographic analysis of a minimum of 10 randomly selected seeds bulked together. Alternatively, stability may be measured in the same way by comparing to subsequent generations. In subsequent generations, stability is defined as having similar fatty acid profiles in the seed as that of the prior or subsequent generation when grown under substantially similar conditions.
Mutation breeding has traditionally produced plants carrying, in addition to the trait of interest, multiple, deleterious traits, e.g., reduced plant vigor and reduced fertility. Such traits may indirectly affect fatty acid composition, producing an unstable mutation; and/or reduce yield, thereby reducing the commercial utility of the invention. To eliminate the occurrence of deleterious mutations and reduce the load of mutations carried by the plant, a low mutagen dose is used in the seed treatments to create an LD30 population. This allows for the rapid selection of single gene mutations for fatty acid traits in agronomic backgrounds which produce acceptable yields.
The seeds of several different plant lines have been deposited with the American Type Culture Collection and have the following accession numbers.
- I f
Line Accession No. Deposit Date
A129.5 40811 May 25, 1990
A133.1 40812 May 25, 1990
M3032.1 75021 June 7, 1991
M3062.8 75025 June 7, 1991
M3028.10 75026 June 7, 1991
IMC130 75446 April 16, 1993
Q4275 97569 May 10, 1996
In some plant species or varieties more than one form of endogenous microsomal delta-12 desaturase may be found. In amphidiploids, each form may be derived from one of the parent genomes making up the species under consideration. Plants with mutations in both forms have a fatty acid profile that differs from plants with a mutation in only one form. An example of such a plant is Brassica napus line Q508 , a doubly-mutagenized line containing a mutant D-form of delta-12 desaturase (SEQ ID NO: 11) and a mutant F-form of delta-12 desaturase (SEQ ID NO: 15) . Another example is line Q4275, which contains a mutant D-form of delta-12 desaturase (SEQ ID NO: 11) and a mutant F-form of delta-12 desaturase (SEQ ID NO:17). See Figs. 2-3.
Preferred host or recipient organisms for introduction of a nucleic acid fragment of the invention are the oil-producing species, such as soybean { Glycine max) , rapeseed (e.g., Brassica napus , B . rapa and B . j uncea) , sunflower (Helianthus annus) , castor bean {Ri cinus communi s) , corn ( Zea mays) , and safflower ( Carthamus tinctorius) . A nucleic acid fragment of the invention may further comprise additional nucleic acids. For example, a nucleic acid encoding a secretory or leader amino acid sequence can be linked to a mutant desaturase nucleic acid fragment such that the secretory or leader sequence is fused in- frame to the amino terminal end of a mutant delta-12 or delta-15 desaturase polypeptide. Other
nucleic acid fragments are known in the art that encode amino acid sequences useful for fusing in- frame to the mutant desaturase polypeptides disclosed herein. See, e.g., U.S. 5,629,193 incorporated herein by reference. A nucleic acid fragment may also have one or more regulatory elements operably linked thereto.
The present invention also comprises nucleic acid fragments that selectively hybridize to mutant desaturase sequences. Such a nucleic acid fragment typically is at least 15 nucleotides in length. Hybridization typically involves Southern analysis (Southern blotting) , a method by which the presence of DNA sequences in a target nucleic acid mixture are identified by hybridization to a labeled oligonucleotide or DNA fragment probe. Southern analysis typically involves electrophoretic separation of DNA digests on agarose gels, denaturation of the DNA after electrophoretic separation, and transfer of the DNA to nitrocellulose, nylon, or another suitable membrane support for analysis with a radiolabeled, biotinylated, or enzyme-labeled probe as described in sections 9.37-
9.52 of Sambrook et al., (1989) Molecular Cloning, second edition, Cold Spring Harbor Laboratory, Plainview; NY.
A nucleic acid fragment can hybridize under moderate stringency conditions or, preferably, under high stringency conditions to a mutant desaturase sequence. High stringency conditions are used to identify nucleic acids that have a high degree of homology to the probe. High stringency conditions can include the use of low ionic strength and high temperature for washing, for example, 0.015 M NaCl/0.0015 M sodium citrate (0.1X SSC); 0.1% sodium lauryl sulfate (SDS) at 50-65°C. Alternatively, a denaturing agent such as formamide can be employed during hybridization, e.g., 50% formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH
6.5 with 750 mM NaCl , 75 mM sodium citrate at 42°C. Another example is the use of 50% formamide, 5 x SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5 x Denhardt's solution, sonicated salmon sperm DNA (50 μg/ml), 0.1% SDS, and 10% dextran sulfate at 42°C, with washes at 42 °C in 0.2 x SSC and 0.1% SDS.
Moderate stringency conditions refers to hybridization conditions used to identify nucleic acids that have a lower degree of identity to the probe than do nucleic acids identified under high stringency conditions. Moderate stringency conditions can include the use of higher ionic strength and/or lower temperatures for washing of the hybridization membrane, compared to the ionic strength and temperatures used for high stringency hybridization. For example, a wash solution comprising 0.060 M NaCl/0.0060 M sodium citrate (4X SSC) and 0.1% sodium lauryl sulfate (SDS) can be used at 50°C, with a last wash in IX SSC, at 65°C. Alternatively, a hybridization wash in IX SSC at 37°C can be used.
Hybridization can also be done by Northern analysis (Northern blotting) , a method used to identify RNAs that hybridize to a known probe such as an oligonucleotide, DNA fragment, cDNA or fragment thereof, or RNA fragment . The probe is labeled with a radioisotope such as 3P, by biotinylation or with an enzyme. The RNA to be analyzed can be usually electrophoretically separated on an agarose or polyacrylamide gel, transferred to nitrocellulose, nylon, or other suitable membrane, and hybridized with the probe, using standard techniques well known in the art such as those described in sections 7.39-7.52 of Sambrook et al . , supra .
A polypeptide of the invention comprises an isolated polypeptide having a mutant amino acid sequence, as well as derivatives and analogs thereof. See, e.g., the mutant amino acid sequences of Fig. 3. By "isolated" is meant a polypeptide that is expressed and produced in an environment other than the environment in which the polypeptide is naturally expressed and produced. For example, a plant polypeptide is isolated when expressed and produced in bacteria or fungi. A polypeptide of the invention also comprises variants of the mutant desaturase polypeptides disclosed herein, as discussed above .
In one embodiment of the claimed invention, a plant contains both a delta-12 desaturase mutation and a delta-15 desaturase mutation. Such plants can have a fatty acid composition comprising very high oleic acid and very low alpha-linolenic acid levels. Mutations in delta-12 desaturase and delta-15 desaturase may be combined in a plant by making a genetic cross between delta-12 desaturase and delta-15 desaturase single mutant lines. A plant having a mutation in delta-12 fatty acid desaturase is crossed or mated with a second plant having a mutation in delta-15 fatty acid desaturase. Seeds produced from the cross are planted and the resulting plants are selfed in order to obtain progeny seeds. These progeny seeds are then screened in order to identify those seeds carrying both mutant genes.
Alternatively, a line possessing either a delta-12 desaturase or a delta-15 desaturase mutation can be subjected to mutagenesis to generate a plant or plant line having mutations in both delta-12 desaturase and delta-15 desaturase. For example, the IMC 129 line has a mutation in the coding region (Glu106 to Lys106) of the D form of the microsomal delta-12 desaturase structural gene. Cells (e.g., seeds) of this line can be
mutagenized to induce a mutation in a delta-15 desaturase gene, resulting in a plant or plant line carrying a mutation in a delta-12 fatty acid desaturase gene and a mutation in a delta-15 fatty acid desaturase gene. Progeny includes descendants of a particular plant or plant line, e.g., seeds developed on an instant plant are descendants. Progeny of an instant plant include seeds formed on F1# F2, F3, and subsequent generation plants, or seeds formed on BC1# BC2, BC3 and subsequent generation plants.
Plants according to the invention preferably contain an altered fatty acid composition. For example, oil obtained from seeds of such plants may have from about 69 to about 90% oleic acid, based on the total fatty acid composition of the seed. Such oil preferably has from about 74 to about 90% oleic acid, more preferably from about 80 to about 90% oleic acid. In some embodiments, oil obtained from seeds produced by plants of the invention may have from about 2.0% to about 5.0% saturated fatty acids, based on total fatty acid composition of the seeds. In some embodiments, oil obtained from seeds of the invention may have from about 1.0% to about 14.0% linoleic acid, or from about 0.5% to about 10.0% c.-linolenic acid. Oil composition typically is analyzed by crushing and extracting fatty acids from bulk seed samples (e.g., 10 seeds) . Fatty acid triglycerides in the seed are hydrolyzed and converted to fatty acid methyl esters . Those seeds having an altered fatty acid composition may be identified by techniques known to the skilled artisan, e.g., gas-liquid chromatography (GLC) analysis of a bulked seed sample or of a single half-seed. Half-seed analysis is well known in the art to be useful because the viability of the embryo is maintained and thus those seeds having a desired fatty acid profile may be planted
to from the next generation. However, half-seed analysis is also known to be an inaccurate representation of genotype of the seed being analyzed. Bulk seed analysis typically yields a more accurate representation of the fatty acid profile of a given genotype. Fatty acid composition can also be determined on larger samples, e.g., oil obtained by pilot plant or commercial scale refining, bleaching and deodorizing of endogenous oil in the seeds . The nucleic acid fragments of the invention can be used as markers in plant genetic mapping and plant breeding programs. Such markers may include restriction fragment length polymorphism (RFLP) , random amplification polymorphism detection (RAPD) , polymerase chain reaction (PCR) or self-sustained sequence replication (3SR) markers, for example. Marker-assisted breeding techniques may be used to identify and follow a desired fatty acid composition during the breeding process. Marker-assisted breeding techniques may be used in addition to, or as an alternative to, other sorts of identification techniques. An example of marker-assisted breeding is the use of PCR primers that specifically amplify a sequence containing a desired mutation in delta-12 desaturase or delta-15 desaturase. Methods according to the invention are useful in that the resulting plants and plant lines have desirable seed fatty acid compositions as well as superior agronomic properties compared to known lines having altered seed fatty acid composition. Superior agronomic characteristics include, for example, increased seed germination percentage, increased seedling vigor, increased resistance to seedling fungal diseases (damping off, root rot and the like) , increased yield, and improved standabilit .
While the invention is susceptible to various modifications and alternative forms, certain specific embodiments thereof are described in the general methods and examples set forth below. For example the invention may be applied to all Brassica species, including B . rapa , B . juncea , and B . hirta , to produce substantially similar results. It should be understood, however, that these examples are not intended to limit the invention to the particular forms disclosed but, instead the invention is to cover all modifications, equivalents and alternatives falling within the scope of the invention. This includes the use of somaclonal variation; physical or chemical mutagenesis of plant parts; anther, microspore or ovary culture followed by chromosome doubling; or self- or cross-pollination to transmit the fatty acid trait, alone or in combination with other traits, to develop new Brassica lines.
EXAMPLE 1 Mutagenesis Seeds of Westar, a Canadian (Brassica napus) spring canola variety, were subjected to chemical mutagenesis. Westar is a registered Canadian spring variety with canola quality. The fatty acid composition of field-grown Westar, 3.9% C16:0, 1.9% C18:0, 67.5% C18:1, 17.6% C18:2, 7.4% C18:3, <2% C20:l + C22:1, has remained stable under commercial production, with <_+ 10% deviation, since 1982.
Prior to mutagenesis, 30,000 seeds of B . napus cv. Westar seeds were preimbibed in 300-seed lots for two hours on wet filter paper to soften the seed coat. The preimbibed seeds were placed in 80 mM ethylmethanesulfonate (EMS) for four hours. Following mutagenesis, the seeds were rinsed three times in distilled water. The seeds were sown in 48-well flats
containing Pro-Mix. Sixty-eight percent of the mutagenized seed germinated. The plants were maintained at 25°C/15°C, 14/10 hr day/night conditions in the greenhouse. At flowering, each plant was individually self-pollinated.
M2 seed from individual plants were individually catalogued and stored, approximately 15,000 M2 lines was planted in a summer nursery in Carman, Manitoba. The seed from each selfed plant were planted in 3 -meter rows with 6 -inch row spacing. Westar was planted as the check variety. Selected lines in the field were selfed by bagging the main raceme of each plant. At maturity, the selfed plants were individually harvested and seeds were catalogued and stored to ensure that the source of the seed was known.
Self-pollinated M3 seed and Westar controls were analyzed in 10-seed bulk samples for fatty acid composition via gas chromatography. Statistical thresholds for each fatty acid component were established using a Z-distribution with a stringency level of 1 in 10,000. Mean and standard deviation values were determined from the non-mutagenized Westar control population in the field. The upper and lower statistical thresholds for each fatty acid were determined from the mean value of the population +. the standard deviation, multiplied by the Z-distribution. Based on a population size of 10,000, the confidence interval is 99.99%.
The selected M3 seeds were planted in the greenhouse along with Westar controls. The seed was sown in 4-inch pots containing Pro-Mix soil and the plants were maintained at 25°C/15°C, 14/10 hr day/night cycle in the greenhouse. At flowering, the terminal raceme was self-pollinated by bagging. At maturity, selfed M4 seed was individually harvested from each plant, labelled, and stored to ensure that the source of the seed was known.
The M4 seed was analyzed in 10-seed bulk samples. Statistical thresholds for each fatty acid component were established from 259 control samples using a Z- distribution of 1 in 800. Selected M4 lines were planted in a field trial in Carman, Manitoba in 3 -meter rows with 6-inch spacing. Ten M4 plants in each row were bagged for self-pollination. At maturity, the selfed plants were individually harvested and the open pollinated plants in the row were bulk harvested. The M5 seed from single plant selections was analyzed in 10-seed bulk samples and the bulk row harvest in 50-seed bulk samples.
Selected M5 lines were planted in the greenhouse along with Westar controls. The seed was grown as previously described. At flowering the terminal raceme was self-pollinated by bagging. At maturity, selfed M6 seed was individually harvested from each plant and analyzed in 10-seed bulk samples for fatty acid composition.
Selected M6 lines were entered into field trials in Eastern Idaho. The four trial locations were selected for the wide variability in growing conditions. The locations included Burley, Tetonia, Lamont and Shelley (Table 7) . The lines were planted in four 3 -meter rows with an 8-inch spacing, each plot was replicated four times. The planting design was determined using a Randomized Complete Block Designed. The commercial cultivar Westar was used as a check cultivar. At maturity the plots were harvested to determine yield. Yield of the entries in the trial was determined by taking the statistical average of the four replications. The Least Significant Difference Test was used to rank the entries in the randomized complete block design.
TABLE 7
Trial Locat ions for Selected Fatty Acid Mutants LOCATION SITE CHARACTERIZATIONS
BURLEY Irrigated . Long season . High temperatures during flowering . TETONIA Dryland . Short season . Cool temperatures .
LAMONT Dryland . Short season . Cool temperatures .
SHELLEY Irrigated . Medium season . High temperatures during flowering .
To determine the fatty acid prof ile of entries , plants in each plot were bagged for self -pollination . The M7 seed from single plants was analyzed for fatty acids in ten- seed bulk samples .
To determine the genetic relationships of the selected fatty acid mutants crosses were made . Flowers of M6 or later generation mutations were used in crossing . F1 seed was harvested and analyzed for fatty acid composition to determine the mode of gene action . The Fx progeny were planted in the greenhouse . The resulting plants were self -pollinated , the F2 seed harvested and analyzed for fatty acid composition for allelism studies . The F2 seed and parent line seed was planted in the greenhouse , individual plants were sel f -pollinated . The F3 seed of individual plants was tested for fatty acid composition using 10 - seed bulk samples as described previously .
In the analysis of some genetic relationships dihaploid populations were made from the microspores of the F-L hybrids. Self-pollinated seed from dihaploid plants were analyzed for fatty acid analysis using methods described previously.
For chemical analysis, 10-seed bulk samples were hand ground with a glass rod in a 15-mL polypropylene tube and extracted in 1.2 mL 0.25 N KOH in 1:1 ether/methanol . The sample was vortexed for 30 sec. and
heated for 60 sec. in a 60°C water bath. Four mL of saturated NaCl and 2.4 mL of iso-octane were added, and the mixture was vortexed again. After phase separation, 600 μL of the upper organic phase were pipetted into individual vials and stored under nitrogen at -5°C. One μL samples were injected into a Supelco SP-2330 fused silica capillary column (0.25 mm ID, 30 M length, 0.20 μm df) .
The gas chromatograph was set at 180°C for 5.5 minutes, then programmed for a 2°C/minute increase to 212 °C, and held at this temperature for 1.5 minutes. Total run time was 23 minutes. Chromatography settings were: Column head pressure - 15 psi, Column flow (He) - 0.7 mL/min., Auxiliary and Column flow - 33 mL/min. , Hydrogen flow - 33 mL/min., Air flow - 400 mL/min., Injector temperature - 250°C, Detector temperature - 300°C, Split vent - 1/15.
Table 8 describes the upper and lower statistical thresholds for each fatty acid of interest .
TABLE 8
Statistical Thresholds for Specific Fatty Acids Derived from Control Westar Plantings Percent Fatty Acids Genotype C16:0 C18:0 Clβ!l C18:2 C18.3 Sats*
M3 Generation (1 in 10,000 rejection rate) Lower 3.3 1.4 -- 13.2 5.3 6.0
Upper 4.3 2.5 71.0 21.6 9.9 8.3
M4 Generation (1 in 800 rejection rate)
Lower 3.6 0.8 -- 12.2 3.2 5.3
Upper 6.3 3.1 76.0 32.4 9.9 11.2
M5 Generation (1 in 755 rejection rate) Lower 2.7 0.9 -- 9.6 2.6 4.5 Upper 5^7 2^ 7 80.3 26.7 9.6 10.0
*Sats=Total Saturate Content
EXAMPLE 2 High Oleic Acid Canola Lines In the studies of Example 1, at the M3 generation, 31 lines exceeded the upper statistical threshold for oleic acid (> 71.0%) . Line W7608.3 had 71.2% oleic acid. At the M4 generation, its selfed progeny (W7608.3.5, since designated A129.5) continued to exceed the upper statistical threshold for C18:1 with 78.8% oleic acid. M5 seed of five self -pollinated plants of line A129.5 (ATCC 40811) averaged 75.0% oleic acid. A single plant selection, A129.5.3 had 75.6% oleic acid. The fatty acid composition of this high oleic acid mutant, which was stable under both field and greenhouse conditions to the M7 generation, is summarized in Table 9. This line also stably maintained its mutant fatty acid composition to the M7 generation in field trials in multiple locations.
Over all locations the self-pollinated plants (A129) averaged 78.3% oleic acid. The fatty acid composition of the A129 for each Idaho trial location are summarized in Table 10. In multiple location replicated yield trials, A129 was not significantly different in yield from the parent cultivar Westar.
The canola oil of A129, after commercial processing, was found to have superior oxidative stability compared to Westar when measured by the Accelerated Oxygen Method (AOM) , American Oil Chemists' Society Official Method Cd 12-57 for fat stability; Active Oxygen Method (revised 1989) . The AOM of Westar was 18 AOM hours and for A129 was 30 AOM hours.
TABLE 9 Fatty Acid Composition of a High
Oleic Acid Canola Line Produced by Seed Mutagenesis
Percent Fatty Acids
Genotype C16 : 0 C18 : 0 C18 : 1 C18 : 2 C18 : 3 Sat s
Westar 3.9 1. .9 67.5 17.6 7. .4 7. .0
W7608.3 3.9 2. .4 71.2 12.7 6, .1 7. .6
(M3)
W7608.3.5 3.9 2. .0 78.8 7.7 3 .9 7, .3
(M4)
A129.5.3 3.8 2 .3 75.6 9.5 4 .9 7, .6
(M5)
Sats=Total Saturate Content
TABLE 10
Fatty Acid Composition of a Mutant High Oleic Acid Line at Different Field Locations in Idaho
Percent Fatty Acids
Location C16 0 C18 0 C18 C182 C18 3 Sats
Burley 3.3 2.1 77.5 8.1 6.0 6.5
Tetonia 3.5 3.4 77.8 6.5 4.7 8.5
Lamont 3.4 1.9 77.8 7.4 6.5 6.3
Shelley 3.3 2.6 80.0 5.7 4.5 7.7 Sats=Total Saturate Content
The genetic relationship of the high oleic acid mutation A129 to other oleic desaturases was demonstrated in crosses made to commercial canola cultivars and a low linolenic acid mutation. A129 was crossed to the commercial cultivar Global (C16 0 - 4.5%, C18 0 - 1.5%, C18 λ - 62.9%,C18 2 - 20.0%, C18 3 - 7.3%). Approximately 200 F2 individuals were analyzed for fatty acid composition. The results are summarized in Table 11. The segregation fit 1:2:1 ratio suggesting a single co-dominant gene controlled the inheritance of the high oleic acid phenotype .
TABLE 11
Genetic Studies of A129 X Global
Genotype Content (%) Observed Expected od-od- 77.3 43 47 od-od+ 71.7 106 94 od+od+ 66.1 49 47 A cross between A129 and IMC 01, a low linolenic acid variety (C16 0 - 4.1%, C18 0 - 1.9%, C18 __ - 66.4%, C182 - 18.1%, C183 - 5.7%), was made to determine the inheritance of the oleic acid desaturase and linoleic acid
desaturase. In the F-^ hybrids both the oleic acid and linoleic acid desaturase genes approached the mid-parent values indicating a co-dominant gene actions. Fatty acid analysis of the F2 individuals confirmed a 1:2:1:2:4:2:1:2:1 segregation of two independent, co- dominant genes (Table 12) . A line was selected from the cross of A129 and IMC01 and designated as IMC130 (ATCC deposit no. 75446) as described in U.S. Patent Application No. 08/425,108, incorporated herein by reference.
TABLE 12 Genetic Studies of A129 X IMC 01 Frequency
Genotype Ratio Observed Expected od-od-ld-ld- 1 11 12 od-od-ld-ld+ 2 30 24 od-od-ld+ld+ 1 10 12 od-od+ld-ld- 2 25 24 od-od+ld-ld+ 4 54 47 od-od+ld+ld+ 2 18 24 od+od+ld-ld- 1 7 12 od+od+ld-ld+ 2 25 24 od+od+ld+ld+ 1 12
An additional high oleic acid line, designated A128.3, was also produced by the disclosed method. A 50- seed bulk analysis of this line showed the following fatty acid composition: C16. 3.5-6 , Cιε - 1.8-s , C18:1 - 77.3%, C18:2 - 9.0%, C18:3 - 5.6%, FDA Sats - 5.3%, Total Sats - 6.4%. This line also stably maintained its mutant fatty acid composition to the M7 generation. In multiple locations replicated yield trials, A128 was not significantly different in yield from the parent cultivar Westar.
A129 was crossed to A128.3 for allelism studies. Fatty acid composition of the F2 seed showed the two lines to be allelic. The mutational events in A129 and A128.3 although different in origin were in the same gene.
An additional high oleic acid line, designated M3028.-10 (ATCC 75026), was also produced by the disclosed method in Example 1. A 10-seed bulk analysis of this line showed the following fatty acid composition: -> C16:0 - 3.5-s, 18:0 - 1.8 -s , C18;1 - 77.3-s, C18:2 - 9.0-s, C18.3 - 5.6%, FDA Saturates - 5.3%, Total Saturates - 6.4%. In a single location replicated yield trial M3028.10 was not significantly different in yield from the parent cultivar Westar.
0 EXAMPLE 3
Low Linoleic Acid Canola In the studies of Example 1, at the M3 generation, 80 lines exceeded the lower statistical threshold for linoleic acid (_< 13.2%). Line W12638.8 had 9.4% linoleic 5 acid. At the M4 and M5 generations, its selfed progenies [W12638.8, since designated A133.1 (ATCC 40812)] continued to exceed the statistical threshold for low C18:2 with linoleic acid levels of 10.2% and 8.4%, respectively. The fatty acid composition of this low 0 linoleic acid mutant, which was stable to the M7 generation under both field and greenhouse conditions, is summarized in Table 13. In multiple location replicated yield trials, A133 was not significantly different in yield from the parent cultivar Westar. An additional low 5 linoleic acid line, designated M3062.8 (ATCC 75025), was also produced by the disclosed method. A 10-seed bulk analysis of this line showed the following fatty acid composition: C16:0 - 3.8%, C18:0 - 2.3%, C18;1 - 77.1%, C18:2 - 8.9%, C18:3 - 4.3%, FDA Sats-6.1%. This line has also 0 stably maintained its mutant fatty acid composition in the field and greenhouse.
TABLE 13
Fatty Aclid Composition of a Low
Linoleic Acid Canol;a Line Produced by Seed Mutagenesis
Percent Fatty Acids
Genotype Cl6:0 iβ : 0 Clβ:l '-18 :2 ^-18:3 Satsb
Westar 3.9 1.9 67.5 17.6 7.4 7.0
W12638.8 3.9 2.3 75.0 9.4 6.1 7.5
(M3)
W12638.8.1 4.1 1.7 74.6 10.2 5.9 7.1
(M4)
A133.1.8 3.8 2.0 77.7 8.4 5.0 7.0
(M5) aLetter and numbers up to second decimal point indicate the plant line. Number after second decimal point indicates an individual plant. bSats=Total Saturate Content
EXAMPLE 4 Low Linolenic and Linoleic Acid Canola In the studies of Example 1, at the M3 generation, 57 lines exceeded the lower statistical threshold for linolenic acid ( < 5.3%) . Line W14749.8 had 5.3% linolenic acid and 15.0% linoleic acid. At the M4 and M5 generations, its selfed progenies [W14749.8, since designated M3032 (ATCC 75021)] continued to exceed the statistical threshold for low C18:3 with linolenic acid levels of 2.7% and 2.3%, respectively, and for a low sum of linolenic and linoleic acids with totals of 11.8% and 12.5% respectively. The fatty acid composition of this low linolenic acid plus linoleic acid mutant, which was stable to the M5 generation under both field and greenhouse conditions, is summarized in Table 14. In a single location replicated yield trial M3032 was not significantly different in yield from the parent cultivar (Westar) .
TABLE 14
Fatty Acid Composition of a Low
Linolenic Acid Canola Line Produced by Seed Mutagenesis
Percent Fatty Acids
Genotype C-L6-0 ^18:0 ^18:1 C ^18:2 C^-18:3 Sats
Westar 3.9 1.9 67.5 17.6 7.4 7.0
W14749.8 4.0 2.5 69.4 15.0 5.3 6.5 (M3)
M3032.8 3.9 2.4 77.9 9.1 2.7 6.4 (M4)
M3032.1 3.5 2.8 80.0 10.2 2.3 6.5 (M5)
Sats=Total Saturate Content
EXAMPLE 5 Canola Lines Q508 and Q4275
Seeds of the B . napus line IMC-129 were mutagenized with methyl N-nitrosoguanidine (MNNG) . The MNNG treatment consisted of three parts: pre-soak, mutagen application, and wash. A 0.05M Sorenson's phosphate buffer was used to maintain pre-soak and mutagen treatment pH at 6.1. Two hundred seeds were treated at one time on filter paper (Whatman #3M) in a petri dish (100mm x 15mm) . The seeds were pre-soaked in 15 mis of 0.05M Sorenson's buffer, pH 6.1, under continued agitation for two hours. At the end of the pre-soak period, the buffer was removed from the plate.
A lOmM concentration of MNNG in 0.05M Sorenson's buffer, pH 6.1, was prepared prior to use. Fifteen ml of 10m MNNG was added to the seeds in each plate. The seeds were incubated at 22°C±3°C in the dark under constant agitation for four (4) hours. At the end of the incubation period, the mutagen solution was removed.
The seeds were washed with three changes of distilled water at 10 minute intervals. The fourth wash
was for thirty minutes. This treatment regime produced an LD60 population.
Treated seeds were planted in standard greenhouse potting soil and placed into an environmentally controlled greenhouse. The plants were grown under sixteen hours of light. At flowering, the racemes were bagged to produce selfed seed. At maturity, the M2 seed was harvested. Each M2 line was given an identifying number. The entire MNNG-treated seed population was designated as the Q series.
Harvested M2 seeds was planted in the greenhouse. The growth conditions were maintained as previously described. The racemes were bagged at flowering for selfing. At maturity, the selfed M3 seed was harvested and analyzed for fatty acid composition. For each M3 seed line, approximately 10-15 seeds were analyzed in bulk as described in Example 1.
High oleic-low linoleic M3 lines were selected from the M3 population using a cutoff of >82% oleic acid and <5.0% linoleic. From the first 1600 M3 lines screened for fatty acid composition, Q508 was identified. The Q508 M3 generation was advanced to the M4 generation in the greenhouse. Table 15 shows the fatty acid composition of Q508 and IMC 129. The M4 selfed seed maintained the selected high oleic-low linoleic acid phenotype (Table 16) .
TABLE 15
Fatty Acid Composition of A129 and High Oleic Acid M3 Mutant Q508
Line # 16:0 18:0 18 :1 18:2 18:3
A129* 4.0 2.4 77.7 7.8 4.2
0508 3.9 2.1 84.9 2.4 2.9
*Fatty acid composition of A129 is the average of 50 self -pollinated plants grown with the M3 population
M4 generation Q508 plants had poor agronomic qualities in the field compared to Westar. Typical plants were slow growing relative to Westar, lacked early vegetative vigor, were short in stature, tended to be chlorotic and had short pods. The yield of Q508 was very low compared to Westar.
The M4 generation Q508 plants in the greenhouse tended to be reduced in vigor compared to Westar. However, Q508 yields in the greenhouse were greater than Q508 yields in the field.
TABLE 16
Fatty Acid Composition of Seed Oil from Greenhouse-Grown 0508, IMC 129 and Westar,
Data from Example 1
cAverage of 50 self-pollinated plants
Nine other M4 high-oleic low-linoleic lines were also identified: Q3603, Q3733, Q4249, Q6284, Q6601, Q6761, Q7415, Q4275, and Q6676. Some of these lines had good agronomic characteristics and an elevated oleic acid level in seeds of about 80% to about 84%.
Q4275 was crossed to the variety Cyclone. After selfing for seven generations, mature seed was harvested from 93GS34-179, a progeny line of the Q4275 Cyclone cross. Referring to Table 17, fatty acid composition of a bulk seed sample shows that 93GS34 retained the seed
fatty acid composition of Q4275. 93GS34-179 also maintained agronomically desirable characteristics.
After more than seven generations of selfing of Q4275, plants of Q4275, IMC 129 and 93GS34 were field grown during the summer season. The selections were tested in 4 replicated plots (5 feet X 20 feet) in a randomized block design. Plants were open pollinated. No selfed seed was produced. Each plot was harvested at maturity, and a sample of the bulk harvested seed from each line was analyzed for fatty acid composition as described above. The fatty acid compositions of the selected lines are shown in Table 17.
TABLE 17
Fatty Acid Composition of Field Grown IMC 129, 04275 and 93GS34 Seeds
The results shown in Table 17 show that Q4275 maintained the selected high oleic - low linoleic acid phenotype under field conditions. The agronomic characteristics of Q4275 plants were superior to those of Q508. M4 generation Q508 plants were crossed to a dihaploid selection of Westar, with Westar serving as the female parent. The resulting FI seed was termed the 92EF population. About 126 FI individuals that appeared to have better agronomic characteristics than the Q508 parent were selected for selfing. A portion of the F2 seed from such individuals was replanted in the field.
Each F2 plant was selfed and a portion of the resulting F3 seed was analyzed for fatty acid composition. The content of oleic acid in F3 seed ranged from 59 to 79%. No high oleic (>80%) individuals were recovered with good agronomic type.
A portion of the F2 seed of the 92EF population was planted in the greenhouse to analyze the genetics of the Q508 line. F3 seed was analyzed from 380 F2 individuals. The C18:1 levels of F3 seed from the greenhouse experiment is depicted in Figure 1. The data were tested against the hypothesis that Q508 contains two mutant genes that are semi -dominant and additive: the original IMC 129 mutation as well as one additional mutation. The hypothesis also assumes that homozygous Q508 has greater than 85% oleic acid and homozygous Westar
has 62-67% oleic acid. The possible genotypes at each gene in a cross of Q508 by Westar may be designated as:
AA = Westar Fad2a
BB = Westar Fad2b aa = Q508 Fad2a" bb = Q508 Fad2b~ Assuming independent segregation, a 1:4:6:4:1 ratio of phenotypes is expected. The phenotypes of heterozygous plants are assumed to be indistinguishable and, thus, the data were tested for fit to a 1:14:1 ratio of homozygous Westar: heterozygous plants: homozygous Q508.
Phenotypic # of
Ratio Westar Alleles Genotype
1 4 AABB (Westar) 4 4 3 3 AABb, AaBB , AABb, AaBB
6 2 AaBb, AAbb, AaBb, AaBb, aaBB , AaBb
4 1 Aabb, aaBb, Aabb, aaBb
1 0 aabb (Q508)
Using Chi-square analysis, the oleic acid data fit a 1:14:1 ratio. It was concluded that Q508 differs from Westar by two major genes that are semi -dominant and additive and that segregate independently. By comparison, the genotype of IMC 129 is aaBB.
The fatty acid composition of representative F3 individuals having greater than 85% oleic acid in seed oil is shown in Table 18. The levels of saturated fatty acids are seen to be decreased in such plants, compared to Westar.
TABLE 18
92EF F3 Individuals with >85% C-g ;- in Seed Oil
EXAMPLE 6
Leaf and Root Fatty Acid Profiles of Canola Lines IMC-129, Q508, and Westar
Plants of Q508, IMC 129 and Westar were grown in the greenhouse. Mature leaves, primary expanding leaves, petioles and roots were harvested at the 6-8 leaf stage, frozen in liquid nitrogen and stored at -70°C. Lipid extracts were analyzed by GLC as described in Example 1. The fatty acid profile data are shown in Table 19.
The data in Table 19 indicate that total leaf lipids in Q508 are higher in C18:1 content than the C18:2 plus C18:3 content. The reverse is true for Westar and IMC 129. The difference in total leaf lipids between Q508 and IMC 129 is consistent with the hypothesis that a second Fad2 gene is mutated in Q508.
The C16 3 content in the total lipid fraction was about the same for all three lines, suggesting that the plastid FadC gene product was not affected by the Q508 mutations. To confirm that the FadC gene was not mutated, chloroplast lipids were separated and analyzed. No changes in chloroplast C16 1# C162 or C16 3 fatty acids were detected in the three lines. The similarity in plastid leaf lipids among Q508, Westar and IMC 129 is consistent with the hypothesis that the second mutation in Q508 affects a microsomal Fad2 gene and not a plastid FadC gene .
TABLE 19
EXAMPLE 7
Sequences of Mutant and Wild-Type Delta-12 Fatty Acid Desaturases from B . napus
Primers specific for the FAD2 structural gene were used to clone the entire open reading frame (ORF) of the
D and F delta-12 desaturase genes by reverse transcriptase polymerase chain reaction (RT-PCR) . RNA from seeds of IMC 129, Q508 and Westar plants was isolated by standard methods and was used as template. The RT-amplified fragments were used for nucleotide sequence determination. The DNA sequence of each gene
from each line was determined from both strands by standard dideoxy sequencing methods .
Sequence analysis revealed a G to A transversion at nucleotide 316 (from the translation initiation codon) of the D gene in both IMC 129 and Q508, compared to the sequence of Westar. The transversion changes the codon at this position from GAG to AAG and results in a non- conservative substitution of glutamic acid, an acidic residue, for lysine a basic residue. The presence of the same mutation in both lines was expected since the
Q508 line was derived from IMC 129. The same base change was also detected in Q508 and IMC 129 when RNA from leaf tissue was used as template.
The G to A mutation at nucleotide 316 was confirmed by sequencing several independent clones containing fragments amplified directly from genomic DNA of IMC 129 and Westar. These results eliminated the possibility of a rare mutation introduced during reverse transcription and PCR in the RT-PCR protocol. It was concluded that the IMC 129 mutant is due to a single base transversion at nucleotide 316 in the coding region of the D gene of rapeseed microsomal delta 12 -desaturase .
A single base transition from T to A at nucleotide 515 of the F gene was detected in Q508 compared to the Westar sequence. The mutation changes the codon at this position from CTC to CAC, resulting in the non- conservative substitution of a non-polar residue, leucine, for a polar residue, histidine, in the resulting gene product. No mutations were found in the F gene sequence of IMC 129 compared to the F gene sequence of
Westar.
These data support the conclusion that a mutation in a delta-12 desaturase gene sequence results in alterations in the fatty acid profile of plants containing such a mutated gene. Moreover, the data show
that when a plant line or species contains two delta-12 desaturase loci, the fatty acid profile of an individual having two mutated loci differs from the fatty acid profile of an individual having one mutated locus.
The mutation in the D gene of IMC 129 and Q508 mapped to a region having a conserved amino acid motif (His-Xaa-Xaa-Xaa-His) found in cloned delta-12 and delta- 15 membrane bound-desaturases (Table 20) .
TABLE 20
Alignment of Amino Acid Sequences of Cloned Canola Membrane Bound-Desaturases
(FadD = Plastid delta 15, Fad3 = Microsomal delta-15) , (FadC = Plastid delta-12, Fad2 = Microsomal delta-12) a One letter amino acid code; conservative substitutions are underlined; non-conservative substitutions are in bold.
EXAMPLE 8
Transcription and Translation of Microsomal Delta-12
Fatty Acid Desaturases
Transcription in vivo was analyzed by RT-PCR analysis of stage II and stage III developing seeds and leaf tissue. The primers used to specifically amplify delta-12 desaturase F gene RNA from the indicated tissues
were sense primer 5 ' -GGATATGATGATGGTGAAAGA-3 ' and antisense primer 5 ' -TCTTTCACCATCATCATATCC-3 ' . The primers used to specifically amplify delta-12 desaturase D gene RNA from the indicated tissues were sense primer 5' -GTTATGAAGCAAAGAAGAAAC-3' and antisense primer 5'- GTTTCTTCTTTGCTTCATAAC-3' . The results indicated that RNA of both the D and F gene was expressed in seed and leaf tissues of IMC 129, Q508 and wild type Westar plants . In vi tro transcription and translation analysis showed that a peptide of about 46 kD was made. This is the expected size of both the D gene product and the F gene product, based on sum of the deduced amino acid sequence of each gene and the cotranslational addition of a microsomal membrane peptide.
These results rule out the possibility that nonsense or frameshift mutations, resulting in a truncated polypeptide gene product, are present in either the mutant D gene or the mutant F gene. The data, in conjunction with the data of Example 7, support the conclusion that the mutations in Q508 and IMC 129 are in delta-12 fatty acid desaturase structural genes encoding desaturase enzymes, rather than in regulatory genes.
EXAMPLE 9 Development of Gene-Specific PCR Markers
Based on the single base change in the mutant D gene of IMC 129 described in above, two 5' PCR primers were designed. The nucleotide sequence of the primers differed only in the base (G for Westar and A for IMC 129) at the 3' end. The primers allow one to distinguish between mutant fad2-D and wild-type Fad2-D alleles in a DNA-based PCR assay. Since there is only a single base difference in the 5' PCR primers, the PCR assay is very sensitive to the PCR conditions such as annealing
temperature, cycle number, amount, and purity of DNA templates used. Assay conditions have been established that distinguish between the mutant gene and the wild type gene using genomic DNA from IMC 129 and wild type plants as templates. Conditions may be further optimized by varying PCR parameters, particularly with variable crude DNA samples. A PCR assay distinguishing the single base mutation in IMC 129 from the wild type gene along with fatty acid composition analysis provides a means to simplify segregation and selection analysis of genetic crosses involving plants having a delta-12 fatty acid desaturase mutation.
EXAMPLE 10 Transformation with Mutant and Wild Type Fad3 Genes B . napus cultivar Westar was transformed with mutant and wild type Fad3 genes to demonstrate that the mutant Fad3 gene for canola cytoplasmic linoleic desaturase delta-15 desaturase is nonfunctional. Transformation and regeneration were performed using disarmed Agrobacterium tumefaciens essentially following the procedure described in WO 94/11516.
Two disarmed Agrobacterium strains were engineered, each containing a Ti plasmid having the appropriate gene linked to a seed-specific promoter and a corresponding termination sequence. The first plasmid, pIMCHO, was prepared by inserting into a disarmed Ti vector the full length wild type Fad3 gene in sense orientation (nucleotides 208 to 1336 of SEQ ID 6 in WO 93/11245) , flanked by a napin promoter sequence positioned 5' to the Fad3 gene and a napin termination sequence positioned 3' to the Fad3 gene. The rapeseed napin promoter is described in EP 0255378.
The second plasmid, pIMC205, was prepared by inserting a mutated Fad3 gene in sense orientation into a disarmed Ti vector. The mutant sequence contained
mutations at nucleotides 411 and 413 of the microsomal Fad3 gene described in W093/11245, thus changing the sequence for codon 96 from GAC to AAG. The amino acid at codon 96 of the gene product was thereby changed from aspartic acid to lysine. See Table 20. A bean
( Phaseolus vulgaris) phaseolin (7S seed storage protein) promoter fragment of 495 base pairs, starting with 5'- TGGTCTTTTGGT-3' , was placed 5' to the mutant Fad3 gene and a phaseolin termination sequence was placed 3' to the mutant Fad3 gene. The phaseolin sequence is described in Doyle et al . , (1986) J. Biol. Chem. 261:9228-9238) and Slightom et al . , (1983) Proc. Natl. Acad. Sci. USA 80:1897-1901.
The appropriate plasmids were engineered and transferred separately to Agrobacterium strain LBA4404. Each engineered strain was used to infect 5 mm segments of hypocotyl explants from Westar seeds by cocultivation. Infected hypocotyls were transferred to callus medium and, subsequently, to regeneration medium. Once discernable stems formed from the callus, shoots were excised and transferred to elongation medium. The elongated shoots were cut, dipped in Rootone™, rooted on an agar medium and transplanted to potting soil to obtain fertile TI plants. T2 seeds were obtained by selfing the resulting TI plants.
Fatty acid analysis of T2 seeds was carried out as described above. The results are summarized in Table 21. Of the 40 transformants obtained using the pIMCHO plasmid, 17 plants demonstrated wild type fatty acid profiles and 16 demonstrated overexpression. A proportion of the transformants are expected to display an overexpression phenotype when a functioning gene is transformed in sense orientation into plants.
Of the 307 transformed plants having the pIMC205 gene, none exhibited a fatty acid composition indicative
of overexpression. This result indicates that the mutant fad3 gene product is non-functional , since some of the transformants would have exhibited an overexpression phenotype if the gene product were functional.
TABLE 21
Overexpression and Co-suppression Events in Westar Populations Transformed with pIMC205 or pIMCHO
Fatty acid compositions of representative transformed plants are presented in Table 22. Lines 652- 09 and 663-40 are representative of plants containing pIMCHO and exhibiting an overexpression and a co- suppression phenotype, respectively. Line 205-284 is representative of plants containing pIMC205 and having the mutant fad3 gene.
TABLE 22 Fatty Acid Composition of T2 Seed
From Westar Transformed With pIMC205 or pIMCHO.
EXAMPLE 11
Sequences of Wild Type and Mutant Fad2-D and Fad2-F High molecular weight genomic DNA was isolated from leaves of Q4275 plants (Example 5) and from Westar and Bridger canola plants. This DNA was used as template for amplification of Fad2-D and Fad2-F genes by polymerase chain reaction (PCR) . PCR amplifications were carried out in a total volume of 100 μl and contained 0.3 μg genomic DNA, 200 μM deoxyribonucleoside triphosphates, 3 mM MgS04, 1-2 Units DNA polymerase and IX Buffer
(supplied by the DNA polymerase manufacturer) . Cycle conditions were: 1 cycle for 1 min at 95°C, followed by 30 cycles of 1 min at 94°C, 2 min at 55°C and 3 min at 73°C. The Fad2-D gene was amplified once using Elongase®
(Gibco-BRL) . PCR primers were:
CAUCAUCAUCAUCTTCTTCGTAGGGTTCATCG (SEQ ID NO: 23) and CUACUACUACUATCATAGAAGAGAAAGGTTCAG (SEQ ID NO: 24) for the 5' and 3' ends of the gene, respectively. The Fad2-F gene was independently amplified 4 times, twice with Elongase® and twice with Taq polymerase (Boehringer Mannheim) . The PCR primers used were: 5 ' CAUCAUCAUCAUCATGGGTGCACGTGGAAGAA3 ' (SEQ ID NO: 25) and 5 ' CUACUACUACUATCTTTCACCATCATCATATCC3 ' (SEQ ID NO: 26) for the 5' and 3' ends of the gene, respectively.
Amplified DNA products were resolved on an agarose gel, purified by JetSorb® and then annealed into pAMPl (Gibco-BRL) via the (CAU)4 and (CUA)4 sequences at the ends of the primers, and transformed into E. coli DH5α. The Fad2-D and Fad2-F inserts were sequenced on both strands with an ABI PRISM 310 automated sequencer (Perkin-Elmer) following the manufacturer's directions, using synthetic primers, AmpliTaq® DNA polymerase and dye terminator. The Fad2-D gene was found to have intron-like sequences upstream of the ATG start codon (SEQ ID NO: 30 and SEQ ID NO: 31) . As expected, the coding sequence of the gene derived from IMC 129 contained a G to A mutation at nucleotide 316 (Fig. 2) . A single base transversion from G to A at nucleotide 908 was detected in the F gene sequence of the Q4275 amplified products, compared to the wild type F gene sequence (Fig. 2) . This mutation changes the codon at amino acid 303 from GGA to GAA, resulting in the non- conservative substitution of a glutamic acid residue for
a glycine residue (Table 3 and Fig. 3) . Expression of the mutant Q4275 Fad2-F delta-12 desaturase gene in plants alters the fatty acid composition, as described hereinabove .
EXAMPLE 12
Sequence of Wild Type Fad2-U High molecular weight genomic DNA was isolated from the leaves of Bridger and Westar Brassica plants by standard methods. The Fad2-U gene was amplified in a 100 μl total reaction containing 1 μM of each primer, 0.3 μg genomic DNA, 200 μM dNTP, 3 mM MgS04 , lx Buffer (supplied by the manufacturer of the DNA polymerase) , and 1-2 units of Elongase DNA polymerase (BRL) . The amplification conditions included one cycle for 1 min at 95°C, 30 cycles of denaturation at 94°C for 1 min, annealing at 55°C for 2 min, and elongation at 72°C for 3 min. Subsequently, the reaction was incubated at 72 °C for an additional 10 min. Fad2U gene was amplified twice from Westar and twice from Bridger genomic DNAs using the following primers:
5' end primer 5 ' (CAU) 4CTTCTTCGTAGGGTTCATCG3 ' (SEQ ID NO:23)
3' end primer 5' (CUA) 4CATAACTTATTGTTGTACCAG3 ' (SEQ ID NO: 27) Amplified DNA products were purified and sequenced as described in Example 11. The Fad2-U sequence contains an intron-like sequence upstream of the ATG start codon (SEQ ID NO: 28) .
To the extent not already indicated, it will be understood by those of ordinary skill in the art that any one of the various specific embodiments herein described and illustrated may be further modified to incorporate features shown in other of the specific embodiments.
The foregoing detailed description has been provided for a better understanding of the invention only and no unnecessary limitation should be understood therefrom as some modifications will be apparent to those skilled in the art without deviating from the spirit and scope of the appended claims .
SEQUENCE LISTING
(1) GENERAL INFORMATION
(i) APPLICANT: CARGILL, INCORPORATED
(ii) TITLE OF THE INVENTION: FATTY ACID DESATURASES AND MUTANT SEQUENCES THEREOF
(iii) NUMBER OF SEQUENCES: 31
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Fish & Richardson P.C., P. A.
(B) STREET: SO South Sixth Street, Suite 3300
(C) CITY: Minneapolis
(D) STATE: MN
(E) COUNTRY: USA
(F) ZIP: 55402
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette
(B) COMPUTER: IBM Compatible
(C) OPERATING SYSTEM: DOS
(D) SOFTWARE: FastSEQ for Windows Version 2.0
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE: ll-JUN-97
(C) CLASSIFICATION:
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Lundquist, Ronald C
(B) REGISTRATION NUMBER: 37,875
(C) REFERENCE/DOCKET NUMBER: 07148/067WO1
(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 612-335-5070
(B) TELEFAX: 612-288-9696
(C) TELEX:
(2) INFORMATION FOR SEQ ID NO : 1 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1155 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: YES
(iv) ANTI -SENSE: NO
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Brassica napus
( ix) FEATURE :
(D) OTHER INFORMATION: Wild type Fad2.
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 :
ATG GGT GCA GGT GGA AGA ATG CAA GTG TCT CCT CCC TCC AAG AAG TCT 48 Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
GAA ACC GAC ACC ATC AAG CGC GTA CCC TGC GAG ACA CCG CCC TTC ACT 96 Glu Thr Asp Thr lie Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
GTC GGA GAA CTC AAG AAA GCA ATC CCA CCG CAC TGT TTC AAA CGC TCG 144 Val Gly Glu Leu Lys Lys Ala lie Pro Pro His Cys Phe Lys Arg Ser 35 40 45
ATC CCT CGC TCT TTC TCC TAC CTC ATC TGG GAC ATC ATC ATA GCC TCC 192 lie Pro Arg Ser Phe Ser Tyr Leu lie Trp Asp lie lie lie Ala Ser 50 55 60
TGC TTC TAC TAC NTC GCC ACC ACT TAC TTC CCT CTC CTC CCT CAC CCT 240 Cys Phe Tyr Tyr Xaa Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
CTC TCC TAC TTC GCC TGG CCT CTC TAC TGG GCC TGC CAA GGG TGC GTC 288 Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
CTA ACC GGC GTC TGG GTC ATA GCC CAC GAA TGC GGC CAC CAC GCC TTC 336 Leu Thr Gly Val Trp Val lie Ala His Glu Cys Gly His His Ala Phe 100 105 110
AGC GAC TAC CAG TGG CTT GAC GAC ACC GTC GGT CTC ATC TTC CAC TCC 384 Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu lie Phe His Ser 115 120 125
TTC CTC CTC GTC CCT TAC TTC TCC TGG AAG TAC AGT CAT CGC AGC CAC 432 Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Ser His 130 135 140
CAT TCC AAC ACT GGC TCC CTC GAG AGA GAC GAA GTG TTT GTC CCC AAG 480 His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
AAG AAG TCA GAC ATC AAG TGG TAC GGC AAG TAC CTC AAC AAC CCT TTG 528 Lys Lys Ser Asp lie Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
GGA CGC ACC GTG ATG TTA ACG GTT CAG TTC ACT CTC GGC TGG CCG TTG 576 Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
TAC TTA GCC TTC AAC GTC TCG GGA AGA CCT TAC GAC GGC GGC TTC CGT 624 Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Arg 195 200 205
TGC CAT TTC CAC CCC AAC GCT CCC ATC TAC AAC GAC CGC GAG CGT CTC 672 Cys His Phe His Pro Asn Ala Pro lie Tyr Asn Asp Arg Glu Arg Leu 210 215 220
CAG ATA TAC ATC TCC GAC GCT GGC ATC CTC GCC GTC TGC TAC GGT CTC 720 Gin lie Tyr lie Ser Asp Ala Gly lie Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
TTC CGT TAC GCC GCC GGC CAG GGA GTG GCC TCG ATG GTC TGC TTC TAC 768 Phe Arg Tyr Ala Ala Gly Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
GGA GTC CCG CTT CTG ATT GTC AAT GGT TTC CTC GTG TTG ATC ACT TAC 816 Gly Val Pro Leu Leu lie Val Asn Gly Phe Leu Val Leu lie Thr Tyr 260 265 270
TTG CAG CAC ACG CAT CCT TCC CTG CCT CAC TAC GAT TCG TCC GAG TGG 864 Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
GAT TGG TTC AGG GGA GCT TTG GCT ACC GTT GAC AGA GAC TAC GGA ATC 912 Asp Trp Phe Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly lie 290 295 300
TTG AAC AAG GTC TTC CAC AAT ATT ACC GAC ACG CAC GTG GCC CAT CAT 960 Leu Asn Lys Val Phe His Asn lie Thr Asp Thr His Val Ala His His 305 310 315 320
CCG TTC TCC ACG ATG CCG CAT TAT CAC GCG ATG GAA GCT ACC AAG GCG 1008 Pro Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
ATA AAG CCG ATA CTG GGA GAG TAT TAT CAG TTC GAT GGG ACG CCG GTG 1056 lie Lys Pro lie Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
GTT AAG GCG ATG TGG AGG GAG GCG AAG GAG TGT ATC TAT GTG GAA CCG 1104 Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys lie Tyr Val Glu Pro 355 360 365
GAC AGG CAA GGT GAG AAG AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA T 1153 Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
GA 1155
(2) INFORMATION FOR SEQ ID NO : 2 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 :
Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
Glu Thr Asp Thr lie Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
Val Gly Glu Leu Lys Lys Ala lie Pro Pro His Cys Phe Lys Arg Ser 35 40 45
lie Pro Arg Ser Phe Ser Tyr Leu lie Trp Asp He He He Ala Ser 50 55 60
Cys Phe Tyr Tyr Xaa Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Ser His 130 135 140
His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Arg 195 200 205
Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
Phe Arg Tyr Ala Ala Gly Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
Asp Trp Phe Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
Pro Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO : 3 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1155 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: YES
(iv) ANTI -SENSE: NO
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Brassica napus
(ix) FEATURE:
(D) OTHER INFORMATION: G to A transversion mutation at nucleotide 316.
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 :
ATG GGT GCA GGT GGA AGA ATG CAA GTG TCT CCT CCC TCC AAG AAG TCT 48 Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
GAA ACC GAC ACC ATC AAG CGC GTA CCC TGC GAG ACA CCG CCC TTC ACT 96 Glu Thr Asp Thr He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
GTC GGA GAA CTC AAG AAA GCA ATC CCA CCG CAC TGT TTC AAA CGC TCG 144 Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
ATC CCT CGC TCT TTC TCC TAC CTC ATC TGG GAC ATC ATC ATA GCC TCC 192 He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
TGC TTC TAC TAC NTC GCC ACC ACT TAC TTC CCT CTC CTC CCT CAC CCT 240 Cys Phe Tyr Tyr Xaa Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
CTC TCC TAC TTC GCC TGG CCT CTC TAC TGG GCC TGC CAA GGG TGC GTC 288 Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
CTA ACC GGC GTC TGG GTC ATA GCC CAC AAG TGC GGC CAC CAC GCC TTC 336 Leu Thr Gly Val Trp Val He Ala His Lvs Cys Gly His His Ala Phe 100 105 110
AGC GAC TAC CAG TGG CTT GAC GAC ACC GTC GGT CTC ATC TTC CAC TCC 384 Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
TTC CTC CTC GTC CCT TAC TTC TCC TGG AAG TAC AGT CAT CGC AGC CAC 432 Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Ser His 130 135 140
CAT TCC AAC ACT GGC TCC CTC GAG AGA GAC GAA GTG TTT GTC CCC AAG 480 His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
AAG AAG TCA GAC ATC AAG TGG TAC GGC AAG TAC CTC AAC AAC CCT TTG 528 Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
GGA CGC ACC GTG ATG TTA ACG GTT CAG TTC ACT CTC GGC TGG CCG TTG 576 Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
TAC TTA GCC TTC AAC GTC TCG GGA AGA CCT TAC GAC GGC GGC TTC CGT 624 Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Arg 195 200 205
TGC CAT TTC CAC CCC AAC GCT CCC ATC TAC AAC GAC CGC GAG CGT CTC 672 Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
CAG ATA TAC ATC TCC GAC GCT GGC ATC CTC GCC GTC TGC TAC GGT CTC 720 Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
TTC CGT TAC GCC GCC GGC CAG GGA GTG GCC TCG ATG GTC TGC TTC TAC 768 Phe Arg Tyr Ala Ala Gly Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
GGA GTC CCG CTT CTG ATT GTC AAT GGT TTC CTC GTG TTG ATC ACT TAC 816 Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
TTG CAG CAC ACG CAT CCT TCC CTG CCT CAC TAC GAT TCG TCC GAG TGG 864 Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
GAT TGG TTC AGG GGA GCT TTG GCT ACC GTT GAC AGA GAC TAC GGA ATC 912 Asp Trp Phe Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
TTG AAC AAG GTC TTC CAC AAT ATT ACC GAC ACG CAC GTG GCC CAT CAT 960 Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
CCG TTC TCC ACG ATG CCG CAT TAT CAC GCG ATG GAA GCT ACC AAG GCG 1008 Pro Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
ATA AAG CCG ATA CTG GGA GAG TAT TAT CAG TTC GAT GGG ACG CCG GTG 1056 He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
GTT AAG GCG ATG TGG AGG GAG GCG AAG GAG TGT ATC TAT GTG GAA CCG 1104 Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
GAC AGG CAA GGT GAG AAG AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA T 1153 Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
GA 1155
(2) INFORMATION FOR SEQ ID NO: 4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 :
Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
Glu Thr Asp Thr He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
Cys Phe Tyr Tyr Xaa Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
Leu Thr Gly Val Trp Val He Ala His Lys Cys Gly His His Ala Phe 100 105 110
Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Ser His 130 135 140
His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Arg 195 200 205
Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
Phe Arg Tyr Ala Ala Gly Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
Asp Trp Phe Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
Pro Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO : 5 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1155 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: YES
(iv) ANTI -SENSE: NO
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Brassica napus
( ix) FEATURE :
(D) OTHER INFORMATION: Wild type Fad2.
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
ATG GGT GCA GGT GGA AGA ATG CAA GTG TCT CCT CCC TCC AAA AAG TCT 48 Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
GAA ACC GAC AAC ATC AAG CGC GTA CCC TGC GAG ACA CCG CCC TTC ACT 96 Glu Thr Asp Asn He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
GTC GGA GAA CTC AAG AAA GCA ATC CCA CCG CAC TGT TTC AAA CGC TCG 144 Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
ATC CCT CGC TCT TTC TCC TAC CTC ATC TGG GAC ATC ATC ATA GCC TCC 192 He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
TGC TTC TAC TAC GTC GCC ACC ACT TAC TTC CCT CTC CTC CCT CAC CCT 240 Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
CTC TCC TAC TTC GCC TGG CCT CTC TAC TGG GCC TGC CAG GGC TGC GTC 288 Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
CTA ACC GGC GTC TGG GTC ATA GCC CAC GAG TGC GGC CAC CAC GCC TTC 336 Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
AGC GAC TAC CAG TGG CTG GAC GAC ACC GTC GGC CTC ATC TTC CAC TCC 384 Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
TTC CTC CTC GTC CCT TAC TTC TCC TGG AAG TAC AGT CAT CGA CGC CAC 432 Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
CAT TCC AAC ACT GGC TCC CTC GAG AGA GAC GAA GTG TTT GTC CCC AAG 480 His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
AAG AAG TCA GAC ATC AAG TGG TAC GGC AAG TAC CTC AAC AAC CCT TTG 528 Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
GGA CGC ACC GTG ATG TTA ACG GTT CAG TTC ACT CTC GGC TGG CCT TTG 576 Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
TAC TTA GCC TTC AAC GTC TCG GGG AGA CCT TAC GAC GGC GGC TTC GCT 624 Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
TGC CAT TTC CAC CCC AAC GCT CCC ATC TAC AAC GAC CGC GAG CGT CTC 672 Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
CAG ATA TAC ATC TCC GAC GCT GGC ATC CTC GCC GTC TGC TAC GGT CTC 720 Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
TAC CGC TAC GCT GCT GTC CAA GGA GTT GCC TCG ATG GTC TGC TTC TAC 768 Tyr Arg Tyr Ala Ala Val Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
GGA GTT CCG CTT CTG ATT GTC AAT GGG TTC TTA GTT TTG ATC ACT TAC 816 Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
TTG CAG CAC ACG CAT CCT TCC CTG CCT CAC TAT GAC TCG TCT GAG TGG 864 Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
GAT TGG TTG AGG GGA GCT TTG GCC ACC GTT GAC AGA GAC TAC GGA ATC 912 Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
TTG AAC AAG GTC TTC CAC AAT ATC ACG GAC ACG CAC GTG GCG CAT CAC 960 Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
CTG TTC TCG ACC ATG CCG CAT TAT CAT GCG ATG GAA GCT ACG AAG GCG 1008 Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
ATA AAG CCG ATA CTG GGA GAG TAT TAT CAG TTG CAT GGG ACG CCG GTG 1056 He Lys Pro He Leu Gly Glu Tyr Tyr Gin Leu His Gly Thr Pro Val 340 345 350
GTT AAG GCG ATG TGG AGG GAG GCG AAG GAG TGT ATC TAT GTG GAA CCG 1104 Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
GAC AGG CAA GGT GAG AAG AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA T 1153 Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
GA 1155
(2) INFORMATION FOR SEQ ID NO: 6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 :
Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
Glu Thr Asp Asn He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
Tyr Arg Tyr Ala Ala Val Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
He Lys Pro He Leu Gly Glu Tyr Tyr Gin Leu His Gly Thr Pro Val 340 345 350
Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO : 7 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1155 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: YES
(iv) ANTI -SENSE: NO
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Brassica napus
( ix) FEATURE :
(D) OTHER INFORMATION: T to A transversion mutation at nucleotide 515.
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 :
ATG GGT GCA GGT GGA AGA ATG CAA GTG TCT CCT CCC TCC AAA AAG TCT 48 Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
GAA ACC GAC AAC ATC AAG CGC GTA CCC TGC GAG ACA CCG CCC TTC ACT 96 Glu Thr Asp Asn He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
GTC GGA GAA CTC AAG AAA GCA ATC CCA CCG CAC TGT TTC AAA CGC TCG 144 Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
ATC CCT CGC TCT TTC TCC TAC CTC ATC TGG GAC ATC ATC ATA GCC TCC 192 He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
TGC TTC TAC TAC GTC GCC ACC ACT TAC TTC CCT CTC CTC CCT CAC CCT 240 Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
CTC TCC TAC TTC GCC TGG CCT CTC TAC TGG GCC TGC CAG GGC TGC GTC 288 Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
CTA ACC GGC GTC TGG GTC ATA GCC CAC GAG TGC GGC CAC CAC GCC TTC 336 Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
AGC GAC TAC CAG TGG CTG GAC GAC ACC GTC GGC CTC ATC TTC CAC TCC 384 Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
TTC CTC CTC GTC CCT TAC TTC TCC TGG AAG TAC AGT CAT CGA CGC CAC 432 Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
CAT TCC AAC ACT GGC TCC CTC GAG AGA GAC GAA GTG TTT GTC CCC AAG 480 His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
AAG AAG TCA GAC ATC AAG TGG TAC GGC AAG TAC CAC AAC AAC CCT TTG 528 Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr His Asn Asn Pro Leu 165 170 175
GGA CGC ACC GTG ATG TTA ACG GTT CAG TTC ACT CTC GGC TGG CCT TTG 576 Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
TAC TTA GCC TTC AAC GTC TCG GGG AGA CCT TAC GAC GGC GGC TTC GCT 624 Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
TGC CAT TTC CAC CCC AAC GCT CCC ATC TAC AAC GAC CGC GAG CGT CTC 672 Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
CAG ATA TAC ATC TCC GAC GCT GGC ATC CTC GCC GTC TGC TAC GGT CTC 720 Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
TAC CGC TAC GCT GCT GTC CAA GGA GTT GCC TCG ATG GTC TGC TTC TAC 768 Tyr Arg Tyr Ala Ala Val Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
GGA GTT CCG CTT CTG ATT GTC AAT GGG TTC TTA GTT TTG ATC ACT TAC 816 Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
TTG CAG CAC ACG CAT CCT TCC CTG CCT CAC TAT GAC TCG TCT GAG TGG 864 Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
GAT TGG TTG AGG GGA GCT TTG GCC ACC GTT GAC AGA GAC TAC GGA ATC 912 Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
TTG AAC AAG GTC TTC CAC AAT ATC ACG GAC ACG CAC GTG GCG CAT CAC 960 Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
CTG TTC TCG ACC ATG CCG CAT TAT CAT GCG ATG GAA GCT ACG AAG GCG 1008 Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
ATA AAG CCG ATA CTG GGA GAG TAT TAT CAG TTG CAT GGG ACG CCG GTG 1056 He Lys Pro He Leu Gly Glu Tyr Tyr Gin Leu His Gly Thr Pro Val 340 345 350
GTT AAG GCG ATG TGG AGG GAG GCG AAG GAG TGT ATC TAT GTG GAA CCG 1104 Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
GAC AGG CAA GGT GAG AAG AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA T 1153 Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
GA 1155
(2) INFORMATION FOR SEQ ID NO : 8 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 :
Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
Glu Thr Asp Asn He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr His Asn Asn Pro Leu 165 170 175
Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
Tyr Arg Tyr Ala Ala Val Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
He Lys Pro He Leu Gly Glu Tyr Tyr Gin Leu His Gly Thr Pro Val 340 345 350
Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO : 9 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1155 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA ( ix) FEATURE :
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 1...1152 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 :
ATG GGT GCA GGT GGA AGA ATG CAA GTG TCT CCT CCC TCC AAA AAG TCT 48 Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
GAA ACC GAC AAC ATC AAG CGC GTA CCC TGC GAG ACA CCG CCC TTC ACT 96 Glu Thr Asp Asn He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
GTC GGA GAA CTC AAG AAA GCA ATC CCA CCG CAC TGT TTC AAA CGC TCG 144 Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
ATC CCT CGC TCT TTC TCC TAC CTC ATC TGG GAC ATC ATC ATA GCC TCC 192 He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
TGC TTC TAC TAC GTC GCC ACC ACT TAC TTC CCT CTC CTC CCT CAC CCT 240 Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
CTC TCC TAC TTC GCC TGG CCT CTC TAC TGG GCC TGC CAG GGC TGC GTC 288 Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
CTA ACC GGC GTC TGG GTC ATA GCC CAC GAG TGC GGC CAC CAC GCC TTC 336 Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
AGC GAC TAC CAG TGG CTG GAC GAC ACC GTC GGC CTC ATC TTC CAC TCC 384 Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
TTC CTC CTC GTC CYT TAC TTC TCC TGG AAG TAC AGT CAT CGA CGC CAC 432 Phe Leu Leu Val Xaa Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
CAT TCC AAC ACT GGC TCC CTC GAG AGA GAC GAA GTG TTT GTC CCC AAG 480 His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
AAG AAG TCA GAC ATC AAG TGG TAC GGC AAG TAC CTC AAC AAC CCT TTG 528 Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
GGA CGC ACC GTG ATG TTA ACG GTT CAG TTC ACT CTC GGC TGG CCT TTG 576 Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
TAC TTR GCC TTC AAC GTC TCG GGG AGA CCT TAC GAC GGC GGC TTC GCT 624 Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
TGC CAT TTC CAC CCC AAC GCT CCC ATC TAC AAC GAC CGT GAG CGT CTC 672 Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
CAG ATA TAC ATC TCC GAC GCT GGC ATC CTC GCC GTC TGC TAC GGT CTC 720 Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
TAC CGC TAC GCT GCT RTC CAA GGA GTT GCC TCG ATG GTC TGC TTC TAC 768 Tyr Arg Tyr Ala Ala Xaa Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
GGA GTT CCT CTT CTG RTT GTC AAC GGG TTC TTA GTT TTG ATC ACT TAC 816 Gly Val Pro Leu Leu Xaa Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
TTG CAG CAC ACG CAT CCT TCC CTG CCT CAC TAT GAC TCG TCT GAG TGG 864 Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
GAT TGG TTG AGG GGA GCT TTG GCC ACC GTT GAC AGA GAC TAC GGA ATC 912 Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
TTG AAC AAG GTC TTC CAC AAT ATC ACG GAC ACG CAC GTG GCG CAT CAC 960 Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
CTG TTC TCG ACC ATG CCG CAT TAT CAT GCG ATG GAA GCT ACG AAG GCG 1008 Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
ATA AAG CCG ATA CTG GGA GAG TAT TAY CAG TTC GAT GGG ACG CCG GTG 1056 He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
GTT AAG GCG ATG TGG AGG GAG GCG AAG GAG TGT ATC TAT GTG GAA CCG 1104 Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
GAC AGG CAA GGT GAG AAG AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA T 1153 Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
GA 1155
(2) INFORMATION FOR SEQ ID NO: 10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO-.10:
Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
Glu Thr Asp Asn He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
Phe Leu Leu Val Xaa Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
Tyr Arg Tyr Ala Ala Xaa Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
Gly Val Pro Leu Leu Xaa Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO: 11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1155 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 1...1152 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:
ATG GGT GCA GGT GGA AGA ATG CAA GTG TCT CCT CCC TCC AAA AAG TCT 48 Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
GAA ACC GAC AAC ATC AAG CGC GTA CCC TGC GAG ACA CCG CCC TTC ACT 96 Glu Thr Asp Asn He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
GTC GGA GAA CTC AAG AAA GCA ATC CCA CCG CAC TGT TTC AAA CGC TCG 144 Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
ATC CCT CGC TCT TTC TCC TAC CTC ATC TGG GAC ATC ATC ATA GCC TCC 192 He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
TGC TTC TAC TAC GTC GCC ACC ACT TAC TTC CCT CTC CTC CCT CAC CCT 240 Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
CTC TCC TAC TTC GCC TGG CCT CTC TAC TGG GCC TGC CAG GGC TGC GTC 288 Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
CTA ACC GGC GTC TGG GTC ATA GCC CAC AAG TGC GGC CAC CAC GCC TTC 336 Leu Thr Gly Val Trp Val He Ala His Lys Cys Gly His His Ala Phe 100 105 110
AGC GAC TAC CAG TGG CTG GAC GAC ACC GTC GGC CTC ATC TTC CAC TCC 384 Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
TTC CTC CTC GTC CYT TAC TTC TCC TGG AAG TAC AGT CAT CGA CGC CAC 432 Phe Leu Leu Val Xaa Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
CAT TCC AAC ACT GGC TCC CTC GAG AGA GAC GAA GTG TTT GTC CCC AAG 480 His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
AAG AAG TCA GAC ATC AAG TGG TAC GGC AAG TAC CTC AAC AAC CCT TTG 528 Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
GGA CGC ACC GTG ATG TTA ACG GTT CAG TTC ACT CTC GGC TGG CCT TTG 576 Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
TAC TTR GCC TTC AAC GTC TCG GGG AGA CCT TAC GAC GGC GGC TTC GCT 624 Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
TGC CAT TTC CAC CCC AAC GCT CCC ATC TAC AAC GAC CGT GAG CGT CTC 672 Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
CAG ATA TAC ATC TCC GAC GCT GGC ATC CTC GCC GTC TGC TAC GGT CTC 720 Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
TAC CGC TAC GCT GCT RTC CAA GGA GTT GCC TCG ATG GTC TGC TTC TAC 768 Tyr Arg Tyr Ala Ala Xaa Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
GGA GTT CCT CTT CTG RTT GTC AAC GGG TTC TTA GTT TTG ATC ACT TAC 816 Gly Val Pro Leu Leu Xaa Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
TTG CAG CAC ACG CAT CCT TCC CTG CCT CAC TAT GAC TCG TCT GAG TGG 864 Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
GAT TGG TTG AGG GGA GCT TTG GCC ACC GTT GAC AGA GAC TAC GGA ATC 912 Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
TTG AAC AAG GTC TTC CAC AAT ATC ACG GAC ACG CAC GTG GCG CAT CAC 960 Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
CTG TTC TCG ACC ATG CCG CAT TAT CAT GCG ATG GAA GCT ACG AAG GCG 1008 Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
ATA AAG CCG ATA CTG GGA GAG TAT TAY CAG TTC GAT GGG ACG CCG GTG 1056 He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
GTT AAG GCG ATG TGG AGG GAG GCG AAG GAG TGT ATC TAT GTG GAA CCG 1104 Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
GAC AGG CAA GGT GAG AAG AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA T 1153 Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
GA 1155
(2) INFORMATION FOR SEQ ID NO: 12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:
Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
Glu Thr Asp Asn He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
Leu Thr Gly Val Trp Val He Ala His Lys Cys Gly His His Ala Phe 100 105 110
Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
Phe Leu Leu Val Xaa Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
Tyr Arg Tyr Ala Ala Xaa Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
Gly Val Pro Leu Leu Xaa Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO: 13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1155 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 1...1152 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:
ATG GGT GCA GGT GGA AGA ATG CAA GTG TCT CCT CCC TCC AAG AAG TCT 48 Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
GAA ACC GAC ACC ATC AAG CGC GTA CCC TGC GAG ACA CCG CCC TTC ACT 96 Glu Thr Asp Thr He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
GTC GGA GAA CTC AAG AAA GCA ATC CCA CCG CAC TGT TTC AAA CGC TCG 144 Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
ATC CCT CGC TCT TTC TCC TAC CTC ATC TGG GAC ATC ATC ATA GCC TCC 192 He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
TGC TTC TAC TAC GTC GCC ACC ACT TAC TTC CCT CTC CTC CCT CAC CCT 240 Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
CTC TCC TAC TTC GCC TGG CCT CTC TAC TGG GCC TGC CAA GGG TGC GTC 288 Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
CTA ACC GGC GTC TGG GTC ATA GCC CAC GAG TGC GGC CAC CAC GCC TTC 336 Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
AGC GAC TAC CAG TGG CTT GAC GAC ACC GTC GGT CTC ATC TTC CAC TCC 384 Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
TTC CTC CTC GTC CCT TAC TTC TCC TGG AAG TAC AGT CAT CGA CGC CAC 432 Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
CAT TCC AAC ACT GGC TCC CTC GAG AGA GAC GAA GTG TTT GTC CCC AAG 480 His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
AAG AAG TCA GAC ATC AAG TGG TAC GGC AAG TAC CTC AAC AAC CCT TTG 528 Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
GGA CGC ACC GTG ATG TTA ACG GTT CAG TTC ACT CTC GGC TGG CCG TTG 576 Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
TAC TTA GCC TTC AAC GTC TCG GGA AGA CCT TAC GAC GGC GGC TTC GCT 624 Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
TGC CAT TTC CAC CCC AAC GCT CCC ATC TAC AAC GAC CGC GAG CGT CTC 672 Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
CAG ATA TAC ATC TCC GAC GCT GGC ATC CTC GCC GTC TGC TAC GGT CTC 720 Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
TTC CGT TAC GCC GCC GCG CAG GGA GTG GCC TCG ATG GTC TGC TTC TAC 768 Phe Arg Tyr Ala Ala Ala Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
GGA GTC CCG CTT CTG ATT GTC AAT GGT TTC CTC GTG TTG ATC ACT TAC 816 Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
TTG CAG CAC ACG CAT CCT TCC CTG CCT CAC TAC GAT TCG TCC GAG TGG 864 Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
GAT TGG TTG AGG GGA GCT TTG GCT ACC GTT GAC AGA GAC TAC GGA ATC 912 Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
TTG AAC AAG GTC TTC CAC AAT ATT ACC GAC ACG CAC GTG GCG CAT CAT 960 Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
CTG TTC TCC ACG ATG CCG CAT TAT CAC GCG ATG GAA GCT ACC AAG GCG 1008 Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
ATA AAG CCG ATA CTG GGA GAG TAT TAT CAG TTC GAT GGG ACG CCG GTG 1056 He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
GTT AAG GCG ATG TGG AGG GAG GCG AAG GAG TGT ATC TAT GTG GAA CCG 1104 Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
GAC AGG CAA GGT GAG AAG AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA T 1153 Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
GA 1155
(2) INFORMATION FOR SEQ ID NO: 14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:
Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
Glu Thr Asp Thr He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
Phe Arg Tyr Ala Ala Ala Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO: 15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1155 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 1...1152 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:
ATG GGT GCA GGT GGA AGA ATG CAA GTG TCT CCT CCC TCC AAG AAG TCT 48
Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
GAA ACC GAC ACC ATC AAG CGC GTA CCC TGC GAG ACA CCG CCC TTC ACT 96 Glu Thr Asp Thr He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
GTC GGA GAA CTC AAG AAA GCA ATC CCA CCG CAC TGT TTC AAA CGC TCG 144 Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
ATC CCT CGC TCT TTC TCC TAC CTC ATC TGG GAC ATC ATC ATA GCC TCC 192 He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
TGC TTC TAC TAC GTC GCC ACC ACT TAC TTC CCT CTC CTC CCT CAC CCT 240 Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
CTC TCC TAC TTC GCC TGG CCT CTC TAC TGG GCC TGC CAA GGG TGC GTC 288 Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
CTA ACC GGC GTC TGG GTC ATA GCC CAC GAG TGC GGC CAC CAC GCC TTC 336 Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
AGC GAC TAC CAG TGG CTT GAC GAC ACC GTC GGT CTC ATC TTC CAC TCC 384 Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
TTC CTC CTC GTC CCT TAC TTC TCC TGG AAG TAC AGT CAT CGA CGC CAC 432 Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
CAT TCC AAC ACT GGC TCC CTC GAG AGA GAC GAA GTG TTT GTC CCC AAG 480 His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
AAG AAG TCA GAC ATC AAG TGG TAC GGC AAG TAC CAC AAC AAC CCT TTG 528 Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr His Asn Asn Pro Leu 165 170 175
GGA CGC ACC GTG ATG TTA ACG GTT CAG TTC ACT CTC GGC TGG CCG TTG 576 Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
TAC TTA GCC TTC AAC GTC TCG GGA AGA CCT TAC GAC GGC GGC TTC GCT 624 Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
TGC CAT TTC CAC CCC AAC GCT CCC ATC TAC AAC GAC CGC GAG CGT CTC 672 Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
CAG ATA TAC ATC TCC GAC GCT GGC ATC CTC GCC GTC TGC TAC GGT CTC 720 Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
TTC CGT TAC GCC GCC GCG CAG GGA GTG GCC TCG ATG GTC TGC TTC TAC 768 Phe Arg Tyr Ala Ala Ala Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
GGA GTC CCG CTT CTG ATT GTC AAT GGT TTC CTC GTG TTG ATC ACT TAC 816 Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
TTG CAG CAC ACG CAT CCT TCC CTG CCT CAC TAC GAT TCG TCC GAG TGG 864 Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
GAT TGG TTG AGG GGA GCT TTG GCT ACC GTT GAC AGA GAC TAC GGA ATC 912 Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
TTG AAC AAG GTC TTC CAC AAT ATT ACC GAC ACG CAC GTG GCG CAT CAT 960 Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
CTG TTC TCC ACG ATG CCG CAT TAT CAC GCG ATG GAA GCT ACC AAG GCG 1008 Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
ATA AAG CCG ATA CTG GGA GAG TAT TAT CAG TTC GAT GGG ACG CCG GTG 1056 He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
GTT AAG GCG ATG TGG AGG GAG GCG AAG GAG TGT ATC TAT GTG GAA CCG 1104 Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
GAC AGG CAA GGT GAG AAG AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA T 1153 Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
GA 1155
(2) INFORMATION FOR SEQ ID NO: 16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16:
Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
Glu Thr Asp Thr He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr His Asn Asn Pro Leu 165 170 175
Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
Phe Arg Tyr Ala Ala Ala Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO: 17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1155 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 1...1152 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:
ATG GGT GCA GGT GGA AGA ATG CAA GTG TCT CCT CCC TCC AAG AAG TCT 48 Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
GAA ACC GAC ACC ATC AAG CGC GTA CCC TGC GAG ACA CCG CCC TTC ACT 96 Glu Thr Asp Thr He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
GTC GGA GAA CTC AAG AAA GCA ATC CCA CCG CAC TGT TTC AAA CGC TCG 144 Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
ATC CCT CGC TCT TTC TCC TAC CTC ATC TGG GAC ATC ATC ATA GCC TCC 192 He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
TGC TTC TAC TAC GTC GCC ACC ACT TAC TTC CCT CTC CTC CCT CAC CCT 240 Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
CTC TCC TAC TTC GCC TGG CCT CTC TAC TGG GCC TGC CAA GGG TGC GTC 288 Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
CTA ACC GGC GTC TGG GTC ATA GCC CAC GAG TGC GGC CAC CAC GCC TTC 336 Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
AGC GAC TAC CAG TGG CTT GAC GAC ACC GTC GGT CTC ATC TTC CAC TCC 384 Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
TTC CTC CTC GTC CCT TAC TTC TCC TGG AAG TAC AGT CAT CGA CGC CAC 432 Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
CAT TCC AAC ACT GGC TCC CTC GAG AGA GAC GAA GTG TTT GTC CCC AAG 480 His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
AAG AAG TCA GAC ATC AAG TGG TAC GGC AAG TAC CTC AAC AAC CCT TTG 528 Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
GGA CGC ACC GTG ATG TTA ACG GTT CAG TTC ACT CTC GGC TGG CCG TTG 576 Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
TAC TTA GCC TTC AAC GTC TCG GGA AGA CCT TAC GAC GGC GGC TTC GCT 624 Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
TGC CAT TTC CAC CCC AAC GCT CCC ATC TAC AAC GAC CGC GAG CGT CTC 672 Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
CAG ATA TAC ATC TCC GAC GCT GGC ATC CTC GCC GTC TGC TAC GGT CTC 720 Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
TTC CGT TAC GCC GCC GCG CAG GGA GTG GCC TCG ATG GTC TGC TTC TAC 768 Phe Arg Tyr Ala Ala Ala Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
GGA GTC CCG CTT CTG ATT GTC AAT GGT TTC CTC GTG TTG ATC ACT TAC 816 Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
TTG CAG CAC ACG CAT CCT TCC CTG CCT CAC TAC GAT TCG TCC GAG TGG 864 Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
GAT TGG TTG AGG GGA GCT TTG GCT ACC GTT GAC AGA GAC TAC GAA ATC 912 Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Glu He 290 295 300
TTG AAC AAG GTC TTC CAC AAT ATT ACC GAC ACG CAC GTG GCG CAT CAT 960 Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
CTG TTC TCC ACG ATG CCG CAT TAT CAC GCG ATG GAA GCT ACC AAG GCG 1008 Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
ATA AAG CCG ATA CTG GGA GAG TAT TAT CAG TTC GAT GGG ACG CCG GTG 1056 He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
GTT AAG GCG ATG TGG AGG GAG GCG AAG GAG TGT ATC TAT GTG GAA CCG 1104 Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
GAC AGG CAA GGT GAG AAG AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA T 1153 Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
GA 1155
(2) INFORMATION FOR SEQ ID NO: 18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:
Met Gly Ala Gly Gly Arg Met Gin Val Ser Pro Pro Ser Lys Lys Ser 1 5 10 15
Glu Thr Asp Thr He Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
Val Gly Glu Leu Lys Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
He Pro Arg Ser Phe Ser Tyr Leu He Trp Asp He He He Ala Ser 50 55 60
Cys Phe Tyr Tyr Val Ala Thr Thr Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
Leu Ser Tyr Phe Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
Leu Thr Gly Val Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
Ser Asp Tyr Gin Trp Leu Asp Asp Thr Val Gly Leu He Phe His Ser 115 120 125
Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Asp Gly Gly Phe Ala 195 200 205
Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
Gin He Tyr He Ser Asp Ala Gly He Leu Ala Val Cys Tyr Gly Leu 225 230 235 240
Phe Arg Tyr Ala Ala Ala Gin Gly Val Ala Ser Met Val Cys Phe Tyr 245 250 255
Gly Val Pro Leu Leu He Val Asn Gly Phe Leu Val Leu He Thr Tyr 260 265 270
Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Glu He 290 295 300
Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
Leu Phe Ser Thr Met Pro His Tyr His Ala Met Glu Ala Thr Lys Ala 325 330 335
He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO: 19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other Nucleic Acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: GGATATGATG ATGGTGAAAG A 21
(2) INFORMATION FOR SEQ ID NO: 20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other Nucleic Acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: TCTTTCACCA TCATCATATC C 21
(2) INFORMATION FOR SEQ ID NO: 21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other Nucleic Acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: GTTATGAAGC AAAGAAGAAA C 21
(2) INFORMATION FOR SEQ ID NO: 22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other Nucleic Acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: GTTTCTTCTT TGCTTTGCTT CATAAC 26
(2) INFORMATION FOR SEQ ID NO: 23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other Nucleic Acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: CAUCAUCAUC AUCTTCTTCG TAGGGTTCAT CG 32
(2) INFORMATION FOR SEQ ID NO: 24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other Nucleic Acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: CUACUACUAC UATCATAGAA GAGAAAGGTT CAG 33
(2) INFORMATION FOR SEQ ID NO: 25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: CAUCAUCAUC AUCATGGGTG CACGTGGAAG AA 32
(2) INFORMATION FOR SEQ ID NO: 26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: CUACUACUAC UATCTTTCAC CATCATCATA TCC 33
(2) INFORMATION FOR SEQ ID NO: 27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: CUACUACUAC UACATAACTT ATTGTTGTAC CAG 33
(2) INFORMATION FOR SEQ ID NO: 28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2168 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 1014...2165 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28:
CTTCTTCGTA GGGTTCATCG TTATTAACGT AAAATCTCTC TCCCCCCACC CTACGTCAGC 60
AGCTTTCAGG GTCCCCTTCT TCTTCTTCTT CTTCTCATTT TCCTCTTATT TTTATGAATT 120
CCTGGTCTGT GTTCACCTCG TCCATCTCTC TAGCAGTCTA GCATTTGGCA TTTAAATCGA 180
TAGRTCTGCC AGTCTTTATT GCATTCAACT AAAGATCTGT TCCTCTGTTT CCATTTGACA 240
AATCTTGTGT CATGTTTCTT TCATCTCACC GTTAAATAAT GATTACTGTC TATGGTCTAG 300
CATATGAAAT GTTGCAACTT TCTATCTATT CAGAAATCTT TTTATTCAAT AGGTTGGTGA 360
AATAGAAAAG GTCAAATCTC CAAAATAGCA ACTTTCTAAG TTTATATCAC AAAAATAGCA 420
CTCAAAAATT AAAATGACCA AAATATTATT TTATCTTTTG AAAATTTTAA TTTTTTTATT 480
TTTCAAAATT TGAAATCTTA TCCCCAAAAC CTCATTTCTC AACTCTAAAC CCTAAACTCT 540
GAACCATAAR CCCTAAACCC TAAACTCTAA ACCCTAAACC CTAAACCCTA AACCCCACCC 600
TTTAACTYTA AACCATAAGT TTGTGACTTT TGATAAAATA TTAAGTGATA TTTTTGTGAC 660
TTTTGACCTT GAGTGCTAGT TTGGGAACAA AAACTTGGTT TAGTGCTATT TTTGTTTTTT 720
TTCAATATAA AAATCACTTA TTGTTGAACC TTTGATAGAT TTGACCGATT CCTACTGGTT 780
CTTGCTACTG TTATTTCTTA ATAAATGGAA GAACGTTTCA TTGACTTATA AGCTCATCAA 840
CTTTGTACAA ATAAAACGGA TGATTTAAAG TAGGTAGGTA CTTCAGGGTT TAGATGTTCT 900
TTTATAGATT CAAATGCATG AAGAGTTGCA TATACAACTT TGATTAAAGG ATAAAAAGTC 960
TCCGTCCTCC ATAACATTAT TATTATTTTT TGGTTTTCTC TACAGAAACA AAC ATG 1016
Met 1
GGC GCA GRT GGA AGA ATG CAA ATC TCT CCT CCC TCC AGC TCC CCC GAA 1064 Gly Ala Xaa Gly Arg Met Gin He Ser Pro Pro Ser Ser Ser Pro Glu 5 10 15
ACC AAA ACC CTC AAA CGC GTC CCC TGC GAG ACA CCA CCC TTC ACT CTC 1112 Thr Lys Thr Leu Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr Leu 20 25 30
GGA GAC CTC GAG AAA GCA ATC CCA CCT CAC TGC TTC AAA CGC TCC ATC 1160 Gly Asp Leu Glu Lys Ala He Pro Pro His Cys Phe Lys Arg Ser He 35 40 45
CCT CGC TCC TTC TCC TAC CTC CTC TTC GAC ATC CTC GTC TCC TCC TCC 1208 Pro Arg Ser Phe Ser Tyr Leu Leu Phe Asp He Leu Val Ser Ser Ser 50 55 60 65
CTC TAC CAC CTC TCC ACA GCC TAC TTC CCT CTC CTC CCC CAC CCT CTC 1256 Leu Tyr His Leu Ser Thr Ala Tyr Phe Pro Leu Leu Pro His Pro Leu 70 75 80
CCT TAC CTC GCC TGG CCC CTC TAC TGG GCC TGC CAA GGC TGC GTC CTA 1304 Pro Tyr Leu Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val Leu
ACG GGC CTC TGG GTC ATC GCC CAC GAA TGC GGC CAC CAC GCC TTC AGC 1352 Thr Gly Leu Trp Val He Ala His Glu Cys Gly His His Ala Phe Ser 100 105 110
GAC CAC CAG TGG CTG GAC GAC GCC GTG GGC CTC GTC TTC CAC TCC TTC 1400 Asp His Gin Trp Leu Asp Asp Ala Val Gly Leu Val Phe His Ser Phe 115 120 125
CTC CTC GTC CCT TAC TTC TCC TGG AAG TAC AGC CAT CGA CGC CAC CAT 1448 Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His His 130 135 140 145
TCC AAC ACC GGA TCC CTC GAG AGG GAT GAA GTG TTC GTC CCC AAG AAG 1496 Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys Lys 150 155 160
AAA TCC GAC ATC AAG TGG TAC GGA AAG TAC CTC AAC AAC CCG CTA GGA 1544 Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu Gly 165 170 175
CGC ACG GTG ATG CTA ACC GTC CAG TTC ACG CTC GGC TGG CCG TTG TAC 1592 Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu Tyr 180 185 190
TTA GCC TTC AAC GTC TCT GGA AGA CCT TAC AGC GAC GGT TTC GCT TGC 1640 Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Ser Asp Gly Phe Ala Cys 195 200 205
CAT TTC CAC CCG AAC GCT CCC ATC TAC AAC GAC CGC GAG CGT CTC CAG 1688 His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu Gin 210 215 220 225
ATA TAC ATC TCT GAC GCT GGC GTC CTC TCC GTA TGT TAC GGT CTC TAC 1736 He Tyr He Ser Asp Ala Gly Val Leu Ser Val Cys Tyr Gly Leu Tyr 230 235 240
CGC TAC GCT GGT TCG CGA GGA GTG GCC TCG ATG GTC TGT GTC TAC GGA 1784 Arg Tyr Ala Gly Ser Arg Gly Val Ala Ser Met Val Cys Val Tyr Gly 245 250 255
GTT CCG CTT ATG ATT GTC AAC TGT TTC CTC GTC TTG ATC ACT TAC TTG 1832 Val Pro Leu Met He Val Asn Cys Phe Leu Val Leu He Thr Tyr Leu 260 265 270
CAG CAC ACG CAC CCT TCG CTG CCT CAC TAT GAT TCT TCG GAG TGG GAT 1880 Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp Asp 275 280 285
TGG TTG AGA GGA GCT TTG GCT ACT GTG GAT AGA GAC TAT GGA ATC TTG 1928 Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He Leu 290 295 300 305
AAC AAG GTG TTT CAT AAC ATC ACG GAC ACG CAC GTG GCG CAT CAT CTG 1976 Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His Leu 310 315 320
TTC TCG ACG ATG CCG CAT TAT AAC GCG ATG GAA GCG ACC AAG GCG ATA 2024 Phe Ser Thr Met Pro His Tyr Asn Ala Met Glu Ala Thr Lys Ala He 325 330 335
AAG CCG ATA CTT GGA GAG TAT TAC CAG TTT GAT GGA ACG CCG GTG GTT 2072 Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val Val 340 345 350
AAG GCG ATG TGG AGG GAG GCG AAG GAG TGT ATC TAT GTT GAA CCG GAT 2120 Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro Asp 355 360 365
AGG CAA GGT GAG AAG AAA GGT GTG TTC TGG TAC AAC AAT AAG TTA TGA 2168 Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO: 29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29:
Met Gly Ala Xaa Gly Arg Met Gin He Ser Pro Pro Ser Ser Ser Pro 1 5 10 15
Glu Thr Lys Thr Leu Lys Arg Val Pro Cys Glu Thr Pro Pro Phe Thr 20 25 30
Leu Gly Asp Leu Glu Lys Ala He Pro Pro His Cys Phe Lys Arg Ser 35 40 45
He Pro Arg Ser Phe Ser Tyr Leu Leu Phe Asp He Leu Val Ser Ser 50 55 60
Ser Leu Tyr His Leu Ser Thr Ala Tyr Phe Pro Leu Leu Pro His Pro 65 70 75 80
Leu Pro Tyr Leu Ala Trp Pro Leu Tyr Trp Ala Cys Gin Gly Cys Val 85 90 95
Leu Thr Gly Leu Trp Val He Ala His Glu Cys Gly His His Ala Phe 100 105 110
Ser Asp His Gin Trp Leu Asp Asp Ala Val Gly Leu Val Phe His Ser 115 120 125
Phe Leu Leu Val Pro Tyr Phe Ser Trp Lys Tyr Ser His Arg Arg His 130 135 140
His Ser Asn Thr Gly Ser Leu Glu Arg Asp Glu Val Phe Val Pro Lys 145 150 155 160
Lys Lys Ser Asp He Lys Trp Tyr Gly Lys Tyr Leu Asn Asn Pro Leu 165 170 175
Gly Arg Thr Val Met Leu Thr Val Gin Phe Thr Leu Gly Trp Pro Leu 180 185 190
Tyr Leu Ala Phe Asn Val Ser Gly Arg Pro Tyr Ser Asp Gly Phe Ala 195 200 205
Cys His Phe His Pro Asn Ala Pro He Tyr Asn Asp Arg Glu Arg Leu 210 215 220
Gin He Tyr He Ser Asp Ala Gly Val Leu Ser Val Cys Tyr Gly Leu 225 230 235 240
Tyr Arg Tyr Ala Gly Ser Arg Gly Val Ala Ser Met Val Cys Val Tyr 245 250 255
Gly Val Pro Leu Met He Val Asn Cys Phe Leu Val Leu He Thr Tyr 260 265 270
Leu Gin His Thr His Pro Ser Leu Pro His Tyr Asp Ser Ser Glu Trp 275 280 285
Asp Trp Leu Arg Gly Ala Leu Ala Thr Val Asp Arg Asp Tyr Gly He 290 295 300
Leu Asn Lys Val Phe His Asn He Thr Asp Thr His Val Ala His His 305 310 315 320
Leu Phe Ser Thr Met Pro His Tyr Asn Ala Met Glu Ala Thr Lys Ala 325 330 335
He Lys Pro He Leu Gly Glu Tyr Tyr Gin Phe Asp Gly Thr Pro Val 340 345 350
Val Lys Ala Met Trp Arg Glu Ala Lys Glu Cys He Tyr Val Glu Pro 355 360 365
Asp Arg Gin Gly Glu Lys Lys Gly Val Phe Trp Tyr Asn Asn Lys Leu 370 375 380
(2) INFORMATION FOR SEQ ID NO: 30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1132 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:
GCGTAACCCT TATTAACGTT AAATCTTCAT CCCCCCCTAC GTCAGCCAGC TCAAGGTCCC 60 TTTCTTCTTC CATTTCTTCT CATTTTTACG TTGTTTTCAA TCTTGGTCTG TTCTTTTCTT 120 ATCGCTTTTC TATTCTATCT ATCATTTTTG CATTTCAGTC GATTTAATTC TAGATCTGTT 180 AGTATTTATT GCATTAAACT ATAGATCTGG TCTTGATTCT CTGTTTTCAT GTGTGAAATC 240 TTGATGCTGT CTTTACCATT AATCTGATTA TATTGTCTAT ACCGTGGAGA ATATGAAATG 300 TTGCATTTTC ATTTGTCCGA ATACAAACTG TTTGACTTTC AATCTTTTTT AATGATTTAT 360 TTGATGGGTT GGTGGAGTTG AAAAATCACC ATAGCAGTCT CACGTCCTGG TCTTAGAAAT 420 ATCCTTCCTA TTCAAAGTTA TATATTTGTT TACTTGTCTT AGATCTGGAC CTGAGACATG 480 TAAGTACATA TTTGTTGAAT CTTTGGGTAA AAAATTTATG TCTCTGGGTA AAATTTGCTT 540 GGAGATTTGA CCGATTCCTA TTGGCTCTTG ATTCTGTAGT TACCTAATAC ATGAAAAAGT 600 TTCATTTGGC CTATGCTCAC TTCATGCTTA CAAACTTTTC TTTGCAAATT AATTGGATTA 660 GATGTCCTTC ATAGATTCAG ATGCAATAGA TTTGCATGAA GAAAATAATA GGATTCATGA 720 CAGTAAAAAA GATTGTATTT TTGTTTGTTT GTTTATGTTT AAAAGTCTAT ATGTTGACAA 780 TAGAGTTGCT CTCAACTGTT TCATTTAGCT TTTTGTTTTT GTCAAGTTGC TTATTCTTAG 840 AGACATTGTG ATTATGACTT GTCTTCTCTA ACGTAGTTTA GTAATAAAAG ACGAAAGAAA 900 TTGATATCCA CAAGAAAGAG ATGTAAGCTG TAACGTATCA AATCTCATTA ATAACTAGTA 960 GTATTCTCAA CGCTATCGTT TATTTCTTTC TTTGGTTTGC CACTATATGC CGCTTCTCTG 1020 CTCTTTTGTC CCACGTACTA TCCATTTTTT TGAAACTTTA ATAACGTAAC ACTGAATATT 1080 AATTTGTTGG TTTTTTTAAC TTTGAGTCTT TGCTTTTGGT TTATGCAGAA AC 1132
(2) INFORMATION FOR SEQ ID NO: 31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1135 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31:
TTATTAACGT TAAATCTTCA TCCCCCCCTA CGTCAGCCAG CTCAAGGTCC CTTTCTTCTT 60
CCATTTCTTC TCATTTTTAC GTTGTTTTCA ATCTTGGTCT GTTCTTTTCT TATCGCTTTT 120
CTGTTCTATC TATCATTTTT GCATTTCAGT CGATTTAATT CTAGATCTGT TAATATTTAT 180
TGCATTAAAC TACAGATCTG GTCTCGATTC TCTGTTTTCA TGTGTGAAAT CTGATGCTGT 240
CTTTACCATT AATCTGCTTA TATTGTATAT ACCGTGGAGA ATATGAAATG TTGCATTTTC 300
ATTTGTCCGA ATACAAACTG TTTGACTTCC AATCGTTTTT AATTATATAT ATTTTTTGAT 360
GGGTTGGTGG AGTTGAAAAA TCACCATAGC AGTCTCACGT CCTGGTTTTA GAAATATCCT 420
TCCTATTCAA AGTTATATAT TTGTTTACTT TTGTTTTAGA TCTGGACCTG AGACATGTAA 480
GTACCTATTT GTTGAATCTT TGGGTAAAAT TTATGTCTCT GGGTAAAATT TGCTGAGAGA 540
TTTGACCGAT TCCTATTGGC TCTGGATTCT GTATACATGA AAAAGTTTCA TTGGCCTATG 600
CTCACGTCAT GCTTACAAAC TTTTCTTTGC AAATTAATTC GATTAGATGC TCCTTCATAG 660
ATTCAGATGC AATAGATTTG CATGAAGAAA ATAATAGGAT TCATGATAGT AAAAAAATTG 720
TACATTTTTT TGTTTGTTTA TGTTTAAAAG TCTATATGTT GACAATAGGG TTGCTATCAA 780
CTGTTTCATT TAGCTTTTTG TTTTTCTCAA GTTGCTTATT CTTAGAGACA TTGTGATTAT 840
GACTTGTCGT CTTTAACGTA GTTTAGTAAT AAAAGACGAA AGAAATTGAT ATCCACAAGA 900
AAGAGATGTG AGCTGTAGCG TATCAAATCT CATTAATAAC TAGTAGTATT CTCAACGCTA 960
TCGTTTATTT CTTTCTTTGG TTTGCCACTA TATGCCGCTT CTCTCCTCTT TATCCCACGT 1020
ACTATCCATT TTTTTTGTGG TAGTCCATTT TTTTGAAGCT TTAATAACGT AACACTGAAT 1080
ATTAATTTGT TGGTTTAATT AACTTTGAGT CTTTGCTTTT GGTTTATGCA GAAAC 1135