AU756359B2 - Materials and methods for the modification of plant lignin content - Google Patents

Materials and methods for the modification of plant lignin content Download PDF

Info

Publication number
AU756359B2
AU756359B2 AU57975/01A AU5797501A AU756359B2 AU 756359 B2 AU756359 B2 AU 756359B2 AU 57975/01 A AU57975/01 A AU 57975/01A AU 5797501 A AU5797501 A AU 5797501A AU 756359 B2 AU756359 B2 AU 756359B2
Authority
AU
Australia
Prior art keywords
seq
plant
sequence
gene
dna construct
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU57975/01A
Other versions
AU5797501A (en
Inventor
Leonard Nathan Bloksberg
Alistair Wallace Grierson
Ilkka Jaakko Havukkala
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rubicon Forests Holdings Ltd
ArborGen LLC
Original Assignee
Rubicon Forests Holdings Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU44036/97A external-priority patent/AU733388B2/en
Application filed by Rubicon Forests Holdings Ltd filed Critical Rubicon Forests Holdings Ltd
Priority to AU57975/01A priority Critical patent/AU756359B2/en
Publication of AU5797501A publication Critical patent/AU5797501A/en
Assigned to RUBICON FORESTS HOLDINGS LIMITED, GENESIS RESEARCH AND DEVELOPMENT CORPORATION LIMITED reassignment RUBICON FORESTS HOLDINGS LIMITED Alteration of Name(s) of Applicant(s) under S113 Assignors: FLETCHER CHALLENGE FORESTS LIMITED, GENESIS RESEARCH AND DEVELOPMENT CORPORATION LIMITED
Application granted granted Critical
Publication of AU756359B2 publication Critical patent/AU756359B2/en
Priority to AU2003203517A priority patent/AU2003203517B2/en
Assigned to ARBORGEN LLC, RUBICON FORESTS HOLDINGS LIMITED reassignment ARBORGEN LLC Alteration of Name(s) in Register under S187 Assignors: GENESIS RESEARCH AND DEVELOPMENT CORPORATION LIMITED, RUBICON FORESTS HOLDINGS LIMITED
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Description

a- P00011 Regulation 3.2 Revised 2/98
AUSTRALIA
Patents Act, 1990
ORIGINAL
COMPLETE SPECIFICATION STANDARD PATENT TO BE COMPLETED BY THE APPLICANT NAME OF APPLICANT: ACTUAL INVENTORS: Genesis Research Development Corporation Limited and F'etchoer Challenge Forests Limited tCors- y<o\ C ,5 LSv\ Leonard Nathan Bloksberg; -o 1,3 Alistair Wallace Grierson and Ilkka Jaakko Havukkala 0 ADDRESS FOR SERVICE:
S
Peter Maxwell Associates Level 6 Pitt Street SYDNEY NSW 2000 MATERIALS AND METHODS FOR THE MODIFICATION OF PLANT LIGNIN CONTENT INVENTION TITLE: DETAILS OF ASSOCIATED APPLICATION(S): Divisional of Australian Patent Application No. 44,036/97 (733,388) filed on September 1997 The following statement is a full description of this invention including the best method of performing it known to me:- MATERIALS AND METHODS FOR THE MODIFICATION OF PLANT LIGNIN CONTENT Technical Field of the Invention This invention relates to the field of modification of lignin content and composition in plants. More particularly, this invention relates to enzymes involved in the lignin biosynthetic pathway and nucleotide sequences encoding such enzymes.
Background of the Invention Lignin is an insoluble polymer which is primarily responsible for the rigidity of plant stems. Specifically, lignin serves as a matrix around the polysaccharide components of some plant cell walls. The higher the lignin content, the more rigid the plant. For example, tree species synthesize large quantities of lignin, with lignin constituting between 20% to 30% of the dry weight of wood. In addition to providing rigidity, lignin aids in water transport within plants by rendering cell walls hydrophobic and water impermeable. Lignin also plays a role in disease resistance of plants by impeding the penetration and propagation of pathogenic agents.
The high concentration of lignin in trees presents a significant problem in the 20 paper industry wherein considerable resources must be employed to separate lignin from the cellulose fiber needed for the production of paper. Methods typically employed for the removal of lignin are highly energy- and chemical-intensive, resulting in increased costs and increased levels of undesirable waste products. In the U.S. alone, about 20 million tons of lignin are removed from wood per year.
Lignin is largely responsible for the digestibility, or lack thereof, of forage crops, with small increases in plant lignin content resulting in relatively high decreases in digestibility. For example, crops with reduced lignin content provide more efficient forage for cattle, with the yield of milk and meat being higher relative to the amount of forage crop consumed. During normal plant growth, the increase in dry matter content 30 is accompanied by a corresponding decrease in digestibility. When deciding on the optimum time to harvest forage crops, farmers must therefore chose between a high yield of less digestible material and a lower yield of more digestible material.
For some applications, an increase in lienin content is desirable since increasing the lignin content of a plant would lead to increased mechanical strength of wood.
changes in its color and increased resistance to rot. Mycorrhizal species composition and abundance may also be favorably manipulated by modifying lignin content and structural composition.
As discussed in detail below, lignin is formed by polymerization of at least three different monolignols which are synthesized in a multistep pathway, each step in the pathway being catalyzed by a different enzyme: It has been shown that manipulation.of the number of copies of genes encoding certain enzymes, such as cinnamyl alcohol dehydrogenase (CAD) and caffeic acid 3-0-methyltransferase (COMT) results in modification of the amount of lignin produced; see, for example, U.S. Patent No.
5,451,514 and PCT publication no. WO 94/23044. Furthermore, it has been shown that antisense expression of sequences encoding CAD in poplar leads to the production of lignin having a modified composition (Grand, C. et al. Planta (Berl.) 163:232-237 (1985)).
While DNA sequences encoding some of the enzymes involved in the lignin biosynthetic pathway have been isolated for certain species of plants, genes encoding many of the enzymes in a wide range of plant species have not yet been identified.
Thus there remains a need in the art for materials useful in the modification of lignin content and composition in plants and for methods for their use.
Summary of the Invention Briefly, the present invention provides isolated DNA sequences obtainable from eucalyptus and pine which encode enzymes involved in the lignin biosynthetic pathway, DNA constructs including such sequences, and methods for the use of such constructs. Transgenic plants having altered lignin content and composition are also Sprovided.
In a first aspect, the present invention provides isolated DNA sequences coding .for the following enzymes isolated from eucalyptus and pine: cinnamate 4-hydroxylase 30 (C4H), coumarate 3-hydroxylase (C3H), phenolase (PNL), O-methyl transferase (OMT), cinnamyl alcohol dehydrogenase (CAD), cinnamoyl-CoA reductase (CCR), phenylalanine ammonia-lyase (PAL), 4-coumarate:CoA ligase (4CL), coniferol glucosyl transferase (CGT), coniferin beta-glucosidase (CBG), laccase (LAC) and peroxidase (POX), together with ferulate-5-hydroxylase (F5H) from eucalyptus. In one embodiment, the isolated DNA sequences comprise a nucleotide sequence selected from the group consisting of: sequences recited in SEQ ID NO: 3, 13, 16-70, and 72-88; complements of the sequences recited in SEQ ID NO: 3, 13, 16-70, 72-88; reverse complements of the sequences recited in SEQ ID NO: 3, 13, 16-70, 72-88; (d) reverse sequences of the sequences recited in SEQ ID NO: 3, 13, 16-70, 72- 88; and sequences having at least about a 99% probability of being the to same as a sequence of as measured by the computer algorithm FASTA. In a preferred embodiment, the isolated DNA sequences comprise a nucleotide sequence selected from the group consisting of: nucleotides 1-535 of SEQ ID NO: 1; nucleotides 46-671 of SEQ ID NO: 2; nucleotides 1-535 of SEQ ID NO: 3; nucleotides 290-949 of SEQ ID NO: 4; nucleotides 15-959 of 15 SEQ ID NO: 5; nucleotides 15-1026 of SEQ ID NO: 6; nucleotides 15-1454 of SEQ ID NO: 7; nucleotides 15-740 of SEQ ID NO: 8; nucleotides 108-624 of SEQ ID NO: 9; nucleotides 68-274 of SEQ ID NO: 10; nucleotides 1-765 of SEQ ID NO: 11; nucleotides 1-384 of SEQ ID NO: 12; nucleotides 1-278 of p I. SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID NO: 13; NO: 17; NO: 19; NO: 21; NO: 23; NO: 25; NO: 27; NO: 29; nucleotides nucleotides nucleotides nucleotides nucleotides nucleotides nucleotides nucleotides 14-472 of SEQ ID NO: 16; nucleotides 1-672 15-469 of SEQ ID NO: 18; nucleotides 15-469 1-341 of SEQ ID NO: 20; nucleotides 15-387 1-443 of SEQ ID NO: 22; nucleotides 15-607 15-421 of SEQ ID NO: 24; nucleotides 1-760 58-469 of SEQ ID NO: 26; nucleotides 15-495 15-472 of SEQ ID NO: 28; nucleotides 13-396 15-592 of SEQ ID NO: 30; nucleotides 15-468 SEQ ID NO: 31; nucleotides 1-405 of SEQ ID NO: 32; nucleotides 1-380 of SEQ ID NO: 33; nucleotides 1-305 of SEQ ID NO: 34; nucleotides 15-693 of SEQ ID NO: 35; nucleotides 1-418 of SEQ ID NO: 36; nucleotides 15-777 of SEQ ID NO: 37; nucleotides 1-344 of SEQ ID NO: 38; nucleotides 1-341 of SEQ ID NO: 39; nucleotides 15-358 of SEQ ID NO: 40; nucleotides 1-409 of SEQ ID NO: 41; nucleotides 1-515 of SEQ ID NO: 42; nucleotides 15-471 of SEQ ID NO: 43; nucleotides 15-487 of SEQ ID NO: 44; nucleotides 108-664 of SEQ ID NO: 45; nucleotides 15-418 of SEQ ID NO: 46; nucleotides 65-479 of SEQ ID NO: 47; nucleotides 127-1785 of SEQ ID NO: 48; nucleotides 15-475 of SEQ ID NO: 49; nucleotides 288-801 of SEQ ID NO: 50; nucleotides 51io 711 of SEQ ID NO: 51; nucleotides 1-426 of SEQ ID NO: 52; nucleotides 92- 562 of SEQ ID NO: 53; nucleotides 1-1074 of SEQ ID NO: 54; nucleotides 1- 1075 of SEQ ID NO: 55; nucleotides 1-1961 of SEQ ID NO: 56; nucleotides 1- 1010 of SEQ ID NO: 57; nucleotides 15-741 of SEQ ID NO: 58; nucleotides 1- 643 of SEQ ID NO: 59; nucleotides 15-441 of SEQ ID NO: 60; nucleotides 15 913 of SEQ ID NO: 61; nucleotides 15-680 of SEQ ID NO: 62; nucleotides 492 of SEQ ID NO: 63; nucleotides 15-524 of SEQ ID NO: 64; nucleotides 1o, 417 of SEQ DI NO: 65; nucleotides 1-511 of SEQ ID NO: 66; nucleotides 176- 609 of SEQ ID NO: 67; nucleotides 1-474 of SEQ ID NO: 68; nucleotides 1-474 of SEQ ID NO: 69; nucleotides 176-608 of SEQ ID NO: 70; nucleotides 30"i 20 1474 of SEQ ID NO: 71; nucleotides 15-1038 of SEQ ID NO: 72; nucleotides 1- 372 of SEQ ID NO: 73; nucleotides 18-545 of SEQ ID NO: 74; nucleotides :.*46 463 of SEQ ID NO: 75; nucleotides 32-435 of SEQ ID NO: 76; nucleotides of SEQ ID NO: 79; nucleotides 1-346 of SEQ ID NO: 80; nucleotides 15-957 of o25 SEQ ID NO: 81; nucleotides 40-452 of SEQ ID NO: 82; nucleotides 15-471 of SEQ ID NO: 83; nucleotides 1-338 of SEQ ID NO: 84 nucleotides 150-1229 of SEQ ID NO: 83; nucleotides 1-338 of SEQ ID NO: 84; nucleotides 150-1229 of ,4 SEQ ID NO: 85; nucleotides 1-1410 of SEQ ID NO: 86; nucleotides 1-687 of SEQ ID NO: 87; and nucleotides 1-688 of SEQ ID NO: 88.
In another aspect, the invention provides DNA constructs comprising a DNA sequence of the present invention, either alone, in combination with one or more of the inventive sequences or in combination with one or more known DNA sequences; together with transgenic cells comprising such constructs.
In a related aspect, the present invention provides DNA constructs comprising, in the direction, a gene promoter sequence; an open reading frame coding for at least a functional portion of an enzyme encoded by the to inventive DNA sequences or variants thereof; and a gene termination sequence. The open reading frame may be orientated in either a sense or antisense direction. DNA constructs comprising a non-coding region of a gene coding for an enzyme encoded by the above DNA sequences or a nucleotide sequence complementary to a non-coding region, together with a gene 15 promoter sequence and a gene termination sequence, are also provided.
Preferably, the gene promoter and termination sequences are functional in a host plant. Most preferably, the gene promoter and termination sequences are those of the original enzyme genes but others generally used in the art, such as the Cauliflower Mosaic Virus (CMV) promoter, with or without enhancers, 20 such as the Kozak sequence or Omega enhancer, and Agrobacterium tumefaciens nopalin synthase terminator may be usefully employed in the present invention. Tissue-specific promoters may be employed in order to target expression to one or more desired tissues. In a preferred embodiment, the gene promoter sequence provides for transcription in xylem. The DNA construct may further include a marker for the identification of transformed cells.
a a In a further aspect, transgenic plant cells comprising the DNA constructs of the present invention are provided, together with plants comprising such transgenic cells, and fruits and seeds of such plants.
In yet another aspect, methods for modulating the lignin content and composition of a plant are provided, such methods including stably incorporating into the genome of the plant a DNA construct of the present invention. In a preferred embodiment, the target plant is a woody plant, preferably selected from the group consisting of eucalyptus and pine species, most preferably from the group consisting of Eucalyptus grandis and Pinus radiata. In a related aspect, a method for producing a plant having altered lignin content is provided, the method comprising transforming a plant cell with a DNA construct of the present invention to provide a transgenic cell, and cultivating the transgenic cell under conditions conducive to regeneration and mature plant growth.
In yet a further aspect, the present invention provides methods for modifying the activity of an enzyme in a plant, comprising stably incorporating into the genome of the plant a DNA construct of the present invention. In a preferred embodiment, the target plant is a woody plant, preferably selected from the group consisting of eucalyptus and pine species, most preferably from the group consisting of Eucalyptus grandis and Pinus radiata.
The above-mentioned and additional features of the present invention and the manner of obtaining them will become apparent, and the invention will be best understood by reference to the following more detailed description, read in conjunction with the accompanying drawing.
25 Brief Description of the Figures Fig. 1 is a schematic overview of the lignin biosynthetic pathway.
Detailed Description Lignin is formed by polymerization of at least three different monolignols, 30 primarily para-coumaryl alcohol, coniferyl alcohol and sinapyl alcohol. While these three types of lignin subunits are well known, it is possible that slightly different variants of these subunits may be involved in the lignin biosynthetic pathway in various plants. The relative concentration of these residues in lignin varies between different plant species and within species. In addition, the composition of lignin may also vary between different tissues within a specific plant. The three monolignols are derived from phenylalanine in a multistep process and are believed to be polymerized into lignin by a free radical mechanism.
Fig. I shows the different steps in the biosynthetic pathway for coniferyl alcohol together with the enzymes responsible for catalyzing each step. para-Coumaryl alcohol and sinapyl alcohol are synthesized by similar pathways. Phenylalanine is first deaminated by phenylalanine ammonia-lyase (PAL) to give cinnamate which is then hydroxylated by cinnamate 4-hydroxylase (C4H) to form p-coumarate. p-Coumarate is hydroxylated by coumarate 3-hydroxylase to give caffeate. The newly added hydroxyl group is then methylated by O-methyl transferase (OMT) to give ferulate which is conjugated to coenzyme A by 4-coumarate:CoA ligase (4CL) to form feruloyl-CoA.
Reduction of feruloyl-CoA to coniferaldehyde is catalyzed by cinnamoyl-CoA reductase (CCR). Coniferaldehyde is further reduced by the action of cinnamyl alcohol dehydrogenase (CAD) to give coniferyl alcohol which is then converted into its glucosylated form for export from the cytoplasm to the cell wall by coniferol glucosyl transferase (CGT). Following export, the de-glucosylated form of coniferyl alcohol is obtained by the action of coniferin beta-glucosidase (CBG). Finally, polymerization of the three monolignols to provide lignin is catalyzed by phenolase (PNL), laccase (LAC) and peroxidase (POX).
The formation of sinapyl alcohol involves an additional enzyme, hydroxylase (F5H). For a more detailed review of the lignin biosynthetic pathway, see: Whetton, R. and Sederoff, The Plant Cell, 7:1001-1013 (1995).
Quantitative and qualitative modifications in plant lignin content are known to be induced by external factors such as light stimulation, low calcium levels and "mechanical stress. Synthesis of new types of lignins, sometimes in tissues not normally lignified, can also be induced by infection with pathogens. In addition to lignin, several other classes of plant products are derived from phenylalanine, including flavonoids, 30 coumarins, stilbenes and benzoic acid derivatives, with the initial steps in the synthesis of all these compounds being the same. Thus modification of the action of PAL, C4H and 4CL may affect the synthesis of other plant products in addition to lignin.
Using the methods and materials of the present invention, the lignin content of a plant can be increased by incorporating additional copies of genes encoding enzymes involved in the lignin biosynthetic pathway into the genome of the target plant. Similarly, a decrease in lignin content can be obtained by transforming the target plant with antisense copies of such genes. In addition, the number of copies of genes encoding for different enzymes in the lignin biosynthetic pathway can be manipulated to modify the relative amount of each monolignol synthesized, thereby leading to the formation of lignin having altered composition. The alteration of lignin composition would be advantageous, for example, in tree processing for paper, and may also be effective in altering the palatability of wood materials to rotting fungi.
In one embodiment, the present invention provides isolated complete or partial DNA sequences encoding, or partially encoding, enzymes involved 15 in the lignin biosynthetic pathway, the DNA sequences being obtainable from eucalyptus and pine. Specifically, the present invention provides isolated DNA sequences encoding the enzymes CAD (SEQ ID NO: 1, in particular nucleotides 1 535; and SEQ ID NO: 30, in particular nucleotides 15 592), PAL (SEQ ID NO: 16, in particular nucleotides 14 20 472), C4H (SEQ ID NO: 17, in particular nucleotides 1 672), C3H (SEQ ID NO: 18, in particular nucleotides 15 469), F5H (SEQ ID NO: 19, in particular nucleotides 15 469; SEQ ID NO: 20, in particular nucleotides 1 341; and SEQ ID NO: 21, in particular nucleotides 15 387), OMT (SEQ ID NO: 22, in particular nucleotides 1 443; SEQ ID NO: 23, in particular nucleotides 15 607; SEQ ID NO: 24, in particular nucleotides 15 421; and SEQ ID NO: 25, in particular nucleotides 1 760), CCR (SEQ ID NO: 26, in particular nucleotides 58 469; SEQ ID NO: 27, in particular nucleotides 15 495; SEQ ID NO: 28, in particular nucleotides 15 472; and SEQ ID NO: 29, in particular nucleotides 13 396), CGT (SEQ ID NO: 31, in particular nucleotides 15 468; SEQ ID NO: 32, in particular nucleotides 1 405; and SEQ ID NO: 33, in particular nucleotides 1 380), CBG (SEQ ID NO: 34, in particular nucleotides 1 305), PNL (SEQ ID NO: in particular nucleotides 15 693; and SEQ ID NO: 36, in particular nucleotides 1 418), LAC (SEQ ID NO: 37, in particular nucleotides 15 777; SEQ ID NO: 38, in particular nucleotides 1 344; SEQ ID NO: 39, in particular nucleotides 1 341; SEQ ID NO: 40, in particular nucleotides 358; and SEQ ID NO: 41, in particular nucleotides 1 409) and POX (SEQ ID NO: 42, in particular nucleotides 1 515; SEQ ID NO: 43, in particular nucleotides 15 571; and SEQ ID NO: 44, in particular nucleotides 15 487) from Eucalyptus grandis; and the enzymes C4H (SEQ ID NO: 2, in particular nucleotides 46 671; SEQ ID NO: 3, in particular 15 nucleotides 1 535; SEQ ID NO: 48, in particular nucleotides 127 1785; and SEQ ID NO: 49, in particular nucleotides 15 475), C3H (SEQ ID NO: 4, in particular nucleotides 290 949; SEQ ID NO: 50, in particular nucleotides 288 801; SEQ ID NO: 51, in particular nucleotides 51 711; and SEQ ID NO: 52, in particular nucleotides 1 426), PNL (SEQ ID NO: 20 in particular nucleotides 15 959; and SEQ ID NO: 81, in particular nucleotides 15 957), OMT (SEQ ID NO: 6, in particular nucleotides 15 1026; SEQ ID NO: 53, in particular nucleotides 92 562; SEQ ID NO: 54, in particular nucleotides 1 1074; and SEQ ID NO: 55, in particular nucleotides 1 1075), CAD (SEQ ID NO: 7, in particular nucleotides 15 25 1454; and SEQ ID NO: 71, in particular nucleotides 15 1474), CCR (SEQ 'ID NO: 8, in particular nucleotides 15 740; SEQ ID NO: 58, in particular nucleotides 15 741; SEQ ID NO: 59, in particular nucleotides 1 643; SEQ ID NO: 60, in particular nucleotides 15 441; SEQ ID NO: 61, in particular nucleotides 15 913; SEQ ID NO: 62, in particular nucleotides 680; SEQ ID NO: 63, in particular nucleotides 15 492; SEQ ID NO: 64, in particular nucleotides 15 524; SEQ ID NO: 65, in particular nucleotides 1 417; SEQ ID NO: 66, in particular nucleotides 1 511; SEQ ID NO: 67, in particular nucleotides 176 609; SEQ ID NO: 68, in particular nucleotides 1 474; SEQ ID NO: 69, in particular nucleotides 1 474; and SEQ ID NO: 70, in particular nucleotides 176 608), PAL (SEQ ID NO: 9, in particular nucleotides 108 624; SEQ ID NO: 10, in particular nucleotides 68 274; SEQ ID NO: 11, in particular nucleotides 1 765; SEQ ID NO: 45, in particular nucleotides 108 664; SEQ ID NO: 46, in particular nucleotides 15 418; and SEQ ID NO: 47, in particular nucleotides 65 479), 4CL (SEQ ID NO: 12, in particular nucleotides 1 384; SEQ ID NO: 56, in particular nucleotides 1 1961; and SEQ ID NO: 15 57, in particular nucleotides 1 1010), CGT (SEQ ID NO: 72, in particular nucleotides 15 1038), CBG (SEQ ID NO: 73, in particular nucleotides 1 372; SEQ ID NO: 74, in particular nucleotides 18 545; SEQ ID NO: in particular nucleotides 40 463; SEQ ID NO: 76, in particular nucleotides 32 435; SEQ ID NO: 77, in particular nucleotides 15 451; SEQ ID NO: .e 20 78, in particular nucleotides 1 374; SEQ ID NO: 79, in particular nucleotides 1 457; and SEQ ID NO: 80, in particular nucleotides 1 346), LAC (SEQ ID NO: 82, in particular nucleotides 40 452; SEQ ID NO: 83, in particular nucleotides 15 471; and SEQ ID NO: 84, in particular nucleotides 1 338) and POX (SEQ ID NO: 13, in particular nucleotides 1 25 278; SEQ ID NO: 85, in particular nucleotides 150 1229; SEQ ID NO: 86, in particular nucleotides 1 1410; SEQ ID NO: 87, in particular nucleotides 1 687; and SEQ ID NO: 88, in particular nucleotides 1 688) 6c from Pinus radiata. Complements of such isolated DNA sequences, reverse complements of such isolated DNA sequences and reverse sequences of such isolated DNA sequences, together with variants of such sequences, are also provided. DNA sequences encompassed by the present invention include cDNA, genomic DNA, recombinant DNA and wholly or partially chemically synthesized DNA molecules.
The definition of the terms "complement", "reverse complement" and "reverse sequence", as used herein, is best illustrated by the following example. For the sequence 5' AGGACC the. complement, reverse io complement and reverse sequence are as follows:
S**
S
complement 3' TCCTGG reverse complement 3' GGTCCT reverse sequence 5' CCAGGA 3'.
As used herein, the term "variant" covers any sequence which exhibits at least about 50%, more preferably at least about 70% and, more preferably yet, at least about 90% identity to a sequence of the present invention. Most preferably, a "variant" is any sequence which has at least about a 99% probability of being the same as the inventive sequence. The probability for DNA sequences is measured by the computer algorithm FASTA (version 2.0u4, February 1996; Pearson W. R. et al..
Proc. Natl. Acad. Sci, 85:2444-2448, 1988), the probability for translated DNA sequences is measured by the computer algorithm TBLASTX and that for protein sequences is measured by the computer algorithm BLASTP (Altschul, S. F. et al. J.
Mol. Biol., 215:403-410, 1990). The term "variants" thus encompasses sequences wherein the probability of finding a match by chance (smallest sum probability) in a database, is less than about 1% as measured by any of the above tests.
Variants of the isolated sequences from other eucalyptus and pine species, as well as from other commercially important species utilized by the lumber industry, are contemplated. These include the following gymnosperms, by way of example: loblolly pine Pinus taeda, slash pine Pinus elliotti, sand pine Pinus clausa, longleaf pine 20 Pinuspalustrus, shortleaf pine Pinus echinata, ponderosa pine Pinus ponderosa, Jeffrev pine Pinus jeffrey, red pine Pinus resinosa, pitch pine Pinus rigida, jack pine Pinus S*banksiana, pond pine Pinus serotina, Eastern white pine Pinus strobus, Western white pine Pinus monticola, sugar pine Pinus lambertiana, Virginia pine Pinus virginiana, lodgepole pine Pinus contorta, Caribbean pine Pinus caribaea, P. pinaster, Calabrian 25 pine P. brutia, Afghan pine P. eldarica, Coulter pine P. coulteri, European pine P.
nigra and P. sylvestris; Douglas-fir Pseudotsuga menziesii; the hemlocks which include Western hemlock Tsuga heterophylla, Eastern hemlock Tsuga canadensis, Mountain hemlock Tsuga mertensiana; the spruces which include the Norway spruce Picea abies, red spruce Picea rubens, white spruce Picea glauca, black spruce Picea mariana, Sitka spruce Picea sitchensis, Englemann spruce Picea engelmanni, and blue spruce Picea pungens; redwood Sequoia sempervirens; the true firs include the Alpine fir Abies lasiocarpa, silver fir Abies amabilis, grand fir Abies grandis, noble fir Abies procera, white fir Abies concolor, California red fir Abies magnifica, and balsam fir Abies balsamea, the cedars which include the Western red cedar Thuja plicata, incense cedar libocedrus decurrens, Northern white cedar Thuja occidentalis, Port Orford cedar Chamaecyparis lawsoniona, Atlantic white cedar Chwnaecyparis thyoides, Alaska yellow-cedar Clamaecyparis nootkatensis. and Eastern red cedar Huniperus virginiana: the larches which include Eastern larch Larix laricina, Western larch Larix occidentalis, European larch Larix decidua, Japanese larch LariL leprolepis, and Siberian larch Larix siberica; bold cypress Taxodium distichum and Giant sequoia Sequoia gigantea; and the following angiosperms, by way of example: Eucalyptus alba, E. bancroftii, E. botyroides, E. bridgesiana, E. calophylla, E.
camaldulensis, E. citriodora, E. cladocalyx, E. coccifera, E. currisii, E. dalrympleana. E.
deglupra. E. delagatensis, E. diversicolor, E. dunnii, E. ficifolia, E. globulus, E.
gomphocephala. E gunnii, E. henryi, E. laevopinea, E. macarthurii, E. macrorhyncha.
E. maculata. E. marginata, E. megacarpa, E. melliodora. E. nicholii, E. nitens, E. novaanglica. E. obliqua, E. obtusiflora, E. oreades, E. pauciflora. E. polybracrea, E. regnans, E. resinifera. E. robusta, E. rudis, E. saligna, E. sideroxylon, E. stuartiana, E. tereticornis, E. torelliana, E. umigera, E. urophylla, E. viminalis, E. viridis, E. wandoo and E.
youmanni.
The inventive DNA sequences may be isolated by high throughput sequencing of cDNA libraries such as those prepared from Eucalyprus grandis and Pinus radiata as described below in Examples 1 and 2. Alternatively, oligonucleotide probes based **on the sequences provided in SEQ ID NO: 1-13 and 16-88 can be synthesized and used to identify positive clones in either cDNA or genomic DNA libraries from Eucalyptus grandis and Pinus radiata, or .from other gymnosperms and angiosperms including those identified above, by means of hybridization or PCR techniques.
Probes can be shorter than the sequences provided herein but should be at least about 10, preferably at least about 15 and most preferably at least about nucleotides in length. Hybridization and PCR techniques suitable for use with such oligonucleotide probes are well known in the art. Positive clones may be analyzed by restriction enzyme digestion, DNA sequencing or the like.
In addition, the DNA sequences of the present invention may be generated by synthetic means using techniques well known in the art. Equipment for automated synthesis of oligonucleotides is commercially available from suppliers such as Perkin Elmer/Applied Biosystems Division (Foster City, CA) and may be operated according to the manufacturer's instructions.
-8- In one embodiment, the DNA constructs of the present invention include an open reading frame coding for at least a functional portion of an enzyme encoded by a nucleotide sequence of the present invention or a variant thereof. As used herein, the "functional portion" of an enzyme is that portion which contains the active site essential for affecting the metabolic step, i.e. the portion of the molecule that is capable of binding one or more reactants or is capable of improving or regulating the rate of reaction. The active site may be made up of separate portions present on one or more polypeptide chains and will generally exhibit high substrate specificity. The term "enzyme encoded by a nucleotide sequence" as used herein, includes enzymes encoded by a nucleotide sequence which includes the partial isolated DNA sequences of the present invention.
For applications where amplification of lignin synthesis is desired, the open reading frame is inserted in the DNA construct in a sense orientation, such that transformation of a target plant with the DNA construct will lead to an increase in the number of copies of the gene and therefore an increase in the amount of enzyme. When down-regulation of lignin synthesis is desired, the open reading frame is inserted in the DNA construct in an antisense orientation, such that the RNA produced by transcription of the DNA sequence is complementary to the endogenous mRNA sequence. This, in turn, will result in a decrease in the number of copies of the gene and therefore a 20 decrease in the amount of enzyme. Alternatively, regulation can be achieved by inserting appropriate sequences or subsequences DNA or RNA) in ribozyme constructs.
In a second embodiment, the inventive DNA constructs comprise a nucleotide sequence including a non-coding region of a gene coding for an enzyme encoded by a 25 DNA sequence of the present invention, or a nucleotide sequence complementary to such a non-coding region. As used herein the term "non-coding region" includes both transcribed sequences which are not translated, and non-transcribed sequences within about 2000 base pairs 5' or 3' of the translated sequences or open reading frames.
Examples of non-coding regions which may be usefully employed in the inventive constructs include introns and 5'-non-coding leader sequences. Transformation of a target plant with such a DNA construct may lead to a reduction in the amount of lignin synthesized by the plant by the process of cosuppression, in a manner similar to that I discussed, for example, by Napoli et al. (Plant Cell 2:279-290, 1990) and de Carvalho Niebel et al. (Plant Cell 7:347-358, 1995).
The DNA constructs of the present invention further comprise a gene promoter sequence and a gene termination sequence, operably linked to the DNA sequence to be transcribed, which control expression of the gene. The gene promoter sequence is generally positioned at the 5' end of the DNA sequence to be transcribed, and is employed to initiate transcription of the DNA sequence. Gene promoter sequences are generally found in the 5' non-coding region of a gene but they may exist in introns (Luehrsen, K. Mol. Gen. Genet. 225:81-93, 1991) or in the coding region, as for example in PAL of tomato (Bloksberg, 1991. Studies on the Biology of Phenylalanine Ammonia Lyase and Plant Pathogen Interaction. Ph.D. Thesis. Univ. of California, Davis, University Microfilms International order number 9217564). When the construct includes an open reading frame in a sense orientation, the gene promoter sequence also initiates translation of the open reading frame. For DNA constructs comprising either an open reading frame in an antisense orientation or a non-coding region, the gene promoter sequence consists only of a transcription initiation site having a RNA polymerase binding site.
A variety of gene promoter sequences which may be usefully employed in the DNA constructs of the present invention are well known in the art. The promoter gene 20 sequence, and also the gene termination sequence, may be endogenous to the target plant host or may be exogenous, provided the promoter is functional in the target host.
For example, the promoter and termination sequences may be from other plant species, plant viruses, bacterial plasmids and the like. Preferably, gene promoter and termination sequences are from the inventive sequences themselves.
25 Factors influencing the choice of promoter include the desired tissue specificity of the construct, and the timing of transcription and translation. For example, constitutive promoters, such as the 35S Cauliflower Mosaic Virus (CaMV promoter, will affect the activity of the enzyme in all parts of the plant. Use of a tissue specific promoter will result in production of the desired sense or antisense RNA only in the tissue of interest. With DNA constructs employing inducible gene promoter sequences, the rate of RNA polymerase binding and initiation can be modulated by external stimuli, such as light, heat, anaerobic stress, alteration in nutrient conditions and the like. Temporally regulated promoters can be employed to effect modulation of the rate of RNA polymerase binding and initiation at a specific time during development of a transformed cell. Preferably, the original promoters from the enzyme gene in question. or promoters from a specific tissue-targeted gene in the organism to be transformed, such as eucalyptus or pine are used. Other examples of gene promoters which may be usefully employed in the present invention include, mannopine svnthase (mas), octopine synthase (ocs) and those reviewed by Chua et al. (Science, 244:174- 181. 1989).
The gene termination sequence, which is located 3' to the DNA sequence to be to transcribed, may come from the same gene as the gene promoter sequence or may be from a different gene. Many gene termination sequences known in the art may be usefully employed in the present invention, such as the 3' end of the Agrobacterium lumefaciens nopaline synthase gene. However, preferred gene terminator sequences are those from the original enzyme gene or from the target species to be transformed.
The DNA constructs of the present invention may also contain a selection marker that is effective in plant cells, to allow for the detection of transformed cells containing the inventive construct. Such markers, which are well known in the art, ~typically confer resistance to one or more toxins. One example of such a marker is the NPTII gene whose expression results in resistance to kanamycin or hygromycin, 20 antibiotics which is usually toxic to plant cells at a moderate concentration (Rogers et al. in Methods for Plant Molecular Biologv, A. Weissbach and H. Weissbach, eds., Academic Press Inc., San Diego, CA (1988)). Alternatively, the presence of the desired construct in transformed cells can be determined by means of other techniques well known in the art, such as Southern and Western blots.
25 Techniques for operatively linking the components of the inventive DNA constructs are well known in the art and include the use of synthetic linkers containing one or more restriction endonuclease sites as described, for example, by Maniatis et al., (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989). The DNA construct of the present invention may be linked to a vector having at least one replication system, for example, E. coli, whereby after each manipulation, the resulting construct can be cloned and sequenced and the correctness of the manipulation determined.
The DNA constructs of the present invention may be used to transform a variety of plants, both monocotyledonous grasses, corn, grains, oat, wheat and barley), dicotyledonous Arabidopsis, tobacco, legumes, alfalfa, oaks, eucalyptus, maple), and Gymnosperms Scots pine (Aronen. Finnish Forest Res. Papers, vol. 595, 1996), white spruce (Ellis et al., Biotechnoloev 11:94-92, 1993), larch (Huang et al., In Vitro Cell 27:201-207, 1991). In a preferred embodiment, the inventive DNA constructs are employed to transform woody plants, herein defined as a tree or shrub whose stem lives for a number of years and increases in diameter each year by the addition of woody tissue. Preferably the target plant is selected from the group consisting of eucalyptus and pine species, most preferably from the group consisting of Eucalyptus grandis and Pinus radiata. As discussed above, transformation of a plant with a DNA construct including an open reading frame coding for an enzyme encoded by an inventive DNA sequence wherein the open reading frame is orientated in a sense direction will lead to an increase in lignin content of the plant or, in some cases, to a decrease by cosuppression. Transformation of a plant with a DNA construct comprising an open reading frame in an antisense orientation or a non-coding (untranslated) region of a gene will lead to a decrease in the lignin content of the transformed plant.
Techniques for stably incorporating DNA constructs into the genome of target plants are well known in the art and include Agrobacterium tumefaciens mediated introduction, electroporation, protoplast fusion, injection into reproductive organs, injection into immature embryos, high velocity projectile introduction and the like. The *choice of technique will depend upon the target plant to be transformed. For example, dicotyledonous plants and certain monocots and gymnosperms may be transformed by Agrobacterium Ti plasmid technology, as described, for example by Bevan (Nucl. Acid Res. 12:8711-8721, 1984). Targets for the introduction of the DNA constructs of the present invention include tissues, such as leaf tissue, disseminated cells, protoplasts, seeds, embryos, meristematic regions; cotyledons, hypocotyls, and the like. One preferred method for transforming eucalyptus and pine is a biolistic method using pollen (see, for example, Aronen 1996, Finnish Forest Res. Papers vol. 595, 5 3 pp) or easily regenerable embryonic tissues. Other transformation techniques which may be usefully employed in the inventive methods include those taught by Ellis et al. (Plant Cell Reports. 8:16-20, 1989), Wilson et al. (Plant Cell Reports 7:704-707, 1989) and Tautorus et al. (Theor. Appl. Genet. 28:531-536, 1989).
Once the cells are transformed, cells having the inventive DNA construct incorporated in their genome may be selected by means of a marker, such as the kanamycin resistance marker discussed above. Transgenic cells may then be cultured in an appropriate medium to regenerate whole plants, using techniques well known in the art. In the case of protoplasts, the cell wall is allowed to reform under appropriate osmotic conditions. In the case of seeds or embryos, an appropriate germination or callus initiation medium is employed. For explants, an appropriate regeneration medium is used. Regeneration of plants is well established for.many species. For a review of regeneration of forest trees see Dunstan et al., Somatic embryogenesis in woody plants. In: Thorpe, T.A. ed., 1995: in vitro embryogenesis of plants. Vol. 20 in Current Plant Science and Biotechnology in Agriculture, Chapter 12, pp. 471-540.
Specific protocols for the regeneration of spruce are discussed by Roberts et al., (Somatic Embryogenesis of Spruce. In: Synseed. Applications ofsynthetic seed to crop improvement. Redenbaugh, ed. CRC Press, Chapter 23, pp. 427-449, 1993). The i: resulting transformed plants may be reproduced sexually or asexually, using methods well known in the art, to give successive generations oftransgenic plants.
SAs discussed above, the production of RNA in target plant cells can be 20 controlled by choice of the promoter sequence, or by selecting the number of functional copies or the site of integration of the DNA sequences incorporated into the genome of the target plant host. A target plant may be transformed with more than one DNA construct of the present invention, thereby modulating the lignin biosynthetic pathway for the activity of more than one enzyme, affecting enzyme activity in more than one 25 tissue or affecting enzyme activity at more than one expression time. Similarly, a DNA construct may be assembled containing more than one open reading frame coding for an enzyme encoded by a DNA sequence of the present invention or more than one noncoding region of a gene coding for such an enzyme. The DNA sequences of the present inventive may also be employed in combination with other known sequences encoding enzymes involved in the lignin biosynthetic pathway. In this manner, it may be possible to add a lignin biosynthetic pathway to a non-woody plant to produce a new woody plant.
The isolated DNA sequences of the present invention may also be employed as probes to isolate DNA sequences encoding enzymes involved in the lignin synthetic pathway from other plant species, using techniques well known to those of skill in the art.
The following examples are offered by way of illustration and not by way of limitation.
Example 1 Isolation and Characterization of cDNA Clones from Eucalyptus grandis Two Eucalyptus grandis cDNA expression libraries (one from a mixture of various tissues from a single tree and one from leaves of a single tree) were constructed and screened as follows.
mRNA was extracted from the plant tissue using the protocol of Chang et al.
(Plant Molecular Biology Reporter 11:113-116 (1993)) with mirior modifications.
Specifically, samples were dissolved in CPC-RNAXB (100 mM Tris-CI, pH 8,0; mM EDTA; 2.0 M NaCI; 2%CTAB; 2% PVP and 0.05% Spermidine*3 HCI)and extracted with Chloroform:isoamyl alcohol, 24:1. mRNA was precipitated with ethanol and the total RNA preparate was purified using a Poly(A) Quik mRNA Isolation Kit (Stratagene, La Jolla, CA). A cDNA expression library was constructed from the .20 purified mRNA by reverse transcriptase synthesis followed by insertion of the resulting cDNA clones in Lambda ZAP using a ZAP Express cDNA Synthesis Kit (Stratagene), according to the manufacturer's protocol. The resulting cDNAs were packaged using a Gigapack II Packaging Extract (Stratagene) employing 1 pi of sample DNA from the pl ligation mix. Mass excision of the library was done using XL1-Blue MRF' cells and 25 XLOLR cells (Stratagene) with ExAssist helper phage (Stratagene). The excised phagemids were diluted with NZY broth (Gibco BRL, Gaithersburg, MD) and plated out onto LB-kanamycin agar plates containing X-gal and isopropylthio-beta-galactoside
(IPTG).
Of the colonies plated and picked for DNA miniprep, 99% contained an insert suitable for sequencing. Positive colonies were cultured in NZY broth with kanamycin and cDNA was purified by means of alkaline lysis and polyethylene glycol (PEG) precipitation. Agarose gel at 1% was used to screen sequencing templates for chromosomal contamination. Dye primer sequences were prepared using a Turbo Catalyst 800 machine (Perkin Elmer/Applied Biosystems, Foster City, CA) according to the manufacturer's protocol.
DNA sequence for positive clones was obtained using an Applied Biosystems Prism 377 sequencer. cDNA clones were sequenced first from both the 5' end and, in some cases, also from the 3' end. For some clones, internal sequence was obtained using subcloned fragments. Subcloning was performed using standard procedures of restriction mapping and subcloning to pBluescript II SK+ vector.
The determined cDNA sequence was compared to known sequences in the to EMBL database (release 46, March 1996) using the FASTA algorithm of February 1996 (version 2.0u4) (available on the Internet at the ftp site ftp://ftp.virginia.edu/pub/fasta/). Multiple alignments of redundant sequences were used to build up reliable consensus sequences. Based on similarity to known sequences from other plant species, the isolated DNA sequence (SEQ ID NO: 1) was identified as encoding a CAD enzyme.
In further studies, using the procedure described above, cDNA sequences encoding the following Eucalyptus grandis enzymes were isolated: PAL (SEQ ID NO: 16); C4H (SEQ ID NO: 17); C3H (SEQ ID NO: 18); F5H (SEQ ID NO: 19-21); OMT S.(SEQ ID NO: 22-25); CCR (SEQ ID NO: 26-29); CAD (SEQ ID NO: 30); CGT (SEQ 20 ID NO: 31-33); CBG (SEQ ID NO: 34); PNL (SEQ ID NO: 35, 36); LAC (SEQ ID NO: 37-41); and POX (SEQ ID NO: 42-44).
Example 2 Isolation and Characterization of cDNA Clones from Pinus radiata a) Isolation of cDNA clones by high through-put screening A Pinus radiata cDNA expression library was constructed from xylem and screened as described above in Example 1. DNA sequence for positive clones was obtained using forward and reverse primers on an Applied Biosystems Prism 377 sequencer and the determined sequences were compared to known sequences in the database as described above.
Based on similarity to known sequences from other plant species, the isolated DNA sequences were identified as encoding the enzymes C4H (SEQ ID NO: 2 and 3), C3H (SEQ ID NO: PNL (SEQ ID NO: OMT (SEQ ID NO: CAD (SEQ ID NO: CCR (SEQ ID NO: PAL (SEQ ID NO: 9-11) and 4CL (SEQ ID NO: 12).
In further studies, using the procedure described above, additional cDNA clones encoding the following Pinus radiata enzymes were isolated: PAL (SEQ ID NO: 47); C4H (SEQ ID NO: 48, 49); C3H (SEQ ID NO: 50-52); OMT (SEQ ID NO: 53- 4CL (SEQ ID NO: 56, 57); CCR (SEQ ID NO: 58-70); CAD (SEQ ID NO: 71); CGT (SEQ ID NO: 72); CBG (SEQ ID NO: 73-80); PNL (SEQ ID NO: 81); LAC o0 (SEQ ID NO: 82-84); and POX (SEQ ID NO: 85-88).
b) Isolation of cDNA clones by PCR Two PCR probes, hereinafter referred to as LNB010 and LNB011 (SEQ ID NO: 14 and 15, respectively) were designed based on conserved domains in the following peroxidase sequences previously identified in other species: vanpox, hvupox6, taepox, hvupoxl, osapox, ntopox2, ntopoxl, lespox, pokpox, luspox, athpox, hrpox, spopox, and tvepox (Genbank accession nos. D11337, M83671, X56011, X58396, X66125, J02979, D11396, X71593, D11102, L07554, M58381, X57564, Z22920, and Z31011, respectively).
20 RNA was isolated from pine xylem and first strand cDNA was synthesized as described above. This cDNA was subjected to PCR using 4 M LNB010, 4 pM LNB011, 1 x Kogen's buffer, 0.1 mg/ml.BSA, 200 mM dNTP, 2 mM Mg 2 and 0.1 U/l of Taq polymerase (Gibco BRL). .Conditions were 2 cycles of 2 min at 94 1 min at 55 *C and 1 min at 72 25 cycles of 1 min at 94 1 min at 55 and I min at 72 and 18 cycles of 1 min at 94 1 min at 55 and 3 min at 72 *C in a Stratagene Robocycler. The gene was re-amplified in the same manner. A band of about 200 bp was purified from a TAE agarose gel using a Schleicher Schuell Elu- Quik DNA purification kit and clones into a T-tailed pBluescript vector (Marchuk D. et al., Nucleic Acids Res. 19:1154, 1991). Based on similarity to known sequences, the isolated gene (SEQ ID NO: 13) was identified as encoding pine peroxidase (POX).
Example 3 Use of an O-methvltransferase (OMT) Gene to Modify Lienin Biosynthesis a) Transformation of tobacco plants with a Pinus radiata OMT gene Sense and anti-sense constructs containing a sequence including the coding region of OMT (SEQ ID NO: 53) from Pinus radiata were inserted into Agrobacterium tumefaciens LBA4301 (provided as a gift by Dr. C. Kado, University of California, Davis, CA) by direct transformation using published methods (see, An G, Ebert PR, Mitra A, Ha SB: Binary Vectors. In: Gelvin SB, Schilperoort RA (eds) Plant Molecular Biology Manual, Kluwer Academic Publishers, Dordrecht (1988)). The presence and integrity of the trarngenic constructs were verified by restriction digestion and DNA sequencing.
Tobacco (Nicotiana tabacum cv. Samsun) leaf sections were transformed using the method of Horsch et al. (Science, 227:1229-1231, 1985). Five independent transformed plant lines were established for the sense construct and eight independent transformed plant lines were established for the anti-sense construct for OMT.
Transformed plants containing the appropriate lignin gene construct were verified using Southern blot experiments. A in the column labeled "Southern" in Table 1 below 20 indicates that the transformed plant lines were confirmed as independent transformed lines.
b) Expression of Pinus OMT in transformed plants Total RNA was isolated from each independent transformed plant line created with the OMT sense and anti-sense constructs. The RNA samples were analysed in Northern blot experiments to determine the level of expression of the transgene in each transformed line. The data shown in the column labeled "Northern" in Table 1 shows *that the transformed plant lines containing the sense and anti-sense constructs for OMT all exhibited high levels of expression, relative to the background on the Northern blots.
OMT expression in sense plant line number 2 was not measured because the RNA sample showed signs of degradation. There was no detectable hybridisation to RNA samples from empty vector-transformed control plants.
c) Modulation of OMT enzyme activity in transformed plants The total activity of OMT enzyme, encoded by the Pinus OMT gene and by the endogenous tobacco OMT gene, in transformed tobacco plants was analysed for each transformed plant line created with the OMT sense and anti-sense constructs. Crude protein extracts were prepared from each transformed plant and assayed using the method of Zhang et al. (Plant Phvsiol., 113:65-74, 1997). The data contained in the column labeled "Enzyme" in Table I shows that the transformed plant lines containing the OMT sense construct generally had elevated OMT enzyme activity, with a maximum of 199%, whereas the transformed plant lines containing the OMT anti-sense construct generally had reduced OMT enzyme activity, with a minimum of relative to empty vector-transformed control .plants. OMT enzyme activity was not estimated in sense plant line number 3.
d) Effects of Pinus OMT on lignin concentration in transformed plants The concentration of lignin in the transformed tobacco plants was determined using the well-established procedure of thioglycolic acid extraction (see, Freudenberg et al. in "Constitution and Biosynthesis of Lignin", Springer-Verlag, Berlin, 1968).
Briefly, whole tobacco plants, of an average age of 38 days, were frozen in liquid nitrogen and ground to a fine powder in a mortar and pestle. 100 mg of frozen powder 20 from one empty vector-transformed control plant line, the five independent transformed plant lines containing the sense construct for OMT and the eight independent transformed plant lines containing the anti-sense construct for OMT were extracted individually with methanol, followed by 10% thioglycolic.acid and finally dissolved in 1 M NaOH. The final extracts were assayed for absorbance at 280 nm. The data shown 25 in the column labelled "TGA" in Table 1 shows that the transformed plant lines containing the sense and the anti-sense OMT gene constructs all exhibited significantly decreased levels of lignin, relative to the empty vector-transformed control plant lines.
S r Table 1 Inint Iinp t~rn pn r C t' I'T t. CI V F, ,I &LaLL&a U1 JUUL eIIl IU1II!1II ZjILYI jC I 1 2 4 1 2 3 4 6 7 8 control
OMT
OMT
OMT
OMT
OMT
OMT
OMT
OMT
OMT
OMT
OMT
OMT
OMT
na sense sense sense.
sense sense anti-sense anti-sense anti-sense anti-sense anti-sense anti-sense anti-sense anti-sense blank 2.9E+6 na 4.1E+6 2.3E+6 3.6E+5 1.6E+4 5.7E+3 8.0E+3 1.4E+4 2.5E+4 2.5E+4 2.5E+4 1.1E+4
S
*c
SS*
S
5
C
5.
C
.5* These data clearly indicate that lignin concentration, as measured by the TGA assay, can be directly manipulated by either sense or anti-sense expression of a lignin biosynthetic gene such as OMT.
25 Example 4 Use of a 4-Coumarate:CoA ligase (4CL) Gene to Modify Lignin Biosynthesis a) Transformation of tobacco plants with a Pinus radiata 4CL gene Sense and anti-sense constructs containing a sequence including the coding region of 4CL (SEQ ID NO: 56) from Pinus radiata were inserted into Agrobacterium tumefaciens LBA4301 by direct transformation as described above. The presence and integrity of the transgenic constructs were verified by restriction digestion and DNA sequencing.
35 Tobacco (Nicotiana tabacum cv. Samsun) leaf sections were transformed as described above. Five independent transformed plant lines were established for the sense construct and eight independent transformed plant lines were established for the anti-sense construct for 4CL. Transformed plants containing the appropriate lignin gene construct were verified using Southern blot experiments. A in the column labeled "Southern" in Table 2 indicates that the transformed plant lines listed were confirmed as independent transformed lines.
b) Expression ofPinus 4CL in transformed plants Total RNA was isolated from each independent transformed plant line created with the 4CL sense and anti-sense constructs. The RNA samples were analysed in Northern blot experiments to determine the level of expression of the transgene in each transformed line. The data shown in the column labelled "Northern" in Table 2 below shows that the transformed plant lines containing the sense and anti-sense constructs for 4CL all exhibit high levels of expression, relative to the background on the Northern blots. 4CL expression in anti-sense plant line number 1 was not measured because the RNA was not available at the time of the experiment. There was no detectable hybridisation to RNA samples from empty vector-transformed control plants.
c) Modulation of 4CL enzyme activity in transformed plants The total activity of 4CL enzyme, encoded by the Pinus 4CL gene and by the .e endogenous tobacco 4CL gene, in transformed tobacco plants was analysed for each transformed plant line created with the 4CL sense and anti-sense constructs. Crude protein extracts were prepared from each transformed plant and assayed using the @0SO e 20 method of Zhang et al. (Plant Physiol., 113:65-74, 1997). The data contained in the column labeled "Enzyme" in Table 2 shows that the transformed plant lines containing the 4CL sense construct had elevated 4CL enzyme activity, with a maximum of 258%, and the transformed plant lines containing the 4CL anti-sense construct had reduced 4CL enzyme activity, with a minimum of 59%, relative to empty vector-transformed control plants.
d) Effects of Pinus 4CL on lignin concentration in transformed plants The concentration of lignin ii s'amples of transformed plant material was determined as described in Example 3. The data shown in the column labelled "TGA" in Table 2 shows that the transformed plant lines containing the sense and the antisense 4CL gene constructs all exhibited significantly decreased levels of lignin, relative to the empty vector-transformed control plant lines. These data clearly indicate that lignin concentration, as measured by the TGA assay, can be directly manipulated by either sense or anti-sense expression of a lignin biosynthetic gene such as 4CL.
Table 2 nlant line trnoen nr'Pntntinrn Qmiltliprr Mrnrthprn Fnnp TtY A Lrr 1V UVUI CIL VILllrlll L~IIL 111~ ~~sr r v r r~ ivr uvu~ ri I vrrirrrir ~rr~iillr LU~ 1 1 2 3 4 5 2 3 4 5 6 7 8 control control 4CL 4CL 4CL 4CL 4CL 4CL 4CL 4CL 4CL 4CL 4CL 4CL 4CL na na sense sense sense sense sense anti-sense anti-sense anti-sense anti-sense anti-sense anti-sense anti-sense anti-sense blank blank 2.3E+4 4.5E+4 3. 1E+4 1.7E+4 1.6E+4 na 1.OE+4 9.6E+3 1.2E+4 4.7E+3 3.9E+3 1.8E+3 1.7E+4 p p p Example Transformation of Tobacco using the Inventive Lignin Biosynthetic Genes Sense and anti-sense constructs :containing sequences including the coding regions of C3H (SEQ ID NO: 18), F5H (SEQ ID NO: 19), CCR (SEQ ID NO: 25) and CGT (SEQ ID NO: 31) from Eucalyptus grandis, and PAL (SEQ ID NO: 45 and 47), C4H (SEQ ID NO: 48 and 49), PNL (SEQ ID NO: 81) and LAC (SEQ ID NO: 83) from Pinus radiata were inserted into Agrobacterium tumefaciens LBA4301 by direct transformation as described above. The presence and integrity of the transgenic constructs were verified by restriction digestion and DNA sequencing.
Tobacco (Nicotiana tabacum cv. Samsun) leaf sections were transformed as described in Example 3. Up to twelve independent transformed plant lines were established for each sense construct and each anti-sense construct listed in the preceding paragraph. Transformed plants containing the appropriate lignin gene construct were verified using Southern blot experiments. All of the transformed plant lines analysed were confirmed as independent transformed lines.
Example 6 Manipulation of Lignin Content in Transformed Plants a) Determination of transeene expression by Northern blot experiments Total RNA was isolated from each independent transformed plant line described in 0o Example 5. The RNA samples were analysed in Northern blot experiments to determine the level of expression of the transgene in each transformed line. The column labelled "Northern" in Table 3 shows the level of transgene expression for all plant lines assayed, relative to the background on the Northern blots. There was no detectable hybridisation to RNA samples from empty vector-transformed control plants.
b) Determination of lienin concentration in transformed plants The concentration of lignin in empty vector-transformed control plant lines and in up to twelve independent transformed lines for each sense construct and each anti-sense 20 construct described in Example 5 was determined as described in Example 3. The column labelled "TGA" in Table 3 shows the thioglycolic acid extractable lignins for all plant lines assayed, expressed as the average percentage of TGA extractable lignins in transformed plants versus control plants. The range of variation is shown in parentheses.
0* %i *o Table 3 transgene orientation no. of lines Northern TGA control na 3 blank 100 (92-104) C3H sense 5 3.7E+4 74 (67-85) sense 10 5.8E+4 70 (63-79) anti-sense 9 5.8E+4 73 (35-93) CCR sense 1 na 74 CCR anti-sense 2 na 74 (62-86) PAL sense 5 1.9E+5 77 (71-86) PAL anti-sense 4 1.5E+4 62 (37-77) C4H anti-sense 10 5.8E+4 86 (52-113) is PNL anti-sense 6 1.2E+4 88 (70-114) LAC sense 5 1.7E+5 na LAC anti-sense 12 1.7E+5 88 (73-114) Transformed plant lines containing the sense and the anti-sense lignin biosynthetic gene constructs all exhibited significantly decreased levels of lignin, relative to the empty vector-transformed control plant lines. The most dramatic effects on lignin concentration were seen in the F5H anti-sense plants with as little as 35% of the amount of lignin in control plants, and in the PAL anti-sense plants with as little as 37% of the amount of lignin in control plants. These data clearly indicate that lignin 25 concentration, as measured by the TGA assay, can be directly manipulated by conventional anti-sense methodology and also by sense over-expression using the inventive lignin biosynthetic genes.
Example 7 Modulation of Lignin Enzyme Activity in Transformed Plants The activities and substrate specificities of selected lignin biosynthetic enzymes were assayed in crude extracts from transformed tobacco plants containing sense and 35 anti-sense constructs for PAL (SEQ ID NO: 45), PNL (SEQ ID NO: 81) and LAC (SEQ ID NO: 83) from Pinus radiata, and CGT (SEQ ID NO: 31) from Eucalyptus grandis.
Enzyme assays were performed using published methods for PAL (Southerton, S.G. and Deverall, Plant Path. 39:223-230, 1990), CGT (Vellekoop, P. et al., FES, 330:36-40, 1993), PNL (Espin, C.J. et al., Phvtochemistrv, 44:17-22, 1997) and LAC (Bao, W. et al., Science, 260:672-674, 1993). The data shown in the column labelled "Enzyme" in Table 4 shows the average enzyme activity from replicate measures for all plant lines assayed, expressed as a percent of enzyme activity in empty vector-transformed control plants. The range of variation is shown in parentheses.
Table 4 transgene orientation no. of lines Enzyme control na 3 100 PAL sense 5 87 (60-124) PAL anti-sense 3 53 (38-80) CGT anti-sense 1 89 PNL anti-sense 6 144 (41-279) LAC sense 5 78 (16-240) LAC anti-sense 11 64 (14-106) All of the transformed plant lines, except the PNL anti-sense transformed plant 20 lines, showed average lignin enzyme activities which were significantly lower than the activities observed in empty vector-transformed control plants. The most dramatic effects on lignin enzyme activities were seen in the PAL anti-sense transformed plant lines in which all of the lines showed reduced PAL activity and in the LAC anti-sense transformed plant lines which showed as little as 14% of the LAC activity in empty vector-transformed control plant lines.
Example 8 S* 30 Functional Identification of Lignin Biosynthetic Genes Sense constructs containing sequences including the coding regions for PAL (SEQ ID NO: 47), OMT (SEQ ID NO: 53), 4CL (SEQ ID NO: 56 and 57) and POX (SEQ ID NO: 86) from Pinus radiata, and OMT (SEQ ID NO: 23 and 24), CCR (SEQ ID NO: 26-28), CGT (SEQ ID NO: 31 and 33) and POX (SEQ ID NO: 42 anid44) from Eucalyptus grandis were inserted into the commercially available protein expression vector, pProEX-1 (Gibco BRL). The resultant constructs were transformed into E. coli XL 1-Blue (Stratagene), which were then induced to produce recombinant protein by the addition of IPTG. Purified proteins were produced for the Pinus OMT and 4CL constructs and the Eucalyptus OMT and POX constructs using Ni column chromatography (Janknecht, R. et al., Proc. Natl. Acad. Sci., 88:8972-8976, 1991).
Enzyme assays for each of the purified proteins conclusively demonstrated the expected substrate specificity and enzymatic activity for the genes tested.
The data for two representative enzyme assay experiments, demonstrating the verification of the enzymatic activity of a Pinus radiata 4CL gene (SEQ ID NO: 56) and a Pinus radiata OMT gene (SEQ ID NO: 53), are shown in Table 5. For the 4CL enzyme, one unit equals the quantity of protein required to convert the substrate into product at the rate of 0.1 absorbance units per minute. For the OMT enzyme, one unit equals the quantity of protein required to convert 1 pmole of substrate to product per minute.
Table purification total ml total mg total units yield fold transgene step extract protein activity activity purification 4CL crude 10 ml 51 mg 4200 100 1 Ni column 4 ml 0.84 mg 3680 88 53 .2 OMT crude 10 ml 74 mg 4600 100 1 Ni column 4 ml 1.2 mg 4487 98 25 The data shown in Table 5 indicate that both the purified 4CL enzyme and the purified OMT enzyme show high activity in enzyme assays, confirming the identification of the 4CL and OMT genes described in this -application. .Crude protein preparations from E. coli transformed with empty vector show no activity in either the 4CL or the OMT enzyme assay.
Although the present invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, changes and modifications can be carried out without departing from the scope of the invention which is intended to be limited only by the scope of the appended claims.
SEQUENCE LISTING GENERAL INFORMATION APPLICANT: Genesis Research and Development Corp. Ltd.
(ii) TITLE OF THE INVENTION: MATERIALS AND METHODS FOR THE MODIFICATION OF PLANT LIGNIN CONTENT (iii) NUMBER OF SEQUENCES: 88 (iv) CORRESPONDENCE
ADDRESS:
ADDRESSEE: Russell McVeagh West-Walker STREET: The Todd Building, Cnr Brandon Street Lambton Quay CITY: Wellington
STATE:
COUNTRY: New Zealand
ZIP:
COMPUTER READABLE FORM: MEDIUM TYPE: Diskette COMPUTER: IBM Compatible OPERATING SYSTEM: DOS SOFTWARE: Wordperfect 5.1 (vi) CURRENT APPLICATION
DATA:
APPLICATION
NUMBER:
FILING DATE:
CLASSIFICATION:
(vii) PRIOR APPLICATION
DATA:
S' APPLICATION NUMBER: FILING DATE: (viii) ATTORNEY/AGENT
INFORMATION:
NAME: Bennett, Michael Roy REGISTRATION
NUMBER:
REFERENCE/DOCKET NUMBER: 22315\MRB (ix) TELECOMMUNICATION
INFORMATION:
TELEPHONE: +64 4 495 7740 TELEFAX: +64 4 499 9306
TELEX:
INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 535 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: CTTCGCGCTA CCGCATACTC CACCACCGCG TGCAGAAGAT GAGCTCGGAG GGTGGGAAGG AGGATTGCCT CGGTTGGGCT GCCCGGGACC CTTCTGGGTT CCTCTCCCCN TACAAATTCA 120 CCCGCAGGCC GTGGGAAGCG AAGACGTCTC GATTAAGATC ACGCACTGTG GAGTGTGCTA 180 CGCAGATGTG GCTTGGACTA GGAATGTGCA GGGACACTCC AAGTATCCTC TGGTGCCGGG 240 GCACGAGATA GTTGGAATTG TGAAACAGGT TGGCTCCAGT GTCCAACGCT 'CAAAGTTGG 300 CGATCATGTG GGGGTGGGAA CTTATGTCAA TTCATGCAGA GAGTGCGAGT ATTGCAATGA 360 CAGGCTAGAA GTCCAATGTG AAAAGTCGGT TATGACTTTT GATGGAATTG ATGCAGATGG 420 TACAGTGACA AAGGGAGGAT ATTCTAGTCA CATTGTCGTC CATGAAAGGT ATTGCGTCAG 480 GATTCCAGAA AACTACCCGA TGGATCTAGC AGCGCATTGC TCTGrGCTGG ATCAC 535 INFORMATION FOR SEQ 10 NO:2: SEQUENCE CHARACTERI*STICS: LENGTH: 671 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: GCGCCTGCAG GTCGACACTA GTGGATCCAA AGAATTCGGC ACGAGGTTGC AGGTCGGGGA TGATTTGAAT Cl 120 CAAGATGGGC C~ 180 GCACACCCAG GC 240 GGGCAAGGGG C] 300 GATCATGACT GJ 360 AGACGAGATC AC 420 CATTGTCATC CC 480 GACAGGAGAT TC 540 GAGCGAAGTC G1 600 AGGCCCTTCC TC 660 CTTTTCAAGG A
%CAGAAACC
%GAGGAATC
3CGTCGAGT
~GGACATGG
rGCCTTTCT
CCGCGTGG
;TAGCGCCT
GAATCCGA
TTGGCCCA
AGAGGTTA
TCAGCGATTT
TTGTGGTAGT
TTGGGTCTCG
TGTTCACCGT
TTACGAATAA
TCGCGGATGT
CCAGCTCATG
GGACGACCCG
GAGCTTTGAG
TCACAGAATC
TGCCAAGAAA
TTCATCTCCC
AACCCGGAAC
CTATGGAGAT
AGTTGTCCAG
GAAATCCCGC
ATGTATAATA
CTTTTCCTCA
TACAATTATG
TGCAATGAGA
TATGGCAAAA
GATCTCGCCA
GTGGTGTTCG
CACTGGAGAA
CACTACAGAT
GCCGAGTCTT
TTATGTATAG
AG CTCAAGGC
GGGATTTCAT
TTAAAGAGAA
TCTTTCTGCT
AGGAGGTCCT
ATATCTTCAC
AGATGCGCAG
TCGCGTGGGA
CCACCTCGGG
GATGATGTTC
CCTCAACGGA
TCCCAGTCTT
ACGGCTCTCT
C
C
INFORMATION FOR SEQ ID NO:3: Wi SEQUENCE CHARACTERISTIC LENGTH: 940 base paix TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: CTTCAGGACA AGGGAGAGAT CAATGAGGAT GTTGCAGCAA TTGAGACAAC GCTGTGGTCG 120 CACCAGGACA TTCAGAGCAA GGTGCGCGCA 180 CAGATAACGG AACCAGACAC GACAAGGTTG 240 SEQ ID NO:3: AATGTTTTGT ACATCGTTGA GAACATCAAC ATGGAATGGG GAATAGCGGA GCTGGTGAAC GAGCTGGACG CTGTTCTTGG ACCAGGCGTG CCCTACCTTC AGGCGGTTGT GAAGGAAACC
CTTCGTCTCC
300
CTCGGGGGCT
360
AACAACCCCG
420
GAGAAGCACA
480
AGGAGCTGCC
540
TTCAGAACTT
600
GCGGGCAATT
660
CTGCTTAATC
720
CTCCATCTAT
18 0
CTTCAAAAGT
840
AAGTTTGCAT
900
ATTTTACTGC
940 GCATGGCGAT CCCGTTGCTC ACGATATTCC GGCAGAGAGC CCAACTGGAA GAACCCCGAG CCGAAGCCAA TGGCAACGAC CGGGAATCAT TCTGGCGCTG CCACCTTCTG CCGCCGCCCG CAGCCTTCAC ATTCTCAACC CCAACTTGTC AGTGACTGGT CATGACTGTG TGTGCGTGTC TTGCTAGGAT TTCAATAACA AAATTAAATG ATATTTCAAT TAAAAAAAAA AAAAAAAAAA
GTCCCCCACA
AAGATCCTGG
GAGTTCCGCC
TTCAAATTCC
CTCTrCCTCGC
GGCAGAGCAA
ATTCTCTCAT
ATATAAATGC
CACTGTCGAG
GACACCGTCA
ATACTATTTT
AAAAAAAAAA.
TGAATCTCCA CGACGCCAAG TGAACGCCTG GTGGTTGGCC CCGAGCGGTT CTTCGAGGAG rGNCCTTCGG TGTGGGGAGG ACTCTCCATC GGAAGACTTG AGTGGATGTC ACTGAGAAGG CGTCGCCAAG CCCATAGCTT GCGCACCTGA ACAAAAAACA TCTACTAAGA GCTCATAGCA
ATTATGTCAT'GTTTCAATAA
GACTCTCCAC CAATTGGGGA INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 949 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: NNGCTCNACC GACGGTGGAC GGTCCGCTAC TCAGTAACTG AGTGGGATCC CCCGGGCTGA
CAGGCAATTC
120
CTCGTATGTT
180
ATGATTACGC
240
GCGGTGGCGG
300
TTCAGGCCTG
360
CTGCCATTGG
420
AGTCTATGTT
480
CAGAAGACAT
540
AGGCCATTGC
600
CAATTGATCT
660
TCTCCTTTTC
720
TCTGAAGCCC
780
AAATTGAGWA
840
TANTTTAGGG
900
GATTTAGCTC
GTGTGGAATT
CAAGCGCGCA
CCCTCTAGA
AGAGATTTCT
TGCAGGGCGC
GGGACACCTG
AGATCTCACA
TATTCCTCGA
GATAGTAAGT
CATAGTCAAC
AACTTCTAGC
TTTCTCTGTA
NATTTTAATA
ACTCATTAGG
GTGAGCGGAT
ATTAACCCTC
ACTAGTGGAT
TGAGGAAGAT
AGGATCTGCC
CTT CAT CATT
GAGAATCCAG
TTGCCTGATC
TTGAATTTTG
ATGCAGCTTT
AAGCAATAAC
GGGGNNGNTA
GTCCTANGTA
CACCCCAGGC
AACAATTTCA
ACTAAAGGGA
CCAAAGAATT
GTTGATATTA
CTGGTGCACA
TCGTATGGGC
GGCTTGTTAC
ATCTCTACAA
TTTTGATACA
CTTTCTCTGA
TGTATATTTT
ATTGTGCAAT
ANANGNGGNA
TTTACACTTT ATGCTTCCGG CAC:AGGAAAC AGCTATGACC ACAAAAGCTG. GAGCTCCACC CGGCACGAGA CCCAGTGACC AGGGCCATGA TTACAGGCTA ATTGGGTATT AATTTAGTTC ACCTCCTGAG GGAATGAAGG TTTCATGGCC AAGCCTGTGC GCGACAGCCA CTCAATTGAT AAACGAAATA ACGTGCAGTT AGCGCATGCA GCTTTCTTTC AGAACAAATA CCTATTCCTC TTGCAAGNAA TAGTAAAGTT ATGNTAGNGG GCATTNAGAA ANCCCTAATA GNTGTTGGNG GNNGNTAGGN TTTTTNACCA AAAAAAAA 949 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 959 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID GAATTCGGCA CGAGAAAGCC CTAGAATTTT
CTTTAACTGC
120
CTGGTCAGGT
180
CTTGCACTCA
240
TGGAAGCTTG
300
TCAAGCAAGT
360
TGACCCTCTC
420
AACTCCATGT
480
ATGGGATGGT
540
ATATAGATAC
600
AGCAGATTTC
660
ACATAGATGA
720
TCATCTTCAA
780
TAGTACTGTG
840
AAAATCTCAA
900
TGACATTTGA
959
AATAACTGTG
TCCAGCATTT
GCCCTACATT
TGTCAACACG
TTTGTCATCT
TCTTCAAGAC
TCTGCAGATG
GAGCTTCAAT
TGCAATTCGG
GTGTGATCAT
TTTTGATACT
GACTCGCTTA
GCTGAGTCCA
ATTTCTCGAT
GCACCTCGAG
GAAGCGTACA
CCAAAATACA'
GATTTAGCAA'
AACACAGAGA
CTTTATAAAC
ATAGCAAGTA
ATTCAAGATG
GAGGATCCTG
AGA.ATCATGG
TCCTACCTGA
GTTCCCCAGA
TATTCATTAC
GAAkAGGATCT
GTCTAGTCTT
TGAACTACAA
ACAACTACAG
AGTTCAAGAA
GGAATATTCA
CGGTACAGTT
GTGAGATTTT
AACAGTACAA
CACTATCAAA
GTAAGGTGGG
AGTTCANAAA
TTTCTATGTG
CTCGGTATTA
GATTTTGATT
AGTTGCATGT
TAGTGGGAAA
TGATAGTAAT
GAGATTGACA
GGAGACTGCT
TGCAACCATA
AACATGTCAG
GA.AGCTCACC
GAGAGAGCGT
TATGTAACAA
AATTGATAGT
TCACTTGACA
ATGAATGCGA
TAAAAAAAAA
TTCAGCATGC TATCACAGCC AAAAGTTTGT CCTAGTTTCT CACCTGCTGT 'TGTCCAAAGA
CCAGCGACAA
CTCATTCAGA
AATTTGAAAT
ATTTCTGTAT
TTGGGGTTAG
CAGACATATC
AAGCAGGCTG
AATCAGAAAG
ATGACTGAAT
ACAGTAGATG
TCAAGATTTG
ATGATGTAAA
CTGTTAACAA
TGCCATCAAA
CTTTTAGTTG
AAAAAAAAA
INFORMATION F'OR SEQ ID NO: 6: SEQUENCE CHARACTERISTICS: LENGTH: 1026 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 1 (xi) SEQUENCE DESCRIPTION: GAATTCGGCA CGAGCTTTGA GGCAACCTAC CAAACAGGTT TAAGGAAATG GCAGGCACAA 120 CAACCCAAGC AGAGGAGCCG GTTAAGGTTG 180 TTTTGCAGAG CGATGCCCTC TATCAGTATA 240 SEQ ID NO: 6: ATTCATTGAA TCCCAGGATT TCTTCTTGTC GTGTTGCTGC AGCAGAGGTG AAGGCTCAGA TCCGCCATCA AGAAGTGGGA CACAAAAGTC TATTGGAAAC GAGCGTGTAC CCTCGTGAGC
CCGAGCCAAT
300
CTTCTGCCGA
360
CCATGGAGAT
420
ATGATGGAAA
480
TTATTGAGAA
540
TTCTGGACGA
600
ATGCGGACAA
660
GAGGTCTGAT
120 CTCCCCT GAG '780
TTGCTGTCGA
840
GCAGGCGTG'T
900
CTCTGATTAT
960
TATTGATAAT
1020
AAAAAA
1026
GAAGGAGCTC
TGAGGGTCAA
TGGGGTGTAC
GATTCTAGCC
AGCAGGAGTT
ACTGCTTAAG
AGACAACTAT
TGCATATGAC
GAMTATGTG
TCCCCGCATT
CTATTGAAAA
AAGGAGAACG
AAAGTAGTAC
CGCGAAGTGA
TTTCTGGGCC
ACTGGTTACT
ATGGACATCA
GCCCACAAGA
AATGAGGACA
CTAAACTACC
AACACCCTGT
AGATATTACA
CTGCCAAGCA TCCCTGGAAC
TCCTGCTGAA
CGCTTCTCAG
ACAGAGAGAA
TTGACT TCAG
TGCATGGATC
ACAAGCGTCT
GGAACGGATC
GAGATTTCGT
GCTCATTAAC
CACAGCCCTT
CTATGATATC
AGAGGGCCCT
GTTCGATTTT
GATCGATCTG
TGTGGTGGCT
CATGGAGCTA
CGGTGACGGC
CTATTGCAAG
GCCATTTGTT
CTCATGACTA
GCCAAGAACA
GCATTGCCCG
GGATTGCCTA
GCTCTGCCAG
GTGTTCGTGG
GTGAAGGTTG
CCACCCGATG
AACAAGGCCC
GTCACCCTTT
CAT AAAGGCT
TTGTTTAGTG
GAGATCAGCC AAATCCCAGT' CAATCCTTGT TTCTGCTCGT CTATAATATA TGGGGTTGAA AGCATATGCA AAGTTTGTAT CAAAAARAAA AAAAAAAAAA INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 1454 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGCA
TGAAGTGGTT
120
GCTCGGGACT
180
GAGGATGTAA
240
CGTAATGAAA
300
GTAACAGAGA
360
TGCATTGTTG
420
AGCAAGAGGA
480
TTTGCAAGCA
540
CTGGAACAAG
600
TTCGCCATGA
660
ATGGGTGTCA
720
AAAAAGAAAG
780 SEQUENCE DESCRIPTION: SEQ ID NO:7:
CGAGGCCAAC
CTGAAGTGAT
CCAGTGGCCA
TTGTAAAGGT
TGGACATGTC
TTGGCAGCGA
GGTCCTGTCG
TTTGGACCTA
GTATGGTGGT
CGGCCCCTCT
CAGAGCCCGG
AGATTGCCAA
AAGAAGCCAT
TGCAAGCAAT
GGGAAGCTTG
CTTGTCCCCT
CATTTACTGC
TCATTACCCA
GGTGAAGAAA
CAGTTGCGGT
CAATGATGTG
TGATCAGATG
GTTATGTGCA
GAAGAAATGT
AGCCTTTGGA
GGAAGTCCTC
ACAGTACAAG
GAATCTGAAA.
TACACTTACA
GGAATCTGCC
ATGGTCCCTG
TTCAAAGTGG
AATTGCAATC
AACCATGACG
TWTGTGGTTC
GGGGTTACAG
GGGATTTTGG
CTCCACGTGA
GGCGCCGATG
AGCCAGACGA
AAACTGTTAC
ATCTCAGAAA
ACTCTGATTT
GGCATGAAGT
GAGAGCATGT
AGAGCATGGA
GCACACCTAC
GA.ATCCCGGA
TTTTCAGCCC
GTTTAGGAGG
CGGTTATCAG
CTTATCTTGT
TCGAATCCTG
AGGATATGCA
GAAAGGACCT
AGTTCAAATG
GGTGGGGATT
AGGGGTTGGT
ACAATACTGC
TCAGGGCGGA
GAATCTTCCT
AATGAAGCAT
CGTGGGGCAC
TTCGTCTGAT
TAGCAAGGAT
ACTGAAAAGA
840
GCTCATCCTC
900
GGCGTTGTTC
960
ATAGCTGGAA
1020
GAGAAGAAGG
1080
GAAAGGTTGG
1140
TTGGATAATT
1200
CTGGACTAGT
1260
TTTTTGTTAC
1320
GTATATGTAA
1380
TAATATATGT
1454 AAAAAA 1454
TGATGGAAGC
TGGAACCATA
CAGAGTCGTT
GTTTCATTGG
TATCATCGAT
AGAAGAACGA
AGTCTGCAAT
AGCTTAACAT
TTTAGTTTAG
AGATCAATTT
ATTCGTATTT
AGCAGAGAGC CTAGATTACA TCTTGCCCTT CTGAAGACAA GCACTTCGTG ACTCCTCTCT CAGCATGGAG GAAACACAGG GATTGAGGTT GTGGGCCTGG TGTCCGTTAC AGATTTGTGG CAATCAATCA GATCAATGCC GAAAGGGAAA TTAAATTTTT CTTTTGTGAG GTTGAAACAA
CTCGTGACAG"TAAATAATAA
TAATGGACAC
ATGGAAAGCT
TAPLTACTTGG
AAACTCTAGA
ACTACATCAA
TGGATGTTGC
TGCATGCAAG
ATTTAGGAAC
TTCAGATGTT
TCCAATGTCT
CATTCCAGTT
AGTGATGCTG
GAGAAGGAGC
TTTCTGTGCA
CACGGCCATG
TAGAAGCAAG
ATGAATAGAT
TCGATACTGG
TTTTTAACTT
TCTGCCAAAT
TTATATGAAA AAAAAAAAAA AAAA 1440 AAAAAAAAAA AAAA INFORMATION FOR SEQ ID NO:8: Wi SEQUENCE CHARACTERISTICS: LENGTH: 740 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: GAATTCGGCA CGAGACCATT TCCAGCTAAT ATTGGCATAG CAATTGGTCA TTCTATCTTT
S
GTCAAAGGAG
120
TACCCAGATG
180
CGAGATTCTC
240
GCATTTGCCA
300
TTTAGTATAG
360
AACAAATACT
420
TCTTTGGAAA
480
CCAAAATCAT
540
CAGTGAATAA
.600-
GTGTTAGTGA
660
AATCTTGATG
740 720 740
ATCAAACAAA
TGAAATATAC
TTCCACATGC
AATTGTGGGT
TATGACGAGC
CACCTGTGGT
CCGCTTAGTG
GGCTGATGTG
TTTTGTTAGA
ACGGAATGAT
TTTTGAAATT
CACTGTCGAT
TTCAGAGATA
TATAATCCTT
TAGGCACTGC
TTGTTTTCTT
TGGAATGCTA
AACTGGTTGT
GTGTTTAGAT
GTCAAATCTT
GGACCTAATG
GAGTACCTCA
CATAACAGTT
CGTAGGTGTT
AGATCCTTCA
TCTTTCTGGA
AGTACTAGTG
TCCAGAGGGT
CCATCTTTAC
GATGGGCTGA
TAAAAAAAAA
GTGTGGAGGC TAGtCAGCTA GCAAATTTGT GTGAAGTATG TCAATCAATG TTTGTCCTAG TGGCAGAACA GAACCTCCTG CACTTTTCTC TTCCATAAGA ACTTTGGTAT GGCAATAATG TCCAGAGTTC TAAGGGAGTT GTTTACAACC AACAGTTGTT ALAGGCTATTG AGTAAGGTTG CTGACTCTCT TGTGATGTCA AAAAAAAAAA AAAAAAAA GATTGTGTCT TTTTCAATGG AAAAAAAAAA AAAAAAAAAA INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 624 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:
GAATTCCTGC
GCGCGCCTGC
120
TTGTTGGACG
180
CCCAAGGAAG
240
GTCTGTGTTG
300
GAGGTGATGC
360
CCAGGGCAGA
420
AAAGAAGCAG
480
GCTCTGCGAA
540
CACTCCATCG
600
GACATGGCTG
624 AGCCCGGGGG ATCCACTAGT TCTAGAGCGG CCGCCACCGC
AGG'ICGACAC
CCATGGAAGC
GACTGGCTCT
ACGCCAACGT
AAGGGAAACC
TCGAAGCCGC
CGCGGCTTCA
CATCGCCACA
AGCGGGAGAT
TCCACGGCGG
TAGTGGATCC AA.AGAATTCG TCTCCGGAAA GCCGGGATTC CGTCAACGGC ACAGCGGTGG GCTGGGCGTG CTGGCTGAGA GGAGTTCGTA GATCCGTTAA GGCCGTCATG -GAGTTCCTCC
CGAGAAAGAC.CCGTTGAGCA
GTGGTTGGGG CCTCCGATCG CAATTCCGTC AACGACAATC
CAAC
GCACGAGGCC
TGGAACCGTT
GATCCGCCGT
TTCTGTCTGC
CCCACCAGTT
'TCGACGGTAG
AACCGAAACA
AAGTCATCCG
CGTTAATCGA
GGTGGAGCTC
CGACGGCCAC
TAAACTGCAG
GGCCGCGTCC
GCTCTTCTGC
GAAGCACCAC
CGACTACGTG
AGACCGCTAC
CGCTGCYACT
TGTCTCCAGG
S
INFORMATION FOR SEQ ID NO:1O: Wi SEQUENCE CHARACTERISTICS: WA LENGTH: 278 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCCTGC
60
CAGTACCTGG
120
GTCAATTCCC
180
CTGATGTTCG
240
*GAAAACATGC
278 SEQUENCE DESCRIPTION: SEQ ID AGCCCGGGGG ATCCACTAGT TCTAGAGCGG CCGCCACCGC GGTGGAGCTC CCAACCCCGT CACGACTCAC GTCCAGAGCG CCGAACAACA. CAACCAGGAT TCGGCTTGAT CTCCGCCAGA AAGACTGCCG.AGGCCGTTGA
GATTTTAAAG
CTACATATCT GGTGGCCTTA TGCCAGGCGA -TCGATCTCCG GCACCTGGAA GATCCGTTGT GAAGCACGTA GTCTTGCA INFORMATION FOR SEQ ID NO:ll: SEQUENCE CHARACTERISTICS: LENGTH: 765 base pairs TYPE: nulcleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: GAGCTCCTGC AAGTCATCGA TCATCAGCCC GTTTTCTCGT ACATCGACGA TCCCACAAAT CCATCATACG CGCTTATGCT CCAACTCAGA GAAGTGCTCG TAGATGAGGC TCTCAAATCA 120
TCTTGCCCAG
180
GCTGCTGGAA
240
AAGGCCCGTT
300
CCAATTGCAA
360
GP.GTTGGGAA
420
AAGGTATTTG
480
GCTTGGGGTG
540
TTCAATGCCT
600
AGAGGTTTCT
660
CCTAAACAGC
720
GGGTTCCAAC
'765
ACGGGAATGA
TATTACCCAA
TAGAGGAAGA
ACAGAATAAA
CCGATTTGCT
AGGGCATTTG
GGTGCGCTGG
CATATTGGGC
GGAGCGCCCA
TTGTTCTTCG
AAAATAGAAG
CGAATCCGAT
TTGGGTGTTT
GGTTCCGAAG
CAAGTGCAGG
AACAGGGCCC
CCAAGGGAAA
ACCATTCACT
ATGGTTTGAT
ACAACAACAA
CAATAACGAA
AAATATTTTC.
CACAATTTGC
AGCAGGATCC
GCGAGGGAAC
ACATATCCCA
AAGTGGAGAA
ATTGGAAACG
CCACGTGCAT
AGCACCAAAT
GTTCTTTGA'r
TCTTTCATCT
GATCCAAAAA
AGCCCGCTGA
CCATATTTCA
GATTCGP.TAA
TTTACAGATT
GCCCCGGCGA
TGATCCTCAA
ATCCTGCGTC1
CACCCTCTGC
TTAACTGACT
TCGTTACTTT
GAGCGCTGGA
AGAGGAGTTG
TGGGGACTTC
CGTGAGATCA
AGATATAGAA
ATGTCTGGAC
TCCTGCAGCG
AACGAGCGGC
CTTAAGCATT
GTAAAAGATG
INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 453 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
TGATTATGCG
TGAACCTAGC
120
TCCGGAACGC
180
CAAGCCGGCG
240
GAATCCACGG
300
ATTGACGATG
360
GCTTCCAGGT
420
CGGCCGCCAC
453 SEQUENCE DESCRIPTION: GATCCTTGGG CAGGGATACG CTTCGCAAAG AATCCTTTCC TCAAATAAAG ATCCTCGATT AAATCTGCAT CCGCGGACCC CCGCTACAAT CGATGAAGAA ACGAAGAAAT CTTCATAGTC.
GGATCCTGCT AATCGAATTC CGCGGTGGAG CTCCAGCTTT SEQ ID NO:12: GCATGACAGA AGCAGGCCCG CCGCCAAATC TGGCTCCTGC ACAGGAACTG GCGAGTCTCT GAAATAATGA AAGGATATAT GGCTGGCTCC ACACAGGCGA GACAGAGTAA. AGGAGATTAT CTGCAGCCCG GGGGTCCACT
TGT
GTGCTGGCAA
GGAACAGTCG
CCCGCACAAT
TAACGACCCG
CGTCGGGTAC
CAATATAAAG
AGTTCTAGAG
INFORMATION FOR SEQ ID NO:13: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 278 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: TCTTCGAATT CTCTTTCACG ACTGCTTCGT TAATGGCTGC GATGGCTCGA TATTGTTAGA TGATAACTCA ACGTTCACCG GAGAAAAGAC TGCAGGCCCA AATGTTAATT CTGCGAGAGG 120 ATTCGACGTA ATAGACACCA TCAAAACTCA Ac-TTGAGGCA GCCTGCAGTG GTGTCGTGTC 180 AGTTGCCGAC ATTCTCGCCA TTGCTGCACG CGATTCAGTC GTCCAACTGG GGGGCCCAAC 240 ATGGACGGTA CTTCTGGGAG AAAAGACGGA TCCGATCA 278 INFORMATION FOR SEQ ID NO:14: i)SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: CTTCGAATTC WYTTYCAYGA YTG 23
K
INFORMATION FOR SEQ ID N0415: SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID GATCGGATCC RTCYYKYCTY CC INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 472 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
AATTCGGCAC
GTTCTAGTGC
120
CTGCGCCGCC
180
CACAGGGAGC
240
GCGCCT-CGGC
300
GGTAGGGGTC
360
CATGGAGAGC
420
CTTCTCAAAC
472 SEQUENCE DESCRIPTION: GAGACGACCT CTTGTATCGG TGAATGGAGA TGGAGAGCAC GGGAGCCACC ATGCCGACCC CACCTCGACG AGGTGAAGCG GGGGAGTCCC TCACGATAGC GAGCTCTCGG AGGCGGCCCG ATGAACAAGG GAACTGACAG CGGAGGCCGA AGCAAGGCGG SEQ ID NO:16: ACCCGGATCC GCTATCGTTA CACCGGCACC GGCAACGGCC ACTGAACTGG GGGGCGGCGG GATGGTCGAG GAGTACCGGA CCAGGTGGCG GCGGTGGCGA TCCCAGGGTC AAGGCCAGCA CTACGGGGTC ACCACCGGGT TCCT7TTCAG AAGGAACTTA
ACGTACACAC
TTCACAGCCT
CAGCAGCCCT
GGCCGGCGGT
GTCAGGAGGG
GCGACTGGGT
TCGGCGGCA.A
TA
INFORMATI.ON FOR SEQ ID NO:17: ii) SEQUENCE CHARACTERISTICS: LENGTH: 622 base pairs' TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
CCAAAGCTCC
CCCCGAGGCA
120
GACOATCCCA
180
CGACTATTGC
240
GTGGTGAGCT
300
GAATTGGTGG
360
GCCTTGTGCA
420
CGTTGTCTA.A
460
CTGAAAATAG
540
GCCAATGCAG
600
AACATTGTTC
622 SEQUENCE DESCRIPTION: SEQ ID NO:17:
TAGTGCCTCA
CCATGATCCT
CAAATTTTAA
CGTTTGGGAT
TGGTCCTGGC
ACTTGTCCGA
AAGCGCGTGA
TGAATTTACA
GCCAGTGCAG
CT TTAGGC CT
AAAAAAAAAA
TGAGTCTGCT
GGTTAATGCG
ACCGGAGAGG
GGGGAGGAGA
GGCGCTTATT
GGGGACGGGA
ATGCATGATA
TTGGTGATGT
CTTTAGGAAT
TTCTCTTAGG
AA
GAGGATTGCA
TGGGCAA-TC
TACGAGGGAT
AGTTGTCCTG
CAGTGCTTCG
CTCACAATGC
GCTAATGTTC
ATCTCCAATG
GATCGTGAGC
AGAAAAATGA
CAATTGGCGG
AAAGAGACCC
TGGAAGGTGA
GTGCTGGCCT
AATGGGAACG
CAAAGAGAGA
TTGCGCACCT
TTTTTGA.ATA
ATCAATAGCA
TGGTTTATAT
GTTCGACGTG
AAAAGTGTGG
TCATGCCTAC
TGCCA.ATAGA
AGTTGGCGA
GCCATTGGAG
TTAAGAAGGT
ATCAAATAGA
TCCTGAGGAG
AGGTACTCGC
INFORMATION FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 414 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
CACGCTCGAC
CATTGAACTC
120
ACCCCACCCA
180
CAATGCAGTC
240
CGGTGAACTG
300
TCTCCGGCAA
360
AGGAAGCCAA
414 SEQUENCE DESCRIPTION: GAATTCGGTA CCCCGGGTTC TCTCTCTCTC TCTCTCTCTC CATACAGACA AGTAGATACG AATCGCACTA GCGACGGTTC GGTGTGGCTG AGGCCGAAGA SEQ ID NO:13: GAAATCGATA AGCTTGGATC TCTCTCTCTC. TCCCCCACCC CGCACACAGA AGAAGAAA.AG TGGCCGTCCT AACGACATGG GGCTCGAGAG GCTTCTGAGA GCGACCTCAA GGAGAACCTG CCGATGACAT CAAGCCTCGT
CAAAGCAACA
CCCCTTCCCA
ATGGGGGTTT
GCGTGGAGGG
CAGCAAGGTC
CGGATGCTCA GTCCTACACC
GTCCAAGCCC
TTCCTGGTCG
ATCGCCGTCT CTCT INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: CA) LENGTH: 469 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear
E
(xi)
GAATTCGGCA
CTCATCTCCT
120
CTTCTCCGA.A
180
TCATAGGCCA
240
TGGCGGACAA
300
TAAGCAGCCG
360
GCCCCAAATC
420
AATACGGGGA
469 SEQUENCE DESCRIPTION: CGAGTGTCTC TCTCTCTCTC AGCAGTTCTA GGGGTTGTGT CAAACCCAAA GGTACTGCCT CATCCACTTG CTGGGCGGCG GCAGGGCCCG. ATGTTTCGGA TGAGGCGGTC CGGGAGTGCT CAAGGCGGGA ATCCACTTGG CTTTTGGCGC GAGATGAGGA SEQ ID N0:19: TCTCTCTGTA AACCACCATG TGCTICCTGCT AATTCTATGG TACCCCCGGA GCTGCCGGGC AGACCCCGCT GGCCAGGACC TCCGTCTCGG AGTCCACCCG TCACCACCCA -CGACAAGGAC GCTACGGGTA TGCCGGTTTT AGATCACCAT GCTCGAGCT
CTCTTCCTCA
AGGGCAAGAT
GCATGGCCGA
CTGGCCGCCA
GCGACCATCA
CTCtCTTCTC
GGCTTCGTAG
INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 341 base pairs TYPE: nucleic acid STRANDEDNESS: single ID) TOPOLOGY: linear a a (xi)
CGGGCTCGTG
TCATCGGGAA
120
AGAAGTATGG
180
CCCCCGACGT
240
CCACCATCGC
300
GCCCGTTCTG
341 SEQUENCE DESCRIPTION: GCTCGGCTCC GGCGCAACGC CATGCTCATG ATGGGCGAGC CGGGATCTTC CACCTCCGCA GGCCCGCCAG GTCCTCCAGG GATCAGCTAC CTCACGTATG GCGGCAGATG CGGA.AGCTGT SEQ ID CCTTCCCACC GGGCCCGAGG GGCCTCCCGG TCACCCACCG CGGCCTCGCG AGTCTGGCGA.
TGGGCTTCCT GCACATGGTT GCCGTGTCGT TCCACGACGG GATCTTCTCG AACCGGCCTG ACCGGGCCGA CATGGCCTTC GCGCACTACG GCGTGATGAA A a a a a INFORMATION FOR SEQ ID NO:21: SEQUENCE CHARACTERISTICS: LENGTH: 387 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGCA
GAGGGGCCTC
120
CGCGAGTCTG
180
GGTTGCCGTG
240
CTCGAACCGG
300
CTTCGCGCAC
360 SEQUENCE DESCRIPTION: CGAGCGGGCT CGTGGCTCGG CCGGTCATCG GGAACATGCT GCGAAGAAGT ATGGCGGGAT TCGTCCCCCG ACGTGGCCCG CCTGCCACCA TCGCGATCAG TACGGCCCGT TCTGGCGGCA SEQ ID NO:21: CTCCGGCGCA ACGCCCTTCC CATGATGGGC GAGCTCACCC CTTCCACCTC CGCATGGGCT CCAGGTCCTC CAGGTCCACG CTACCTCACG TATGACCGGG GATGCGGAAG CTGTGCGTGA
CACCGGGCCC
ACCGCGGCCT
TCCTGCACAT
ACGGGATCTT
CCGACATGGC
TGAAAGCTCT
TCAGCGGAAG CGGGCTGAGT CGTGGGA 387 INFORMATION FOR SEQ ID NO:22: SEQUENCE CHARACTERISTICS: LENGTH: 443 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear
CACGAGCTCG
CACTCCTTTC
120
AAAGCCCAAA
180
GAACCAAGCA
240
TGCTCTGATA
300
ACTGGCCGAG
360
CGCA.AAGAAC
420
CTCCATCCTC
443 SEQUENCE DESCRI PT ION: TGAGCCTTCC CGGAGACA.AG TCAAGAAACC TAGTCATCCA CTCGTACAGA AGGAGAGAGA ATCACGACGG CCIAGTGAAGA GCACTCCCCT TGGTCTTGA TGCGGGCCTA TGGCTCCACT CCGGAAGCCC CCGTAACCCT TCTTGCACTC TCG SEQ ID NO:22: GCCATCTTAC TTCGCAACAA AGAAGCAGAG CATTGCAACT .GAGAGAGAAT AGAAGCATGA
TGAAGAGTTC
GGCCACCATC
TTCGCCTGCT
TGACCGGATC
TTGTTCGCCA
GAACTGGGGA
CAGATTGCCT
CTCCGGTTTC
ATTGCGTCCG
GCAAACAGCC
GTGCATGCAC
TGGAAATGAA
TCCTCGAAAT
CCCGTCTCTC
TCGCCAGCTA
a.
a a a a a INFORMATION FOR SEQ ID NO:23: SEQUENCE CHARACTERISTICS: LENGTH: 607 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:
GAATTCGGCA
AACCGGTCCA
120
ACTTAGGTCA
180
AGAGAGAGAG
240
GGAGAGCCAG
300
TGATGCTCTT
360
GAAGGAGCTC
420
CGAAGGGCAG
480
TGGTGTCTTC
540
GATTTTGGCT
600
GCCGGTG
607
CGAGCCAACC
AACCGGACCA
ACTGCAACAT
AGAGAGPIGAG
ACCCAAGCCG
TACCAATATA
AGGGAAATAA
TTCTTGAACA
ACTGGCTACT
ATGGACATTA
CTGGACCAGG
TCACTGTCCT
TTCTTGATCA
AGAGAGAGAG
GGAGGCACCA
TTTTGGAGAC
CAGCAXAACA
TGCTTCTCAA
CTCTCCTCGC
ACAGAGAGAG
TACTTTTGGC
TATATACGTT
CAACATATTA
AGAGTTTGAA
GGAGGTTGGC
CAGCGTGTAC
TCCATGGAAC
GCTCATCAAA
CACCGCTCTT
CTATGAACTT
AGGCGGTCCA
GCATCATGCC
CA.ATATTCCT
TCAATGGCCA
CACAAGTCTC
CCAAGAGAGC
ATAATGACAA
GCCAAGAACA
GCTCTTCCTG
GGCCTGCCGG
TTGCCCTTCA
TGCTCATAGA
AAGCAGAGAG
CCGCCGGAGA
TCCTTCAGAG
CTGAGCCCAT
CATCAGCAGA
CCATGGAGAT
ATGACGGAAA
CAT CCAAAA INFORMATION FOR SEQ ID NO:24: SEQUENCE CHARACTERISTICS: LENGTH: 421 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGCA
AGAGAAGAGA
120
CCAAGTCTCG
180
CCCCATGGTC
240
GCCGGGCGCG
300
GGCACCCGTA
360
CACCCTCCGC
420
C
421 SEQUENCE DESCRIPTION: CGAGCCGTTT TATTTCCTCT SEQ ID NO:24: GATTTCCTTT GCTCGAGTCT
GOAGAGGAGA
GACGAGGAGG
CTCAAGGCCG
TTCCTCTCCC
ATGCTCGACC
GACCTCCCCG
GAATGGGTTC
CGAACCTCTT
CCATCGAGCT
GACCGGATCC
CGCCATGCAG
CGACCTCCTC
*GAGACCCAGA
CTGGCGAGCG
GAGATCATGG
CT.CCCGACCC
AGCTACTCCG
TACGGCTTAG
CGCGGAAGAG
TGACCCCGAC
CCTCCGTGCT
CCAAGGCCGG
AGAACCCCGA.
TGCTCACGTG
CGCCGGTGTG
CGGGGGAAGT'CGCGGCCCAG
GGATCTTCCG GCTGCTGGCC ATGGCAAGGT CGAGCGGCTC 9* 9 9 9.
9. 9@ 49 9 999 C 9 9 .9.
C 9 9 9 9*C* 99. 9 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 760 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID
GGAAGAAGCC
AAGGAGCTTA
120
CGAAGCATTC
180
TATTGGAGAC
240
CAGCCAAACA
300
TGCTCCTCAA
360
CTCTCCTCGC
420
ATAGGGAGAA
480
TCGATTTCAG
540
ACCATGGAAC
600
TCAATGGTTC
660
TAAGGGTAGT
720
GCTGACCCGG
760
GAGCAAACGA
AGAAGCATCA
GGAAGTCGGC
CAGCGTCTAC
TCCATGGAAC
GCTCATCAAC
AACCGCCCTT
CTTCGAGATC
AGAAGGCCCT
GTACGACTTC
AAAGACAACA
TTCTCATTTC
CGGCACAGGT
ATTGCAGACG
TCAATGGCAG
CACAAGAGCC
CCAAGAGAGC
CTGATGACCA
GCCAAGAACA
GCTCTTCCCG
GGGCTGCCCG
GCCCTGCCGC
TTCTCAATCC
TAAGACAGAA
ATCAATGCTT
GATGCCATCC
CCATTGAAAA
CCAACGCAGA
TCTTGCAGAG
CAGAGCCCAT
CATCGGCGGA
CCATGGAGAT
ATGACGGAAA
TCATCCAGAA
TCCTTGATCA
TTAATCGTTC
GATGGAAAAA
GATTTTGAGA
CCGACGGGAA
AAGACACGAA
GCCTCAGCAG
CGATGCTCTC
:'GAAGGAGCTC
TGAAGGGCAG
CGGCGTCTAC
GATCTTGGCC
GGCCGGCCTT
GCTCGTGCAA
ATTTGAATAC
ATAGAAAGGA
TCTCCTTTCT
AGAGATCAAG
ACCCAACCAG
TACCAGTATA
AGGGAAATAA
TTCCTGAACA
ACCGGCTACT
ATGGCCATCA
GCCCACAAGA
GATGAGAAGA
AAATACATGC
AGGAAAGTAT
GGTGCGATCA
INFORMATION FOR SEQ ID N0:26: SEQUENCE CHARACTERISTICS: LENGTH: 508 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGTA
ACTAACCATC
120
ATATCGTGAA
180
AGGGACGGGC
240
GTTCGCGCTG
300
CTTCAAGAAC
360
GAAGGCAATC
420
TCAGACCAAA
480
GGTTCATTTG
508 SEQUENCE DESCRIPTION: CCCGGGTTCG AAATCGATAA
TGCCTTTCTT
AGGAGTCCGT
TACGTCGGCA
GTTAGGCAGA
TTGGGCGTCA
AAGCAAGCCG
GAATCGTCGA
ATCTGGTTTG
CATCTTCTTT
CGACGACAAT
AGTTCATCGT
GCACGGTCTC'
CTCTGCTCAT
ACGTGGTGAT
CGCCATTAAA
GGGGGGTC
SEQ ID NO:26: GCTTGGATCC AAAGAATTCG CTTCTGCTTC. TCCTCCGTTT GGCCGAGAAG AGCAAGGTCC GGAAGCGAGT GCAAAAGCAG CGACCCCGTC .'AAGGGCCAGC
GCACGAGATC
CCTCGTTTCG
TGATCATCGG
GGCATCCCAC
TCCTCGAGAG
AGAGCTTGGT
AAATGGCGGA.
TTTGTTGGTT
CGGTGATCTG
ATCGACAGTG
GGAAGCTGC
TACGATCATG
GGGCACATGC
AACGTTAAGG
S S 0@e S S S 4* S
S
*5 S S *54
S
4*
S
S.
S
S5*S 6*S SO'S
S
INFORMATION FOR SEQ ID NO:27: SEQUENCE CHARACTERISTICS: LENGTH: 495 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear
GAATTCGGCA
CTCCTCCCTT
120
AAGAGGAAGG
180
ATCCTCATCA
240
GP.AGGTCATC
300
GGTGAGTCGG
360
AGAAAGGATT
420
.GACATTAACG
460
ACCAGTCAAC
495 SEQUENCE DESCRI PTION: CGAGGTTAAT GGCAGTGCAG CTTCTTTCTC TGACTTCAAT TGGGGCAGCC TAAAGGGGCA TGGGAGGCAC CCGTTTCATC AGGTCACTTT GTTTACCAGA ACAAGGACTT CGCTGATTTT TTGATTTTGT TAAATCTAGT GCGAGAGGCG GATGAAGTCG
TACTG
SEQ ID NO:2
CCTCAACACC
GGCAGCCGAC
CTGCGGGTCA
GGTGTGTTTT
GGAAAAGCAC
ACCCACCTTC
TCCATGCTTG
CTGCATCAAG
TGTCGAGACT
CCATCACTCA
TCCTGCATTT
AAGGCTTTGA
GGATGCCTGC
CTCCATCTCT
CGTTCAGTAT
CAATAAGAAG
ACTTGTCAAA
ACAATTGCCT
GAAAGGAGAC
CGTTGTTTAT
-CAACCTTG----
TCATCCAAGA
CTTGCTGCAG
CACCAATTTT
INFORMATION FOR SEQ ID NO:28: SEQUENCE CHARACTERISTICS: LENGTH: 472 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) GAAT'rCGGCA
CCTCGTCGTT
120
GGGCCACCCC
180
GACGCTACTG
240
-GAGGAGCCTC
300
CCACTTCCGG
360
TGGAAATGTC
420
TGCAATTGAG
472 SEQUENCE DESCRIPTION: CGAGCATAAG CTCTCCCGTA GGCGGCACTG GCTACCTCGG
ACGTACGTCC
CGCTTCAAGA
GTCGACGCTG
AGCCACAACA
AAGCGGTTTT
CCGGGAAGGG
TCCAGCGTCC
GGCGTGGCGC
TGAGGCGGGT
TCCTGATGCA
TGCCGTCAGA
TCACGTTCGA
SEQ ID NO:28: ATCCTCACAT CACATGGCGA GCGGAGGTTC GTGAGGGCGA GGAGACCGGC CTCGACATTG CCAACTCGTC GAGGCCTCGT CGATGTCGTC GTCTGTGCCA GCTCAAGCTC GTGGAGGCTA GTTCGGAATG GACCCGGCCC TGAGAAATGG. AGGTGAGAAA
;!UAGCAAGGT
GCCTGGACCA
AGAAGCTCCA
TCTCAGACCT
TGTCCGGGGGT
TCAAAGAAGC
TCATGGGTCA
AG
INFORMATION FOR SEQ ID NO: 29: SEQUENCE CHARACTERISTICS: LENGTH: 396 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear *9 0 0O (xi) SEQUENCE DESCRIPTION: GAATTCGGCA CGAGGAGGCA CCTCCTCGAA AGACGAAGGC GAGAATGAGC GCGGCGGGCG 120 CGTCCGGTTA CATCGCCTCG TGGCTCGTCA 180 AGGCCACCGT CCGCGATCCG AATGATCCAA 240 GAGCGAAAGA TAGACTTCAA CTGTTCAAAG 300 CTATTGTTGA GGGTTGTGCA GGCGTTTTTC 360 AGGATCCGCA GGCAGAATTA CTTGATCCGG 396 SEQ ID NO:29: ACGAAGAAGA AG GTGCCGGGAA GG AGCTCCTCCT CC AAAAGACTGA AC CAAACCTGCT GG AAACTGCCTC TC
CTGTAA
AAGGACGA
TCGTGTGC
AGCGCGGC
ATTTGCTT
AGGACGAAGG
GTGACCGGGG
TACACCGTCA
GGACTTGATG
AAGAGGGT TCATTTGATC CCTTTTAT .CATGATGTCA INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 592 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: GAATTCGGCA CGAGGTTGAA CCTCCCGTCC CGCTCCCGCA TACTCCACCA CCGCGTACAG 120 TGCCTCGGTT GGGCTGCCCG GGACCCTTCT 180 AGGGCCC-TGG GAAGCGAAGA CGTCTCGATT 240 SEQ ID TCGGCTCTGC TCGGCTCGTC ACCCTCTTCG AAGATGAGCT CGGAGGGTGG GAAGGAGGAT GGGTTCCTCT CCCCCTACAA ATTCACCCGC AAGATCACGC ACTGTGGAGT GTGCTACGCA
GATGTGGCTT
300
GAGATAGTTG
360
CATGTGGGGG
420
CTAGMAGTCC
480
GTGACAAAGG
540
CCAGAAAACT
592
GGACTAGGAA
GAATTGTGAA
TGGGAACTTA
AATGTGAAAA
GAGGATATTC
TGTGCAGGGA
ACAGGTTGGC
TGrCAATTCA
GTCGGTTATG
TAGTCACATT
CACTC'CAA3T ATCC--,TCTGGT TCCAGTG7CC AACGCTTCAA
TGCAGAGAGT
ACTTTTGATG
GTCGTCCATG
CATTTGCTCT
GCGAGTATTG
GAATTGATGC
AAAGGTATTG
GTGCTGGATC
.§X:CAGGGCAC
.TTGGCGAT
AATGACAGG
A GATGGTACA
C..TCAGGATT
ACCCGATGGA TCTAGCAGCG INFORMATION FOR SEQ ID NO:31: Wi SEQUENCE CHARACTERISTICS: LENGTH: 468 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear
C
*CCC..
CCC.
C
*4*C
.C..CC
(x i)
GAATTCGGCA
AAACAAATGG
120
TACCCTGCAC
180
GGCTTCCACA
240
GGCCCCGAGT
300
CTTCCTCCTT
360
AACTATATGG
420
CCTCCGGTGA
468 SEQUENCE DESCRIPTION: CGAGAACTCA TCTTGAAATG
GTTCCGCCGG
AAAGCCACAT
TCTCCTTCGT
TCACAA.ATGG
CGGACTTGGA
TCAGCCCCAT
CTTGCATCAA
ATTCGAATCG
TGGCGCCATG
CAACACCGAG
AATGCTGAGC
TGCGATCCAA
CAACGATCTT
TCTCGGATGG
TCATTGGAGT
GCCACAAAGC
CTCAAGCTAG
TTCAACCACC
GACTTTCAGT
GACATCAAGA
GTATCGAGCC
TTTCATGACA
CATCATCCTC
CGCACGCCGT
CAAAGCTCCT
GGCGGCTCGC
TCCTGACAAT
TGCTCTGCGA
TGGGCTCGAA.
CTC-GTGAC
AGTGAGAAG
TTGCATTCCC
:ZCATCACAAG
CAGGGCTCGA
CCCCGATGGT
ATCGTCCAGG
CCCGAGCGTC
SEQ ID NO:31: INFORMATION FOR SEQ ID NO:32: Wi SEQUENCE CHARACTERISTICS: LENGTH: 405 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
CTTTACTCCG
60
-GTCCATCTTC
120
AGCAAGCCCC
180
CGGAATTTTC
240
TTCGACCCGG
300
AGGCATTACA
360
ATCGGGCTGT
405 SEQUENCE DESCRI PTION: CCAAGAAGAT CCAATCGCAG ATCGGGAAGT CTCTTGGCAG TAACTCAGTG GTCTATGTGA CGAAATAGCT TTAGGTTTAG GTCAGTGAGC GGCTCGGAAC SEQ ID NO.32: TTTTCGCAAT TGGCCCATTA AAGACCGGAG 'TT GCATTTCC GTCTTGGGAG CATCGCCTCT CCGATAGCCA GCAGCCATTC TCTTAGAGAA TTTGCCCGGT AATGGGCGCC TCAACATGAA
CACAAATGCG
'TGGCTGGACA
GTGAACGACT
TTGTGGGTGG
GTGCTGGCTC
GGAGAGGGGG
CGGAGCGTTT
AAGATTGTGA
TGGACTCACA ATGGATGGAA CTCCA INFORMATION FOR SEQ ID '0:23: SEQUENCE CHARACTERISTICS: LENGTH: 380 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GGCAAACACG
TCGAGAGCTT
120
AGGAGCTCAC
1.80
TCCCGACCAT
240
ACGAGGAGCT
300
TCCCCAACGA
360
TCGAGGAGAA
380 SEQUENCE DESCRIPTION: CCCGTTTTCG TTTTACTAAG GTCGAGCAGT GGCATTCAGT AAGCATTGGC GACATCTTCG CGACCTCGAG GACATAGCGT CAGGAAGGCT GCCACCGACT CCTGATTGAG CGTGTAAAGA
GGACAAGCAT
SEQ ID NO:33: AGAAGATGGT GAGCGTTGTG CGATCCCGCA, GGAGTATGTG AGGAGGAGAA GAAGCATGAG CTAA.AGAC CC' CGTGGTGAGG GGGGCGTCAT GCACCTCGTC AGGCTGGCGA GGTGTTCTTC
GCTGGTAGAG
AGGCCGMAGG
GGCCCTCAGG
GAGAGGTGCC
AACCATGGGA
AACCTCCCGA
ft...
a ft ft INFORMATION FOR SEQ ID NO:34: SEQUENCE CHARACTERISTICS: LENGTH: 305 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
TTGTACCCGA
CGAGGCCGCC
120
CGTCGTGGTC
180
AGGAAGCGGA
240
TCGGAGATGG
300
TGTTT
305 SEQUENCE DESCRIPTION: AGATCTCCGG GACCGTTCGA GCCGGAGGCC GGGGAGAAGC GGCGGCGGCG GCGTGGTGGA CATGGCTGGG GGATCGATCG AGAGATGGAA ATGAAAGAGA SEQ ID NO:34: CGGCGACATC GCCGTCGGCC GGGAACCCGT TGGAGTAGCC GCCGTAGCCG GAGAAGGCGC CCTCATCGCC GTCCATGCTG AAGGCGTCGA ACCGATCCGA' TCGGCCGGAG GATTTCGAGA -GAGAGAGAGA GAGATCCGGT GGACTGGTGG INFORMATION FOR SEQ ID SEQUENCE CHARACTERIS TICS: LENGTH-:--69 3 -base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID GA.ATTCGGCA .CGAGCTA-AGA GAGGAGAGGA GAGGAGCAAC ATGGCACTAG CAGGAGCTGC ACTGTCAGGA ACCGTGGTGA GCTCCCCCTT TGTGAGGATG CAGCCTGTGA ACAGACTCAG 120 .3
GGCATTCCCC
180
TGCCATGGCC
240
CCCCGACGAT
300
CTGCCGTGCC
360
GAGCGACGGC
420
CGCCTACCCT
480
AAGCTCTCCT
540
TCTCCCCCCT
600
GGGATGATTT
660
TGAGGAAATA
693
AATGTGGGTC
GCTTACAAGG
GTTTACATCT
GGCTCTTGCT
AGCTTCCTGG
AAGTCTGAGG
ATATTTGCTT
TCACTACATG
GATGTTATTC
AAACTCATGC
AGGCCCTGTT
TCACCCTGCT
TGGACTACGC
CCTCCTGCGC
ATGATGATCA
TCACCATTGA
TTGCATAAAT
TTTGTTAGTT
TGAGTCTAAT
TCTAAAAAAA
TGGTGTC.kriC
CACCCC-TGAA
CGAGGAGCr A.
GGGCAAGGTC
GATTGAGGAA
GACCCACAAG
CAGTCTCACT
CCTTTAGTC-T
GTAATGGCTT
AAA
TCT-GGCCGTG
GGCAAAGTCG
GGCATCGACT
GTGGCGGGGA
GGTTGGGTCC
GAAGAGGAGC
CTACGCAACT
CTTCCTTTTT
TTCTTTTTCC
GCAGAGTGAC
AACTCGACGT
TGCCCTACTC
GCGTCGACCA
TCACTTGTGT
TCACTGCTTG
TTCTCCACTC
TACTGTACGA
TATTTCTGTA
INFORMATION FOR SEQ ID NO:36: SEQUENCE CHARACTERISTICS: LENGTH: 418 base Dairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear '3 (xi)
AGGACTTTAT
TCCAATAATC
120
ATGCCTTAGT
180
TGTGAAAAAC
240
GATATGATGA
300
TCCTCGGCAG
360
ACATCCCGCA
418 SEQUENCE DESCRIPTION: SEQ ID NO:36: TATAAGCATT GTAAAAAGAG TCAAACTPAT ACATCGCAAG TACAAAAAGA AAAAAGTTTG ATGCATTGAG ATGGTAACTG TTGAAAAATT AACCAACTAT TAAAATTAAT GATGATGAAT TATATAGACT TAAAATTGAC TCAGAAGACA TTCTTTTCTT ATTCGGTCTA AACAGGCAAA TGGTGTCAAA CGGGAAGTCG TGACTACCGG GCGGGCGATG.ATGCGGATCC GGGGGCCGGG CGGACCGGTC CACGTTTGGT GCGGTGACAA CAGGCAGCCC
AATTGGGTTA
CTTAATTCAA
ATGGATTATG
CTTATTTTAT
GCAAAACTCT
TCGCTGGAGA
AACCTGGA
INFORMATION FOR SEQ ID NO:37: Wi SEQUENCE CHARACTERISTICS: CA) LENGTH: 777 base pairs TYPE: nucleic acid CC) STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: GAATTCGGCA CGAGCATACA ACTACACTGC GACGCCGCCG CAGAACGCGA GCGTGCCGAC CATGAACGGC ACCAAGGTCT ACCGGTTGCC GTATAACGCT ACGGTCCAGC TCGTTTTACA 120 GGACACCGGG ATAATCGCGC CGGAGACCCA CCCCATCCAT CTGCACGGAT TCAACTTCTT 180 CGGTGTGGGC AAAGGAGTGG GGAATTATGA CCCAAAGAAG GATCCCAAGA AGTTCAATCT 240
GGTTGACCCA
300
ATTCACAGCA
360
TTGGGGACTG
420
TCCACCTCCA
480
ACTAATGACA
540.
ATAAGAAAGA
600
CCAAAGAGAC
660
ACTCCGAC?.C
720
AGTGTAATTT
77 7
GTGGAGAGGA
GACAATCCAG
AAGATGGCAT
AGTGATCTTC
CCAAGTTAGT
TGAGGAGAGA
CCTTGAGATC
TGCTACAATA
GTTTTTTGGC
ACACCATTGG
GAGTTTGGTT
TCTTGGTGGA
CAAAATGTTG
GGAATCTTCT
AGCcATAGAA
ACGACATCCC
AATTAAGGAA
AAGCTCATCA
AATCCCATCT
CCTGCACTGC
CAATGGGA.AG
ATCATTTGAT
CTTTGAAAAA
GATTTGACCA
GCAATTGT'TT
GACAAGGAAT
CATGAATCAC
GG TGGATGGA
CATCTGGAAG
GGGCCTAAAG
CATGAGGACG
GAAGAAGAAG
AGAAGAGAGA
CTAGAGTAAT
TT GGTTTTTT ATGGAkAAA
TAGCCATCAG
TGCACACAAC
AGACCCTGCT
ACAAGCGATT
AGCAAGAAGA
GGGCAATAAA
AGAAGGATTT
TCATTGGAGG
AAAAAAA
INFORMATION FOR SEQ ID NO:38: SEQUENCE CHARACTERISTICS: LENGTH: 344 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
ATATGTTCAG
ATACGTTGAA
120
TCGATGTTCA
180
ACTACTACAT
240
TACACTACAC
300
AAAAACATTG
344 SEQUENCE DESCRIPTION: AATTTCAAAT GTGGGAATGT GCTAGTCGAG GTTGAAGGAT CGTGGGCCAA TCCATGGCTG TGTCGCATCC ACCCGGTTCA CAACTCGCTT ACCCCAGTTT GTCCATGAAG CAAGCAAGAA SEQ ID NO:38: CAACCTCCTT GAACTTCAGA ATTCAGGGCC CTCACACCGT CCAGAACATG TATGATTCAA TCTTAGTGAC CTTAAATCAG CCTCCAAAGG CCAAGACGGT TCTCAATGCA ACTGCAGTGC CCGGGCCACT ACCAGCTGGT CCAACTTACC CAATCAGGTG GAAC INFORMATION FOR SEQ ID NO:39: SEQUENCE CHARACTERISTICS: LENGTH: 341 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) dCGAACTG
GGGAGCTCTC
120
TGATTTTGTC
180
GGTGAACGGG
240
CAATGTCGTC
300
GAGATCTGGT
341 SEQUENCE DESCRIPTION: CAATTCTCTT CGTAAAACAT CTCCTCTTCT CTGTGGCGGT GTTCAAGCGA CCAAGGTGAA CAATTCCCGG GTCCGACTTT AACAAAGCTC GCTACAACGT TGGGCTGATG GGGCGGAATT SEQ ID NO:39: GACGGCTGTC GGCAAAAtCT CTTTCCTCTT GP.CATTGGCA GATGCAAAAG TTTACTACCA GAGGCTGTGC ACGACCCACA ACACCATCAC GGAAGTTAAC GACGGCGACA CCCTCGTTGT CACCATTCAC TGGCACGGCG TCCGGCAGGT TGTGACTCAA TI INFORMATION FOR SEQ.ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 358 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: GAATTCGGCA CGAGATATGT TCAGAATTTC CAGAATTCAG GGCCATACGT TGAAGCTAGT 120 CATGTATGAT TCAATCGATG TTCACGTGGG 180 TCAGCCTCCA AAGGACTACT ACATTGTCGC 240 TGCAACTGCA GTGCTACACT ACACCAACTC 300 TGGTCCAACT TACCAAAAAC ATTGGTCCAT 358 SEQ ID AAATGTGGGA ATGTCAACCT CGAGGTTGAA GGATCTCACA CCAATCCATG GCTGTCTTAG
ATCCACCCGG*'TTCACCAAGA
GCTTACCCCA GTTTCCGGGC GAAGCAAGCA AGAACAATCA
CCTTGAACTT
CCGTCCAGAiA
TGACCTTAAA
CGGTTCTCAA
CACTACCAGC
GGTGGAAC
INFORMATION FOR SEQ, ID NO:41: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 409 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear a
S
(xi)
ATCAAGAGTT
TGCTGATCGG
120
TTCCTCTTGG
180
TACTACCATG
240
ACCATCACGG
300
CTCGTTGTCA
360
CGGCAGGTGA
409 SEQUENCE DESCRIPTION: TGAGTCTAAA CCTTGTCTAA.
CCGCAGCTGC ATTCTCTTCG GAGCTCTCCT CCTCTTCTCT ATTTTGTCGT TCAAGCGACC TGAACGGGCA ATTCCCGGGT SEQ I D NO: 41: TCCTCTCTCG CATAGTCATT TGGAGACGAA TAAAACATGA CGGCTGTCGG CAAAACCTCT GTGGCGCTGA CATTGGCAGA TGCAAAAGTT AAGGTGAAGA GGCTGTGCAC.GACCCACAAC CCGACTTTGG AAGTTAACGA CGGCGACACC TACAACGTCA CCATTCACTG GCACGGCGTC GCGGAATTTG TGACTCAAT a S
S
ATGTCGTCAA
GATCTGGTTG
CAAAGCTCGC
GGCTGATGGG
INFORMATION FOR SEQ, ID NO:42: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 515 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: CTCTCTCTCT CTCTCTCTCT GTGTGTTCAT TCTCGTTGAG CTCGTGGTCG CCTCCCGCCA TGGATCCGCA CAAGTACCGT CCATCCAGTG CTTTCAACAC TTCTTTCTGG ACTACGAACT 120
CTGGTGCTCC
180
TTCTTGAGGA
240
AGCGTGTGGT
300
TTTCCCAGCT
360
TCCGTTTCTC
420
GTTTTGCTGT
480
CTGTCTTCTT
515
TGTCTGGAAC
TTATCACCTC
GCATGCCAGA
TACCTGTGCT
CACTGTCATC
GAAGTTCTAC
TGTCCGTAAT
AATAACTCTT
GTGGAGAAAC
GGAGCCAGTG
GATTTCCTTC
CACGAAAGGG
ACAAGAGAGG
GGGATAAATT
CGTrTGACTZ-T TGGALAGCAGA
TTGCCAACTT
CAAD.GGGATT
GGGCACCAGG
GCAGCCCTGA
GTAACTTTGA
CCCCG
TGATAGGGAG
CTTTGAGGTC
AGTTC.AAACA
AACCCTGAGG
TCTGGTGGGA
GG~TCCAATTC
AGGATTCCAG
ACTCATGACA
CCCGTGATTG
GACCCTCGAG
AACAATTTCC
INFORMATION FOR SEQ ID NO:43: SEQUENCE CHARACTERISTICS: LENGTH: 471 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGCA
ATTTGCGGCG
120
GTAAGCCAGG
180
GCTGAGAAGA
240
GATGTGAAGA
300
CACGGGGCCA
360
TTCCCCGTCA
420
ACTGGTGGAC
471 SEQUENCE DESCRIPTION: CGAGGCTCCC TCTCGTACTG ATCCATTTCT CGATTCAAGG AGTACAAGAA GGCTGTCGAG
GCTGCGCTCC
CGAAGACCGG
ACAGCGGGCT
TCACTTATGC
CTGAAGTTGC
GCTCATGCTC
AGGCCCGTTC
CGACGTTGCC
TGATTTCTAC
7 TTTCACCCG SEQ ID NO:43: CCATACTCCT GGGACGGGAT GGAAGAATCA TGGGGAAGTC AAATGCAAGA AGAAGTTGAG CGCATCGCGT GGCACTCCGC GGGACCATGA AGCACGCCGC GATCAGGTCT TGCAGCCGAT CAGCTGGCTG GCGTCGTTGC GAAGAGAGGC AAACCACAAC
TCGGATAGGG
CTACCCGACC
AGGCCTCATC
CGGTACCTTC
GGAGCTCAC
CAAGGATCAG
TGTGGAAGTT
C
INFORMATION FOR SEQ ID NO:44: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 487 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGCA
TCAGTTTCGT
120
CGTCCAGCGC
180
ATGACTCATC
240
TTGAGAAACT
300
GAGCCAGCGC
360 SEQUENCE DESCRIPTION: CGAGCTCCCA CTTCTGTCTC GCTCTCT-TCG TCATCTCTGC TTACGATTCC AGCTTTTGGA GCTGACTGTT GGAACTAGAG TGCCAACTTC GAGAGAGAGA GAAAGGGTTC TTCGAGGTCA SEQ ID NO:44: GCCACCATTA CTAGCTTCAA CTCTTGCCAT GGATCCGTAC CAACCAACTA CGGTGCTCCC GTCCGATTCT CCTGGAGGAC GGATTCCTGA GCGGGTGGTC CCCACGACAT CT CTCACTTG
AGCCCAGATC
AAGTATCGCC
GTCTGGAACA
TACCATCTGA
CATGCACGGG
ACCTGTGCTG
0 ATTTCCTCCG GGCTCCTGGA GTCCAGACGC CCGTAATCGT CCGTTTCTCC ;.CCGTCATCC 420 ACGAGCGCGG CAGCCCGAAC CTCAGGGACC CTCGTGGTTT TGCAGTGAAG ='TACACCA 480
GAGAGGG
487 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 684 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID
S
5555..
GAATTCCTGC
GCGCGCCTGC
120
TTGTTGGACG
180
CCCAAGGAAG
240
GTCTGTTTTG
300
GAGGTGATGC
360
CCAGGGCAGA
420
AAAGAAGCAG
480
GCTCTGCGAA
540
CACTCCATCG
600
GACATGGCTC
660
ATGCGAATCT
684 AGCCCGGGGGu
AGGTCGACAC
CCATGGAAGC
GACTGGCTCT
ACGCCAACGT
AAGGGAAACC
TCGAAGCCGC
CGCGGCTTCA
CAT CGC CACA
AGCGGGAGAT
TCCACGGCGG
CTTTGGCAGC
TCTCCGGAAA
CGTCAACGGC
GCTGGGCGTG
GGAGTTCGTA
GGCCGTCATG
CGAGAAAGAC
GTGGTTGGGG
CAATTCCGTC
CAACTTCCAG
CGTC
GCCGGGATTC
ACAGCGGTGG
CTGGCTGAGA
GATCCGTTAA
GAGTTCCTCC
CCGTTGAGCA
CCTCCGATCG
AACGACAATC
ATCCACTAGT TCTAGAGCGG TAGTGGATCC AAAGAATTCG
CCGCCACCGC
GCACGAGGCC
TGGAACCGTT
GATCCGCCGT
TTCTGTCTGC
CCCACCAGTT
TCGACGGTAG
AACCGAAACA
AAGTCATCCG
CGTTAATCGA
GGTGGAGCTC
CGACGGCCAC
TAAACTGCAG
GGCCGCGTCC
GCTCTTCTGC
GAAGCACCAC
CGACTACGTG
AGACCGCTAC
CGCTGCTACT
TGTCTCCAGG
GGAACACCCA TCGGAGTTTC CATGGACAAC INFORMATION FOR SEQ ID NO:46: SEQUENCE CHARACTERISTICS: LENGTH: 418 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGCA
GAACTCCTGG
120 TAT GATTTGA 180
TGGTCATAAA
240 CAT TGGATTT 300
AAAGTGGCGG
360 SEQUENCE DESCRIPTION: CGAGGACAAG GTCATAGGCC CCCATTCTGA AATAAATAAT GTCCTCGGAT CTTTTTGTTG GCTTGATTTT GTTTTTCTTT GCCAGAAATA TGTAAGGGTG ATCATTTGGG TAGCATGCAG SEQ ID NO:46: CTCTCTTCAA ATGCTTGGAT CTTCCAAGAT CGCCTTTATA ATGCAGTTGT TTACCGATCT CTTTTGTTTT ATACTGCTGG GCAGATCATT TGGGTGATCT ATCAGTTGGG TGATCGTGTA
GGGTGGAAAG
CAACGACTGC
GGAATTTGAT
ATTTGCATCC
GAAACATGTA
CTGCTTTCAC
.1 TATTACTTAC ATATTTAA-AG ATCGGGAATA AAAACATGAT TTTAATTGAA. AIAAAAAA 416 INFORMATION FOR SEQ ID NO:47: SEQUENCE CHARACTERISTICS: LENGTH: 479 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION:
GATATCCCA
CGAGCAAGGA
120
TTAAAAGCAC
180
TTCGAGCAGC
240
CGTATTTGGG
300
CTGCCGTTGC
360
GGGTCGAGGA
420
GTGTCACTAC
479
CGACCGAAAA
AGAAAATATG
TGGGCTGTGC
CAAGGCCATG
AGCCAAGGAG
TCGAAGATCG
GAGTTCAAAC
TGGTTTCGGA
CCTGTATTTT
GTTGCAGCAG
ACGGACTTCG
GAAGGAAGTC
ATTTCCATTG
CAAGTGAAAG
TGGGTTCTCA
GCCACTTCTC
*SEQ ID NO:47:- CAGGGCGCCA TGGGGATCCG CAGAAATTAC GCAGGCCAAT 'GCTCGTCTGG CAGCGATCCA.
ACTTTGAAGA AGTGAAAGCG AAGGGAAATC TCTGACAATC TGAAATTGGA TGCTGCGGCT CCCAGATGAC CAAGGGGACG ACAGGAGAAC GAACCAGGGA
GPATTCGGCA
GAAGTTCAAG
'CTGAACTGGG
ATGGTGGATT
TCr-AGACGTTLG
GCCAAATC-TA
GATACCTATG
GCCGAGCTT
S INFORMATION FOR SEQ ID NO:48: SEQUENCE CHARACTERISTICS: LENGTH: 1785 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION:
TATCGATAAG
CGCCACCGCG
120
CACGAGGTTG
180
ATATGGCAAA
240
CGATCTCGCC
300
CGTGGTGTTC
360
TCACTGGAGA
420
GCACTACAGA
480
CGCCGAGTCT
540
TATTATGTAT
600
CAAGCTCAAG
660
TGGGGATTTC
720
CTTGATATCG
GTGGAGCTCG
CAGGTCGGGG
ATCTTTCTGC
AAGGAGGTCC
GATATCTTCA
AAGATGCGCA
TTCGCGTGGG
TCCACCTCGG
AGGATGATGT
GCCCTCAACG
ATTCCCATTC
AATTCCTGCA
CGCGCCTGCA
ATGATTTGAA
TCAAGATGGG
TGCACACCCA
CGGGCAAGGG
GGATCATGAC
AAGACGAGAT
GCATTGTCAT
TCGACAGGAG
GAGAGCGAAG
TTAGGCCCTT
SEQ ID NO:48: GCCCGGGGGA T.CCACTAGTT CTAGAGCGGC GGTCGACACT AGTGGATCCA AAGAATTCGG TCACAGAAAC CTCAGCGATT TTGCCAAGAA
CCAGAGGAAT
GGGCGTCGAG
GCAGGACATG
TGTGjCCTTTC
CAGCCGCGTG.
CCGTAGGCGC
ATTCGAATCC
TCGATTGGCC
CCTCAGAGGT
CTTGTGGTAG
TTTGGGTCTC
GTGTTCACCG
T TTACGAATA
GTCGCGGATG
CTCCAGCTCA
GAGGACGACC
CAGAGCTTTG
TATCTCAGAA
TTTCATCTCC
GAACCCGGAA
TCTATGGAGA
AAGTTGTCCA
TGAAATCCCG
TGATGTATAA
CGCTTTTCCT
AGTACAATTA
TCTGCAATGA
GATTAAAGAG
780
CAACAGTACC
840
TAGATGCTCA
900
TCAACGTTGC
960
TGAACCACCA
1020
GCGTGCAGAT
1080
AAACCCTTCG
1140
CCAAGCTCGG
1200
TGGCCAACAA
1260
AGGAGGAGAA
1320
GAGGAGGAGC
1380
ACTTGTTCAG'
1440
GAAGGGCGGG
1500
AGCTTCTGCT
1560
AAACACTCCA
1620 TAG CACT TCA 1680
AATAAAAGTT
1740
GGGGAATTTT
1785
AAACGGCTV-.
AAGACTAGTA
GGACAAGGGA
AGCAATTGAG
GGACATTCAG
AACGGAACCA
TCTCCGCATG
GGGCTACGAT
CCCCGCCAAC
GCACACCGAA
TGCCCGGGAA
AACTTCCACC
CAGTTCAGCC
TAATCCCAAC
TCTATCATGA
AAAGTTTGCT
TGCATAAATT
ACTGCTAAAA
CTCTTTTCAA
CCAACACCGG
GAGATCAATG
ACAACGCTGT
AGCAAGGTGC
GACACGACAA
GCGATCCCGT
ATTCCGGCAG
TGGAAGAACC
GCCAATGGCA
TCATTCTGGC
TTCTGCCGCC
TTCACATTCT
TTGTCAGTGA
CTGTGTrGTGC
AGGATTTCAA
AAATGATATT
AAAAAAAAAA
GGACTACTTO
GGGAGCTCAA
AGGATAATGT
GGTCGATGGA
GCGCAGAGCT
GGTTGCCCTA
TGC'TCG?,CCC
AGAGCAAGAT
CCGAGGAGTT
ACGACTTCAA
GCTGCCTC-TC
GCCCGGGCAG
CAACCATTCT
CTGGTATATA
GTGTCCACTG
TAACAGACAC
TCAATATACT
AAAAAAAAAA
GTGr'GAAGAGC
GTGTGCA.ATG
TTTGTACATC
ATGGGGAATA
GGACGCTGTT
CCTTCAGGCG
CCACATGAAT
CCTGGTGAAC
C!CGCCCCGAG
ATTCCTGCCT
CTCGCACTCT
AGCAAAGTGG
CTCATCGTCG
AATGCGCGCA
TCGAGTCTAC
CGTCAATTAT
ATTTTGACTC
AAAAA
ZCAAGAAGCT
GACCATATTT
GTTGAGAACA
GCGGAGCTGG
CTGGACCAG
C3TTGTGAAGG
CCACGACG
GCCTGGTGGT
,--GTTCTTCG
TCGGTGTGGG
CCATCGGAAG
AGTCACTGA
:-AAGCCCAT
-CTGAACAAA
TAAGAGCTCA
GTCATGTTtC
TCCACCAATT
C
a p INFORMATION FOR SEQ ID NO:49: SEQUENCE CHARACTERISTICS: LENGTH: 475 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGCA
GTCCTCGTCC
120
TAGGAACCTG
180
ATTTTCCAGC
240
ATTCACTGTG
300
CGAGGCGCTC
360
GAAAATCTTC
420 GAG CCTTCGC 475 SEQUENCE DESCRIPTION: CGAGATTTCC ATGGACGATT TCGTTTTCCT TGTTCTTCCT CCGCCAGGAC CCCCGGCATG GGCGCGTTCG AGACCTCAGT TGGCTCGGTT CCCGCCCTCT GTACAGAAGG GCTCCGTCTT AGTAGCAACC AGCACAACAT AGGAATCTGG TTAAAGAAGC SEQ ID NO:49: CCGTTTGGCT TCAATTCGTT CCGACTTTTT CTCTGGAAGC GCCGATCGTA GGGAACGTCC GAAGAAATTC CATGAGAGAT GCTGATGATC ACCGACCGCG CGCTGACCGC CCGCCCGCCC CACTTCGGCT GAATACGGCC CCTGAGACTT CGGCGATGAA
TCCTCTGGCT
TATGGCGTAA
TTCAGATTGG
ACGGTCCAAT
AGCTTGCCCA
TCGGGATGCA
CGCTGTGGCG
GGCTT
INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 801 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:
GCTCCACCGA
GGCAATTCGA
120
CGTATGTTGT
180
GATTACGCCA
240
GGTGGCGGCC
300
CAGGCCTGAG
360
GCCATTCGGT
420
GTCTATGTTG
480
AGAAGACATA
540
GGCCATTGCT
600
AATTGATCTG
660
CTCCTTTTCC
720
CTGAAGCCCA
780
AATTGAGTAT
CGGTGGACGG
TTTAGCTCAC
GTGGAATTGT
AGCGCGCAAT
GCTCTAGAAC
AGATTTCTTG
GCAGGGCGCA
GGACACCTGC
GATCTCACAG
ATTCCTCGAT
ATAGTAAGTT
ATAGTCAACA
ACTTCTAGCA
TTCTCTGTAG
TCCGCTACTC
TCATTAGGCA
GAGCGGATAA
AGTAACTGAG
CCCCAGGCTT
CAATTTCACA
TGGGATCCCC
TACACTTTAT
CAGGAAACAG
CG~GGCTGACA
GCTTCCGGCT
CTATGACCAT
TAACCCTCAC TAAAGGGAAC
TAGTGGATCC
AGGAAGATGT
GGATCTGCCC
TTCATCATTT
AGAATCCAGG
.AAAGAATTCG
TGATATTAAG
TGGTGCACAA
CGTATGGGCA
GCTTGTTACT
AAPIAGCTGGA GCTCCACCGC GCACGAGAC.C CAGTGACCTT GGCCATGATT. ACAGGCTACT TTGGGTATTA ATTTAGTTCA CCTCCTGAGG GAATGAAGGC TTCATGGCCA AGCCTGTGCA CGACAGCCAC TCAATTGATC AACGAAATAA CGTGCAGTTT GCGCATGCAG CTTTCTTTCT GAACAAATAC CTATTCCTCA TGCCTGATCA TCTCTACAAG
TGAATTTTGT
TGCAGCTTTC
AGCAATAACT
TTTGATACAA
TTTCTCTGAA
GTATATTTTA
.0.5.5 .555
S
S
INFORMATION FOR SEQ ID NO:51: Wi SEQUENCE CHARACTERISTICS: LENGTH: 744 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: GGGCCCCCCT TCGAGGTGGA CACTAGTGGA AAGGACGCTG TGCTTGAAGG CTCCCAGCCA 120 GAGTACCCGG CCATCGATCA GAGATT CAAC 180 TCTACCATGT TGATGAACAA GATTTTGGAT 240 TTGGTGGATG TGGGAGGAGG TATTGGGTCG 300 CACATTTCAG GAATCAACTT CGACTTGTCC 360 GCTGTGAAA~C ATGTGGGTGG AGACATGTTT 420 ATGAAGTGGA TTCTGCATGA TTGGAGCGAT 480 SEQ ID NO:51: TCCAAAGAAT TCGGCACGAG TTCACCAAAG CCCATGGAAT AAGATTTTCA ACAGGGCTAT ACTTACGAGG GTTTTAAGGA ACTCTCAATC TCATAGTGTC CATGTGCTGG CCGATGCTCC GATAGTGTAC CAAGTGGCCA.
GATCATTGCA GGAAGCTTTT
GTTTTATCTG
GAATGCGTTC
GTCTGAGAAT
GGTTCAGGAG
TAGGTATCCC
TCACTACCCA
AGCTATTTTT
GAAGAATTGT
CACAAGGCGT TGCCAGAGAA GGGGAAGGTG ATTGCGGTGG ACACCATTCT -CCAGTGGCT 540 GCAGAGACAT CTCCTTATGC TCGTCAGGGA TTTCATACAG ATTTACTGAT GTTGGCATAC 600 AACCCAGGGG GCAAGGAACG CACAGAGCAA GAATTTCAAG A-TTTAGCTAA GGAGACGGGA 660 TTTGCAGGTG GTGTTGAACC TGTATGTTGT GTCAATGGAA TGTGGGTAAT GGAATTCCTG 720 CAGCCCGGGG GATCCACTAG TTCT 744 INFORMATION FOR SEQ ID NO:52, SEQUENCE CHARACTERISTICS: LENGTH: 426 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GTGGCCCTGG
GGTTTCTCTC
120
AGCTCAAGAA
1.80
CATTGCTGGG
240
TTGTCTATCT
300
AGGCGTTTCT
360
CGTACATCGC
420
TGGAGC
426 SEQUENCE DESCRIPTION: AAGTAGTGTG CGCGACATGG TTGGCTTGCT CTCTACATTG GAGGCGCCTC CCGCCGGGCC AGCGATGCCT CACGTTACTC CAAACTGGGG ACGTCCGACA GAAGACTTTG GATATAAACT CTACGATTCT CAGGACATGG SEQ ID NO.: 52: ATTCCTTGAA TTTGAACGAG GATTTCGTTA TGTTTTGAGA CATCGGGATG GCCAGTGGTG
TCTACAACAT
TGGTTGTGGC
TCTCCAACCG
TGTGGGCAGC
GTATAAGAAA
CTCCACGCCC
GCCGGGAAAT
GTATGGAGGA
TTTATGTTGT
TCGAACTTGA
GGAAGTCTGC
TATGGCCCCG
GCTGCAGCTA
GCAGGAGCCA
CGGTGGAAGA
INFORMATION FOR SEQ ID NO.53: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 562 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear a..
(xi)
CAGTTCGAAA
ACACTAGTGG
120'
CAGGATTTCT
180
AGAGGTGAAG
240
AGTGGGACAC
300
CGTGTACCCT
360
CTGGAACCTC
420
CATTAACGCC
SEQUENCE DESCRIPTION: TTAACCTCAC TAAAGGGAAC ATCCAAAGAA TTCGGCACGA TCTTGTCCAA ACAGGTTTAA GCTCAGACAA CCCAAGCAGA AAAAGTCTTT TGCAGAGCGA CGTGAGCCCG AGCCAATGAA ATGACTACTT CTGCCGATGA AAGAACACCA TGGAGATTGG SEQ ID 140:53: AAAAGCTGGA GTTCGCGCGC GCTTTGAGGC AACCTACATT GGAAATGGCA GGCACAAGTG GGAGCCGGTT AAGGTTGTCC TGCCCTCTAT CAGTATATAT GGAGCTCCGC GAAGTGACTG GGGTCAATTT CTGGGCCTCC GGTGTACACT GGTTACTCGC
CTGCAGGTCG
CATTGAATCC
TTGCTGCAGC
GCCATCAAGA
TGGAAACGAG
CCAAGCATCC
TGCTGAAGCT
TTCTCAGCAC
AGCCCTTGCA TTGCCCGATG ATGGAAAGAT TCTAGCCATG GALCATCAACA GAGAGAACTA 540 TGATATCGGA TTGCCTATAA TT 562 INFORMATION FOR SEQ ID NO:54: SEQUENCE CHARACTERISTICS: LENGTH: 1074 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54: a a a a *aa..a a 9
TCGTGCCGCT
GCACGCTGAG
120
GAATGGGAAA
180
GCTTCAAAAC
240
TCGAGAGGAC
300
GGAGATGATG
360
TGCTAAAAAG
420
CGCTCTTCCT
480
CTGGCCCTGC
540
GTTGGACTAT
600
CGTGGACGCA
660
CGTGGGGGGC
'720
TCCCCACAAC
'780
CAACTCCATG
840
TGTCACTGTT
900
TATATGATAA
960
TTTCTGTTTA
1020
GCCGCTTGCA
1074
CGATCCTCAC
AATGGCAACG
CCAGGTTATG
GTTATGGAAG
GAGGCGTCCA
ACATTGCCAG
GCATTGGAGA
TCTGATGGCA
TTCGTTAAGG
TTGGATTCCC
GACAAAGTGA
GTCATAATTT
CTGCTTAAGA
GTAGCCAACG
TGTTACCGCA
TGGCGTCGAT
GCCAGAATGT
AATCCATTTA
AGGAGCTCTG
ATCAGGTGCA
TCGGAGTTTT
AGGTGGTAGC
CTGGAGTTGC
TTATTCAAAA
ACTACGTGAA
ACGACGACAC
ATGATTACAT
ACCCCAACTT
CTGCTTAGTT'
CAAGTACATG
GGAACGAACA
GTTCCTGCGC
CACTGGCTAT
TGTGGATCCA
AGACAAAGTG
GGGGGAGAAG
CTATCATCCA
CCTCTGGTTT
GAGGACTTCT
GGAGGTCGCC
AGCTAGTCCT
AGGCCCTTTT TATTTCCCTG GGGTGGAGGT 'TGTTGATCCA
ACAAGCGTAC"GCTGCCTGCG
GTGAACGATA
ACGGACTTAA
GACTGGAAG T
CTGGAAACAT
TGGAACCTGA
TTGATGGTAA
TCATTGCTCA
GGAGATGACC
GAGATCAAGA
GATTGCTTCG
CGGCTGATGA
GGTCTGGTGG
CTGGAGGGTA
ACAGTCTTTA
CCGTCATTCT
CAATGTTTCT
TTAAAGCCAG
GACTAGTTCT
CGATGGGCTC
CTGACATCGA
TTGGAGTGAA
TCACCCGCCA
CACAGAGAGG
AGATGTCAGG
ATATCGCTCT
CCAAATTTGG
AAACTACAGG
ACTTTGCATT
AGTTAGTGCG
GAGGAAAGGA
TCAAGGCCAT
TGGGATATGG
GCTATGTATG
ATCGTCATGT
AALTAAAATTA
CTTC
TTCTGATATA"GGTGGTTTTT
TTCGATCGTC ATGGTTTCTG GTTCAAAAAA AAAAAAAAAA AAAAACTCGA INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1075 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ, ID TCCGAGCTCT CGAATCCTCA CAGGCCCTTT TTATTTCCCT GGTGAACGAT ACGATGGGCT
CGCACGCTGA
120
AAGAATGGGA
180
AGCTTCAAAA
240
ATCGAGAGGA
300
GGGAGATGAT
360
GTGCTAAAA.A
420
TCGCTCTTCC
480
GCTGGCCCTG
540 GGTTGGACTAr 600
TCGTGGACGC
660
GCGTGGCGGGG
720
ATCCCCACAA
780
TCAACTCCAT
840
GTGTCACTGT
900
GTATATGATA
960
TTTTCTGTTT
1020
AGCCGCTTGC
1075
GAATGGCAAC
AACCAGGTTA
CGTTATGGAA
CGAGGCGTCC
GACATTGCCA
GGCATTGGAG.
TTCTGATGGC
CTTCGTTA.AG
TTTGGATTCC
AGACAAAGTG
CGTCATAATT
CCTGCTTAAG
GGTAGCCAAC
TTGTTACCGC
ATGGCGTCGA
AGCCAGAATG
AGTTCAAAAA
GGGGTGGAGG
TGACAAGCGT
GAATCCA'TTT
AAGGAGCTCT
GATCAGGTGC
ATCGGAGTTT
AAGGTGG7AG
GCTGGAGTTG
CTTATTCe'AA
AACTACGTGA
TACGACGACA
AATGATTACA
GACCCCAACT
ACTGCTTAGT
TTTCTGATAT
TTTCGATCGT
AAAAAAAAAA
TTC-TTGATk-C
CGCTGCCTGC
ACAAGTACAT
GGGAACGAAC
AGTTCCTGCG
TCACTGGCTA
CTGTGGATCC
CAGACAAAGT
AGGGGGAGAA
ACTATCATCC
CCCTCTGGTT
TGAGGACTTC
TGGAGGTCGC
TAGCTAGTCC
AGGTGGTTTT
CATGGTTTCT
AAAAAACTCG
AACGGACTTA
GGACTGGAAG
GCTGGAAACA
ATGGAACCTG
CTTGATGGTA
TTCATTGCTC
AGGAGATGAC
GGAGATCAAG
GGATTGCTTC
TGGTCTGGTG
TCTGGAGGGT
CACAGTCTTT
TCCGTCATTC
TCAATGTTTC
GTTAAAGCCA
AGACTAGTTC
ACTGACATCG
TTTGGAGTGA
TTCACCCGCC
ACACAGAGAG
AAGATGTCAG
7AATATCGCTC
CCCAAATTTG
AAAACTACAG
GACTTTGCAT
AAGTTAGTGC
GGAGGAAAGG
ATCAAGGCCA
ATGGGATATG
TGCTATGTAT
TATCGTCATG
GAATAAAATT
TCTTC
9 9 99..
*9 9 9 9*9 C 999999 9 9999 4 9.
999 *9 *9 9~ t INFORMATION FOR SEQ ID NO:56: SEQUENCE CHARACTERISTICS: LENGTH: 1961 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: GTTTTCCGCC ATTTTTCGCC TGTTTCTGCG GAGAATTTGA TCAGGTTCGG ATTGGGATTG
AATCAATTGA
120
GGTCGAGCAT
180
TCATTCGTAT
240
GGCGACAGAC
300
CGGTCTGGCG
360
CATCGAATTT
420
CAATCCTTTC
480
TCATAGTTAC
540
TCGTCATCAC
600
AAGGTTTTTA
CTGTACAGAT
TGCTTTGAGA
AGAACTTATT
A.AGCTCGGGT
GCGTTTGTGT
TACAAGCCGG
CCTGGCAGCT
AATCGATGAT
TTTTCAGTAT
CGAAGCTTCC
GAGTAGCGGA
GCTTTTCAGA
TGCAGCAGGG
TCATGGGGGC
GCGAGATCGC
TATGTGGAGA
GCTCCCAAGG
TTCGATCGCC
CGATATCGAG
ATTCGCAGAC
GGTGGAACTG
GCAGGTTGTC
CTCTGTCCGG
CAAACAGGCC
AACTGGCCGA
AAGGTTGCCA
ATGGCCAACG
ATCTCCGACC
AGACCCTGTC
ATTTCTCGCA
ATGCTTCTCC
GGCGCCATTG
hAGGCCGCGG
TCTGCAGAGC
ACATATTTCC
GAATCAAGAA
ATCTGCCTCT
TGATCGATGG
AGGTCGCTGC
TTCCGAATTG
TGACCACGGC
GCGCGCGCGA
CACGATGTGC
GTTCTGACCG
AAGCCGACGA
660
CCTATTCTTC
720
TGTCCAGCG'T
780
ACGTGATACT
840
GCGCGCTCAG
900
TGGAGCTGAT
960
ACATCACAAA
1020
CCGGCGCTGC
1.080
CCATTTTCGG
1140
CCTTCGCAAA
1200
CTCAAATAAA
1260
AAATCTGCAT
1320
CCGCTACAAT
1380
ACGAAGAAAT
1440
TGGCTCCTGC
1500
TCGTTCCTCA
1560
CGGAAATCAG
1620
AAATACACAG
1680
GAAAGGATTT
17
ATTCCTTTGC
1800
GTGCAGAGTA
1860
CAACGCCCTA
1920
TTTTTTATAA
1961
AACCCAATGC
CGGAACCACG
TGCCCAGCAG
CTGTGTCTTG
AGCCGGGGC'T
TCAGAAATAC
GAGCCCCATC
GCCTCTCGGG
GCAGGGCTAC
GAATCCTTTC
GATCCTCGAT
CCGCGGACCC
CGATGAAGAA
CTTCATAGTC
TGAGCTGGAA
AAAGCACGAG
CGAGCAGGAA
AGTTTACTTT
GAGAAGCAGA
CGATAATTAT
AGCGCCCTAT
CACTCTTGCG
CCGGCCGTGA
GGGCTCCCCA
GTCGATGGTG
CCTCTTTTCC
GCGACCCTGA
AAGGTTACCG
GTTTCCCAGT
AAGGAACTCG
GCATGACAG.
CCCGTCAAAT
ACAGAAACTG
GAAATAATGA
GGCTGGCTCC
GACAGAGTAA
GCTTTACTTG
GAGGCGGGCG
ATCAAGGAAT
GTGGATGCGA
CTGGCAGCAA
AGGATTCCTT
AAGGAGAGAG
ATCGCTTTCA
CAATCCACCC
AGGGCGTGAT
AMAATCCCAA
ACATCTATTC
TTATGCAGAA
TTGCCCCAAT
ACGATGTICTC
AAGATGCCCT
AAGCAGGCCC
CTGGCTCCTG
GCGAGTCTCT
AAGGATATAT
ACACAGGCGA
AGGAGATTAT
TTGCTCATCC
AGGTTCCGGT
TCGTGGCAAA
TTCCTAAGTC
AATGAAAATG
TCTGTTCACT
AGAGCTTATC
ATATGCATAT
AAAAAAAAAA
GGACGATGTC GTGGCGTTGC GTTAACGCAC AAAGGCCTGG TCTGTATTTC CATTCCGATG TCTCAATTCG GTTCTCCTCT ATTCAACCTC ACGACCTGTC TGTGCCTCCA ATTGTCCTGG GGCCGTCCGG ATAATCATGT CAGAGAGCGT TTTCCCAAGG GGTGCTGGCA ATGAACCTAG .CGGAACAGTC GTCCGGAACG CCCGCACAAT CAAGCCGGCG TAACGACCCG GAATCCACGG CGTCGGGTAC ATTGACGATG CAAATATAAG GGCTTCCAGG GTCAATCGCT GACGCAGCAG GGCGTTCGTG GTGAAGTCGT tGCAGGTGATT TTCTACAAGA GCCGTCCGGC AAGATTCTGA AATTTCCATA TGATTCTAAG TCTATTTATA TAATAAAGTG AATTGTATCA TATGGATTGT TACTATAAAC GATATATGTT
A
C
ATTTACTGCA CTTCTCGTTC C 0*9C C QCC
*CCCC*
INFORMATION FOR SEQ ID NO:57: SEQUENCE CHARACTERISTICS: LENGTH: 1010 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY:. linear (xi) SEQUENCE DESCRIPTION: GACAAACTTG GTCGTTTGTT TAGGTTTTC TTGCAGCATT AAGCAAAGAA GATGAGTTCA 120 TTCCAGAGAA TATAAGTCTT TTCCAGTTTG 180 AGGTGGCCCT CGTGGAGGCC TCCACAGGGA 240 SEQ ID NO:57: TGCAGGTGAA CACTAATATG TTTTTCACAG CCCTTTTCCT TTCTGGAAGG TGCTGAGAAA AGGAGTACAA CTATGGTCAG
GAAGGCCAGA
GCAGTACCTG
TACCGTGATA
GTGATTTCGC
TCACAAGGAA
300
TTGTTCTGCT
360
GCGCAGTGTT
420
AGGATTCTGG
480
TGAAACTGCC
540
AAATTTTTGA
600
TGTGTGCACT
660
ACAGAAATCT
720
GAAATTTCAC
780
GTTGCGCCAC
840
ACTTTATCAG
900
TGCTCTCCCT
960
TCCAAAGCTG
1010
TGTTGCAGCT
TCCAAATATG
TTCTGGGGCA
AGCAAAGATT
TGTTATTATT
GAGAAACTAT
CCCTTATTCC
GATTGCAAAT
CACGTTGGGG
TCTTCGCA.AC
TTCTTTGATT
CCGGTTTA.AA
TTCATGACTG
GGGCTCGTGG
GCAGAATACC
AATCCTTCTG
GTTGTGACAG
GCAGATAACG
GAGGCCGCAG
TCTGGCACCA
CTGTGCTCTA
CTGATGCCAT
GGAGGCAAGG
ACTTATGAGG
AATCCTATCG
ACAAAGGCAT TCAAAAGGGC CCATT1ATTGT GCTGGGAATA CACACATCAA TGAAGTTGAA
TTGGGTCTGC
AGCATGTCAT
GGCOTTTTGT
CAGGGGCCTC
GCTTGTTTGA
TCTTTCACAT
TCGTGGTCAT
TCA.ACTTCGC
TTAACGAGTT
TTATGAGAAG
GAACACAATT
ACAAATTTGT
TAAAGGTGTC
TGTCCATGAA
ATATGGCATC
GTCCAGATTC
GCCTA7TGTC
CGATCTCAGC
G~ATGTTGTAT
ATGTTGGCCG
AAACA'TATCC
GTGAGGCAAG
CCATTGCAGG
C.AGGATGATC
ATG'CTCACTC
TCTCTTGTAG
CGGGCATC7
.ATCTCCGAC
CCGCCTATA.A
CGCTTGAAi& CGGCTGCTCC ACTGGCGCCG GATCTACTGC
S.
S 0 005 4 00
S.
0S S
OOS*
S
*5
S
INFORMATION FOR SEQ ID NO:58: SEQUENCE CHARACTERISTICS: LENGTH: 741 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: 0 00 9 0 a 0
GAATTCGGCA
GTCAAAGGAG
120
TACCCAGATG
180
CGAGATTCTC
240
GCATTTGCCA
300
TTTAGTATAG
360
AACAAATACT
420
TCTTTGGA.AA
480
CCAAAATCAT
540
CAGTGAATAA
600
GTGTTAGTGA
660
AATCTTGATG
720
AAAAAAAAA-A
741
CGAGACCATT
ATCAAACAAA
TGAAATATAC
TTCCACATGC
AATTGTGGGT
TATGACGAGC
CACCTGTGGT
CCGCTTAGTG
GGCTGATGTG
TTTTGTTAGA
ACGGAATGAT
GATTGTGTCT
TCCAGCTAAT
TTTTGAA.ATT
CACTGTCGAT
TTCAGAGATA
TATAATCCTT
TAGGCACTGC
TTGTTTTCTT
TGGAATGCTA
AACTGGTTGT
GTGTTTAGAT
GTCAAATCTT
TTTTCAATGG
ATTGGCATAG
GGACCTAATG
GAGTACCTCA
'CATAACAGTT
CGTAGGTGTT
AGATCCTTCA
TCTTTCTGGA.
AGTACTAGTG
TCCAGAGGGT
CCATCTTTAC
GATGGGCTGA
TAAAAAPAAA
CAATTGGTCA
GTGTGGAGGC
GCAAATTTGT
TCAATCAATG
TGGCAGAACA
CACTTTTCTC
ACTTTGGTAT
TCCAGAGTTC
GTTTACAACC
AAGGCTATTG
CTGACTCTCT
AAAAAAAAAA
TTCTATCTTT
TAGTCAGCTA
GTGAAGTATG
TTTGTCCTAG
GAACCTCCTG
TTCCATAAGA
GGCAATAATG
TAAGGGAGTT
AACAGTTGTT
AGTAAGGTTG
TGTGATGTCA
AAAAAAAAAA
AAAAAAAAAA A INFORMATION FOR SEQ ID NO:59:
I.
SEQUENCE CHARACTERISTICS: LENGTH: 643 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59:
CTCATCTCGG
GCAGAtGAAG' 120
ATGGCTTGTC
180
CCCAGCTGAT
240
GCTTGAGCTT
300
TTGCCACGGT
360
ACCAGCAGTC
420
CGTTGTTTTT
480
GATTGTCCAT
540
TGCTATGCAA
600
GAGCTTGCAG
643
AGTTGCAGGC
CAAACGGATC
AAGATGCTCC
GATAGGTGGA
GTGAAAGCTG
GTCTTTCACA
GAAGGGACGA
ACTTCTTCCA
GATGACTGCT
AAACCTTGGC
TGATAAATCC
TGCAGCTTTT
AAACAGTTTG
TCATCAGAGG
AGTATGAGCA
ATATTCTCCA
TGGCTTCAGT
GGAATGTGAT
TCGGCGCAGT
GGAGCGATTT
AGAGAAATCT
AGGCCTGGCC
GGCCCAAAGC
CGTTACTGGA
TTACACTGTC
TCTGCGAGAG
TTACCAGAGC
TCTCAATGAT
GGAGGCCTGC
TTACATGA.AT
GACTACTGCG
GCATGGGATA
TTAGGTCCCT
ATGATATCAG ATCAAACGAC
GCAGCGGGT.T.TCATTGCCTC
AGAGCAGCAG TrTCGGACCAA TTGGAAGGAG CAAAAGAGAG T.TACTCACAG TCATCAGAGG GACCCTGAGC ?.AGTGATAGA GCAGAAACTG GGGTGAAGCG CCTCATAGAG ACCCGCTCGC
TACAAACCAA
TTGCTAAGGG
TGA
GAAT TGG TAT
AAGGAATTTA
0@ S S 055
SSOO
@009 @5 6 0@ 0*65
S
0@ 0 6 OSe S
S
S@ 5550
S
6 #0 0 0050 *5 00 0
S@
6000 6 0600 INFORMATION FOR SEQ ID Wi SEQUENCE CHARACTERISTICS: LENGTH: 441 base pairs TYPE: nucleic acid CC) STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION4 GAATTCGGCA CGAGAATTTT TCTGTGGTAA ACGATGTCAG CATAACAAAC TCCAAAGGAT 120 TGGCATCTTG GCTTATCAAG CGTCTCCTCC 180 GGGATCCTGG CAATGAGAAA AAGATGGCTC 240 GACTGCAACT AATGAAAGCT GATTTAATGG 300 GCTGCCATGG TGTTTTTCAC ACAGCGTCTC 360 TATGGTATGC TCTGGCCAAG ACTTTAGCAG 420 ACCATCTGGA CATGGTTGCA G 441 SEQ ID NO:6
GCATATCTAT
TGGTATGCGT
AGTGTGGTTA
ATTTATGGAA
ACGAGGGCAG
CAGTCGTGGG
0: GGCTCAAACC AGAGAGAAGG GACAGGAGCG GCTGGTTACT.
CCAAGTGAGA GGAACTGTGC G TTAGATGGG GCGAAAGAGA CTTCGATGAG GTCATCAGAG TGTCAAATCA GATCCCAAGA AAAA AGCAGC ATGGGATTTT GCCCAAGAAA INFORMATION FOR SEQ ID NO:61: Wi SEQUENCE CHARACTERISTICS: LENGTH: 913 base pairs TYPE: nucleic acid.
STRANOEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID '40:61: GAATTCGGCA CGAGGAAAAC ATCATCCAGG CAT TTTGGAA ATTrTAGCTCG CCGGTTGATT
CAGGATCCTG
120
AATCCTCOGG
180
TGGCTCATCA
240
GGTAATCCTS
300
CTCTGGAAA;G
360
GGTGTTT-TC!-
420
ATTAAGCC"-A
480
GTGAAGCGAG
540
ACACCAGGCA
600
AAAATGACAG
660
TTTGCAGA.GG
720
TTCATTATGC
780
GAACCCCACT
840
TCACATATCT
900
GATGCTACCC
CAATGGCTTT
TCCATCGAGG
TGCGATTGCT
TAAAGACAAA
CAGATTTGGA
ATGTTGCCAC
CAATCAACGG
TTGTTTTCAC
AAGTTTTTGA
GATGGATGTA
AGAACAAGAT
PLGACCATGCC
ACATGATACT
TTGTATATGA
ATT
TGGCGAAGAG
AACAGTGTGC
TGAGCGAGGA
GCATCTGTTG
TGATGAAGGA
TCCCATGGAT
GGTCTTGAAT
GTCATCTGCT
CGAATCATGC
CTTTGTATCG
CGATCTCATT
ACCGAGCATG
GAGACAGGTA
ACATCCTGAA
CAGACTGCCT
GTTACAGGAG
TATAGTGTTA
GATCTGCCGG
AGCTTTrGATG
TT.CGAGTC-CG
GTTATGAGAT
GGGACTC-TGA
TGGACCA.ACG
AAGACAT TAG
ACTGTTATCC
ATCACAGCCT
CAGCTGGTTC
GCAAAGGGCA
TGCCACAAGA
C TGCTGGGTT
GAGCAACTGT
GGGCAAA.TGA
CTGCCATTGA
AGGATCCCGA
CGTGTGCAA.A
ATTTTACAGA
1'GGATCTTTG
CAGAGAAAGC
CCACATTGGT
TGGCACTGTT
ACTTGGATGA
GATACATCTC
AACGCCTTTG
CATAGGGTCA
GCGAGACACT
GAGATTGA.CT
.GGGTGTGAG
GAATGAGIATA
AGCCAAGTC
TGATTTCCAA
CAGAAAA;G -17
TGCTTGGGAT
CGTTGGACCA
AACGCGGAAT
TCTCTGTATG
TTCCACATGT
a a ~a INFORMATION FOR SEQ ID NO:62: (ii SEQUENCE CHARACTERISTICS: LENGTH: 680 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGCA
ATTGATCAC-T
120
AGATGAAGCA
180
CGGCTCTTGG
240
GGACCCAGA
300 GCTAAAGCTZn 360
TTGTCAAGGG
420
GGAGGTTGTT
480 SEQUENCE DESCRIPTION: CGAGATCAAT TTTTGCATAT CACAGAGTCA TGGCCAG'rTG TGCOAAGAGA ACAAGAGAGT CTGGTCATGA GATTACTGGA GACACAGGGA AGGTTGGGCA TTCAAGGCAG AGCTTAACGA GTTTTCCACG TTGCCAAGCC GGTCCTGCGG TGAGGGGAAC SEQ ID NO:62: TATTAAAAAG TAAGTGTATT TGGTTCCGAG AAAGTAAGAG GGTTTGTGTA ACTGGGGCAA ACATGGCTAT TATGTTCATG TTTGCTGCGG CTCCCAGGGG CGA.AATGGCC TTTGATGATG TGTTAATCTG GACTCAAACG AGTAAATCTG CTTCGAGCCT
CGTTCTCTAT
GGTTGAATGG
ATGGGTACAT
GAACTGTTAG
CAAGTGAGAA
CTGTGAGCGG
CTCTTCAGGG
GCGAACGATC
GGGCACTGTG AAACGAGTGA TACATACCTC GTCCG-TTTCA G'CAGTGAGAT 7CACTGGGAA 540 ACCTGACCCC CCTGATACTG TGCTGGATGA ATCTCATTGG ACTTCGGTCG ;GTATTGCAG 600 AAAGACAAAG ATGGTCGGAT GGATG'rACTA CATCGCCAAC ACTTATGCAG A--AGAGGGAGC 660 CCATAAGTTC GGATCAGAGA 680 INFORMATION FOR SEQ ID NO:63: ()SEQUENCE CHARACTERISTICS: (A)*LENGTH: 492 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63-.
GCCCAATGGC 'C.TCCCCTACA
GAATTCGGCA
AGATTTCAGA
120
GAGCTGCTGG
180
TTAGAGGAAC
240
CTGGGGCGAA
300
ACGCCGCCAT
360
CCGAGGACCC
420
GATCGTGTGG
480
TGCTTTTTAC
492
CGAGGCTGGT
AGAGCTGCTA
CTTCATAGGA
TGTGCGAGAC
TGAGAGGTTA
TGATGGTTGT
CGAGAACGAG
GAAAACCAAG
GG
TCAAGTGTCA
AATCATGAGA
TCATGGCTCG
ACTGGTAATC
ACT CTCTGGA
GAGGGAGTTT
ATAATTAAAC
TCTATGAAGC
TCCATCAAGG
TCATGCGTTT
CGGTGAAGAC
AAGCAGATTT
TCCATGTTGC
CCGC'TGTCAA
GAGTTGTTTT
A.AGTACAGTA
GCTTGAGCGA
GAAGCATCTA
GGATGATGAA
CACTCCCATG
TGGGATGTTG
CACGTCGTCT
GAGAATCCCC
TOGTGTGACAG
GGATATACTG
TTGGATCTGC
GGAAGCTTTG
GATTTTGAAT
AATGTTTTGA
GCTGGGACTC
C
INFORMATION FOR SEQ ID NO0:64: SEQUENCE CHARACTERISTICS: LENGTH: 524 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGCA
TCCAAGCTTT
120
GGCTTCATAG
180
ACAGTTCGCA
240
AACGAAAGAC
300
GTAGATGGTG
360
CGCTTGAAGG
420 TGT TCAAGAT 480 SEQUENCE DESCRIPTION: CGAGCTTGTT CAAAGTCACA TCGTCTACCT CCCTGAAAAG CTGCTTATCT CATTCGTAGT ACCCAGATAA TGTGGAGAAG TCAACATCGT GAGAGCAGAT TAGATGGAGT ATTCCATACT AAACCCTAAT AGATCCTTGT CACCTTCAGT AAAGCGGGTG SEQ ID NO:64: TATCTTATTT TCTTTGTGAT. ATCTGCAATT ATGAGCGAGG TATGCGTGAC AGGAGGCACA CTTCTCCAGA A AGGTTACAG AGTTCGCACT TTTAGrTATC TGTGGGATCT GCCTGGTGCA TTGCTAGAGG AAGGCAGTTT TGATGCAGCA GCATCACCTG TCTTAGTCCC ATATA-ACGAG GTGAAGGGCA CTATCAATGT CCTCAGGTCC GTGCTTACAT CCTCCTGCTC P.TCAATACCG ATACGACT:AT AATAGCTTAG AGCGTTCCCT GCGG.AC'TGA GTC.A 524 INFORMATION FOR SEQ ID NO: 69: SEQUENCE CHARACTERISTICS: LENGTH: 417 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear .(xi)
TCCTAATTGT
GTTGTTCATG
120
ATAACGAGA.G
180
TCTGGACCCC
240
AGGTTTGTGG
300
GAGAGACGCT
360
CTATGCAGAT
417 SEQUENCE DESCRIPTION: TCGATCCTCC CTTTTAAAGC CAGTGCTAGC AGGAGGAGCA GACAGAAGTA AGTTTGTGGA TCTGAGGACA ATGGCAAGCT ATTGTTCAGG GCCTTCTTCA GGCGAGGTTG AGTCTCTCAG GTCTTGGATT ATCACACCAT SEQ ID N0:65: CCTT1C0-,TGG CCTTCATTCC GCGTTGC:AAT TGGGGAAA?.T AATAGCAACC ATGCCn.,GT CGTTTGTGTC ATGGATGCGT AGGAGGCOTAT TCAGTGCATG ?.AAATTC-CAT GGGGATCGAT TACTGATGCG CTCA.AGGGCT
AGGTCACAGA
-C:CAAAATnrA:
TTCCTTCTGG
CCAGTTATGT
CCACGGTGCA
7 GCAGATCAT
GTTCTGG
INFORMATION FOR SEQ ID NO:66: SEQUENCE CHARACTERISTICS: LENGTH: 511 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 0 0 0 0 0 (xi) ATGA CACG;A
CTTCATTCCA
120
TGAAGAAAA.T
180
GAGTTTCCTG
240
CGCAATTCCG
300 GGGGAAGCT'3 360
CAATGGTTGC
420
GGAGTATCCG
480 GAATT7AAGG 511 SEQUENCE DESCRIPTION: TTTGTGCCTC TCTCTGACCA TCATCCAGGA GCTTCTGTTA GGATACGGCG CTTCCA.ATTC GGGATTCATA TCGCAAGAAT GTAACGCCAG AAGAGGCAGG GAGATATGCC AAGCCGATCT TCCGGAGTCT TCCACGTCCC GTATGATTAG TTTAATAGAT TTTTCTTAGA ATTTGGATAC SEQ 1D NO:66.
GAGCTTGAAG CTCTGTCTTC .TATCCM'TTTC CTCAAAATGG TCGGAAATTA ATGTGCCTTA GCTGCTCGGC CGGGGTTACT CTCACTTATG GAATCCGAAG CTTGGATTAT CGCAGCGTTT TGCGCCCTGT GATCATCTGG TGACGGGGTA TCCTGTATGA
T
TCTGATATCG
ATGCCTA.CCT
CCGGGGGCTG
CAGTCCGTTT
AAGCATTATC
TCGGCAACAT
ATGGATTACA
ATTAGTTTAT
INFORMATION FOR SEQ ID W4: 67: SEQUENCE CHARACTERISTICS: LENGTH: 609 base pairs TYPE: nucleic acid tC) STRANDEDNESS: single TOPOLOGY: linear (xiil SEQUENCE DESCRIPTION: SEQ, ID NO:67: CATTGATAGr TGATGGAAGA CCATCAGTAA AGCATGAAAA AGAAP.TTGTT C'CAAGGTGAA
GAAGTCAGTT
120 AATATGTAJA7 CT GAC CTT C2 240
AGTGGCGAAC
300
GAGGAACAGG
360
CATTCCTTC-
420
CCTTCAAGGC
480
TGGAGGCAAT
540
TTCAAACA-G
600
TTTGGCCAA
609
GCTCCAGCAG
CCATAAACTT
CATATTTATT
CAACTTGACA
CTACATTGGT
TGTCAGAGAG
CTCAGGTGCT
CAAGAAAGTT
ATATTTATCC
AACCTTTTTA
ATGCAGGAAG
CCAATTCTAA
GGGTTGGACA-
CGTCATATPA
ACCTCCGCTT
ATTATACTCC
GATGTAGTTA
GCAATTG TTT -1GCCTCGTGC
TATCTCTACT
TGGCCAACAG
CCAAAGCCAG
CTAATCCTGA
ATGGATC TTT
TCTCGGCTGT
TTGTATCCTT
CGAATTCGGC
CGCTGTCTAC
CAGCAAGATT
CCTTGCTCTT
GAAGGCTA-AG
GGAGGACCAT
CA-AGGGACCA
7 CGAACCCAT T TTGCCTTTG
ACGAGAATCA
CTGATTTTrTC
CTGATTATTG
GGTCATCCCA
CTTCTGGAAT
'GCAAGTCTTG
CAGCTGACGG
CAAGAAGGGT AGGGTATTTA AAGGGAGGGT INFORMATION FOR SEQ, ID NO:68: SEQUENCE CHARACTERISTICS: LENGTH: 474 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear ~6.
(x i)
GCAAGATAGG
GCATAGCAAT7 120
GCTCTATGTA
180
TGCAGCAAAT'-
240 GGTGATGGGn' 300
TCATCCTACC
360
CGTCCAACAA:
420
CCTCTTAGTA
474 SEQUENCE DESCRIPTION: TTTTATTCTT CTGGAGTTGG TAAGCAGTTG CAGCCATGGC GCTGCAGACA TGGTGGAAAA TGTGAGATGG AGAAGCCTCT GCCACAGGTT ACATTGGCCG TATGCTCTTA TACGCCCGTT TTGAAGGATG CCGGGGTCCA AATACATTGA AGGACATGGG SEQ ID NO: 68: GTGAGG CTTG GAAATTTAAG GGTCTGTGGA ACTGAAGTAG CAACACGTCT ATTGTGACCA TCTAAATTCC TCTGCCACCT TTTTGTTGCC CAAGAAGCTG TGCTGCTTGT GACCTGGCCA TATCCTTTAT GGGTCTTTGA CCGTTGTTAT CTCTACCATT
TAAAAAGGGT
CTCATACTGT
CCTCTATGGC
CAAGAATACT
TTGCTGCTGG
AAGCACAGCG
GTGATCACAA
GGAG
INFORMATION FOR SEQ ID NO:69: SEQUENCE CHARACTERISTICS: (A1) LENGTH: 474 base pairs TYPE: nucleic acid C) STRANDEDNESS: single TOPOLOGY: linear (x2 SEQUENCE DESCRIPTION: SEQ ID NO:69:
GCAAGATAGG
GCATAGCAA.T
120
GCTCTATGTA
180
TGCAGCAAAT
240
GGTGATGGGA
300
TCATCCTACC
360
CGTCCAACPA
420
CCTCTTAGTA
474 T TTTATTCTT
TAAGCAGTTG
GCTGCAGACA
TGTGAGATGG
GCCACAGGTT
TATGCTCTTA
TTGA.AGGATG
CTGGAG77:zG GTGAnGGCTTG' CAGCCA7-C GGTCTGTGGA T GG 71G G kAA CAACACGTCT AGAAGCCTCT TCTAAATTCC ACATTGGC!CG TTTTGTTGCC T AC GC C r T TGCTGCTTGT CCGGGGTCCA -TATCCTTTAT AGGACATf-GG CCGTTGTTAT
GA-ATTTAAG
ACTGAAGTAG
ATTGTGACCA
TCTGCCACCT
C.AAGAAGCTG
G; CCTGGCCA GGGTCT TTGA
CTCTACCATT
TA~AAAAGGGT
CTCATACTGT
CC1TCTATGCC
CAAGAATACT
TTGCTGCTGG
AAGCACAGCG
GTGATCAkCAA
GGAG
INFORMATION FOR SEQ ID SEQUENCE CHARACTER3'rICS: LENGTH: 608 base =airs TYPE: nucleic acia STRANDEDNESS: sincie TOPOLOGY: -1inear (xi) SEQUENCE DESCRIPTICN: SEQ I0 a
U
*4.
CATTGATAGT
GAAGTCAGTT
120
AATATGTAAT
180
CTGACCTTCA
240
AGTGGCGAAC
300
GAGGAACAGG
360
CATTCCTTCT
420
CCTTCAAGGC
480
TGGAGGCAAT
540
ATCAAACAGG
600
TTGGCCA.A
GCTCCAGCAG
CCATAAACTT
AATATTTATT
CAACTTGACA
CTACATTGGT
TGTCAGAGAG
CTCAGGTGCT
CAAGAAAGTT
ATATTTATCC
AACCTT-17-A GCAATTGTTT ATGCAGGAAG TGCCTCGTGC CCAATTC77AA TATCTCTACT GGGTTGGACA TGGCCAACAG CGTCIkAT'A CCAAAGCCAG ACCTCCGCTT CTA-ATCCTGA ATTATACTCC ATGGATCTT, GATGTAGITIA TCTCGGCTGT AGGGTAT-TA AAGGGAGGTT
TTTATCCTT
CG,%.TTCGGC
CGUCTGTCTAC
CAGCAAGATT
CCTTGCTCTT
GPAGGCTAAG
GGAGGACCAT
CAAGGGACCA
GGAACCCATC
TTTGCCTTTG
ACGAGAATCA
CTGATTTTTC
CTGATTATTG
GGTCATCCCA
CTTCTGGAAT
GCAAGTCTTG
CAGCTGACGG
AAGAAGGGTT
TGATGGAAGA CCATCAGTAA AGCATGAAAA AGAAATTGTT CCAAGGTGAA INFORMATION FOR SEQ ID NO:71: SEQUENCE CHARACTERISTICS: LENGTH: 1474 base pairs TYPE: nucleic acid STRANDEDNESS: singie TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: GAATTCGGCA CGAGAAAACG TCC?-JACZCTT CCTTGCCAAC TGCAAGCAAT ACAGTACAAG AGCCAGACGA TCGAATCCTG TGAAGTCTT CTGAAGTGAT GGGAAGCTTG GAATCTGAAA 120
AAACTGTTAC
IS0
ATCTC'AGAAA
240
ACTCTGATT'.!
300
GGCATGAAGT~
360
GAGAGCATG'T
420
AGAGCATGGA
480
GCACACCTAC-
540 GAAtCCCGGA 600
TTTTCAGCCC
660
GTTTAGGAGG
720
CGGTTATCAG
780 CTTATCTTG7 840
TAATGGACAC
900
ATGGAAAGCT
960
TAATACTTGG
1020
AAACTCTAGA
1080
ACTACATCAA
1140
TGGATGTTGC
1200
TGCATGCAAG
1260
ATTTAGGAAC
1320
TTCAGATGTT
1380
TCCAATGTCT
1440
AAAAAAAAAA
1474 AGGATATGCA GCTCGGGACT
GAAAGGACCT
AGTTCAAATG
GGTGGGGATT
AGGGGTTG2GT
ACAATACTGC
TCAGGGCGGA
GAATCTTCCT
A.ATGAAGCAT
CGTGGGGCA'C
TTCGTCTGAT
TAGCAAGGAT
CATTCCAGTT
AGTGATGCTG
GAGAAGGAGC
TTTCTGTGCA
CACGGCCATG
TAGAAGCAAG
ATGAATAGAT
TCGATACTGG
TTTTTAACTT
TCTGCCAAAT
AAAAAAAAAA
GAGGATGTAA
CGTAATGAAA
GTAACAGAGA
TGCATTGTTG
AGCAAGAGGA
TTTGCAAGCA
CTGGAACAAG
TTCGCCATGA
ATGGGTGTCA
AAAAAGAAAG
ACTGAAAAGA
GCTCATCCTC
GGCGTTGT7C ATAGCT GGAA
GAGAAGAAGG
GAAAGGTTGG
TTGGATAATT
CTGGACTAGT
TTTTTGTTAC
GTATATGTAA.
TPATATATGT
AAAAAAAAAA
CCAGTGGCCA
TTGTAAAGGT
TGGACATGTC
TTGGCAGCGA
GGTCCTGTCG
TTTGGACCTA
GTATGGTG^T.
CGGCCCCTCT
CAGAGCC!CGG
AGATTGCCAA
AAGAAGCCAT
TGATGGAAGC
TGGAACCATA
CAGAGCCGTT
GTTTCATTGG
TATCATCGAT
AGAAGAACGA
AGTCT GCAAT
AGCTTAACAT
TTTAGTTTAG
AGATCAATTT
ATTCGTATTT
AAAA
CTrTGTCCCCT TACACTTACA CATTTACTGC GGAATCTCC TCATTACCCA ATGGTCCCTG GGTGAAGAAA TTCAAAGTGG CAGTTGCGGT AATTGCAATC CP.ATGATGTG PACCATGACG TGATCAGATG TTTGTGGTTC GTTATGTGCA GGGGTTACAG GAAGAAATGT GGGATTTTGG AGCCTTTGGA CTCCACGTGA GGAAGTCCTC GGCGCCGATG AGCAGAGAGC CTAGATTACA TCTTGC%^CTT CTGAAGACAA GCACTTCGTG ACTCCTCTCT CAGCATGGAG GAAACACAGG GATTGAGGTT GTGGGCCTGG TGTCCGTTAC AGATTTGTGG CAATCAATCA GATCAATGCC GAAAGGGAAA TTAAATTTTT CTTTTGTGAG GTTGAAACAA CTCGTGACAG TAAATAATA-A TTATATGAAA. AAAAAAAAAA 9 t INFORMATION FOR SEQ ID NO: 72: SEQUENCE CHARACTERISTICS: LENGTH: 1038 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear SEQUENCE DESCRIPTION: GAATTCGGCA CGAGAGAGGG TTATATATCT TGCCAAGCT!C TGGGCCACGG ATTTGGAATC 120 GGCGAATTT-C ACAAGTATT TCACCGATAA 180 CTTTGAGGGA AAAAAACCCT GCTACTTCAA 240 SEQ ID WO:72: TGATTCTGAC CTCATTGTCG TCGACGACAT TCGTGTCCTC GGGGCACCAG AGTACTGCAA TTTCTGGTGG GATCCCGCAT TATCCAAGAC CACAGGCGTA ATGGTGATCG ATCTTGAAAA
I.
ATGGCGGGCA
300
CCGTATCTAT
360
GCAAGTCGAT
420
CCGAGATCTT
480
GCTACGCCTG
540
TTTATCGATC
600
ATCGAATTAA
660 GtTTTGAATT 720
CAAATCCATC
780
CGCCTGTGAA
840 GAG CCAGCAG 900
AATTTTCGGC
960
CCTGAACCAA
1020
AIAAAAAAAAA
1038
GGGGAATTCA
GAGCTCGGAT
CATCGTTGGA
CACCCTGGAC
GA.ATGCCALAG
AACGTATTAC
kCCTGATTTG
TCAATTCTGG
ATGAGGGACC
GAATGATAT?
AGAGGCAAGC
GACTGTACAG
CA.ACTGTATA
AAAAAAPA
'CAAGAAAGAT
CAT TACCGCC
ATCAGCACGG
CTGTCAGTTT
CGGACTTGCC
CTAAATGGGT
ATAAAATGCC
TAACGAATAG
.%ATCGTTTGA
GTGGACTGAT
AATGCCGCTG
G.:TGTAAATT
TACCTTATA
CGAAATCTIGG
ATTT-1TACTG
TTTAGGCGGA
GTTGCATTGG
CTCTGGATAC
GAGAGAGCCT
AAATAGAACT
AAGAAAACAA
ATTTAG TAT?
CTATTTATAT
CAAGTCATGT
TTTGGAACAT
AATGTATCTG
A:GGACATAC
GTTTrGCTG
GATAATTTGC
AGTGGTAAGG
TTTATGGGCT
CT'CTCCTCGG
TTACGCCTAT
TAGCACAGCC
A-ATAAGGTTG
TTGTACTGCC
AGGGAAGGCG
TAATATCATT
CAACTCCATT
;AAGGAACG
TTGGTTAA
A AGGCCTTTG
GCAAACCTTG
CCTTATGATC
GGTGCTTTTT
GCATCTTtCA-
ACAGGCAGGA
-TCCATATAA
ArGCCATCCT mGAT.AAGTT 7TTGCATAAA INFORMATION FOR SEQ I D NO: 73: SEQUENCE CHARACTERISTICS: LENGTH: 372 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
CTPLGGGGTCT
CATGCAAGAG
120
GTCCTTTTAG
180
CGGGAGGAGG
240
CCATCGGAAT
300
TGGGCAATTC
360
AAATCTGCCA
372 SEQUENCE DESCRIPTION: TGGGGGGTTC CTGATGCCCA ATCTGTAGTC AGTAGTCTTG GGTAACATCA TTCCAACCAT AGCAAGATAT TCAGCATTGC TCAGCCGAGC TCGCCCCCTC TGGCTCGA.AA TCGCCAAAT!
GT
SEQ ID NO:73: ATTGTTGCTG TGCTTGGCAT TTGGATCTAT AGCTTTTAGA ATCCAGTTCC ACCACCGGCT TTTGGGCACC AGATGGATAG AGTCCAATCG TCGTGAAAAT ATGGGCTACA ACAGGATTAA .AACCCaAA
A-AAGAGTCAC
ACACCTTCAA
GCATTATTT-T
CCCTCAAAAT
AATTGCACAG
INFORMATION FOR SEQ ID NO:74: SEQUENCE CHARACTERISTICS: LENGTH: 545 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:74:
I.
AAAGAATTCG
GGGAGTTGGC
120
CCAACAAGCC
180
ATCCTGATAA
240
GAAACATAAC
300
CTGAAGTGGT
360 kT.GCCATTGT 420
TGACCATTCC
480
TTGTAATCTT
540
GTTTT
545
GCACGAGGC
GAGAGAAGCT
TTTGCTCCCT
TCTGGGTTAT
CGTAGGAACT
TTATGAGCAA
GGTTGTGGGT
CCTAGGCGGA
GATATCTGGA
AATCCGAGCC
GTTAGGAAAT
TTGGAGAAGA
CAGTGTGGTG
ACAATTCTGG
AA.TCCAGATG
GAGGCACCAT
GGGGACACGA
AGGCCACTTG
TAGCCA.CCA
C 'TT GGT AT T
ATGCTTC-CAA
GATGGACGAT
AAGC TAT CAA
CTAACTATGT
ACGCAGAAAC
TTAAGACGGT
TTATTGAACC
ACTTGGCAGC
GTTGAAAAAT
GGrTCTTGTT
GGAATGGCAA
ACTAGCTGTC
CAAAGGACA-A
GTTTGGAGAC
CTGTGGCTCC
TTATCTTCCA
.AGGAGCACA
'GGAAG TCAG
G:CAGGAA-CCC
:-'GATTA:AGTG
GCCCCTCTA.
GGGTTTTCAT
.ATCTTPUATT
7-GAAATf2CC
TGGTGGATC
INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 463 base pairs TYPE: nucleic acid STRLANDEDNESS: single TOPOLOGY: linear 5 (xi)
GCAGGTCGAC
CTAGTGATGA
120
TGTAGAGCCT
180
CGGTGGATTT
240
TCGTGCTTTT
300
ATATTTTTCT
360
CTGCACTGCG
420
ACATAGTACC
463 SEQUENCE DESCRIPTION: SEQ ID NO: ACTAGTGGAT CCAAAGAATT CGGCACGAGA AAAAACAAAT GCTTTACGTA TACCTGGCCT TTTATACATG GATCTGAGTT TTTGTTACTC TGTATCACTG GGACTTGCCA CAAGCTCTGG CGTAGCAAAA AAGTTGTGGA TGAkCTTTGGC ATATTCTCAG GGAGACCGTG TGAAGTACTG GGTAACTGTT. AACGAACCGT TACGATGTGG GGCTTCACGC ACCGGGCC^GC TGTTCGCCTG GGAAATTCAG CGACAGAGCC TTATATTGTA GCCCATAACA GCTGTTAAA.A ATATATAGCA TAAATACCCA GGG
CTTAGCTAGC
TTTATGCAGG
AGGACGAATA
,2.AGAATGCTT
GATCTTCTC
GATTTGGAAA
T GCTTCTTGC INFORMATION FOR SEQ ID NO:76: SEQUENCE CHARACTERISTICS: LENGTH: 435 base pairs TYPE: nucleic acid STRLANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: ACACTAGTGG ATCCAPAGAA TTCGGCACGA GGCTACCATC TTCCCTCATA ATATTGGGCT TGGAGCTACC AGGGATCCTG ATCTGGCTAG AP.GAATAGGG GCTGCTACGG C-TTGGkAGT 120 TCGAGCTACT GGtATTCA.AT ACACATTTGC TCCATGTGTT GCTGTTTGCA CGAGATCC'-CG 180 ATGGGGCCGCC TGCTATGA-A GCTACAGTGA GG;;TCCA.; A AT'rGTCAAGG ::CATGAC:GA 240 GATTATCGT- GGCCTGCAAG GGAATCCTCC TGC0TAATTCT ACAA-AAGGGG n-CCTTT-IT 300 AGCTGGACAG TCAAATGTTG CAGCTTCTGC TA.GCA"YTTT GTGGGTTATG '-:GGAAC-AC 360 CAAAGGTATC GATGAGAATA ATACTGTTAT CAACTP.7CAA GGGTTATTTC .ACATTCCAA 420 ATTACCCCCA ATTTT 435 INFORMAT ION FOR SEQ ID NO: 77: Wi SEQUENCE CHARACTERISTICS: LENGTH: 451 base pairs TYPE: 7,ucleic acid STRANDEDNESS: single D) TOPOLOGY: linear (xi; SEQUENCE DESCRIPTION: 'SE GAATTCGGCA CGAGCCTA.DA ATTCTATGGT G TACAAAGGA A CAGTCCCAAA TGGTTAAAGG T1 120 CACTGCTTAT TACATGTATG ?.TCCTAAACA AC 180 TGGACTGGA A TACAGGCTTT GCATATGCTC GC 240 ACTCCAATTS GCTTTACATT GTGCCTTGGG G1 300 AACACTATGG AAATCCAACT ATGATTCTCT Cl 360 GACACTTCCA GCAGGACTGC ATGATACCAT C~ 420 AAATTTGATT AATGCACGTG AATGACCGGG G 451 Q0 ID, \10:77: ~CAATAGAC TA CTAAACAA AA AATGGAGT GC ~CTATACAA .GG' L'GAAAATGG AA kGGGGT-AAC TA 'ACA.AGGC 7GCCCAA-Yr
TCTAGGCG
TGTAACAG
CTATTGGA
CCGTCACA
TGGACGAC
CTATAAAA
77A A AC C ATA mTTACCAGAC
CCAAGGGCGA
TACGTAAA AG
!LGGAAACGT
GTATTTGCA
INFORMATION FOR SEQ ID 140:78: SEQUENCE CHARACTERISTICS: LENGTH: 374 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi4) SEQUENCE DESCRIPTION: CTGCTCTGCA AGCAGTACTA TGCACAGCAA CTTGAGGAA-A CGCTCAAGCA TTGCTGAGGC 120 CTTCAGAAP.-N ATGGCAATGG CACAAGCATT 180 GCTCCGCCG!C AACATTCTGC CGGAGGATAA 240 AGCTCTTAGC CTGCTCTCAT CAAAAGCCTT 300 AGCTGCTArA AATTC.AACAA TTGTGTTGCA 360 GACAGGACAA-: TCTG 374 SEQ I D NO: 7
GGCCTGCTTA
CACCGTTTAT
CAGAGGCCGT
AAGCTTT'GGA
CATCTCTTTC
ATCTCGAAAC
ACTGAAAACA
CTAAATAGCG
GTCTTGCAAG
TCCGCTGCTT
TCTGTTGAAC
TTTTCTGCAA
GAGCGCTGAG
CAACATAGGG
CTGCCCGTTT
C-TCCTAGACG
*,GCATCGGCT
,-AGGTAAAAA
INFORMATION FOR SEQ ID NO:79: SEQUENCE CHARACTERISTICS: LENGTH: 457 base cairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
GAAGAATGG;%
TtATGtTTtZG 120
AAGGCCCA;AG
180
ACGGTGATCK
240
ATATGGGCGT
300
AAGGAGAGAT
360
AGAATGGAAT
420
GATGA-ATA".G
457 SEQUENCE DESCRIPTION: SEQ ID NO:79: AGAGATTAAT GGTGATAACG CAGTXAGGAG GAGCTGCTTT GATAtCAACT TCTGCTTATC AGTGTGA-.GG AGCTGCCAAC CATCTGGGAC TC'ATTTTC!-AC GAACACCAGG CAAAATTCT AGCAGTGGAT CAGTATCATC GTTATAAGGC AGAGTAA GGCTACCTAC AGATTCTCGA TTTCATGGCC TCGTATATTT CAATGAGGAkA GGAGTAGCCT'ATTACAATAA CCTCATCAAT CCAAGCGTCT GTCAACTrTG TTTCACTGOG ATACTCCCCA GCGGATTTCT GAGGCCAACC ATTGTGA
CCTCCAGGTT
GAAGGTGGAA
GATGGAAGCA
CTGATGAAAG
CCAAAGGGAA
GAACTCCTCC
GTCTCTGGAG
p 12) INFORMATION FOR SEQ ID NO.: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 346 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (x
GGTGTGATGG
ATGCTGCACA
120
GGTTTCAA;.T
180
GGCAAAGCCC
240
AATATTGGGA
300
AATACATGGG
346 SEQUENCE DESCRIPTION: CAGGAATTCC ?GTCCTAAGG TTGTAGCTGC AGTAGCTTCA TTGGTGCAGG GTCATCTGCT CAAGCATTTG GGATACATTC TGTTGCAGTA GATCAATACC AATGGACGTC TATCGTTTICT SEQ ID NO: CCATTTTGCA TCTGTTTGCT TTCAGTCTAC CCAAGGCTAG GTAGAAGCAG CTTCCCAAGG TATCAGGCGG AAGGAGCTGC TCATGAGGGT TCCC'ACACTC CAGGTAAAAT CGCTGATGGG ACCGTTATAA GGAAGATGTG CAGCTTCTCA CTATCTCCTG GTCACG a. INFORMATION FOR SEQ ID NO:81; (iJ) SEQUENCE CHARACTERISTICS: LENGTH: 957 base pairs TYPE: nucleic acid ~)STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: GAATT CGGCA CGAGAAAGCC CTAGAA77TT TTCAGCATGC TATCACAGCC CCAGCGACAA CTTTAAC---C AATAACTGTG GA.AGCGTACA AAAAGTTTGT CCTAGTTTCT CTCATTCAGA 120
CTGGTCAGG-
180
CTTGCACTCA
240
TGGAAGCTTG
300 TCAAGCAAG7T 360
TGACCCTCTC
420 AACTCCATG7 480
,ATGGGATGGT
540
ATATAGATAC
600
AGCAGATTTC-
660
ACATAGATGA.
720
TCATCTTC'A
780
TAGTACTGT-
840
APAATCTC;A
900
TGACATTTGA
957
TCCAGCATTT
GCCCTACATT
TGTCAACACG
TTTGTCATCT
TCTTCAAGAC
TCTGCAGATG
GAGCTTCAAT
TGCAATTCGG
GTGTGATCAT
TTTTGATACT
GACTCGCTTA
GCTGAGTCCA
ATTTCTCGAT
GCACCTCGAG
CCA-AAATACA
GATTTAGCAA
AACACAGAGA
CTT TATAAAC
ATAGCAAGTA
ATTCAAGATG,
GAGGATCCTG
AGAATCATGG
TCCTA.CCTGA
GTTCCCCAGA
TATTCATTAC
GAAAGGATCT
GTCTAGTC'TT
TGAACTAcAA
CANCCTGCTGT
ACAACTACAG
AGTTCAAGAA
GGAATATTCA
CGGTACAGTT
GTGAGATTT T
AACAGTACAA
CACTATCAAAk GTAAGGTe',G
AGTTCACAAA
TTTCTA"GTG
CTCGGTATTA
GATTTTGATT
AGTTGCATGT
T-T* ICCAAAGA
TAGTGGGAAA
TGATAGTAAT
GAGATTGACA
GGAGACTGCT
TGCAACCATA
AACATGTCAG
GA.AGCTCACC
GAGAGAGCGT
TA.TGTAACAA
AATTGATAGT
MC.CTTGACA
ATGAATGCGA
TA-APAAAAAA
AATTTGA,'-ZT
A T TTC TGT PT
TTGGGGTTAG
CAGACATA:'TC
AAGCAGGCTG
AATCAGAAAG
ATGACTGAAT
ACAGTAGATG
TCAAGATTTG
ATGATGTA.AA
CTGTTAACAA
TGCCATCAA
CTTTTAGTTG
AAAAAAA
S
*5 *59~ INFORMATION FOR SEQ ID NO:82: Wi SEQUENCE CHARACTERISTICS: LENGTH: 489 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear SEQUENCE DESCRIPTION: SEQ ID NO:82: GCAGGTCGAC0 ATCCTCCA-'t' 120
GGACTAGAGT
180
GCATCTTCAG
240
GCCAAGGTGST
300
CTGTCGAG'AG
360
CAGACAATCC
420
TGAAAATGGC
480
TACCAGTAA
489
ACTAGTGGAT
CCCATTCAAT
AAAAGTCCTT
CACAGACAGC
TGGAAACTAC
AAACACTGTG
AGGGGTTTGG
CCAAAGAATT
TACACTGGTA
CCCTTTAACA
CACCCTGTCC
AATGAATCAA.
GGAGTTCCCA
TTCATGCACT
CGGCACGAGA
CTCCACCCAA
CAACTGTTCA
ATCTCCATGG
CAGATGCACC
AAGGAGGTTG
GTCATTTGGA
TAAGACTAAT
TA-ATACACAG
ATTGATTCTT
TTTCAATTTC
AAATTTTAAC
GGCTGCTATA
GGTTCACACA
CGATTTTCCA
T TTCCAGACA
GCTGTGAATG
CAAGACACCA
TTTGTGGTGG
CTCATTGACC
AGATTTCGTG
TCGTGGGGAC
CCCGGGTGGG
GTGGGTAGTA AAGAACGGAA AAGGGCCCAT INFORMATION FOR SEQ ID NO:83: SEQUENCE CHARACTERISTICS: LENGTH: 471 base pairs~ TYPE: nucleic acid C) STRANOEDNESS: single TOPOLOGY: linear (xi)
GAATTCGGCA
ACAGACATAC
120
TATTCCAACG
180
GAGGGAAGCT
240
AACAGTGCTA
300
TTCGTTCCT'C
360
TGTCCGGGGC
420
ATCATTTGTC
471 SEQUENCE DESCRIPTION: CGAGAAAACC TTTTCAGACG TTCTCACTGC CAATCAGGCT GGCAAGGAGT TCC CTTCGAT CTAAGACTTC AACTCCAGTC CTAGCTTCGC, TAATGGTC-T AGAGTGTGGA GGAGAATCTG AGTCTTGTGG AGGTCCAACG CCGCAACCAC TTCTTCCAAT SEQ ID NO:83: AATGTTrCTGA TGCTCGGCCC ACAGGTAGAT ACTACATGGC AACACCACTA CCACTGCCAT ATGCCTAATC TTCCATTCTA AGAAGCTTGG GCTCACACGA TTCTACACCA TCGGTTTGGG GATCAAGATT TGCAGCA.PGT CCTTCAAGCT CAGCATTTTG
CGGCCAGACA
TGCTCGAGCA
TTTAGAATAC
TAtACGACACC
CCACCCAGTC
GTTGATCAAA
ATGAATACAT
G
INFORMATION FOR SEQ ID NO:84: SEQUENCE CHARACTERISTICS: LENGTH: 338 base pairs TYPE: nucleic acid STRANDEONESS: single TOPOLOGY: linear
S.
(x i)
GTTCGGCACT
ATCTCTTTCA
120
AATCAGTGCG
180
TTGCTGTA.AC
240
CAGGCTATGC
300
CATGTACGTG
338 SEQUENCE DESCRIPTION: GAGAGATCCA TTTCTTTCAA GGAATATATC GTGCTTGCAG TCTATCTTCT GCTCTCCTTG AAACGCAGAT GTCCACAATT AATA.AGCGTA TAATCGCCAC ATGGAGACGT TGTTAATTAT SEQ ID NO:84: TGTTGAGACA GTGAGTAGTA TTAGTTTGAT GATCTTTAGT TTCTGCAACA ATGTCGTTGC TTTTGCTACT AGCATTTGTT GCTTACTTAG ATACCTTCAT TATTAGAAAG AGACAGTTAC CGTCAATGGC AGCTACCAGG CCCAACTATT
CAAAGCTT
INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1229 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi)
AGAGAAATA.P
TGCAGGAGA
120
GTCTTTTCTC
180
CAAAGGAATT.
240
TAAGCGAGGA
300 SEQUENCE DESCRI PTION: TTATATTTGT AAATTTAAGT AAAACAAGCA TGCTGTCTAC GTGCCGAATT CGGCACGAGA TTGTGGGTCA TTTGCAGGTG GTACAAGGCT GCCATTGACA SEQ ID NO: CTACGTTTAT TAAAAAACTA CAACCCTAAA TGAAGCTTAC AAATCAAATC CCTGCGATAT AGATCTTGGT TCGAGTCTCT CAGCTCTCTC AAGACACCAT GGTGAAGGCT TATCCCACCG AATGCAAGAG GAAGCTCCGA GCTCTCATTG *9.
CAGAGAAGA
360
ATGTCAAGAC
420
.ACGGTGCTAA
480
TCCCCATJAAT
540
CCGGGGGACC
600 AAGGCCGCC T 660
TGGGGTTGAA
720
ACAAGGAGAG
780
CTTACTTCAC
840
AGGCACTGCT
900
ACGCTTTCTT
960
ATGCGTAGAT
1020
TCATCTALATC
1080
CTTGAACCTA
1140
ATTTTGGAAT
1200
TAAAAAAAAA
1229
CTGTGCGCCG
CAAGACCGGA
CAGTGGTCTG
CACCTATGCT
TGACATTCCG
TCCTGATGCT
TGATAAGGAA:
ATCTGGTTTT
AGAGCTTGTG
TGCTGAT CCT
TGCTGACTAT
TCATACCTTC
TTTTCGATTA
CATGTTTTTG
CTGGTTGTGT
AATAAAATAA
*.TCATGGTTrC
GGGCCCTTCC
GACATCGCAG
GACCTTTATC
TTCCATCCTG
,ACAAAAGGAC
:4.TGTGGCCT
GAAGGACCAT
A~CTGGAGAGA
AGTTTTGOAG
GCGGAAGCTC
TGCAGAGACA
TATAGTCA.CA.
AAAAGTAT'-CG
TCTATCA.GC
AAAAAAAA
GAATCGCZG
GGACGATGAG
TTAGGCTCC C AGTTGGCTGG3
GAAGAC-AAGA
CTGATCA7 rCT-
TGTCTGGT.GC
GGACCTC'TAA
AGGAAGGCC"-
'.ACCTGAAGC T
ATTCCCT-GCT
TAGAAGTTGG
ATGTTCTTTA
GCATATTTTA
GC.ZCAGCGCT GGGACTTACG ATATGGGGCC GAGCTTGCCC GGAGCCAATC AAGGAACAGT TGTGGTGGCT GQTTGAAGTGA CAAGCCTGAG CCTCCAGAAG GAGGGATGTT TTTGGTCACA
CCACACCTTG*GGGAGATGCC
CCC:CC-T T'TC TT TGACAACT GCT TCAGTTG CCATCTGATA GACTAT GCA. CAGGACGAAG TTCf-TGA.CTT: GGGTTTGCTG AGATAGCTTC toTTTTGTATT TGTT'"'GCGC CATAGTGATA AAATGPACAT TGAATACAAC ATCGAATGCT TCGTTCCTGT a
I.
INFORMATION FOR SEQ ID NO:86: SEQUENCE CHARACTERISTICS: LENGTH: 1410 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: GAAGATGGGG CTGTGGGTGG TGCTGGCTTT-GGCGCTCAGT GCGCACTATT GCAGTCTCAG
GCTTACAATG
120
ATGAATTACT
180
CTGTTGTACA
240
TGTGCTGTGG
300
GAAAAGGACA
360
GA.AGCCGTGG
420
GCCAGAGATG
480
GATGGACGGA
540
ATCTCCACTG
600
CTGCTGCGGGG
660
TGGTAAGTTC
ATGGGGACTC
AAAGACACAA
AGTCATGTGA
CTGACAGGAG
AGAGGGAGTG
GCGTTGTATC
AGAGCAGAGC
TTCTGTCTCG
CTCACAGCGT
AAGCAATG'CT
TTGCCCTCAG
GAACACTQ'CA
TGCATCGCTT
CTTCGGCCTC
CCCCGGGGTC
GTTGGGAGGA
AGATGTGGTG
CTTCAAAGCC
GGGGAGGACT
ACTGGGAGTT
GCTGAAGAGA
TTCTCATGGC
CTGTTGGACT
CGCAACTTTA
GTTTCCTGTG
CCATACATTC
GAGAATTACC
ATGGGAATCG
CACTGCGTGA
ACAGTGAGAA
TCATTGCTGA
TTAGAAATAT
CAACAAGGAA
GGTATTTGGA
CAGATATACT
CC:CTGA-AGAC
TGCCCGATCA
ACACCCGTGG
AGCTGGTGCA
TGGATTGGTG
ACAAGTACGC
TTTCCATGAC
CAGCATATCA
TACCATCAAG
CGTTCTCTCT
GGGAAGAAGA
CAATGAGAGC
GGTTGTTGCA
CAGGCTGTAC
C!CGGAAGTAG
720
GAC'OCGATLCC
'78 0
AAGCTGGACA
840
CAGCAACTGT
900
GAATACTTCT
960
ACCGGCGCTC
1020
-AGCAAGC'GTT
1080
TGGTG'GGCAT
11.40
GGT.GTGACTA
1200 CTTATTCCC7 1260
ATTCTAGTAT
1320
.AAATATGACA
1380
GACAACTACA
1410
ATCCGACACT
CCAACCCGAA
ACAACTACTA
ATGCAGATTC
TCAAATACTT
GAGGAGAAAT
GAGCGATAGC
TTCATATATa
TGCCCTGCGA
TATGTAAGCA
AATTTTGTCA
.ACTACGTATC
TATATTCTT
GGCAGTGCAG
CGTG3ACCTG
GAGGACCAGG
CTCCCGGGCG
CCcGTC3GCG%^ AT TGCGCr
ATCACATCGA
GC!-TTTTA
TCA;GTTAAAG
TTC-TTGGTC
A.2A.AAArVAAA CACTG:,C 't'-.nTG.!AGC-- TATGTGCOGA ACGACC3GGGG ATGAAC;ACA AC-GGC-.CT CCGTATGTOA AG-A.GATGGC CTCACCaTCC T C ICTG?.GAA TGCTCGCTCA A. X AT T .3GTGGTGGGA. C.TGA-1AGCGT 'AG CG-,T Tr'r T77AGATATC TGALACCMCA-A CCGAACCGTG -TATAAGCPA. ;LkACAA.T TTGCTCATCT GA;TA.-TAACT, ATCTGATAAT 7' L;CCGG,-AAC -AGTGCCC3
.:-.CGCCTA.TG
;X;.TAGTG!2C- AAAGCC aG
.:.-TGCCP'CAG
A.:ATGG7-? nACAGTZGG CCJGTC7'-TT- ':AAPACn2n-A T LA.PAra. T INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 687 base pairs TYPE: nucleic acid STRANDEDNESS: sinale TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ 10 NO:87:
C
C.
C~
C
GTAGTTTCG'
ATGACGAAGT
120
ATAAPTTTCTG
180
CTTTCGTGGA
240
CTTGAGCCGG
300
CATGACTGTT
360
CCCAGTGAGC
420
GACGAAATTA
480
CTGGCTTTGG
540
GGCCGCAGAG
600
CCAACTTTA
660
GA!AATGGTTG
687 TTTACAACAA T!VT'AGGTTT
ACGTGATCGT
TCAATGGAT7
CATTTTATAA
CGTTGGACGA
TTGTGC.AGGG
AACAGGCTCA
AAACCGCTGT
CTGCTCGTGA
ATAGCCTAAA
ATTTGACACA
CTCTTCAGGT
TAGCTCCATT
AG77GTCCAT
GGACAGTTGC
AGATATCACT
TTGCGATGGG
GCCAAACTTA
AGAAGCTAGC
CTC'CGTCCGC
GTTTGCCAGT
GCTGATGAAC
GGCACAC
TGAATCTCAG P-ATAGTTGCG GTGTGTTTL'CT TrGTA7 71TGT GAAGATGATC TG-C;,-AGCC CCCGACTT'GG AGGCCATAGT CAGGCC.GCAG GCTTG-T GAG TCCGTGTTG1C TGACPIGGAAC ACACTAAGAG CCCGGGCCTT TGCAGTGGGG TTGTAACTTG TCAGGAGGCC CAAATTTCC CAATCCGTAC TCTrCZCCA.A ATTTTTGGCT CCAAAGGATT
AGGAAGCG
T-CTGCGTGC
4 TGCAT3GG-
GAAATCC-GTA
ArTTCATTTC
AAAAGAA.C
GCAGCTGATC
7GCAGACATT
AGTACCACTT
T ATACCAACT
CAGTTTGGCC
INFORMATION FOR SEQ ID NO:88: SEQUENCE CHARACTERISTICS: LEMIGTH-: 688 base Dairs TYPE: rnucieic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ 1D NO:88: GTAGTTTCG- TTTACA-ACA-A TCTACAGGTT TTGAAT CICA GAATAGTTGC '-ThAAGGAAGC
GATGACGAAG
12 0
CATPA.TTTC--
'180
GCTTTCGTGG
240 300
CATGACTGT:-
360
CCCC%'AGTG:
420 -1CGACGAA' 480 TTCTGGC -7 540
TGGCCGCAGA
600
TCCAACTTTA
660
CGAAATGGTT-
688
TACGTGATCG
GTCAATGGAT
ACATTTTATA
TTGTGCAlGGG
GCAACAGGCT
TAAAACCOCT
GGCTGCOrT
GATAGCCTA
AATTTGACAC
GCTCTTCAGG
T'TAGCT COAT .TzAG'TTCCA
AGGACAGTTG
,AAATATCAC
TT'3CGATGGG C AG CCAAAC T
GTAGAAGCTA
GACTCCGTCG
AGTTTGCCAG
AG CTGATGAA
TGGCACAC
T GTA T G TT TGA AGATGAT CO OGACTT-7
TAACACT.?AG
CTOAGGAGG C TOC'An'T CCG -,A
CA.TTTTTGGC
TTT-GTATTTG
CTGT.CAAAQC
GAGGCCATAG
GGTrTGCT GAG
TGACAGGAAC
CGTTGTAACT
CC-AAAATTTC
GT TCTCGCCA TO 'AAAGGAT 7- CT'-'CGTG C7TGCATGG
~TCA-TC
7-;-rLAGA-AAC 77-GCAGCTGA 7 GC aG.C CA G TAC CzC T A .A TA CCA-.AC 7*:AGTTTGGC
S
S. S
S
55 S
I

Claims (41)

1. An isolated DNA sequence comprising a nucleotide sequence selected from the group consisting of sequences recited in SEQ ID NO: 6 and 53-55; nucleotides 15-1026 of SEQ ID NO: 6; nucleotides 92-562 SEQ ID NO: 53; nucleotides 1-1072 of SEQ ID NO: 54; sequences having at least 90% identity to a sequence recited in above; complements of a sequence recited in above; reverse complements of a sequence recited in above; S.and e reverse sequences of a sequence recited in above.
2. A DNA construct comprising a DNA sequence according to claim 1. e*
3. A transgenic cell comprising a DNA construct according to claim 2.
4. A DNA construct comprising, in the direction: a gene promoter sequence, an open reading frame coding for at least a functional portion of an enzyme encoded by a nucleotide sequence recited in claim 1; and a gene termination sequence. The DNA construct of claim 4 wherein the open reading frame is in a sense orientation.
6. The DNA construct of claim 4 wherein the open reading frame is in San antisense orientation. 16/10/02 73
7. The DNA construct of claim 4, wherein the gene promoter sequence and gene termination sequences are functional in a plant host.
8. The DNA construct of claim 4, wherein the gene promoter sequence provides for transcription in xylem.
9. The DNA construct of claim 4 further comprising a marker for identification of transformed cells. A DNA construct comprising, in the direction: a gene promoter sequence, a non-coding region of a gene coding for an enzyme encoded by a nucleotide sequence recited in claim 1, and a gene termination sequence.
11. The DNA construct of claim 10 wherein the non-coding region is in a sense orientation.
12. The DNA construct of claim 10 wherein the non-coding region is in an antisense orientation. 0 0.
13. The DNA construct of claim 10, wherein the gene promoter sequence and gene termination sequences are functional in a plant host.
14. The DNA construct of claim 10, wherein the gene promoter sequence provides for transcription in xylem. A transgenic plant cell comprising a DNA construct, the DNA construct comprising, in the 5' 3' direction: a gene promoter sequence: 16/10/02 an open reading frame coding for at least a functional portion of an enzyme encoded by a nucleotide sequence recited in claim 1; and a gene termination sequence.
16. The transgenic plant cell of claim 15 wherein the open reading frame is in a sense orientation.
17. The transgenic plant cell of claim 15 wherein the open reading frame is in an antisense orientation.
18. The transgenic plant cell of claim 15 wherein the DNA construct further comprises a marker for identification of transformed cells. S00" 19. A plant comprising a transgenic plant cell according to claim 15, or fruit or seeds thereof.
20. The plant of claim 19 wherein the plant is a woody plant. Cel*
21. The plant of claim 20 wherein the plant is selected from the group consisting of eucalyptus and pine species.
22. A transgenic plant cell comprising a DNA construct, the DNA construct comprising, in the 5' 3' direction: a gene promoter sequence; a non-coding region of a gene coding for an enzyme encoded by a nucleotide sequence recited in claim 1; and a gene termination sequence.
23. The transgenic plant cell of claim 22 wherein the non-coding region Ris in a sense orientation. 16/10/02
24. The transgenic plant cell of claim 22 wherein the non-coding region is in an antisense orientation. A plant comprising a transgenic plant cell according to claim 22, or fruit or seeds thereof.
26. The plant of claim 25 wherein the plant is a woody plant.
27. The plant of claim 26, wherein the plant is selected from the group consisting of eucalyptus and pine species.
28. A method for modulating the lignin content of a plant comprising stably incorporating into the genome of the plant a DNA construct comprising, in the direction: a gene promoter sequence; an open reading frame coding for at least a functional portion of an enzyme encoded by a nucleotide sequence recited in claim 1; and a gene termination sequence.
29. The method of claim 28 wherein the plant is selected from the group consisting of eucalyptus and pine species. l The method of claim 28 wherein the open reading frame is in a sense orientation.
31. The method of claim 28 wherein the open reading frame is in an antisense orientation.
32. A method for modulating the lignin content of a plant comprising z S stably incorporating into the genome of the plant a DNA construct kcomprising, in the direction: 16/10/02 a gene promoter sequence; a non-coding region of a gene coding for an enzyme encoded by a nucleotide sequence recited in claim 1; and a gene termination sequence.
33. The method of claim 32 wherein the non-coding region is in a sense orientation.
34. The method of claim 32 wherein the non-coding region is in an antisense orientation. Oo .35. The method of claim 32 wherein the plant is a woody plant. Ol 36. The method of claim 35, wherein the plant is selected from the group consisting of eucalyptus and pine species.
37. A method for producing a plant having altered lignin structure comprising: o*o. transforming a plant cell with a DNA construct comprising, in the direction, a gene promoter sequence, an open reading frame coding for at least a functional portion of an enzyme encoded by a o nucleotide sequence recited in claim 1, and a gene termination sequence to provide a transgenic cell; cultivating the transgenic cell under conditions conducive to regeneration and mature plant growth.
38. The method of claim 37 wherein the open reading frame is in a sense orientation. n-TR 39. The method of claim 37 wherein the open reading frame is in an C' antisense orientation. 16/10/02 The method of claim 37 wherein the plant is a woody plant.
41. The method of claim 40 wherein the plant is selected from the group consisting of eucalyptus and pine species.
42. A method for producing a plant having altered lignin structure comprising: transforming a plant cell with a DNA construct comprising, in the direction, a gene promoter sequence, a non-coding region of a gene coding for an enzyme encoded by a nucleotide sequence recited in 0• claim 1, and a gene termination sequence to provide a transgenic cell; cultivating the transgenic cell under conditions conducive to oo regeneration and mature plant growth.
43. The method of claim 42 wherein the non-coding region is in a sense orientation. oooo 0 44. The method of claim 42 wherein the non-coding region is in an antisense orientation. The method of claim 42 wherein the plant is a woody plant.
46. The method of claim 45 wherein the plant is selected from the group consisting of eucalyptus and pine species.
47. A method of modifying the activity of an enzyme in a plant comprising stably incorporating into the genome of the plant a DNA construct including: a gene promoter sequence; 16/10/02 78 an open reading frame coding for at least a functional portion of an enzyme encoded by a nucleotide sequence recited in claim 1; and a gene termination sequence.
48. The method of claim 47 wherein the open reading frame is in a sense orientation.
49. The method of claim 47 wherein the open reading frame is in an antisense orientation. A method of modifying the activity of an enzyme in a plant comprising stably incorporating into the genome of the plant a DNA construct including: 0 a gene promoter sequence; a non-coding region of a gene coding for an enzyme encoded by a nucleotide sequence recited in claim 1; and a gene termination sequence. S
51. The method of claim 50 wherein the non-coding region is in a sense orientation. 0
52. The method of claim 50 wherein the non-coding region is in an antisense orientation.
53. The method of claim 50 wherein the plant is a woody plant. 16/10/02
54. The method of claim 53 wherein the plant is selected from the group consisting of eucalyptus and pine species. Dated this 1 6 th day of October 2002 Genesis Research Development Corporation Limited and Fletcher Challenge Forests Limited Patent Attorneys for the Applicants PETER MAXWELL ASSOCIATE *i 16/10/02
AU57975/01A 1996-09-11 2001-08-10 Materials and methods for the modification of plant lignin content Ceased AU756359B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU57975/01A AU756359B2 (en) 1996-09-11 2001-08-10 Materials and methods for the modification of plant lignin content
AU2003203517A AU2003203517B2 (en) 1996-09-11 2003-04-08 Materials and methods for the modification of plant lignin content

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US08/713000 1996-09-11
AU44036/97A AU733388B2 (en) 1996-09-11 1997-09-10 Materials and methods for the modification of plant lignin content
AU57975/01A AU756359B2 (en) 1996-09-11 2001-08-10 Materials and methods for the modification of plant lignin content

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU44036/97A Division AU733388B2 (en) 1996-09-11 1997-09-10 Materials and methods for the modification of plant lignin content

Related Child Applications (1)

Application Number Title Priority Date Filing Date
AU2003203517A Division AU2003203517B2 (en) 1996-09-11 2003-04-08 Materials and methods for the modification of plant lignin content

Publications (2)

Publication Number Publication Date
AU5797501A AU5797501A (en) 2001-10-04
AU756359B2 true AU756359B2 (en) 2003-01-09

Family

ID=3731291

Family Applications (1)

Application Number Title Priority Date Filing Date
AU57975/01A Ceased AU756359B2 (en) 1996-09-11 2001-08-10 Materials and methods for the modification of plant lignin content

Country Status (1)

Country Link
AU (1) AU756359B2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7402428B2 (en) 2004-09-22 2008-07-22 Arborgen, Llc Modification of plant lignin content
US7456338B2 (en) 2004-09-22 2008-11-25 Arborgen Llc Modification of plant lignin content
US7799906B1 (en) 2004-09-22 2010-09-21 Arborgen, Llc Compositions and methods for modulating lignin of a plant
US7910326B2 (en) 1996-09-11 2011-03-22 Arborgen, Inc. Materials and methods for the modification of plant lignin content

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7910326B2 (en) 1996-09-11 2011-03-22 Arborgen, Inc. Materials and methods for the modification of plant lignin content
US7402428B2 (en) 2004-09-22 2008-07-22 Arborgen, Llc Modification of plant lignin content
US7456338B2 (en) 2004-09-22 2008-11-25 Arborgen Llc Modification of plant lignin content
US7799906B1 (en) 2004-09-22 2010-09-21 Arborgen, Llc Compositions and methods for modulating lignin of a plant
US7807880B2 (en) 2004-09-22 2010-10-05 Arborgen Llc Modification of plant lignin content
US8030545B2 (en) 2004-09-22 2011-10-04 Arborgen Llc Modification of plant lignin content

Also Published As

Publication number Publication date
AU5797501A (en) 2001-10-04

Similar Documents

Publication Publication Date Title
US5952486A (en) Materials and methods for the modification of plant lignin content
AU777237B2 (en) Materials and methods for the modification of plant lignin content
Nakamura et al. Starch debranching enzyme (R-enzyme or pullulanase) from developing rice endosperm: purification, cDNA and chromosomal localization of the gene
Van Der Meer et al. Cloning of the fructan biosynthesis pathway of Jerusalem artichoke
Hibino et al. Increase of cinnamaldehyde groups in lignin of transgenic tobacco plants carrying an antisense gene for cinnamyl alcohol dehydrogenase
AU724942B2 (en) Transgenic potatoes having reduced levels of alpha glucan L- or H-type tuber phosphorylase activity with reduced cold-sweetening
EP1009751A2 (en) Improvements in or relating to stability of plant starches
US20030131373A1 (en) Materials and methods for the modification of plant lignin content
US6204434B1 (en) Materials and methods for the modification of plant lignin content
AU756359B2 (en) Materials and methods for the modification of plant lignin content
EP1321525A2 (en) Fruit ripening-related genes
US6653528B1 (en) Pinus radiata nucleic acids encoding O-methyl transferase and methods for the modification of plant lignin content therewith
US6410718B1 (en) Materials and methods for the modification of plant lignin content
AU733388B2 (en) Materials and methods for the modification of plant lignin content
CA2211665A1 (en) Coniferin beta-glucosidase cdna for modifying lignin content in plants
AU2003203517B2 (en) Materials and methods for the modification of plant lignin content
BRPI0813206A2 (en) modification of lignin biosynthesis via sense suppression.
MXPA99002262A (en) Materials and methods for the modification of plant lignin content
US7317136B1 (en) Methods for modifying plant cell walls and modified plants produced thereby
GB2294266A (en) Reduction of the level of glucose in an organism
Oliver et al. Inhibition of tobacco NADH-hydroxypyruvate reductase by expression of a heterologous antisense RNA derived from a cucumber cDNA: implications for the mechanism of action of antisense RNAs
MXPA01003475A (en) Materials and methods for the modification of plant lignin content
AU2007201050B2 (en) Materials and methods for the modification of plant lignin content
US7498492B2 (en) Modification of sucrose synthase gene expression in plant tissue and uses therefor
AU2008203173A1 (en) Materials and methods for the modification of plant lignin content

Legal Events

Date Code Title Description
PC1 Assignment before grant (sect. 113)

Owner name: GENESIS RESEARCH AND DEVELOPMENT CORPORATION LIMIT

Free format text: THE FORMER OWNER WAS: GENESIS RESEARCH AND DEVELOPMENT CORPORATION LIMITED, FLETCHER CHALLENGE FORESTS LIMITED

FGA Letters patent sealed or granted (standard patent)