CA2270905A1 - Recombinant pinoresinol/lariciresinol reductase, recombinant dirigent protein, and methods of use - Google Patents

Recombinant pinoresinol/lariciresinol reductase, recombinant dirigent protein, and methods of use Download PDF

Info

Publication number
CA2270905A1
CA2270905A1 CA002270905A CA2270905A CA2270905A1 CA 2270905 A1 CA2270905 A1 CA 2270905A1 CA 002270905 A CA002270905 A CA 002270905A CA 2270905 A CA2270905 A CA 2270905A CA 2270905 A1 CA2270905 A1 CA 2270905A1
Authority
CA
Canada
Prior art keywords
leu
protein
seq
gly
val
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002270905A
Other languages
French (fr)
Inventor
Norman G. Lewis
Laurence B. Davin
Albena T. Dinkova-Kostova
Masayuki Fujita
David R. Gang
Simo Sarkanen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Minnesota
Washington State University Research Foundation
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2270905A1 publication Critical patent/CA2270905A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/415Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/02Preparation of oxygen-containing organic compounds containing a hydroxy group
    • C12P7/22Preparation of oxygen-containing organic compounds containing a hydroxy group aromatic

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Botany (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)

Abstract

Dirigent proteins and pinoresinol/lariciresinol reductases have been isolated from Forsythia intermedia, Thuja plicata and Tsuga heterophylla, together with cDNAs encoding dirigent proteins and pinoresinol/lariciresinol reductases from these species. Accordingly, isolated DNA sequences are provided which code for the expression of dirigent proteins and pinoresinol/lariciresinol reductases.
In other aspects, replicable recombinant cloning vehicles are provided which code for dirigent proteins or pinoresinol/lariciresinol reductases or for a base sequence sufficiently complementary to at least a portion of dirigent protein or pinoresinol/lariciresinol reductase DNA or RNA to enable hybridization therewith (e.g., antisense dirigent protein or pinoresinol/lariciresinol reductase RNA or fragments of complementary dirigent protein or pinoresinol/lariciresinol reductase DNA which are useful as polymerase chain reaction primers or as probes for genes encoding dirigent proteins or pinoresinol/lariciresinol reductases or related genes). In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding dirigent protein or pinoresinol/lariciresinol reductase. Thus, systems and methods are provided for the recombinant expression of dirigent proteins and/or pinoresinol/lariciresinol reductases that may be used to facilitate the production, isolation and purification of significant quantities of recombinant dirigent proteins and/or pinoresinol/lariciresinol reductases for subsequent use, to obtain expression or enhanced expression of dirigent proteins and/or pinoresinol/lariciresinol reductases in plants in order to enhance, or otherwise alter, lignan biosynthesis, or may be otherwise employed for the regulation or expression of dirigent proteins and pinoresinol/lariciresinol reductases.

Description

RECOMBINANT PINORESINOL/LARICIRESINOL REDUCTASE, RECOMBINANT DIRIGENT PROTEIN, AND METHODS OF USE
Field of the Invention The present invention relates to isolated dirigent proteins and pinoresinol/lariciresinol reductases from Forsythia intermedia, Tsuga heterophylla and Thuja plicata, to nucleic acid sequences which code for dirigent proteins and pinoresinolllariciresinol reductases from Forsythia intermedia, Tsuga heterophylla and Thuja plicata, and to vectors containing the sequences, host cells containing the sequences and methods of producing recombinant pinoresinol/lariciresinol reductases, recombinant dirigent protein and their mutants.
Back round of the Invention Lignans are a large, structurally diverse, class of vascular plant metabolites having a wide range of physiological functions and pharmacologically important properties (Ayres, D.C., and Loike, J.D. in Chemistry and Pharmacology of Natural Products. Lignans. Chemical, Biological and Clinical Properties, Cambridge University Press, Cambridge, England ( 1990); Lewis et al., in Chemistry of the Amazon, Biodiversity Natural Products, and Environmental Issues, 588, (P.R.
Seidl, O.R. Gottlieb and M.A.C. Kaplan) 135-l67, ACS Symposium Series, Washington D.C. (199S)). Because of their pronounced antibiotic properties (Markkanen, T. et al., Drugs Exptl. Clip. Res. 7:711-718 (1981)), antioxidant properties (Faure, M. et al., Phytochemistry 29:3773-3775 ( 1990); Osawa, T.
et al., Agric. Biol. Chem. 49:335l-3352 {1985)) and antifeedant, properties (Harmatha, J., and Nawrot, J., Biochem. Syst. Ecol. 12:95-98 (l984)), a major role of lignans in vascular plants is to help confer resistance against various opportunistic biological AA

WO 98/20I13 PCTlUS97l20391 pathogens and predators. Lignans have also been proposed as cytokinins (Binns, A.N. et al., Proc. Natl. Acad. Sci. USA 84:980-984 (1987)) and as intermediates in lignification (Rahman, M.M.A. et al., Phytochemistry 29:1861-( 1990)), suggesting a critical role in plant growth and development. It is widely held that elaboration of biochemical pathways to lignins/Iignans and related substances from phenylalanine (tyrosine) was essential for the successful transition of aquatic plants to their vascular dry-land counterparts (Lewis, N.G., and Davin, L.B., in Isoprenoids and Other Natural Products. Evolution and Function, 562 (W.D. Nes, ed) 202-246, ACS Symposium Series: Washington, DC (l994)), some four hundred and eighty million years ago (Graham, L.E., Origin of Land Plants, John Wiley & Sons, Inc., New York, NY (1993)).
Based on existing chemotaxonomic data, lignans are present in "primitive"
plants, such as the fern Blechnum orientale (Wada, H. et al., Chem. Pharm.
Bull.
40:2099-2l01 ( 1992)) and the hornworts, e.g., Dendroceros japonicus and 1 S Megaceros flagellaris (Takeda, R. et al., in Bryophytes. Their Chemistry and Chemical Taxonomy, Vol. 29 {Zinsmeister, H.D. and Mues, R. eds) pp. 201-207, Oxford University Press: New York, NY ( 1990); Takeda, R. et al., Tetrahedron Lett.
31:4159-4162 ( 1990)), with the latter recently being classified as originating in the Silurian period (Graham, L.E., J. Plant Res. 109: 241-252 ( 1996)).
Interestingly, evolution of both gymnosperms and angiosperms was accompanied by major changes in the structural complexity and oxidative modifications of the lignans (Lewis, N.G., and Davin, L.B., in Isoprenoids and Other Natural Products.
Evolution and Function, 562 (W. D. Nes, ed) 202-246, ACS Symposium Series:
Washington, DC { 1994); Gottlieb, O.R., and Yoshida, M., in Natural Products of Woody Plants. Chemicals Extraneous to the Lignocellulosic Cell Wall (Rowe, J.W.
and Kirk, C.H. eds) pp. 439-51l, Springer Verlag: Berlin (1989)). Indeed, in some species, such as Western Red Cedar (Thuja plicata), lignans can contribute extensively to heartwood formation/generation by enhancing the resulting heartwood color, quality, fragrance and durability.
In addition to their functions in plants, lignans also have important pharmacological roles. For example, podophyllotoxin, as its etoposide and teniposide derivatives, is an example of a plant compound that has been successfully employed as an anticancer agent (Ayres, D.C., and Loike, J.D. in Chemistry and Pharmacology of Natural Products. Lignans. Chemical, Biological and Clinical Properties, Cambridge University Press, Cambridge, England (1990)). Antiviral WO 98I20113 PCTlUS97/20391 properties have also been reported for selected lignans. For example, (-)-arctigenin (Schroder, H.C. et al., Z. Naturforsch. 45c, 12l 5-1221 ( 1990)), (-)-trachelogenin (Schroder, H.C. et al., Z. Naturforsch. 45c, 1215-122l (1990)) and nordihydroguaiaretic acid (Gnabre, J.N. et al., Proc. Natl. Acad. Sci. USA
92:11239-11243 (199S)) are each effective against HIV due to their pronounced reverse transcriptase inhibitory activities. Some lignans, e.g., matairesinol (Nikaido, T, et al., Chem. Pharm. Bull. 29:3S86-3S92 (1981)), inhibit cAMP-phosphodiesterase, whereas others enhance cardiovascular activity, e.g., syringaresinol ~3-D-glucoside (Nishibe, S. et al., Chem. Pharm. Bull. 38:l763-( 1990)). There is also a high correlation between the presence, in the diet, of the "mammalian" Iignans or "phytoestrogens", enterolactone and enterodiol, formed following digestion of high fiber diets, and reduced incidence rates of breast and prostate cancers (so-called chemoprevention) (Axelson, M., and Setchell, K.D.R., FEBS Lett. I23:337-342 ( 1981 ); Adlercreutz et al., J. Steroid Biochem.
Molec. Biol.
41:3-8 (l992); Adlercreutz et al., J. Steroid Biochem. Molec. Biol. 52:97-103 ( 1995)). The "mammalian lignans," in turn, are considered to be derived from lignans such as matairesinol and secoisolariciresinol (Boriello et al., J.
Applied Bacteriol., 58:37-43 (1985)).
The biosynthetic pathways to the lignans are only now being defined, although there are no prior art reports of the isolation of enzymes or genes involved in the lignan biosynthetic pathway. Based on radiolabeling experiments with crude enzyme extracts from Forsythia intermedia, it was first established that entry into the 8,8'-linked lignans, which represent the most prevalent dilignol linkage known (Davin, L.B., and Lewis, N.G., in Ree. Adv. Phytoehemistry , Vol. 26 (Stafford, H.A., and Ibrahim, R.K., eds), pp. 325-375, Plenum Press, New York, NY ( 1992)), occurs via stereoselective coupling of two achiral coniferyl alcohol molecules, in the form of oxygenated free radicals, to afford the furofuran lignan (+)-pinoresinol (Davin, L.B., Bedgar, D.L., Katayama, T., and Lewis, N.G., Phytochemistry 31:3869-3874 (1992);
Pare, P.W. et al., Tetrahedron Lett. 35:4731-4734 (1994)) (FIGURE 1).
Bimolecular phenoxy radical coupling reactions, such as the stereoselective coupling of two achiral coniferyl alcohol molecules to afford the furofuran Iignan (+)-pinoresinol, are involved in numerous biological processes. These are presumed ~ to include lignin formation in vascular plants (M. Nose et al., Phytochemistry 39:71 ( 1995)), lignan formation in vascular plants (N.G. Lewis and L.B. Davin, ACS
Symp.
Ser. 562:202 ( 1994); P. W. Pare et al., Tetrahedron Lett. 35:4731 ( 1994)), suberin i~
formation in vascular plants (M.A. Bernards et al., J. Biol. Chem. 270:73 82 ( 1995)), fruiting body development in fungi (J.D. Bu'Lock et al., J. Chem. Soc. 2085 ( I 962}), insect cuticle melanization and sclerotization (M. Miessner et al., Helv.
Chim. Acta 74:120S (1991); V.J. Marmaras et al., Arch. Insect Biochem. Physiol. 31:119 (l996)), the formation of aphid pigments (D. W. Cameron and Lord Todd, in Organic Substances of Natural Origin. Oxidative Coupling of Phenols, W.I. Taylor and A.R.
Battersby, Eds. (Dekker, New York, 1967), Vol. l, p.203), and the formation of algal cell wall polymers (M.A. Ragan, Phytochemistry 23:2029 (1984)).
In contrast to the marked regiochemical and/or stereochemical specificities observed in the biosynthesis of the foregoing lignin and lignan substances in vivo, all previously described chemical (J. Iqbal et al., Chem. Rev. 94:519 ( 1994)) and enzymatic (K. Freudenberg, Science 148:595 (I965)) bimolecular phenoxy radical coupling reactions in vitro have lacked strict regio- and stereospecific control. That is, if chiral centers are introduced during coupling in vitro, the products are racemic, i 5 and different regiochernistries can result if more than one potential coupling site is present. Thus, the ability to generate a particular enantiomeric form or a specific coupling product in vitro is not under explicit control. Consequently, it is inferred that a mechanism exists in vivo to control the regiochemistry and stereochemistry of bimolecular phenoxy radical coupling reactions leading to the formation of, for example, lignans.
In Forsythia intermedia, and presumably other species, (+)-pinoresinol, the product of the stereospecific coupling of two E-coniferyl alcohol molecules, undergoes sequential reduction to generate (+)-lariciresinol and then (-)-secoisolariciresinol (Katayama, T. et al., Phytochemistry 32:58l-591 (1993);
Chu, A. et al., J. Biol. Chem. 268:27026-27033 (1993)) (FIGURE 1). While it has hitherto been unclear whether more than one reductase is required to catalyze the sequential steps, the reductions proceed via abstraction of the pro-R hydride of NADPH, resulting in an "inversion" of configuration at both the C-7 and C-T
positions of the products, (+)-Iariciresinol and (-)-secoisolariciresinol (Chu, A., et al., J. Biol. Chem. 268:27026-2703 3 ( 1993 )). (-)-Matairesinol is subsequently formed via dehydrogenation of (-)-secoisolariciresinol, further metabolism of which presumably affords lignans such as the antiviral (-)-trachelogenin in Ipomoea cairica and (-)-podophyllotoxin in Podophyllum peltatum.
Thus, the stereospeciflc formation of (+)-pinoresinol and the subsequent reductive steps giving (+)-lariciresinol and (-)-secoisolariciresinol are pivotal points in lignan metabolism, since they represent entry into the furano, dibenzylbutane, dibenzylbutyrolactone and aryltetrahydronaphthalene lignan subclasses.
( Additionally, it should be noted that while lignans are normally optically active, the particular enantiomer present may differ between plant species. For example, . 5 (-}-pinoresinol occurs in Xanthoxylum ailanthoides {Ishii et al., Yakugaku Zasshi, 103:279-292 (1983)), and (-)-lariciresinol is present in Daphne tangutica (Lin-Gen, et al., Planta Medica, 45:172-17b (1982)). The optical activity of a particular lignan may have important ramifications regarding biological activity. For example, (-)-trachelogenin inhibits the in vitro replication of HIV-1, whereas its IO (+)-enantiomer is much less effective (Schroder et al., Naturforsch.
45c: l215-1221 (1990)).
Summary of the Invention In accordance with the foregoing, in one aspect of the invention it has now been discovered that a 78-kD dirigent protein is involved in conferring 15 stereospecificity in 8,8'-linked lignan formation. This protein has no detectable catalytically active oxidative center and apparently serves only to bind and orient coniferyl alcohol-derived free radicals, which then undergo stereoselective coupling to form (+)-pinoresinol. The formation of free-radicals, in the first instance, requires the oxidative capacity of either a nonspecif c oxidase or even a non-enzymatic 20 electron oxidant. In another aspect of the invention, it has been discovered that a single enzyme, designated pinoresinol/lariciresinol reductase, catalyzes the conversion of pinoresinol to lariciresinol and then to secoisolariciresinol.
Thus, one aspect of the invention relates to isolated dirigent proteins and to isolated pinoresinol/lariciresinol reductases, such as, for example, those from Forsythia 25 intermedia, Thuja plicata and Tsuga heterophylla.
In other aspects of the invention, cDNAs encoding dirigent protein from Forsythia intermedia (SEQ ID Nos:l2 and 14), Thuja plicata {SEQ ID
Nos:20,22,24,26,28,30,32 and 34) and Tsuga heterophila (SEQ ID Nos:l6 and 18) have been isolated and sequenced, and the corresponding amino acid sequences have 30 been deduced. Also, cDNAs encoding pinoresinol/lariciresinol reductase from Forsythia intermedia (SEQ ID Nos:47,49,51,53,55 and 57), Thuja plicata (SEQ ID
Nos:61,63,65 and 67) and Tsuga heterophila (SEQ ID Nos:69 and 71 ) have been . isolated and sequenced, and the corresponding amino acid sequences have been deduced.

i~

WO 98I20113 PCTlUS97120391 Thus, the present invention relates to isolated proteins and to isolated DNA
sequences which code for the expression of dirigent protein or pinoresinol/-lariciresinol reductase. In other aspects, the present invention is directed to replicable recombinant cloning vehicles comprising a nucleic acid sequence which codes for a pinoresinol/lariciresinol reductase or for a dirigent protein. The present invention is also directed to a base sequence sufficiently complementary to at least a portion of a pinoresinol/lariciresinol reductase DNA or RNA, or to at least a portion of a dirigent protein DNA or RNA, to enable hybridization therewith. The aforesaid complementary base sequences include, but are not limited to: antisense pinoresinol/lariciresinol reductase RNA; antisense dirigent protein RNA;
fragments of DNA that are complementary to a pinoresinol/lariciresinol reductase DNA, or to a dirigent protein DNA, and which are therefore useful as polymerase chain reaction primers, or as probes for pinoresinol/lariciresinol reductase genes, dirigent protein genes, or related genes.
In yet other aspects of the invention, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence of the invention. Thus, the present invention provides for the recombinant expression of pinoresinol/lariciresinol reductases and dirigent proteins in plants, animals, microbes and in cell cultures. The inventive concepts described herein may be used to facilitate the production, isolation and purification of significant quantities of recombinant pinoresino111ariciresinol reductase or dirigent protein, or of their enzyme products, in plants, animals, microbes or cell cultures.
Brief Description of the Drawings The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same becomes better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
FIGURE 1 shows the stereospecific conversion of E-coniferyl alcohol to (+)-pinoresinol in Forsythia intermedia. The stereoselectivity of this reaction is controlled by dirigent protein. (+)-Pinoresinol is then sequentially converted to (+)-lariciresinol and (-)-secoisolariciresinol by {+)-pinoresinol/(+)-lariciresinol reductase. (+)-pinoresinol, (+)-lariciresinol and (-)-secoisolariciresinol are the precursors of the furofuran, furano and dibenzylbutane families of lignans, respectively.

-'7.
Detailed Description of the Preferred Embodiment As used herein, the terms "amino acid" and "amino acids" refer to all naturally occurring L-a-amino acids or their residues. The amino acids are identified by either the single-letter or three-letter designations:
Asp D aspartic acid Ile I isoleucine Thr T threonine Leu L leucine Ser S serine Tyr Y tyrosine Glu E glutamic acid Phe F phenylalanine Pro P proline His H histidine Gly G glycine Lys K lysine Ala A alanine Arg R arginine Cys C cysteine Trp W tryptophan Val V valine Gln Q glutamine Met M methionine Asn N asparagine As used herein, the term "nucleotide" means a monomeric unit of DNA or RNA containing a sugar moiety (pentose), a phosphate and a nitrogenous heterocyclic base. The base is linked to the sugar moiety via the glycosidic carbon ( 1' carbon of pentose) and that combination of base and sugar is called a nucleoside.
The base characterizes the nucleotide with the four bases of DNA being adenine ("A"), guanine ("G"), cytosine {"C") and thymine ("T"). Inosine ("I") is a synthetic base that can be used to substitute for any of the four, naturally-occurring bases (A, C, G or T). The four RNA bases are A,G,C and uracil ("U"). The nucleotide sequences described herein comprise a linear array of nucleotides connected by phosphodiester bonds between the 3' and 5' carbons of adjacent pentoses.
The term "percent identity" (%I) means the percentage of amino acids or nucleotides that occupy the same relative position when two amino acid sequences, yr two nucleic acid sequences, are aligned side by side.
The term "percent similarity" (%S) is a statistical measure of the degree of relatedness of two compared protein sequences. The percent similarity is calculated by a computer program that assigns a numerical value to each compared pair of amino acids based on chemical similarity (e.g., whether the compared amino acids are acidic, basic, hydrophobic, aromatic, etc.) and/or evolutionary distance as measured by the minimum number of base pair changes that would be required to convert a codon encoding one member of a pair of compared amino acids to a codon i~

_g_ encoding the other member of the pair. Calculations are made after a best fit alignment of the two sequences has been made empirically by iterative comparison of a11 possible alignments. (Henikoff, S. and Henikoff, J.G., Proc. Nat'1 Acad Sci USA 89:l0915-10919 (1992)).
"Oligonucleotide" refers to short length single or double stranded sequences of deoxyribonucleotides linked via phosphodiester bonds. The oligonucleotides are chemically synthesized by known methods and purified, for example, on polyacrylamide gels.
The term "pinoresinol/lariciresinol reductase" is used herein to mean an enzyme capable of catalyzing two reduction reactions: the reduction of pinoresinol to lariciresinol, and the reduction of lariciresinol to secoisolariciresinol. The products of these reactions, lariciresinol and secoisolariciresinol, can be either the (+}- or (-)-enantiomers.
The term "dirigent protein" is used herein to mean a protein capable of guiding a bimolecular phenoxy radical coupling reaction thereby determining the stereochemistry and regiochemistry of the product of the reaction and/or its polymeric derivatives.
The terms "alteration", "amino acid sequence alteration", "variant" and "amino acid sequence variant" refer to dirigent protein or pinoresinol/lariciresinol reductase molecules with some differences in their amino acid sequences as compared to the corresponding native dirigent protein or pinoresinol/lariciresinol reductase. Ordinarily, the variants will possess at least about 70% homology with the corresponding, native dirigent protein or pinoresinol/lariciresinol reductase, and preferably they will be at least about 80% homologous with the corresponding, native dirigent protein or pinoresinol/lariciresinol reductase. The amino acid sequence variants of dirigent protein or pinoresinolllariciresinol reductase falling within this invention possess substitutions, deletions, and/or insertions at certain positions.
Sequence variants of dirigent protein or pinoresinol/lariciresinol reductase may be used to attain desired enhanced or reduced enzymatic activity, modified regiochemistry or stereochemistry, or altered substrate utilization or product distribution.
Substitutional dirigent protein variants or pinoresinol/lariciresinol reductase variants are those that have at least one amino acid residue in the corresponding native dirigent protein sequence or pinoresinol/lariciresinol reductase sequence removed and a different amino acid inserted in its place at the same position.
The =9-substitutions may be single, where only one amino acid in the molecule has been substituted, or they may be multiple, where two or more amino acids have been substituted in the same molecule. Substantial changes in the activity of the dirigent protein or pinoresinol/lariciresinol reductase molecule may be obtained by _ 5 substituting an amino acid with a side chain that is significantly different in charge and/or structure from that of the native amino acid. This type of substitution would be expected to affect the structure of the poiypeptide backbone and/or the charge or hydrophobicity of the molecule in the area of the substitution.
Moderate changes in the activity of the dirigent protein or pinoresinol/
lariciresinol reductase molecule would be expected by substituting an amino acid with a side chain that is similar in charge and/or structure to that of the native molecule. This type of substitution, referred to as a conservative substitution, would not be expected to substantially alter either the structure of the polypeptide backbone or the charge or hydrophobicity of the molecule in the area of the substitution.
Insertional dirigent protein variants or pinoresinol/lariciresinol reductase variants are those with one or more amino acids inserted immediately adjacent to an amino acid at a particular position in the native dirigent protein or pinoresinol/-lariciresinol reductase molecule. Immediately adiacent to an amino acid means connected to either the a-carboxy or a-amino functional group of the amino acid.
The insertion may be one or more amino acids. Ordinarily, the insertion will consist of one or two conservative amino acids. Amino acids similar in charge and/or structure to the amino acids adjacent to the site of insertion are defined as conservative. Alternatively, this invention includes insertion of an amino acid with a charge and/or structure that is substantially different from the amino acids adjacent to the site of insertion.
Deletional variants are those where one or more amino acids in the native dirigent protein or pinoresinol/lariciresinol reductase molecule have been removed.
Ordinarily, deletional variants will have one or two amino acids deleted in a particular region of the dirigent protein or pinoresinol/lariciresinol reductase molecule.
The term "antisense" or "antisense RNA" or "antisense nucleic acid" is used herein to mean a nucleic acid molecule that is complementary to all or part of a . messenger RNA molecule. Antisense nucleic acid molecules are typically used to inhibit the expression, in vivo, of complementary, expressed messenger RNA
_ 35 molecules.

i~

WO 98I20113 PCTlUS97120391 The terms "biological activity", "biologically active", "activity" and "active"
when used with reference to a pinoresinol/lariciresinol reductase molecule refer to the ability of the pinoresinol/lariciresinol reductase molecule to reduce pinoresinol and lariciresinol to yield lariciresinol and secoisolariciresinol, respectively, as measured in an enzyme activity assay, such as the assay described in Example 8 below.
The terms "biological activity", "biologically active", "activity" and "active"
when used with reference to a dirigent protein refer to the ability of the dirigent protein to guide a bimolecular phenoxy radical coupling reaction thereby determining IO the stereochemistry and regiochemistry of the product of the reaction and of its polymeric derivatives.
Amino acid sequence variants of dirigent protein or pinoresinol/lariciresinol reductase may have desirable altered biological activity including, for example, altered reaction kinetics, substrate utilization, product distribution or other characteristics such as regiochemistry and stereochemistry.
The terms "DNA sequence encoding", "DNA encoding" and "nucleic acid encoding" refer to the order or sequence of deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides determines the order of amino acids along the translated polypeptide chain. The DNA sequence thus codes for the amino acid sequence.
The terms "replicable expression vector" and "expression vector" refer to a piece of DNA, usually double-stranded, which may have inserted into it a piece of foreign DNA. Foreign DNA is defined as heterologous DNA, which is DNA not naturally found in the host. The vector is used to transport the foreign or heterologous DNA into a suitable host cell. Once in the host cell, the vector can replicate independently of or coincidentally with the host chromosomal DNA, and several copies of the vector and its inserted (foreign} DNA may be generated.
In addition, the vector contains the necessary elements that permit translating the foreign DNA into a polypeptide. Many molecules of the polypeptide encoded by the foreign DNA can thus be rapidly synthesized.
The terms "transformed host cell," "transformed" and "transformation" refer to the introduction of DNA into a cell. The cell is termed a "host cell", and it rnay be a prokaryotic or a eukaryotic cell. Typical prokaryotic host cells include various strains of E. coli. Typical eukaryotic host cells are plant cells, such as maize cells, yeast cells, insect cells or animal cells. The introduced DNA is usually in the form of a vector containing an inserted piece of DNA. The introduced DNA sequence may be from the same species as the host cell or from a different species from the host . cell, or it may be a hybrid DNA sequence, containing some foreign DNA and some DNA derived from the host species.
In accordance with the present invention, cDNAs encoding dirigent protein and pinoresinolllariciresinol reductase from Forsythia intermedia, Thuja plicata and Tsuga heterophylla were isolated, sequenced and expressed in the following manner.
With respect to the cDNAs encoding dirigent protein from Forsythia intermedia, an empirically-determined purification protocol was developed to isolate the Forsythia dirigent protein. This procedure yielded at least six isoforms of the dirigent protein. Amino acid sequencing of the amino terminus of each of these isoforms revealed that the sequence of each isoform was identical. Sequencing of the N-terminus of a mixture of these isoforms yielded a 28 amino acid sequence (SEQ ID No: l ). Tryptic digestion of a mixture of these isoforms yielded six peptide fragments which were purified in sufficient quantity to permit sequencing SEQ ID Nos:2-7.
A primer designated PSINT1 (SEQ ID No:B) was synthesized based on the sequence of amino acids 9 to 15 of the N-terminal peptide (SEQ ID No: l ). A
primer designated PSI1R (SEQ ID No:9) was synthesized based on the sequence of amino acids 3 to 9 of the internal peptide sequence set forth in (SEQ ID No:2). A
primer designated PSI2R (SEQ ID No:10) was synthesized based on the sequence of amino acids 13 to 20 of the internal peptide sequence set forth in (SEQ ID No:2). A
primer designated PSI7R (SEQ ID No:l1) was synthesized based on the sequence of amino acids 6 to 12 of the internal peptide sequence set forth in (SEQ ID No:3).
Forsythia total RNA was isolated by means of a protocol adapted from a method specifically designed for woody tissues which contain a large concentration of polyphenols. Poly A+ RNA was isolated and a cDNA library constructed using standard means. A PCR reaction utilizing primers PSINT1 (SEQ ID No:8) and one of PSI7R, (SEQ ID No:ll) PSI2R (SEQ ID No:lO) or PSI1R (SEQ ID No:9), together with an aliquot of Forsythia cDNA as substrate, each yielded a single cDNA
band of 370 bp, 155 by and I25 bp, respectively. The 370 by product of the PSINTI (SEQ ID N0:8)-PSI7R (SEQ ID No:l l) reaction was amplified by PCR and utilized as a probe to screen approximatley 600,000 PFU of a Forsythia intermedia cDNA library. Two distinct cDNAs were identif ed, called pPSDFi 1 _ 35 (SEQ ID No:l2) and pPSDFi2 (SEQ ID No:l4). The cDNA insert encoding dirigent i~

protein was excised from plasmid pPSDFi 1 and cloned into the baculovirus transfer vector pBlueBac4. The resulting construct was used to transform Spodoptera frugiperda from which functional dirigent protein was purified.
With respect to the cloning of dirigent protein from Thuja plicata and Tsuga heterophylla, the Forsythia cDNAs were used as probes to isolate two dirigent protein clones from Tsuga heterophylla (SEQ ID Nos:l6, 18}, and eight dirigent protein cDNA clones from Thuja plicata (SEQ ID Nos:20, 22, 24, 26, 28, 30, 32, 34).
With respect to the cDNAs encoding (+)-pinoresinol/(+)-lariciresinol reductase from Forsythia intermedia, an empirically-determined purification protocol, consisting of eight chromatographic steps, was developed to isolate the Forsythia (+)-pinoresinol/(+)-lariciresinol reductase protein. This procedure yielded two isoforms of (+)-pinoresinol/(+)-lariciresinol reductase which were both capable of catalyzing the reduction of (+)-pinoresinol and (+)-lariciresinol.
Sequencing of I S the N-terminus of each of these isoforms yielded an identical 30 amino acid sequence (SEQ ID No:36). Tryptic digestion of a mixture of both of these isoforms yielded four peptide fragments which were purified in sufficient quantity to permit sequencing (SEQ ID Nos:37-40). Additionally, cyanogen bromide cleavage of a mixture of both of these isoforms yielded three peptide fragments which were purified in sufficient quantity to permit sequencing (SEQ ID Nos:41-43).
A primer designated PLRNS (SEQ ID No:44) was synthesized based on the sequence of amino acids 7 to 13 of the N-terminal peptide (SEQ ID No:36). A
primer designated PLR14R (SEQ ID No:45) was synthesized based on the sequence of amino acids 2 to 8 of the internal peptide sequence set forth in SEQ ID
No:37. A
primer designated PLR15R (SEQ ID No:46) was synthesized based on the sequence of amino acids 9 to 15 of the internal peptide sequence set forth in SEQ ID
No:37.
The sequence of amino acids 9 to 15 of the internal peptide sequence set forth in SEQ ID No:37, upon which the sequence of primer PLR15R (SEQ ID No:46) was based, also corresponded to the sequence of amino acids 4 to 10 of the cyanogen bromide-generated, internal fragment set forth in SEQ ID No:41.
Forsythia total RNA was isolated by means of a protocol adapted from a method specifically designed for woody tissues which contain a large concentration of polyphenols. Poly A+ RNA was isolated and a cDNA library constructed using standard means. A PCR reaction utilizing primers PLRNS (SEQ ID No:44) and either PLR14R (SEQ ID No:45) or PLR15R (SEQ ID No:46), together with an -13_ aliquot of Forsythia cDNA as substrate, yielded two, amplified bands of 380 by and 400 bp. One 40d by cDNA insert was utilized as a probe with which to screen the Forsythia cDNA library. The 400 by probe corresponded to bases 22 to 423 of SEQ ID No:47. Six cDNA clones were isolated and sequenced (SEQ ID Nos:47, 49, 51, 53, 55, 57}. The clones shared a common coding region, many had a different 5'-untranslated region and the 3'-untranslated region of each terminated at a different point. One of these cDNAs (SEQ ID No:47), expressed as a (3-galactosidase fusion protein in E. coli, catalyzed the same enantiomer-specific reactions as the native plant protein.
With respect to the cloning of (+)-pinoresinol/(+)-lariciresinol reductase and (-)-pinoresinol/(-)-lariciresinol reductase from Thuja pl icata, cDNA was synthesized and utilized as a template in a PCR reaction in which the primers were a 3' linker-primer (SEQ ID No:59) and a 5' primer, designated CR6-NT, (SEQ ID No:60). At least two bands of the expected length ( 1.2 kb) were generated and cloned into a plasmid vector. One clone, designated plr-Tpl, (SEQ ID No:61) was completely sequenced and expressed as a ~3-galactosidase fusion protein in E. coli. plr-Tpl encodes a (-)-pinoresinol/(-)-lariciresinol reductase.
The cDNA insert of clone plr-Tpl was used to screen the T. plicata cDNA
library and identified an additional, unique clone, designated plr-Tp2, (SEQ ID No:63). plr-Tp2 has high homology to plr-Tpl but encodes a (+)-pinoresinol/(+)-lariciresinol reductase. The cDNA insert of clone pir-Tp l was used to screen the T. plicata cDNA library and identify an additional two pinoresinol/lariciresinol reductase cDNAs (SEQ ID Nos:65, 67}.
Two cDNAs encoding pinoresinol/lariciresinol reductases from Tsuga heterophylla (SEQ ID Nos:69, 71 ) were isolated by screening a Tsuga heterophylla cDNA library with the pir-Tpl cDNA insert.
The isolation of cDNAs encoding dirigent proteins, (+)-pinoresinol/-(+)-lariciresinol reductase and (-)-pinoresinol/{-}-lariciresinol reductase permits the development of an efficient expression system for these functional enzymes;
provides useful tools for examining the developmental regulation of lignan biosynthesis and permits the isolation of other dirigent proteins and pinoresinol/lariciresinol reductases. The isolation of the dirigent protein and pinoresinol/lariciresinol reductase cDNAs also permits the transformation of a wide range of organisms in order to enhance or modify lignan biosynthesis.

i~

WO 98l20113 PCTlUS97/20391 The proteins and nucleic acids of the present invention can be utilized to predetermine the stereochemistry, regiochemistry, or both, of the products of bimolecular phenoxy coupling reactions, such as the furofuran, furano and dibenzylbutane lignans. By way of non-limiting examples, the proteins and nucleic acids of the present invention can be utilized to: elevate or otherwise alter the levels of health-protecting lignans, such as podophyllotoxin, in plant species, including but not limited to vegetables, grains and fruits, and to food items incorporating material derived from such genetically altered plants; genetically alter plant species to provide an abundant, natural supply of lignans useful for a variety of purposes, for example IO as neutriceuticals and dietary supplements; to genetically alter living organisms to produce an abundant supply of optically pure lignans having desirable biological properties, for example (-)-arctigenin which possesses antiviral properties.
In particular, characterization of the dirigent protein binding site and mechanism of action permits the development of synthetic proteins consisting of an array of dirigent protein binding sites which serve as templates for stereochemically-controlled polymeric assembly.
N-terminal transport sequences well known in the art (see, e.g., von Heijne, G. et al., Eur. J. Biochem 180:53S-S45 (1989); Stryer, Biochemistry W.H. Freeman and Company, New York, NY, p. 769 ( 1988)) may be employed to direct the dirigent protein or pinoresinolllariciresinol reductase to a variety of cellular or extracellular locations.
Sequence variants of wild-type dirigent protein clones and pinoresinol/-lariciresinol clones that can be produced by deletions, substitutions, mutations and/or insertions are intended to be within the scope of the invention except insofar as limited by the prior art. Dirigent protein or pinoresinolllariciresinol reductase amino acid sequence variants may be constructed by mutating the DNA sequence that encodes wild-type dirigent protein or wild-type pinoresinol/lariciresinol reductase, such as by using techniques commonly referred to as site-directed mutagenesis.
Various polymerase chain reaction (PCR) methods now well known in the field, such as a two primer system like the Transformer Site-Directed Mutagenesis kit from Clontech, may be employed for this purpose.
Following denaturation of the target plasmid in this system, two primers are simultaneously annealed to the plasmid; one of these primers contains the desired site-directed mutation, the other contains a mutation at another point in the plasmid resulting in elimination of a restriction site. Second strand synthesis is then carried out, tightly linking these two mutations, and the resulting plasmids are transformed into a mutS strain of E. toll. Plasmid DNA is isolated from the transformed bacteria, restricted with the relevant restriction enzyme (thereby Iinearizing the unmutated plasmids), and then retransformed into E. toll. This system allows for generation of _ 5 mutations directly in an expression plasmid, without the necessity of subcloning or generation of single-stranded phagemids. The tight linkage of the two mutations and the subsequent linearization of unmutated plasmids results in high mutation efficiency and allows minimal screening. Following synthesis of the initial restriction site primer, this method requires the use of only one new primer type per mutation site. Rather than prepare each positional mutant separately, a set of "designed degenerate" oligonucleotide primers can be synthesized in order to introduce all of the desired mutations at a given site simultaneously.
Transformants can be screened by sequencing the plasmid DNA through the mutagenized region to identify and sort mutant clones. Each mutant DNA can then be restricted and analyzed by electrophoresis on Mutation Detection Enhancement gel (J.T. Baker) to confirm that no other alterations in the sequence have occurred (by band shift comparison to the unmutagenized control).
The verified mutant duplexes can be cloned into a replicable expression vector, if not already cloned into a vector of this type, and the resulting expression construct used to transform E. toll, such as strain E toll BL21 (DE3)pLysS, for high level production of the mutant protein, and subsequent purification thereof.
The method of FAB-MS mapping can be employed to rapidly check the fidelity of mutant expression. This technique provides for sequencing segments throughout the whole protein and provides the necessary confidence in the sequence assignment. In a mapping experiment of this type, protein is digested with a protease (the choice will depend on the specific region to be modified since this segment is of prime interest and the remaining map should be identical to the map of unmutagenized protein).
The set of cleavage fragments is fractionated by microbore HPLC (reversed phase or ion exchange, again depending on the specific region to be modified) to provide several peptides in each fraction, and the molecular weights of the peptides are determined by FAB-MS. The masses are then compared to the molecular weights of peptides expected from the digestion of the predicted sequence, and the correctness of the sequence quickly ascertained. Since this mutagenesis approach to protein modification is directed, sequencing of the altered peptide should not be necessary if the MS agrees with prediction. If necessary to verify a changed residue, i~

WO 98l20113 PCT/LTS97120391 CAD-tandem MS/MS can be employed to sequence the peptides of the mixture in question, or the target peptide purified for subtractive Edman degradation or carboxypeptidase Y digestion depending on the location of the modification.
In the design of a particular site directed mutant, it is generally desirable to first make a non-conservative substitution (e.g., Ala for Cys, His or Glu) and determine if activity is greatly impaired as a consequence. The properties of the mutagenized protein are then examined with particular attention to the kinetic parameters of K", and k~at as sensitive indicators of altered function, from which changes in binding and/or catalysis per se may be deduced by comparison to the native enzyme. If the residue is by this means demonstrated to be important by activity impairment, or knockout, then conservative substitutions can be made, such as Asp for Glu to alter side chain length, Ser for Cys, or Arg for His. For hydrophobic segments, it is largely size that will be altered, although aromatics can also be substituted for alkyl side chains. Changes in the normal product distribution can indicate which steps) of the reaction sequence have been altered by the mutation.
Other site directed mutagenesis techniques may also be employed with the nucleotide sequences of the invention. For example, restriction endonuclease digestion of DNA followed by ligation may be used to generate dirigent protein or pinoresinol/lariciresinoi reductase deletion variants, as described in Section 15.3 of Sambrook et al. (Molecular Cloning A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, New York, NY ( 1989)). A similar strategy may be used to construct insertion variants, as described in Section 15.3 of Sambrook et al., supra.
Oligonucleotide-directed mutagenesis may also be employed for preparing substitution variants of this invention. It may also be used to conveniently prepare the deletion and insertion variants of this invention. This technique is well known in the art as described by Adelman et al. (DNA 2;183 (1983)J. Generally, oligonucleotides of at least 25 nucleotides in length are used to insert, delete or substitute two or more nucleotides in the dirigent protein gene or pinoresinoi/
lariciresinol reductase gene. An optimal oligonucleotide will have 12 to 15 perfectly matched nucleotides on either side of the nucleotides coding for the mutation.
To mutagenize the wild-type dirigent protein or wild-type pinoresinolJlariciresinol reductase, the oligonucleotide is annealed to the single-stranded DNA template molecule under suitable hybridization conditions. A DNA polymerizing enzyme, usually the Klenow fragment of E. coli DNA polymerase I, is then added. This _ 17_ enzyme uses the oligonucleotide as a primer to complete the synthesis of the mutation-bearing strand of DNA. Thus, a heteroduplex molecule is formed such that one strand of DNA encodes the wild-type dirigent protein or pinoresinol/lariciresinol reductase inserted in the vector, and the second strand of DNA encodes the mutated form of dirigent protein or pinoresinol/lariciresinol reductase inserted into the same vector. This heteroduplex molecule is then transformed into a suitable host cell.
Mutants with more than one amino acid substituted may be generated in one of several ways. If the amino acids are located close together in the polypeptide chain, they may be mutated simultaneously using one oligonucleotide that codes for all of the desired amino acid substitutions. If however, the amino acids are located some distance from each other (separated by more than ten amino acids, for example) it is more difficult to generate a single oligonucleotide that encodes a11 of the desired changes. Instead, one of two alternative methods may be employed. In the first method, a separate oligonucleotide is generated for each amino acid to be substituted.
The oligonucleotides are then annealed to the single-stranded template DNA
simultaneously, and the second strand of DNA that is synthesized from the template will encode all of the desired amino acid substitutions.
An alternative method involves two or more rounds of mutagenesis to produce the desired mutant. The first round is as described for the single mutants:
wild-type dirigent protein or pinoresinol/lariciresinol reductase DNA is used for the template, an oligonucleotide encoding the first desired amino acid substitutions) is annealed to this template, and the heteroduplex DNA molecule is then generated.
The second round of mutagenesis utilizes the mutated DNA produced in the first round of mutagenesis as the template. Thus, this template already contains one or more mutations. The oiigonucleotide encoding the additional desired amino acid substitution{s) is then annealed to this template, and the resulting strand of DNA now encodes mutations from both the first and second rounds of mutagenesis. This resultant DNA can be used as a template in a third round of mutagenesis, and so on.
Eukaryotic expression systems may be utilized for dirigent protein or pinoresinol/lariciresinol reductase production since they are capable of carrying out any required posttranslational modifications and of directing the enzyme to the proper membrane location. A representative eukaryotic expression system for this . purpose uses the recombinant baculovirus, Autographa californica nuclear polyhedrosis virus {AcNPV; M.D. Summers and G.E. Smith, A Manual of Methods for Baculovirus Vectors and Insect Cell Culture Procedures (1986); Luckow et al., i~

WO 98l20113 PCTlUS97/20391 Bio-technology 6:47-55 ( 1987)) for expression of the dirigent protein or pinoresinol/lariciresinol reductases of the invention. Infection of insect cells (such as cells of the species Spodoptera frugiperda) with the recombinant baculoviruses allows for the production of large amounts of the dirigent protein or pinoresinol/lariciresinol reductase protein. In addition, the baculovirus system has other important advantages for the production of recombinant dirigent protein or pinoresinol/lariciresinol reductase. For example, baculoviruses do not infect humans and can therefore be safely handled in large quantities. In the baculovirus system, a DNA construct is prepared including a DNA segment encoding dirigent protein or pinoresinol/lariciresinol reductase and a vector. The vector may comprise the polyhedron gene promoter region of a baculovirus, the baculovirus flanking sequences necessary for proper cross-over during recombination (the flanking sequences comprise about 200-300 base pairs adjacent to the promoter sequence) and a bacterial origin of replication which permits the construct to replicate in bacteria.
The vector is constructed so that (i) the DNA segment is placed adjacent (or operably-linked or "downstream" or "under the control of') to the polyhedron gene promoter and (ii) the promoter/pinoresinol/lariciresinol reductase, or promoter/-dirigent protein, combination is flanked on both sides by 200-300 base pairs of baculovirus DNA (the flanking sequences).
To produce a dirigent protein DNA construct, or a pinoresinol/lariciresinol reductase DNA construct, a cDNA clone encoding a full length dirigent protein or pinoresinol/lariciresinol reductase is obtained using methods such as those described herein. The DNA construct is contacted in a host cell with baculovirus DNA of an appropriate baculovirus (that is, of the same species of baculovirus as the promoter encoded in the construct) under conditions such that recombination is effected. The resulting recombinant baculoviruses encode the full dirigent protein or pinoresinol/lariciresinol reductase. For example, an insect host cell can be cotransfected or transfected separately with the DNA construct and a functional baculovirus. Resulting recombinant baculoviruses can then be isolated and used to infect cells to effect production of dirigent protein or pinoresinol/lariciresinol reductase. Host insect cells include, for example, Spodoptera frugiperda cells.
Insect host cells infected with a recombinant baculovirus of the present invention are then cultured under conditions allowing expression of the baculovirus-encoded dirigent protein or pinoresinol/lariciresinol reductase. Recombinant protein thus 3 5 produced is then extracted from the cells using methods known in the art.

Other eukaryotic microbes such as yeasts may also be used to practice this invention. The baker's yeast Saccharomyces cerevisiae, is a commonly used yeast, although several other strains are available. The plasmid YRp7 (Stinchcomb et al., Nature 282:39 ( 1979); Kingsman et al., Gene 7:141 ( 1979); Tschemper et al., Gene 10:157 ( I 980)) is commonly used as an expression vector in Saccharomyces.
This plasmid contains the trp 1 gene that provides a selection marker for a mutant strain of yeast lacking the ability to grow in tryptophan, such as strains ATCC No.
44,076 and PEP4-1 (Jones, Genetics 85: I 2 ( 1977)). The presence of the trp 1 lesion as a characteristic of the yeast host cell genome then provides an effective environment for detecting transformation by growth in the absence of tryptophan. Yeast host cells are generally transformed using the polyethylene glycol method, as described by Hinnen (Proc. Natl. Acad Sci. USA 75:I929 (1978)). Additional yeast transformation protocols are set forth in Gietz et al., N.A.R. 20(17):1425 (1992);
Reeves et al., FEMS 99:193-197 (1992).
Suitable promoting sequences in yeast vectors include the promoters for 3-phosphoglycerate kinase (Hitzeman et al., J. Biol. Chem. 255:2073 ( I 980)) or other glycolytic enzymes (Hess et al., J. Adv. Enzyme Reg. 7:l49 (I968); Holland et al., Biochemistry 17:4900 ( I 978)), such as enolase, glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triose-phosphate isomerase, phosphoglucose isomerase, and glucokinase. In the construction of suitable expression plasmids, the termination sequences associated with these genes are also ligated into the expression vector 3' of the sequence desired to be expressed to provide polyadenylation of the mRNA and termination. Other promoters that have the additional advantage of transcription controlled by growth conditions are the promoter region for alcohol dehydrogenase 2, isocytochrome C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, and the aforementioned glyceraldehyde-3-phosphate dehydrogenase, and enzymes responsible for maltose and galactose utilization. Any plasmid vector containing yeast-compatible promoter, origin of replication and termination sequences is suitable.
Cell cultures derived from multicellular organisms, such as plants, may be used as hosts to practice this invention. Transgenic plants can be obtained, for example, by transferring plasmids that encode pinoresinol/lariciresinol reductase, _ 35 and/or dirigent protein, and a selectable marker gene, e.g., the kan gene encoding i resistance to kanamycin, into Agrobacterium tumifaciens containing a helper Ti plasmid as described in Hoeckema et al., Nature 303:179-181 (1983) and culturing the Agrobacterium cells with leaf slices of the plant to be transformed as described by An et al., Plant Physiology 81:301-305 (1986). Transformation of cultured plant host cells is normally accomplished through Agrobacterium tumif'aciens, as described above. Cultures of mammalian host cells and other host cells that do not have rigid cell membrane barriers are usually transformed using the calcium phosphate method as originally described by Graham and Van der Eb (Virology 52:54b (1978)) and modified as described in Sections l6.32-16.37 of Sambrook et al., supra.
However, other methods for introducing DNA into cells such as Polybrene (Kawai and Nishizawa, Mol. Cell. Bio 1. 4:1 l 72 ( 1984)), protoplast fusion (Schaffner, Proc. Natl.
Acad. Sci. USA 77:2163 ( 1980)), electroporation (Neumann et al., EMBO J. 1:

( 1982)), and direct microinjection into nuclei (Capecchi, Cell 22:479 ( 1980)) may also be used. Additionally, animal transformation strategies are reviewed in Monastersky G.M. and Robl, J.M., Strategies in Transgenic Animal Science, ASM
Press, Washington, D.C. (1995). Transformed plant calli may be selected through the selectable marker by growing the cells on a medium containing, e.g., kanamycin, and appropriate amounts of phytohorrnone such as naphthalene acetic acid and benzyladenine for callus and shoot induction. The plant cells may then be regenerated and the resulting plants transferred to soil using techniques well known to those skilled in the art.
In addition, a gene regulating pinoresinol/lariciresinol reductase production, or dirigent protein production, can be incorporated into the plant along with a necessary promoter which is inducible. In the practice of this embodiment of the invention, a promoter that only responds to a specific external or internal stimulus is fused to the target cDNA. Thus, the gene will not be transcribed except in response to the specific stimulus. As long as the gene is not being transcribed, its gene product is not produced.
An illustrative example of a responsive promoter system that can be used in the practice of this invention is the glutathione-S-transferase (GST) system in maize.
GSTs are a family of enzymes that can detoxify a number of hydrophobic electrophilic compounds that often are used as pre-emergent herbicides (Weigand et al., Plant Molecular Biology 7:235-243 (1986)). Studies have shown that the GSTs are directly involved in causing this enhanced herbicide tolerance.
This action is primarily mediated through a specific 1.1 kb mRNA transcription product. In short, maize has a naturally occurring quiescent gene already present that can respond to external stimuli and that can be induced to produce a gene product.
This gene has previously been identified and cloned. Thus, in one embodiment of this invention, the promoter is removed from the GST responsive gene and attached to a pinoresinol/lariciresinol reductase gene, or a dirigent protein gene, that previously has had its native promoter removed. This engineered gene is the combination of a promoter that responds to an external chemical stimulus and a gene responsible for successful production of pinoresinolllariciresinol reductase or dirigent protein.
In addition to the methods described above, several methods are known in the art for transferring cloned DNA into a wide variety of plant species, including gymnosperms, angiosperms, monocots and dicots (see, e.g., Glick and Thompson, eds., Methods in Plant Molecular Biology, CRC Press, Boca Raton, Florida (1993)).
Representative examples include electroporation-facilitated DNA uptake by protoplasts (Rhodes et aL, Science 240(4849):204-207 ( 1988)); treatment of protoplasts with polyethylene glycol (Lyznik et al., Plant Molecular Biology 13:151-161 (l989)); and bombardment of cells with DNA laden microprojectiles (Klein et al., Plant Physiol. 91:440-444 ( 1989) and Boynton et al., Science 240(4858):1534-l538 (1988)). Numerous methods now exist, for example, for the transformation of cereal crops (see, e.g., McKinnon, G.E. and Henry, R.J., J.
Cereal Science, 22(3):203-210 (1995); Mendel, R.R. and Teeri, T.H., Plant and Microbial Biotechnology Research Series, 3:8l-98, Cambridge University Press (1995};
McElroy, D. and Brettell, R.LS., Trends in Biotechnology, 12(2):62-68 (1994);
Christou et aL, Trends in Biotechnology, 10(7):239-246 (1992); Christou, P.
and Ford, T.L., Annals of Botany, 75(5): 449-454 ( 1995}; Park et al., Plant Molecular Biology, 32(6):1135-1148 ( 1996); Altpeter et al., Plant Cell Reports, 16:12-( 1996)). Additionally, plant transformation strategies and techniques are reviewed in Birch, R.G., Anrt Rev Plant Phys Plant Mol Biol 48:297 ( 1997); Forester et al., Exp.
Agric. 33:15-33 (1997). Minor variations make these technologies applicable to a broad range of plant species.
Each of these techniques has advantages and disadvantages. In each of the techniques, DNA from a plasmid is genetically engineered such that it contains not . only the gene of interest, but also selectable and screenable marker genes.
A
selectable marker gene is used to select only those cells that have integrated copies of 3 5 the plasmid (the construction is such that the gene of interest and the selectable and i~

WO 98/20113 PCTlUS97/20391 screenable genes are transferred as a unit). The screenable gene provides another check for the successful culturing of only those cells carrying the genes of interest. A
commonly used selectable marker gene is neomycin phosphotransferase II (NPT
II).
This gene conveys resistance to kanamycin, a compound that can be added directly to S the growth media on which the cells grow. Plant cells are normally susceptible to kanamycin and, as a result, die. The presence of the NPT II gene overcomes the effects of the kanamycin and each cell with this gene remains viable. Another selectable marker gene which can be employed in the practice of this invention is the gene which confers resistance to the herbicide glufosinate (Basta). A
screenable gene commonly used is the (3-glucuronidase gene {GUS). The presence of this gene is characterized using a histochemical reaction in which a sample of putatively transformed cells is treated with a GUS assay solution. After an appropriate incubation, the cells containing the GUS gene turn blue. Preferably, the plasmid will contain both selectable and screenable marker genes.
The plasmid containing one or more of these genes is introduced into either plant protoplasts or callus cells by any of the previously mentioned techniques. If the marker gene is a selectable gene, only those cells that have incorporated the DNA
package survive under selection with the appropriate phytotoxic agent. Once the appropriate cells are identified and propagated, plants are regenerated.
Progeny from the transformed plants must be tested to insure that the DNA package has been successfully integrated into the plant genome.
Mammalian host cells may also be used in the practice of the invention.
Examples of suitable mammalian cell lines include monkey kidney CVI line transformed by SV40 (COS-7, ATCC CRL l651); human embryonic kidney line 293 S (Graham et al., J. Gen. Virol. 36:59 ( 1977)); baby hamster kidney cells (BHK, ATCC CCL 10); Chinese hamster ovary cells {Urlab and Chasin, Proc. Natl. Acad.
Sci USA 77:4216 ( 1980)); mouse sertoli cells (TM4, Mather, Biol. Reprod.
23:243 ( 1980)); monkey kidney cells {CVI-76, ATCC CCL 70); African green monkey kidney cells (VERO-76, ATCC CRL-1587); human cervical carcinoma cells (HELA, ATCC CCL 2); canine kidney cells (MDCK, ATCC CCL 34J; buffalo rat liver cells (BRL 3A, ATCC CRL 1442); human lung cells (W138, ATCC CCL 75); human liver cells (Hep G2, HB 8065); mouse mammary tumor cells (MMT 060S62, ATCC CCL 51 ); rat hepatoma cells (HTC, ML54, Baumann et al., J. Cell Biol.
85:1 (1980)); and TRI cells (Mather et al., Annals N. Y. Acad. Sci. 383:44 (1982)).
Expression vectors for these cells ordinarily include (if necessary) DNA
sequences WO 98l20113 PCT/US97/20391 for an origin of replication, a promoter located in front of the gene to be expressed, a ribosome binding site, an RNA splice site, a polyadenylation site, and a transcription terminator site.
Promoters used in mammalian expression vectors are often of viral origin.
These viral promoters are commonly derived from polyoma virus, Adenovirus 2, and most frequently Simian Virus 40 (SV40). The SV40 virus contains two promoters that are termed the early and late promoters. These promoters are particularly useful because they are both easily obtained from the virus as one DNA fragment that also contains the viral origin. of replication (Hers et al., Nature 273:113 (1978)). Smaller or larger SV40 DNA fragments may also be used, provided they contain the approximately 250-by sequence extending from the HindIII site toward the BgII
site located in the viral origin of replication.
Alternatively, promoters that are naturally associated with the foreign gene (homologous promoters) may be used provided that they are compatible with the host cell line selected for transformation.
An origin of replication may be obtained from an exogenous source, such as SV40 or other virus (e.g., Polyoma, Adeno, VSV, BPV) and inserted into the cloning vector. Alternatively, the origin of replication may be provided by the host cell chromosomal replication mechanism. If the vector containing the foreign gene is integrated into the host cell chromosome, the latter is often sufficient.
The use of a secondary DNA coding sequence can enhance production levels of pinoresinolliariciresinol reductase or dirigent protein in transformed cell lines.
The secondary coding sequence typically comprises the enzyme dihydrofolate reductase (DHFR). The wild-type form of DHFR is normally inhibited by the chemical methotrexate (MTX). The level of DHFR expression in a cell will vary depending on the amount of MTX added to the cultured host cells. An additional feature of DHFR that makes it particularly useful as a secondary sequence is that it can be used as a selection marker to identify transformed cells. Two forms of DHFR
are available for use as secondary sequences, wild-type DHFR and MTX-resistant DHFR. The type of DHFR used in a particular host cell depends on whether the host cell is DHFR deficient (such that it either produces very low levels of DHFR
endogenously, or it does not produce functional DHFR at all). DHFR-deficient cell lines such as the CHO cell line described by Urlaub and Chasin, supra, are transformed with wild-type DHFR coding sequences. After transformation, these DHFR-deficient cell lines express functional DHFR and are capable of growing in a i~

culture medium lacking the nutrients hypoxanthine, glycine and thymidine.
Nontransformed cells will not survive in this medium.
The MTX-resistant form of DHFR can be used as a means of selecting for transformed host cells in those host cells that endogenously produce normal amounts of functional DHFR that is MTX sensitive. The CHO-Kl cell line (ATCC
No. CL 61 ) possesses these characteristics, and is thus a useful cell line for this purpose. The addition of MTX to the cell culture medium will permit only those cells transformed with the DNA encoding the MTX-resistant DHFR to grow. The nontransformed cells will be unable to survive in this medium.
I O Prokaryotes may also be used as host cells for the initial cloning steps of this invention. They are particularly useful for rapid production of large amounts of DNA, for production of single-stranded DNA templates used for site-directed mutagenesis, for screening many mutants simultaneously, and for DNA sequencing of the mutants generated. Suitable prokaryotic host cells include E. coli K12 strain 294 (ATCC No. 31,44b), E. coli strain W3110 (ATCC No. 27,325) E. coli XI776 (ATCC No. 31,S37), and E. coli B; however many other strains of E. coli, such as HBIOI, JM101, NM522, NM538, NM539, and many other species and genera of prokaryotes including bacilli such as Bacillus subtilis, other enterobacteriaceae such as Salmonella typhimurium or Serratia marcesans, and various Pseudomonas species may all be used as hosts. Prokaryotic host cells or other host cells with rigid cell walls are preferably transformed using the calcium chloride method as described in section 1.82 of Sambrook et al., supra. Alternatively, electroporation may be used for transformation of these cells. Prokaryote transformation techniques are set forth in Dower, W. J., in Genetic Engineering, Principles and Methods, 12:275-296, Plenum Publishing Corp. ( 1990); Hanahan et al., Meth. Enxymol., 204:63 ( 1991 ).
As a representative example, cDNA sequences encoding dirigent protein or pinoresinol/lariciresinol reductase may be transferred to the (His)6~Tag pET
vector commercially available (from Novagen) for overexpression in E. coli as heterologous host. This pET expression plasmid has several advantages in high level heterologous expression systems. The desired cDNA insert is ligated in frame to plasmid vector sequences encoding six histidines followed by a highly specific protease recognition site (thrombin) that are joined to the amino terminus codon of the target protein. The histidine "block" of the expressed fusion protein promotes very tight binding to immobilized metal ions and permits rapid purification of the recombinant protein by immobilized metal ion affinity chromatography. The histidine leader sequence is then cleaved at the specific proteolysis site by treatment of the purified protein with thrombin, and the dirigent protein or pinoresinol/lariciresinol reductase eluted. This overexpression-purification system has high capacity, excellent resolving power and is fast, and the chance of a contaminating E toll protein exhibiting similar binding behavior (before and after thrombin proteolysis) is extremely small.
As will be apparent to those skilled in the art, any plasmid vectors containing replicon and control sequences that are derived from species compatible with the host cell may also be used in the practice of the invention. The vector usually has a replication site, marker genes that provide phenotypic selection in transformed cells, one or more promoters, and a polylinker region containing several restriction sites for insertion of foreign DNA. Plasmids typically used for transformation of E.
toll include pBR322, pUC I 8, pUC 19, pUCI I 8, pUC 1 I 9, and Bluescript M I 3, a11 of which are described in Sections 1.12-1.20 of Sambrook et al., supra. However, many other suitable vectors are available as well. These vectors contain genes coding for ampicillin andlor tetracycline resistance which enables cells transformed with these vectors to grow in the presence of these antibiotics.
The promoters most commonly used in prokaryotic vectors include the ~i-lactamase (peniciilinase) and lactose promoter systems (Chang et al. Nature 375:615 ( I 978); Itakura et al., Science I98:1056 ( I 977); Goeddel et al., Nature 281: 544 ( I 979)) and a tryptophan (trp) promoter system (Goeddel et al., Nucl. Acids Res. 8:40S7 (l980); EPO Appl. Publ. No. 36,776}, and the alkaline phosphatase systems. While these are the most commonly used, other microbial promoters have been utilized, and details concerning their nucleotide sequences have been published, enabling a skilled worker to ligate them functionally into plasmid vectors (see Siebenlist et al., Cell 20:269 ( 1980)).
Many eukaryotic proteins normally secreted from the cell contain an endogenous secretion signal sequence as part of the amino acid sequence. Thus, proteins normally found in the cytoplasm can be targeted for secretion by linking a signal sequence to the protein. This is readily accomplished by ligating DNA
encoding a signal sequence to the 5' end of the DNA encoding the protein and then expressing this fusion protein in an appropriate host cell. The DNA encoding the signal sequence may be obtained as a restriction fragment from any gene encoding a protein with a signal sequence. Thus, prokaryotic, yeast, and eukaryotic signal sequences may be used herein, depending on the type of host cell utilized to practice 3 5 the invention. The DNA and amino acid sequence encoding the signal sequence i WO 98l20113 PCTIUS97120391 -portion of several eukaryotic genes including, for example, human growth hormone, proinsulin, and proalbumin are known (see Stryer, Biochemistry W.H. Freeman and Company, New York, NY, p. 769 ( 1988)), and can be used as signal sequences in appropriate eukaryotic host cells. Yeast signal sequences, as for example acid S phosphatase (Arima et al., Nucleic Acids Res. 11:1657 (1983)), alpha-factor, alkaline phosphatase and invertase may be used to direct secretion from yeast host cells.
Prokaryotic signal sequences from genes encoding, for example, Lama or OmpF
(Wong et al., Gene 68:193 (1988)), MaIE, PhoA, or beta-lactamase, as well as other genes, may be used to target proteins from prokaryotic cells into the culture medium.
Trafficking sequences from plants, animals and microbes can be employed in the practice of the invention to direct the gene product to the cytoplasm, endoplasmic reticulum, mitochondria or other cellular components, or to target the protein for export to the medium. These considerations apply to the overexpression of pinoresinol/lariciresinol reductase or dirigent protein, and to direction of expression 1 S within cells or intact organisms to permit gene product function in any desired location.
The construction of suitable vectors containing DNA encoding replication sequences, regulatory sequences, phenotypic selection genes and the dirigent protein DNA or pinoresinol/lariciresinol reductase DNA of interest are prepared using standard recombinant DNA procedures. Isolated plasmids and DNA fragments are cleaved, tailored, and ligated together in a specific order to generate the desired vectors, as is well known in the art (see, for example, Sambrook et al., supra).
As discussed above, pinoresinol/lariciresinol reductase variants, or dirigent protein variants, are preferably produced by means of mutations) that are generated using the method of site-specific mutagenesis. This method requires the synthesis and use of specific oligonucleotides that encode both the sequence of the desired mutation and a sufficient number of adjacent nucleotides to allow the oligonucleotide to stably hybridize to the DNA template.
A dirigent protein gene and/or pinoresinol/lariciresinol reductase gene, or an antisense nucleic acid fragment complementary to all or part of a dirigent protein gene or pinoresinol/lariciresinol reductase gene, may be introduced, as appropriate, into any plant species for a variety of purposes including, but not limited to: altering or improving the color, texture, durability and pest-resistance of wood tissue, especially heartwood tissue; reducing the formation of lignans and/or lignins in plant 3 5 species, such as corn, which are useful as animal fodder, thereby enhancing the availability of the cellulose fraction of the plant material to the digestive system of animals ingesting the plant material; reducing the lignan/lignin content of plant species utilized in pulp and paper production, thereby making pulp and paper production easier and cheaper; improving the defensive capability of a plant against predators and pathogens by enhancing the production of defensive lignans or lignins;
the alteration of other ecological interactions mediated by lignans or lignins;
producing elevated levels of optically-pure lignan enantiomers as medicines or food additives; introducing, enhancing or inhibiting the production of dirigent proteins or pinoresinol/lariciresinol reductases, or the production of pinoresinol or lariciresinol and their derivatives. A dirigent protein and/or pinoresinol/lariciresinol reductase gene may be introduced into any organism for a variety of purposes including, but not limited to: introducing, enhancing or inhibiting the production of dirigent protein and/or pinoresinol/lariciresinol reductase, or the production of pinoresinol or lariciresinol and their derivatives.
1 S The foregoing may be more fully understood in connection with the following representative examples, in which "Plasmids" are designated by a lower case p followed by an alphanumeric designation. The starting plasmids used in this invention are either commercially available, publicly available on an unrestricted basis, or can be constructed from such available plasmids using published procedures. In addition, other equivalent plasmids are known in the art and will be apparent to the ordinary artisan.
"Digestion", "cutting" or "cleaving" of DNA refers to catalytic cleavage of the DNA with an enzyme that acts only at particular locations in the DNA.
These enzymes are called restriction endonucleases, and the site along the DNA
sequence where each enzyme cleaves is called a restriction site. The restriction enzymes used in this invention are commercially available and are used according to the instructions supplied by the manufacturers. (See also Sections 1.60-1.61 and Sections 3.38-3.39 of Sambrook et al., supra.) "Recovery" or "isolation" of a given fragment of DNA from a restriction digest means separation of the resulting DNA fragment on a polyacrylamide or an agarose gel by electrophoresis, identification of the fragment of interest by comparison of its mobility versus that of marker DNA fragments of known molecular weight, removal of the gel section containing the desired fragment, and separation of the gel from DNA. This procedure is known generally, For example, see Lawn et al.

i WO 98/20113 PCT/US97l20391 _28_ (Nucleic Acids Res. 9:6103-6114 ( 1982)), and Goeddel et al. (Nucleic Acids Res., supra).
The following examples merely illustrate the best mode now contemplated for practicing the invention, but should not be construed to limit the invention. A11 literature citations herein are expressly incorporated by reference.

Purification of DiriQent Protein from Forsythia intermedia Plant Materials. Forsythia intermedia plants were either obtained from Bailey's Nursery (var. Lynwood Gold, St., Paul, MN), and maintained in Washington State University greenhouse facilities, or were gifts from the local community.
Initial Extraction and Ammonium Sulphate Precipitation. Solubilization of bound proteins was carried out at 4°C. Frozen Forsythia intermedia stems (2 kg) were pulverized in a blaring Blendor (Model CB6) in the presence of liquid nitrogen.
The resulting powder was homogenized with 0.1 M KH2P04-K2HP0~ buffer (pH 7.0, 4 liters) containing 5 mM dithiothreitol, and filtered through four layers of cheesecloth. The insoluble residue was consecutively extracted, with continuous agitation at 250 rpm, as follows: with chilled (-20°C) re-distilled acetone (4 liters, 3 x 30 min); 0.1 M KH2POq-K2HP04 buffer (pH 6.5) containing 0.1%
p-mercaptoethanol (solution A, 8 liters, 30 min); solution A containing 1 Triton X100 (8 liters, 4 hours) and finally solution A (8 liters, I6 hours).
Between each extraction, the residue was filtered through one layer of Miracloth (Caibiochem). Solubilization of the {+)-pinoresinol forming system was achieved by mechanically stirring the residue in solution A containing 1 M NaCI (8 liters, 4 hours). The homogenate was decanted and the resulting solution consecutively filtered through Miracloth (Calbiochem) and glass fiber (G6, Fisher Sci.). The filtrate was concentrated in an Amicon cell (Model 2000; YM 30 membrane) to a final volume of 800 mI, and subjected to (NH4)2S04 fractionation. Proteins precipitating between 40 and 80% saturation were recovered by centrifugation ( 15,000g, 30 min) and the (NH4)2504 pellet stored at -20°C until required.
Mono S Column Chromatography. Purification of 78-kD dirigent protein and partial purification of oxidase. The ammonium sulfate pellet (obtained from 2 kg of F intermedia stems) was reconstituted in 40 mM MES
[2-(N-Morpholino)ethanesulfonic acid] buffer, adjusted to pH 5.0 with 6 M NaOH
(solution B, 30 ml), the slurry being centrifuged {3,b00g, 5 min), and the supernatant dialyzed overnight against solution B (4 liters). The dialyzed extract was filtered (0.22 pm) and the sample (35 to 40 mg proteins) was applied to a MonoS HRS/5 (50 mm by 5 mm) column equilibrated in solution B at 4°C. After eluting (flow rate ml min-I cm-2) with solution B (13 ml), proteins were desorbed with Na2S04 in solution B, using a linear gradient from 0 to l00 mM in 8 ml and holding at this 5 concentration for 32 ml, then implementing a series of step gradients at 133 mM for 50 ml, l66 mM for 50 mi, 200 mM for 40 ml, 233 mM for 40 ml and finally 333 mM
Na2S04 for 40 ml. Fractions capable of forming (+)-pinoresinol from E-coniferyl alcohol were eluted with 333 mM Na2S04, combined and stored (-80°C) until needed.
POROS SP-M Matrix Column Chromatography (First Column). Fractions from 15 individual elutions from the MonoS HRS/5 column {33mM Na2S04) were combined ( 18.5 mg proteins, 180 ml) and dialyzed overnight against solution C. The dialyzed enzyme solution ( 190 ml) was filtered (0.22 Vim) and an aliquot (47 ml) was applied to the POROS SP-M column. All separations on a POROS SP-M matrix ( 100 mm by 4.6 mm), previously equilibrated in 25 mM MES-HEPES-sodium acetate buffer (pH 5.0, solution C), were performed at a flow rate of 60 ml min-1 cm-2 and at room temperature. After elution with solution C ( 12 ml), the proteins were desorbed with a linear Na2S04 gradient (0 to 0.7 M in 66.5 ml) in solution C, whereupon the concentration established was held for an additional 16.6 ml.
Under these conditions, separation of four fractions (I, II, III and IV) was achieved at -r40, 47, 55 and 61 mS, respectively. This purification step was repeated three times with the remaining dialyzed enzymatic extract, and fractions I, II, III, and IV
from each experiment were separately combined. When protease inhibitors [that is, phenyl-methanesulfonyl fluoride (0.1 mmol ml-I), EDTA (0.5 nmol ml-1), pepstatin A
( 1 pg m1-1 ), and antipain ( 1 ~,g ml-1 )) were added during the solubilization and a11 subsequent purification stages, no differences were observed in the elution profiles of fractions I, II, III, and IV.
POROS SP-M Matrix Column Chromatography (Second Column). Fraction I
from the first POROS SP-M Matrix column chromatography step (2.62 mg proteins, 40 ml, 24.6 mS) was diluted in filtered, cold distilled water until the conductivity reached ~8 mS (f nal volume = l 50 ml). The diluted protein solution was then applied onto a POROS S-P-M column (l00 mm by 4.6 mm). After elution with solution C ( 12 ml), fraction I was desorbed using a linear Na2S04 gradient from 0 to 0.25 M in 20 ml, whereupon the concentration established was held for another 25 ml. This was followed by another linear Na2S04 gradient from 0.25 to 0.7 M
in i WO 98I20113 PCTlUS97120391 26 ml which was then held at 0.7 M for an additional 16.6 ml. Fractions eluted at ~30 mS (the ionic strength of the eluent was measured with a flow-through detector) were combined ( 15 ml, 1.3 mg), diluted with water and rechromatographed. The resulting protein (eluted at ~30 mS with the gradient described above) was stored (-80°C) until needed.
Gel filtration. An aliquot from fraction I (595.5 pg proteins, 3 ml, eluted at ~30 mS), was concentrated to 0.6 ml (Centricon 10, Amicon) and loaded onto a (73.2 cm by 1.6 cm, Pharmacia-LKB) gel chromatographic column equilibrated in 0.1 M MES-HEPES-sodium acetate buffer (pH S.0) containing 50 mM Na2S04 at 4°C. An apparently homogenous 78-kD dirigent protein (242 fig} was eluted (flow rate 0.25 ml miri ' cm 2) as a single component at 133 ml (Vo = 105 ml).
Molecular weights were estimated by comparison of their elution profiles with the standard proteins, f3-amylase (200,000), alcohol dehydrogenase ( 150,000), bovine serum albumin (66,000), ovalbumin (45,000), carbonic anhydrase (29,000) and cytochrome c ( 12,400).

Characterization of the Purified Diri~ent Protein Molecular Weight and Isoelectric Point Determination. Polyacrylamide gel electrophoresis (PAGE) was performed in Laemmli's buffer system with gradient (4 to 15% acrylamide, Bio-Rad) gels under denaturing and reducing conditions.
Proteins were visualized by silver staining. Gel filtration (S200) chromatography of fraction I gave a protein of native molecular weight ~78 kD, whereas SDS-polyacrylamide gel electrophoresis showed a single band at ~27 kD, suggesting that the native protein exists as a trimer. Isoelectric focusing of the native protein on a polyacrylamide gel (pH 3 to 10 gradient) revealed the presence of six bands.
After isoelectric focusing, each of these bands was electroblotted onto a polyvinylidene fluoride (PVDF) membrane and subjected to amino terminal sequencing, which established that a11 had similar sequences indicating a series of isoforms.
The ultraviolet-visible spectrum of the protein had only a characteristic protein absorbance at 280 nm with a barely perceptible shoulder at 330 nm. Inductively coupled plasma (ICP) analysis gave no indication of any metal being present in the protein. Thus, the 78-kD dirigent protein lacks any detectable catalytically active oxidative center.
Assay of the Ability of the Purifted Dirigent Protein to Eorm (+)Pinoresinol from E-Coniferyl alcohol. The four fractions (I to IV} from the first POROS SP-M

WO 98l20113 PCT/US97/20391 chromatographic step (Example 1 ) were individually rechromatographed, with each fraction subsequently assayed for (+)-pinoresinol=forming activity with E-[9-3H)coniferyl alcohol as substrate for one hour. Fraction I (containing dirigent protein) had very little (+)-pinoresinol-forming activity (<5% of total activity loaded onto the POROS SP-M column), whereas fraction III catalyzed nonspecific oxidative coupling to give (t)-dehydrodiconiferyl alcohols, (+)-pinoresinols, and (~)-erythro/threo guaiacylglycerol 8-D-4'-coniferyl alcohol ethers. Thus, Fraction III
appeared to contain an endogenous plant oxygenating protein.
Although the putative oxidase preparation (Fraction III) was not purified to electrophoretic homogeneity, the electron paramagnetic resonance (EPR) spectrum of this protein preparation resembled that of a typical plant laccase, i.e., a class of naturally-occurring plant oxygenase proteins. We then studied the fate of E-[9-3H)coniferyl alcohol (2 umol ml-l, 14.7 kBq) in the presence of, respectively, the oxidase (fraction III), the 78-kD dirigent protein (Fraction I), and both fraction III
and the 78-kD protein together. With the fraction III preparation alone, only nonspecific bimolecular radical coupling occurs to give (+)-dehydrodiconiferyl alcohols, (+)-pinoresinols and (t)-erythro/threo guaiacylglycerol 8-D-4'-coniferyl alcohol ethers. With the 78-kD protein by itself, however, a small amount of (+)-pinoresinol formation (<5% over 10 hours) was observed, this being presumed to result from residual traces of oxidizing capacity in the preparation. When both fraction III and the 78-kD protein were combined, full catalytic activity and regio-and stereo-specificity in the product was reestablished, whereby essentially only (+)-pinoresinol was formed. Additionally, with fraction III alone, and when fraction III was combined with the 78-kD protein, the rates of substrate depletion and dimeric product formation were nearly identical. Moreover, essentially no turnover of the dimeric lignan products occurred in either case in the presence of the oxidase, over the time-period (8 hours) examined: subsequent dimer oxidation does not occur when E-coniferyl alcohol, the preferred substrate, is still present in the assay mixture.
The 78-kD protein therefore appears to determine the specificity of the bimolecular phenoxy radical coupling reaction.
Gel filtration studies were also carried out with mixtures of the dirigent and fraction III proteins, in order to establish if any detectable protein-protein interaction _ might account for the stereoselectivity. But no evidence in support of complex formation (i.e., to higher molecular size entities) was observed.

WO 98I20113 PCTIUS97l20391 Effect of the 78-KD Dirigent Protein on Plant Laccase-Catalyzed Monoli~nol Cout~ln in,g E-coniferyl alcohol coupling assay. E-[9-3H]Coniferyl alcohol (4 pmol ml-1, 29.3 kBq) was incubated with a 120-kD laccase {previously purified from Forsythia intermedia stem tissue) over a 24-hour period, in the presence and absence of the dirigent protein, as follows. Each assay consisted of E-[9-3H]coniferyl alcohol {4 ~mol ml-~, 29.3 kBq, 7.3 MBq mole liter-1; or 2 pmol m1-1, l4.7 kBq with fraction III), the 78-kD dirigent protein, an oxidase or oxidant, or both [final concentrations: 770 pmol m1-1 dirigent protein; 10.7 pmol protein m1-1 Forsythia laccase; I2 pg protein ml-1 fraction III; 0.5 pmol mI-1 FMN; 0.5 ~mol ml-1 FAD; 1 and 10 pmol ml-1 ammonium peroxydisulfate] in buffer (0.1 M MES-HEPES
sodium acetate, pH 5.0) to a total volume of 250 ~.1. The enzymatic reaction was initiated by addition of E-[9-3H]coniferyl alcohol. Controls were performed in the presence of buffer alone.
After one hour incubation at 30 °C while shaking, the assay mixture was extracted with ethyl acetate (EtOAc, 500 pl) containing {~)-pinoresinols (7.5 pg), (~)-dehydrodiconiferyl alcohols (3.5 fig) and erythro/threo (~)-guaiacylglycerol 8-O-4'-coniferyl alcohol ethers (7.5 ug) as radiochemical carriers and ferulic acid ( 15.0 pg) as an internal standard. After centrifugation ( 13, 800g, 5 min), the EtOAc soluble components were removed and the extraction procedure repeated with EtOAc (500 pl). The EtOAc soluble components from each assay were combined, the solutions evaporated to dryness in vacuo, redissolved in methanol-water solution ( 1:1; 100 ~l) with an aliquot (50 ~1) thereof subjected to reversed-phase column chromatography (Waters, Nova-Pak Clg, 150 mm by 3.8 mm). The elution conditions were as follows: acetonitrile/3 % acetic acid in H20 {S:95) from 0 to 5 min, then linear gradients to ratios of 10:90 between 5 and 20 min, then to 20:80 between 20 and 45 min and finally to 50:50 between 45 and 60 min, at a flow rate of 8.8 ml min-1 cm-2.
Fractions corresponding to E-coniferyl alcohol, erythro/threo guaiacylglycerol 8-O-4'-coniferyl alcohol ethers, {~)-dehydrodiconiferyl alcohols and (~)-pinoresinols were individually collected, aliquots removed for liquid scintillation counting, and the remainder freeze-dried. Pinoresinol-containing fractions were redissolved in methanol (l00 ~l) and subjected to chiral column chromatography (Daicel, Chiralcel OD, SO mm by 4.6 mm) with a solution of WO 98I20113 PCT/US97l20391 hexanes and ethanol ( 1:1 ) as the mobile phase (flow rate 3 ml min-1 cm-2), whereas dehydrodiconiferyl alcohol fractions were subjected to Chiralcel OF (250 mm by 4.b mm) column chromatography eluted with a solution of hexanes and isopropanol (9:1 ) as the mobile phase (flow rate 2.4 ml min-1 cm-2), the radioactivity of the eluent being measured with a flow-through detector (Radiomatic, Model A 120).
Results of E-coniferyl alcohol coupling assay. Incubation with laccase alone gave only racemic dimeric products, with (+)-dehydrodiconiferyl alcohols predominating. In the presence of the dirigent protein, however, the process was now primarily stereoselective, affording (+)-pinoresinol, rather than being nonspecific as observed when only laccase was present. The rates of both E-coniferyl alcohol (substrate) depletion and the formation of the dimeric lignans were similar with and without the dirigent protein. A substantial difference was noted in the subsequent turnover of the lignan products observed after E-coniferyl alcohol depletion.
With the laccase alone no turnover occurred, but when both proteins were present the disappearance of the products was significant. In order to understand the difference, assays were conducted where bovine serum albumin (BSA) and ovalbumin were individually added to the Iaccase-containing solutions at levels matching the weight concentrations of the dirigent protein. In this way, it was established that the differences in product turnover were simply due to stabilization of laccase activity at the higher protein concentrations, although interestingly the dirigent protein, BSA
and ovalbumin afforded somewhat different degrees of protection. The findings were quite comparable when a fungal laccase {from Trametes versicolor) was used in place of the plant laccase. When the oxidizing capacity (i.e., laccase concentration) was lowered five-fold, only (+)-pinoresinol formation was observed. Thus, complete stereoselectivity is preserved when the oxidative capacity does not exceed a point where the dirigent protein is saturated.
Stereoselective E-coniferyl alcohol coupling. Assays were also conducted with E-[9-2H2, OC2H3]coniferyl alcohol and the dirigent protein in the presence of laccase as follows. E-[9-2H2, OC2H3]coniferyl alcohol (2 pmol ml-1) was incubated in the presence of dirigent protein (770 pmol m1-1), the purified plant laccase (4.1 pmol ml-~ ) and buffer (0.1 M MES-HEPES-sodium acetate, pH 5.0) in a total volume of 250 ~.I. After one hour incubation, the reaction mixture was extracted with EtOAc, but with the addition of an internal standard and radiochemical carriers omitted. After reversed-phase column chromatography, the enzymatically formed pinoresinol was collected, freeze-dried, redissolved in methanol (l00 ul) and i subjected to chiral column chromatography (Daicel, Chiralcel OD, 50 mm by 4.6 mm) with detection at 280 nm and analysis by mass spectral fragmentation in the EI mode (Waters, Integrity System). Liquid chromatography-mass spectrometry (LC-MS) analysis of the resulting (+)-pinoresinol (>99% enantiomeric excess) gave a molecular ion with a mass to charge ratio (m/z) 368, thus establishing the presence of 2H atoms and verifying that together the laccase and dirigent protein catalyzed stereoselective coupling of E-[9-2H2, OC2H3]coniferyl alcohol.
Other auxiliary one-electron oxidants can also facilitate stereoselective coupling with the dirigent protein. Ammonium peroxydisulfate readily undergoes 10 homolytic cleavage (A. Usaitis, R. Makuska, Polymer 35:4896 ( 1994)) and is routinely used as an one-electron oxidant in acrylamide polymerization.
Ammonium peroxydisulfate was first incubated with E-[9-3H]coniferyl alcohol (4 pmol ml't, 29.3 kBq) for 6 hours using the E-coniferyl alcohol coupling assay procedure described above. Nonspecific bimolecular radical coupling was observed, to afford predominantly (+)-dehydrodiconiferyl alcohols as well as the other racemic lignans (Table 1 ). However, when the dirigent protein was added, the stereoselectivity of coupling was dramatically altered to give primarily (+)-pinoresinol at both concentrations of oxidant, together with small amounts of racemic lignans.
This established that even an inorganic oxidant, such as ammonium peroxydisulfate, could promote (+)-pinoresinol synthesis in the presence of the dirigent protein, even if it was not oxidatively as selective toward the monolignol as was the fraction III
oxidase orlaccase.

O
Table 1.
Effect of dirigent protein on product distribution from E-coniferyl alcohol oxidized by ammonium peroxydisulfate (6 hour assay).
E-Coniferyl (~)-Guaiacyl-glycerol(~)-Dehydro-alcohol in dimer 8-O-4'-coniferytdiconiferyl Dirigent equivalents alcohol ethers alcohols (f)-Pinoresinol(+)-PinoresinolTotal protein Oxidant (770 pmol depleted (nmol ml-1 ) (nmol ml-1 s (nmol ml' (nmol ml-1 dimers ml' 1 ) ) 1 ) ) (nmol m1-1) (nmol ml-1) Ammonium absent 200 f 4 10 f I 35 f 2 16 f 0 0 61 t 3 0 peroxydisulfate , w w (I pmol present 250 ~ 55 6 t 0 13 t 1 0 l30 t 10 149 ~ 11 ' o ml-1) Ammonium absent 860 t 30 90 t 4 2S0 ~ 10 13S t 4 0 475 ~ 17 peroxydisulfate (10 Itmol present l 030 t 25 30 f 1 90 t 3 0 450 t 10 570 t 14 m1-1) Dirigent present 61 ~ 20 5 f 1 8 t 1 0 55 f 1 68 t 3 protein b n H
N
O
W
~D

i WO 98l20113 PCTlUS97120391 Effect of Other Oxygenating Agents on the Stereospecific Conversion of E-Coniferyl Alcohol to (+) pinoresinol. The effects of incubating E-coniferyl alcohol (4 pmol m1-1, 29.3 kBq) with flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) were investigated since, in addition to their roles as enzyme cofactors, they can also oxidize various organic substrates (T.C. Bruice, Acc.
Chem.
Res. 13:256 (1980)). E-[9-3H]coniferyl alcohol was respectively incubated with FMN and FAD for 48 hours. To obtain the FMN, snake (Naja naja atra, Formosan cobra) venom was added to a solution of FAD (5 pmoi m1-1 in H20) and, after 30 min incubation at 30°C, the enzymatically formed FMN was separated from the protein mixture by filtration through a Centricon 10 (Amicon) microconcentrator. In every instance, E-coniferyl alcohol oxidation was more rapid in the presence of FMN
than FAD. Although these differences between the FMN and FAD catalyzed rates of E-coniferyl alcohol oxidation were not anticipated, a consistent pattern was sustained: racemic lignan products were obtained, with the (t)-dehydrodiconiferyl alcohols predominating as before. When the time courses were repeated in the presence of the dirigent protein, a dramatic change in stereoselectivity was observed, where essentially only (+)-pinoresinol formation occurred. Again, the rates of E-coniferyl alcohol depletion, when adjusted for the traces of residual oxidizing capacity {<5 % over 10 hours) in the dirigent protein preparation, were dependent only upon [FMN] and [FAD], as were the total amounts of dimers formed. When full depletion of E-coniferyl alcohol occurs, the corresponding lignan dimers can begin to undergo oxidative changes as a function of time; specifically, FMN is able subsequently to oxidize pinoresinol, in open solution, after the E-coniferyl alcohol has been fully depleted.
Investigation of Substrate-Specific Stereoselectivity. The coupling stereoselectivity was substrate specific. Neither E p-[9-3H]coumaryl {4 p.mol m1-1, 44.5 kBq) or E-[8-14C]sinapyl alcohols (4 p.mol ml-1, 8.3 kBq), which differ from E-coniferyl alcohol only by a methoxyl group substituent on the aromatic ring, yielded stereoselective products when incubated for 6 hours with FMN and ammonium peroxydisuifate respectively, in the presence and absence of the dirigent protein. Incubations were carried out as described above with the following modifications: E p-[9-3H]coumaryl (4 pmol m1-1, 44.5 kBq) or E-[8-14C]sinapyl alcohols (4 p,mol m1-1, 8.3 kBq} were used as substrates and, after 6 hour incubation at 30°C, the reaction mixture was extracted with EtOAc but without addition of radiochemical earners. E-Sinapyl alcohol readily underwent coupling to afford WO 98l20113 PCTIUS97/20391 syringaresinol, but chiral HPLC analysis revealed that the resulting products were, in every instance, racemic (Table 2). Interestingly, by itself, the 78-kD
dirigent protein preparation catalyzed a low level of dimer formation, as previously noted, but only gave rise to racemic (t)-syringaresinol formation, which is presumably a . 5 consequence of the residual traces of contaminating oxidizing capacity present in the protein preparation.
In an analogous manner, no stereoselective coupling was observed with E p-coumaryl alcohol as substrate. That is) only E-coniferyl alcohol undergoes stereoselective coupling in the presence of the dirigent protein. Given the marked substrate specificity of the dirigent protein for E-coniferyl alcohol, it will be of considerable interest to determine, in the future, how it differs from that affording (+)-syringaresinol in Eucommia ulmoides (T. Deyama, Chem. Pharm. Bull. 31, (1983)).
Table 2.
Effect of dirigent protein on coupling of E-sinapyl alcohol (6 hour assay).
E-Sinapyl alcohol in dimer equivalents Racemic Dirigent protein depleted {t)-syringaresinols (??0 pmol m1-1) (nmol m1-1} {nmol ml-1) FMN absent 570 ~ l00 290 + 40 (0.5 p.mol m1-1 ) present 610 + 110 340 t 40 Ammonium absent 1400 + 120 1020 ~ 40 peroxydisulfate (10 ~,mol ml-1} present 1520 ~ 10 1060 ~ 30 Dirigent protein present 110 ~ 10 50 ~ 10 Although the inventors do not intend to be bound by any particular mechanism for stereoselective coupling, three distinct possibilities can be envisaged.
The most likely is that the oxidase or oxidant generates free-radical species from E-coniferyl alcohol, and that the latter are the true substrates that bind to the dirigent protein prior to coupling. The other two possibilities would require that E-coniferyl alcohol molecules are bound and oriented on the dirigent protein, thereby ensuring that only (+)-pinoresinol formation occurs upon subsequent oxidative coupling:
this i WO 98I20113 PCTlUS97120391 could occur either if both substrate phenolic hydroxyl groups were exposed so that they could readily be oxidized by an oxidase or oxidant, or if an electron transfer mechanism were operative between the oxidase or oxidant and an electron acceptor site or sites on the dirigent protein.
Among the three alternative mechanisms, three lines of evidence suggest "capture" of phenoxy radical intermediates by the dirigent protein. First, the rates of both substrate depletion and product formation are largely unaffected by the presence of the dirigent protein. If capture of the free-radical intermediates is the operative mechanism, the dirigent protein would only affect the specificity of coupling when single-electron oxidation of coniferyl alcohol is rate-determining. Second, an electron transfer mechanism is currently ruled out, since we observed no new ultraviolet-visible chromophores in either the presence or absence of an auxiliary oxidase or oxidant, under oxidizing conditions. Third, preliminary kinetic data (as disclosed in Example 4) support the concept of free-radical capture based on the I S formal values of Michaelis constant (Km) and maximum velocity (Vmax) characterizing the conversion of E-coniferyl alcohol into (+)-pinoresinol, with the dirigent protein alone and in the presence of the various oxidases or oxidants.

Kinetic Characterization of the Conversion of E-Coniferyl Alcohol to (+7-pinoresinol in the Presence of Dirigent Protein and an Oxy eg nating_AA ent.
Assays were carried out as described in Example 3 by incubating a series of E-[9-3H)coniferyl alcohol concentrations (between 8.00 and 0.13 g.mol ml-~, 7.3 MBq mole liter) with dirigent protein (770 pmol ml-~) alone and in presence of Forsythia laccase (2.1 pmol ml-1), fraction III (i2 ~,g protein mI-~), or FMN
(0.5 pmol ml-~). Assays with dirigent protein, in presence or absence of FMN, were incubated at 30°C for 1 hour, whereas assays with Forsythia laccase or fraction III in presence or absence of dirigent protein were incubated at 30 °C for 15 min. If free-radical capture by the dirigent protein is the operative mechanism, the Michaelis-Menten parameters obtained will only represent formal rather than true values, because the highest free-energy intermediate state during the conversion of E-coniferyl alcohol into (+)-pinoresinol is still unknown and the relation between the concentration of substrate and that of the corresponding intermediate free-radical in open solution has not been delineated.
Bearing these qualifications in mind, we estimated formal K,.,1 and V,r,ax values for the dirigent protein preparation. As noted earlier, it was capable of engendering formation of low levels of both (+)-pinoresinol from E-coniferyl alcohol, and racemic (+)-syringaresinols from E-sinapyl alcohol, because of traces of contaminating oxidizing capacity. With this preparation (Table 3), a formal Km of + 6 mM and Vmax of 0.02 + 0.02 mol s-1 mol-1 were obtained. However, with 5 addition of fraction III, laccase, and FMN, the formal Km values (mM) were reduced to 1.6 t 0.3, 0.100 t 0.003, and 0.10 t 0.01, respectively, whereas the Vmax values were far less affected at these concentrations of auxiliary oxidaseloxidant.
Formal Km and Vmax values were calculated for the laccase and fraction III
oxidase with respect to E-coniferyl alcohol conversion into the three racemic lignans.
10 However, no direct comparisons can be made to the 78-kD protein, since the formal Km values involve only the corresponding oxidases. For completeness, the Km (mM) and Vmax (mol s-1 mol-1 enzyme) were as follows: with respect to the laccase, 0.200 t 0.00l and 3.9 t 0.2 for (+)-erythro/threo guaiacylglycerol 8-O-4'-coniferyl alcohol ethers, 0.3000 t 0.0003 and 13.1 t 0.6 for (~)-dehydrodiconiferyl alcohols, and 0.300 + 0.002 and 7.54 ~ 0.50 for (+)-pinoresinols; with respect to the fraction III oxidase (estimated to have a native molecular weight of 80 kDa), 2.2 ~ 0.3 and 0.20 ~ 0.03 for (f)-erythro/threo guaiacylglycerol 8-O-4'-coniferyl alcohol ethers, 2.2 t 0.2 and 0.7 ~ 0.1 for (~)-dehydrodiconiferyl alcohols, and 3.7 + 0.7 and 0.6 t 0.1 for (t)-pinoresinols.
These preliminary kinetic parameters are in harmony With the finding that dirigent protein does not substantially affect the rate of E-coniferyl alcohol depletion in the presence of fraction III, laccase and FMN. Both sets of results are together in accord with the working hypothesis that the dirigent protein functions by capturing free-radical intermediates which then undergo stereoselective coupling.
Table 3.
Effect of various oxidants on formal K"~ and V,~,ax values for the dirigent protein (770 pmol ml-I) during (+)-pinoresinol formation from E-coniferyl alcohol.
Vmax (mol s-~ mol'~
Oxidase/Oxidant Formal Km (mM) dirigent protein) Dirigent protein 10 ~ 6 0.02 + 0.02 Fraction III ( 12 ~.g 1.6 t 0.3 0.10 + 0.03 protein m1-1 ) Laccase (2.07 pmol ml-1) 0.100 t 0.003 0.0600 + 0.0002 FMN (0.5 ~.mol m1-1) 0.10 f 0.01 0.024 t 0.001 i Cloning of the Diri~ent Protein cDNA From Forsythia intermedia Plant Materials - Forsythia intermedia plants were either obtained from Bailey's Nursery (vas. Lynwood Gold, St., Paul, MN), and maintained in Washington S State University greenhouse facilities, or were gifts from the local community.
Materials - A11 solvents and chemicals used were reagent or HPLC grade.
Taq thermostable DNA polymerase was obtained from Promega, whereas restriction enzymes were from Gibco BRL (HaeIII), Boehringer Mannheim (Sau3a) and Promega (TagI). pT7Blue T-vector and competent NovaBlue cells were purchased from Novagen and radiolabeled nucleotide ([a-32P]dCTP) was from DuPont NEN.
Oligonucleotide primers for polymerase chain reaction (PCR) and sequencing were synthesized by Gibco BRL Life Technologies. GENECLEAN II~ kits (BIO 101 Inc.) were used for purification of PCR fragments, with the gel-purified DNA concentrations determined by comparison to a low DNA mass ladder (Gibco BRL) in I.5% agarose gels.
Instrumentation - UV (including RNA and DNA determinations at OD2bo) spectra were recorded on a Lambda 6 UV/VIS spectrophotometer. A Temptronic II
thermocycler (Thermolyne) was used for a11 PCR amplifications. Purification of DNA for sequencing employed a QIAwell Plus plasmid purification system (QIAGEN) followed by PEG precipitation (Sambrook, J., Fritsch, E. F., and Maniatis, T. (1994) Molecular Cloning: A Laboratory Manual, 3 volumes, 3rd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY), with DNA sequences determined using an Applied Biosystems Model 373A automated sequences. Amino acid sequences were obtained using an Applied Biosystems protein sequences with on-line HPLC detection, according to the manufacturer's instructions.
Dirigent Protein Amino Acid Seguencing - The dirigent protein N-terminal amino acid sequence (SEQ ID No: l ) was obtained from the purified protein using an Applied Biosystems protein sequences with on-line HPLC detection. For trypsin digestion, the purified enzyme {150 pmol) was suspended in 0.1 M Tris-HCl {50 pl, pH 8.5, Boehringer Mannheim, sequencing grade), with urea added to give a final concentration of 8 1vt in 77.5 ~.1. The mixture was incubated for 15 min at 50°C, following which 100 mM iodoacetamide (2.5 ~1) was added, with the whole kept at room temperature for 15 min. Trypsin (1 p.g in 20 pl) was then added, with the mixture digested for 24 h at 37°C, following which TFA (4 ~.1) was added to stop the enzymatic reaction. The resulting mixture was subjected to reversed phase HPLC

analysis (C-8 column, Applied Biosytems), this being eluted with a linear gradient over 2 h from 0 to 100% acetonitrile (in 0.1 % TFA} at a flow rate of 0.2 ml/min with detection at 280 nm. Fractions containing individual oligopeptide peaks were collected manually and directly submitted to amino acid sequencing (SEQ ID Nos:2-7).
Forsythia intermedia stem cDNA Library Synthesis - Total RNA 0300 pg/g fresh weight) was obtained (Dong, Z.D., and Dunstan, D.I. (l996) Plant Cell Reports 15:516-521 ) from young green stems of greenhouse-grown Forsythia intermedia plants (var. Lynwood Gold). A Forsythia intermedia stem cDNA library was constructed using 5 ltg of purified poly A+ mRNA (Oligotex-dTTM Suspension, QIAGEN) with the ZAP-cDNA~ synthesis kit, the Uni-ZAPT"' XR vector and the Gigapack~ II Gold packaging extract (Stratagene), with a titer of 1.2 x 106 PFU for the primary library. A portion (30 ml) of the amplified library {1.2 x 10»
PFU/ml;
l58 ml total) (Sambrook, J. et al., supra) was used to obtain pure cDNA
library DNA
(Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidnam, J.G., Smith, J.A., and Struhl, K. ( 1991 ) Current Protocols in Molecular Biology, 2 volumes, Greene Publishing Associates and Wiley-Interscience, John Wiley & Sons, NY) for PCR.
Dirigent Protein DNA Probe Synthesis - The N-terminal and internal peptide amino acid sequences were used to construct the degenerate oligonucleotide primers.
Purified F. intermedia cDNA library DNA (S ng) was used as the template in 100 ~1 PCR reactions (10 mtvl Tris-HCl [pH 9.0], 50 mtn KCI, 0.1% Triton X-100, 2.5 mM
MgCl2, 0.2 mM each dNTP and 2.5 units Taq DNA polymerase) with primer PSINT1 (SEQ ID No: 8) ( 100 pmol) and either primer PSI7R (SEQ ID No: l l ) (20 pmol), primer PSI2R (SEQ ID No:lO) (20 pmol) or primer PSI1R {SEQ ID No:9) (20 pmol). PCR amplification was carried out in a thermocycler as follows:
cycles of 1 min at 94°C, 2 min at 50°C and 3 min at 72°C;
with 5 min at 72°C and an indefinite hold at 4°C after the final cycle. Single-primer, template-only and primer-only reactions were performed as controls. PCR products were resolved in I.5% agarose gels, where a single band 0370-, 155- or 125-bp, respectively) was 30 observed for each reaction.
To determine the nucleotide sequence of the amplif ed bands, five 100 pl PCR reactions were performed as above with PSINT1 (SEQ ID No:B) +PSI7R
(SEQ ID No:l1), PSINT1 (SEQ ID No:B) +PSI2R (SEQ ID No:lO) and PSINT1 (SEQ ID No:8) +pSIlR (SEQ ID No:9} primer pairs. The 5 reactions from each 35 primer pair were concentrated {Microcon 30, Amicon Inc.) and washed with TE

i~

buffer ( 10 mM Tris-HCI, pH 8.0, 1 mnt EDTA; 2 x 200 p.l), with the PCR
products subsequently recovered in TE buffer (2 x 50 pl). These were resolved in preparative 1.5% agarose gels. Each gel-purified PCR product (~0.2 pmol) was then ligated into the pT7Blue T-vector and transformed into competent NovaBlue cells, according to Novagen's instructions. Insert sizes were determined using the rapid boiling lysis and PCR technique (with R20mer and Ul9mer primers) according to the manufacturer's instructions. Restriction analyses were performed to determine whether a11 inserts from the reactions utilizing each of the foregoing primer pairs were the same, as follows: to 20 pl each of a l00 ~l PCR reaction (insert of interest amplified with R20mer(SEQ ID No:74) and Ul9mer(SEQ ID No:75) primers) were added 4 units HaeIII, 1.5 units Sau3a or 5 units TaqI restriction enzyme.
Restriction digestions were allowed to proceed for 60 min at 37°C for HaeIII and Sau3A and at 65°C for TaqI reactions. Restriction products were resolved in 1.5%
agarose gels giving one restriction group for each insert tested. Five recombinant plasmids from PSINT1 (SEQ ID No:B) +PSI7R {SEQ ID No:l l) (called pT7PSI1-pT7PSI5) and 2 recombinant plasmids from PSINT1 (SEQ ID No:B) +PSI2R (SEQ ID No:lO) (called pT7PSI6 and pT7PSI7) PCR products were selected for DNA sequencing; all contained the same open reading frame (ORF) (SEQ ID No:69). The dirigent protein probe was next constructed as follows: five 100 ~l PCR reactions were performed as above with 10 ng pT7PSI1 DNA (SEQ ID No:69) with primers PSINT1 (SEQ ID No:B) and PSI7R (SEQ ID No: l l ). Gel-purified pT7PSI 1 insert (50 ng) was used with Pharmacia's T7QuickPrime~ kit and [a-32P]dCTP, according to kit instructions, to produce a radiolabeled probe (in 0.1 ml), which was purified over BioSpin 6 columns (Bio-Rad) and added to carrier DNA (0.5 mg/ml sheared salmon sperm DNA [Sigma], 0.9 ml).
Library Screening - 600,000 PFU of F. intermedia amplified cDNA library were plated for primary screening, according to Stratagene's instructions.
Plaques were blotted onto Magna Nylon membrane circles (Micron Separations Inc.), which were then allowed to air dry. The membranes were placed between two layers of Whatman~ 3MM Chr paper. cDNA library phage DNA was fixed to the membranes and denatured in one step by autoclaving for 2 min at 100°C with fast exhaust. The membranes were washed for 30 min at 37°C in 6X standard saline citrate (SSC) and 0.1 % SDS and prehybridized for 5 h with gentle shaking at 57-58°C in preheated 6X SSC, 0.5% SDS and 5X Denhardt's reagent (hybridization solution, 300 ml) in a 3 5 crystallization dish ( 190 x 75 mm). The [32P]radiolabeled probe was denatured (boiling, 10 min), quickly cooled (ice, 15 min) and added to a preheated fresh hybridization solution (60 ml, 58°C) in a crystallization dish (I50 x 75 mm). The prehybridized membranes were next added to this dish, which was then covered with plastic wrap. Hybridization was performed for 18 h at 57-58°C with gentle shaking.
The membranes were washed in 4X SSC and 0.5% SDS for 5 min at room temperature, transferred to 2X SSC and 0.5% SDS (at room temperature) and incubated at 57-58°C for 20 min with gentle shaking, wrapped with plastic wrap to prevent drying and finally exposed to Kodak X-GMAT AR film for 24 h at -80°C
with intensifying screens. Twenty positive plaques were purified through two more rounds of screening with hybridization conditions as above.
In vivo Excision and Sequencing of Dirigent Protein cDNA-containing Phagemids - Purified cDNA clones were rescued from the phage following Stratagene's in vivo excision protocol. Both strands of several different cDNAs that coded for dirigent protein were completely sequenced using overlapping sequencing primers. Two distinct cDNAs were identified, called pPSD_Fi 1 (SEQ ID No:12) and pPSD Fi2(SEQ ID No:l4).
Sequence Analysis - DNA and amino acid sequence analyses were performed using the Unix-based GCG Wisconsin Package (Program Manual for the Wisconsin Package, Version 8, September 1994, Genetics Computer Group, 575 Science Drive, Madison, Wisconsin, USA 537I 1; Rice, P. (1996) Program Manual for the EGCG
Package, Peter Rice, The Sanger Centre, Hinxton Hall, Cambridge, CB 10 1 Rq, England) and the ExPASy World Wide Web molecular biology server (Geneva University Hospital and University of Geneva, Geneva, Switzerland).

Expression of Functional DiriQent Protein in Spodoptera frugiperda Attempts to express functional dirigent protein in Escherichia coli failed.
Consequently, we expressed the dirigent protein in Spodoptera frugiperda utilizing a baculovirus expression system. The full-length 1.2 kb cDNA clone for the dirigent protein {PSD) in F. intermedia, containing both the 5' and 3' untranslated regions, was excised from the pBlueScript (Stratagene) derived plasmid pPSD Fil {SEQ ID No:12) using the restriction endonucleases BamH I and Xho I. This 1.2 kb fragment was directionally subcloned into these same restriction sites in the multiple cloning site of the baculovirus transfer vector pBlueBac4 (Invitrogen, San Diego, CA). This produced the 6.0 kb construct pBB4/PSD which generates a non-3 5 fusion dirigent protein with translation being initiated at the dirigent protein cDNA

i start codon. This construct was then co-transfected with linearized Bac-N-Blue DNA
(Invitrogen) into Spodoptera frugiperda Sf9 cells by the technique of cationic liposome mediated transfection to produce, by means of homologous recombination, the recombinant Autographa californica nuclear polyhedrosis viral (AcMNPV} DNA
Bac-N-Blue dirigent protein (BB/PSD) which was purified from plaques according to procedures described by Invitrogen. The final recombinant AcMNPV-BB/PSD
contains the PSD gene under the polyhedrin promoter control and the essential sequence needed for replication of the recombinant virus. To verify that the dirigent protein was successfully expressed in the insect cell culture, log phase Sf~
cells infected with the AcMNPV-PSD recombinant viral high titer stock were used to obtain heterologous protein production. Maximal dirigent protein yield occurred by 48-70 hours post-infection. As determined by SDS-PAGE and (+)-pinoresinol forming activity, the protein was found secreted into the medium and showed a molecular mass and activity which corresponded to the indigenous protein originally isolated from Forsythia intermedia.

Isolation of Diri~~ent Protein Clones from Thuia plicata and TsuQa heterophylla The coding region of a Forsythia dirigent protein cDNA, psd-Fil (SEQ ID No:l2), was used to screen cDNA libraries from Thuja plicata and Tsuga heterophylla. The conditions and methods were as disclosed in Example 5, except that hybridization was carried out at 45-50°C. Two dirigent protein cDNAs were isolated from Tsuga heterophylla (SEQ ID Nos:l6, 18), and eight dirigent protein cDNAs were isolated from Thuja plicata (SEQ ID Nos:20, 22, 24, 26, 28, 30, 32, 34).

Purification of Pinoresinol/lariciresinoI Reductases from Forsythia Intermedia Plant Materials. Forsythia intermedia plants were either obtained from Bailey's Nursery (var. Lynwood Gold, St., Paul, MN), and maintained in Washington State University greenhouse facilities, or were gifts from the local community.
Materials. All solvents and chemicals used were reagent or HPLC grade.
Unlabeled (~)-pinoresinols and (~)-lariciresinols were synthesized as described (Katayama, T. et al., Phytochemistry 32:581-591 (1993)). [4R-3H]NADPH was obtained as previously reported (Chu, A. et aL, J. Biol. Chem. 268:27026-27033 (1993}) by modification of the procedure of Moran et al. (Moran, R.G. et at., Anal.
3 5 Biochem. 138:196-204 ( 1984)), and [4R-2H]NADPH was prepared according to Anderson and Lin (Anderson, J.A., and Lin B.K., Phytochemistry 32:8l1-812 (1993)). Yeast glucose-6-phosphate dehydrogenase (Type IX,22.32. mmol h-' mg') and yeast hexokinase (Type F300, 15.l2 mmol-' mg-') were purchased from Sigma and dihydrofolate reductase (Lactobacillus casei, 33.48 mmol h'1 mg-~) was obtained from Biopure Co. Affi-Gel Blue Gel (100-200 mesh) and Bio-Gel HT
Hydroxyapatite were purchased from Bio-Rad, whereas Phenyl Sepharose CL-4B, MonoQ HR 5/5, MonoP HR 5/20, Superose 6, Superose 12, Superdex 75, PD-10 columns, molecular weight standards and Polybuffer 74 were obtained from Pharmacia LKB Biotechnology, Inc. Adenosine 2',5'-diphosphate Sepharose and Reactive Yellow 3 Agarose were from Sigma Chemical Co.
Instrumentation. ' H Nuclear magnetic resonance spectra {300 and 500 MHz) were recorded on Bruker AMX300 and Varian VXR500S spectrometers, respectively, using CDC13 as solvent with chemical shifts (8 ppm) reported downfield from tetramethylsilane (internal standard). UV (including RNA and DNA
determinations at OD26o) and mass spectra were obtained on Lambda 6 UV/VIS and V G 7070E (ionizing voltage 70 eV ) spectrophotometers, respectively. High performance liquid chromatography was carried out using either reversed-phase (Waters, Nova-pak C 18, 150 x 3.9 mm inner diameter) or chiral (Daicel, Chiralcel OD or Chiralcel OC, 240 x 4.6 mm inner diameter) columns, with detection at 280 nm (Chu, A. et al., J. Biol. Chem. 268:27026-27033 (l993)). Radioactive samples were analyzed in Ecolume (ICN) and measured using a liquid scintillation counter {Packard, Tricarb 2000 CA). Amino acid sequences were obtained using an Applied Biosystems protein sequencer with on-line HPLC detection, according to the manufacturer's instructions.
Enzyme Assays. Pinoresinol and lariciresinol reductase activities were assayed by monitoring the formation of (+)-[3H]lariciresinol and (-)-[3H]secoisolariciresinol (Chu, A. et al., J. Biol. Chem. 268:27026-27033 (l993)).
Briefly, each assay for pinoresinol reductase activity consisted of (~)-pinoresinols (5 mM in MeOH, 20 p.l), the enzyme preparation at the corresponding stage of purity (100 ~.l), and buffer (20 mM Tris-HCI, pH 8.0, 110 pl).
The enzymatic reaction was initiated by addition of [4R-3H]NADPH (10 mM, 6.79 kBq/mmol in 20 pl of double-distilled H20). After 30 min incubation at 30°C
with shaking, the assay mixture was extracted with EtOAc (500 p.l) containing (~)-lariciresinols (20 pg) and (t)-secoisolariciresinols (20 fig) as radiochemical carriers. After centrifugation (13,800 x g, 5 min), the EtOAc solubles were removed i~

W~ 98I20113 PCT/US97120391 and the extraction procedure was repeated. For each assay; the EtOAc solubles were combined with an aliquot ( 100 pl) removed for determination of its radioactivity using liquid scintillation counting. The remainder of the combined EtOAc solubles was evaporated to dryness in vacuo, reconstituted in MeOHl3% acetic acid in (30:70, 100 pl) and subjected to reversed phase and chiral column HPLC.
Controls were performed using either denatured enzyme (boiled for 10 min) or in the absence of (~)-pinoresinols as substrate.
Lariciresinol reductase activity was assayed by monitoring the formation of (-)-[3H]secoisolariciresinol. These assays were carried out exactly as described above, except that (~)-lariciresinols (5 mM in MeOH, 20 p l) were used as substrates, with (t)-secoisolariciresinols (20 beg) added as radiochemical carriers.
General Procedures for Enzyme Purification. Protein purification procedures were carried out at 4°C with chromatographic eluents monitored at 280 nm, unless otherwise stated. Protein concentrations were determined by the method of Bradford {Bradford, M.M., Anal. Biochem. 72:248-254 ( I 976)) using y-globulin as standard.
Polyacrylamide gel electrophoresis used gradient (4-15%, Bio-Rad) gels under denaturing and reducing conditions, these being performed in Laemmli's buffer system (Laemmli, U.K., Nature 227:680-685 ( 1970)). Proteins were visualized by silver staining (Morrissey, J.H., Anal. Biochem. 117:307-310 ( 1981 )).
Preparation of crude extracts. F. intermedia stems (20 kg) were harvested, cut into 3-6 cm sections, and stored at -20°C until needed. Batches of stems (2 kg) were frozen in liquid nitrogen and pulverized in a blaring Blendor. The resulting powder was homogenized with potassium phosphate buffer (0.1 mM, pH 7.0, 4 L), containing 5 mM dithiothreitol. The homogenate was filtered through four layers of cheesecloth into a beaker containing IO% {w/v) polyvinylpolypyrolidone. The filtrate was centrifuged ( 12,000 x g, 15 min). The resulting supernatant was fractionated with {NH4)2S04, with proteins precipitating between 40 and 60%
saturation recovered by centrifugation ( 10,000 x g, 1 h). The pellet was next reconstituted in a minimum amount of Tris-HCl buffer {20 mM, pH 8.0), containing S mM dithiothreitol (buffer A) and desalted using prepacked PD-10 columns (Sephadex G-25 medium) equilibrated with buffer A.
Affinity (Affi Blue Gel) C,~hromatography. The crude enzyme preparation (191 mg in buffer A, 5 nmol h-~ mg-~) was applied to an Affi Blue Gel column (2.6 x 70 cm) equilibrated in buffer A. After washing the column with 200 ml of buffer A, pinoresinol/lariciresinol reductase was eluted with a linear NaCI
gradient WO 98/20113 PCTlUS97/20391 ( I .5-5 M in 3 00 ml) in buffer A at a flow rate of 1 ml miri ~ . Active fractions were stored (-80°C) until needed.
Hydrophobic Interaction Chromatography (Phenyl Sepharose). After thawing, ten preparations resulting from the Affi Blue chromatography step (150 mg, 51 nmol h-~ mg ~) were combined and applied to a Phenyl Sepharose column ( 1 x 10 cm) equilibrated in buffer A, containing 5 M NaCI. The column was washed with two bed volumes of the same buffer. Pinoresinol/lariciresinol reductase was eluted using a linear gradient of decreasing concentration of NaCI (5-0 M in 40 ml) in buffer A at a flow rate of 1 ml miri ~ . Fractions catalyzing pinoresinol/lariciresinol reduction were combined and pooled.
Hydroxyapatite I Chromatography. Active protein (31 mg, 91 nmol h-~ mg-~) from the phenyl sepharose purification step was applied to an hydroxyapatite column ( I .6 x 70 cm) equilibrated in 10 mM potassium phosphate buffer, pH 7.0, containing 5 mM dithiothreitol (buffer B). Pinoresinol/lariciresinol reductase was eluted with a linear gradient of potassium phosphate buffer, pH 7.0 (0.01-0.4 M in 200 ml) at a flow rate of 1 ml miri ~ . Active fractions were combined. The buffer was then exchanged with buffer A using PD-10 prepacked columns.
Affinity (2 ; S'-ADP Sepharose) Chromatography. The enzyme solution resulting from the hydroxyapatite purification step (6.5 mg, 463 nmol h-~ mg ~) was next loaded on a 2',S'-ADP Sepharose ( 1 x 10 cm) column, previously equilibrated in buffer A containing 2.5 mM EDTA (buffer A') and then washed with 25 ml of buffer A'. Pinoresinolllariciresinol reductase was eluted with a step gradient of NADP+
(0.3 mM in 10 ml) in buffer A' at a flow rate of 0.5 ml miri ~ . [NAD+ (up to 3 mM) did not elute pinoresinolilariciresinol reductase activity.] Because of the interference of the absorbance of the NADP+, it was not possible to directly monitor the eluent at 280 nm. Protein concentrations for each fraction were determined spectrophotometrically according to Bradford (Bradford, M.M., Anal. Biochem.
72:248-254 ( l976)).
Hydroxyapatite ll Chromatography. Fractions from the 2',5'-ADP Sepharose column that exhibited pinoresinol/lariciresinol reductase activity (0.85 mg, 1051 nmol h-~ mg ~ ) were combined and directly applied to a second hydroxyapatite column ( 1 x 3 cm), equilibrated in buffer B, with the enzyme eluted with a linear gradient of potassium phosphate buffer, pH 7.0 (0.01-0.4 M in 45 ml} at a flow rate of 1 ml miri ~ .

i Affinity (Affi Yellow) Chromatography - Active fractions ( 160 ~.g, 7960 nmol h-' mg'' ) from the second hydroxyapatite column purification step were next applied to a Reactive Yellow 3 Agarose column ( 1 x 3 cm), equilibrated in buffer A.
Pinoresinol/lariciresinol reductase was eluted with a linear NaCI gradient (0-2.5 M in 100 ml) at a flow rate of 1 ml miri'.
Fast Protein Liquid Chromatography (Superase 12 Chromatography) -Combined fractions from the Affi Yellow purification step having the highest activity (50 ug, 10,940 nmol h-' mg-') were pooled and concentrated to 1 ml, using a Centricon 10 microconcentrator {Amicon, Inc.). The enzyme solution was then applied in portions of 200 ul to a fast protein liquid chromatography column (Superose 12, HR 10/30). Gel filtration was performed in a buffer containing 20 mM
Tris-HC1, pH 8.0, 150 mM NaCI and 5 mM dithiothreitol at a flow rate of 0.4 ml miri ' . Pinoresinol/lariciresinol reductase was eluted with 12.8 ml of the mobile phase. The active fractions which coincided with the UV profile {absorbance at 280 nm} were pooled (20 fig, 15,300 nmol h-' mg-') and desalted {PD-10 prepacked columns).
The foregoing purification protocol resulted in a 3060-fold purification of (+)-pinoresinol/(+}-lariciresinol reductase. As for many of the enzymes involved in phenylpropanoid metabolism, the protein was in very low abundance, i.e. 20 kg F. intermedia stems yielded only ~20 pg of the purified (+)-pinoresinol/-{+)-lariciresinol reductase.

Characterization of Purified Pinoresinol/lariciresinol Reductases from Forsythia Intermedia Isoelectric Focussing and pl Determination. In all stages of the purification protocol, (+)-pinoresinol!(+)-lariciresinol reductase activities coeluted.
Given this observation, it was essential to unambiguously ascertain whether more than one form of the protein existed, i. e., whether one form of the protein catalyzed the reduction of pinoresinol, and another form of the protein catalyzed the reduction of lariciresinol.
To this end, the isoelectric point of pinoresinol/lariciresinol reductase was estimated by chromatofocussing on a MonoP HR 5/20 FPLC column.
Active fractions from the Superose 12 gel filtration column (Example 1 ) were pooled and the buffer exchanged with 25 mM Bis-Tris, pH 7.1, using prepacked PD-10 columns, equilibrated in the same buffer. The preparation so obtained was loaded on the chromatofocussing column and a pH gradient between 7.1 and 3.9 was formed, using Polybuffer 74 as eluent at a flow rate of 0.5 ml miri ~ .
Aliquots (200 pl) of each fraction were assayed for pinoresinol/lariciresinol reductase activities. The remainder of the fractions was used to determine the pH
gradient.
Molecular Weight Determination. Application of the MonoP HR 5/20 FPLC
column preparation of pinoresinol/lariciresinol reductase to SDS-gradient gel electrophoresis (4-15% polyacrylamide) revealed the presence of two protein bands of similar apparent molecular weight, whose separation was achieved via anion exchange chromatography on a MonoQ HR 5I5 FPLC matrix. Pooled fractions from the Sepharose 12 purification step (Example 1) were applied to a MonoQ HR 5l5 column (Pharmacia}, equilibrated in buffer A. The column was washed with 10 ml of buffer A and pinoresinol/lariciresinol reductase activity eluted using a linear NaCI
gradient (0-500 mM in 50 ml} in buffer A at a flow rate of 0.5 ml miri'.
Aliquots (30 pl) of the collected fractions were analyzed by SDS polyacrylamide gel electrophoresis, using a gradient (4-15% acrylamide) gel. Proteins were visualized by silver staining. Active fractions 34 through 37 (27,760 nmol h-~ mg 1) and 38 through 41 (30,790 nmol h-1 mg-~) were pooled separately and immediately used for characterization.
The two protein bands thus resolved under denaturing conditions had apparent molecular masses of ~36 and ~35 kDa, respectively. Each of the two reductase forms had a ph5.7.
Native molecular weights of each reductase isoform were estimated via comparison of their elution behavior on Superose 12, Superose 6 and Superdex gel filtration FPLC columns with the elution behavior of calibrated molecular weight standards. Gel filtration was carried out as set forth in Example 8. For each reductase, an apparent native molecular weight of 59,000 was calculated based on its elution volume, in contrast to that of 36,000 and 35,000 by SDS-polyacrylamide gel electrophoresis. While the discrepancy between molecular weights from gel filtration and SDS-PAGE remains unknown, it can tentatively be proposed that although the native protein likely exists as a dimer, it could also be a monomer of asymmetric shape, thereby altering its effective Stokes radius (Cantor, C.R., and Shimmel, P.R., Biophysical Chemistry, Part II, W.H. Freeman and Company, San Francisco, CA ( 1980); Stellwagen, E., Methods in Enzymology 1 S2:317-328 {
1990}), as reported for human thioredoxin reductase (Oblong, J.E., et al., Biochemistry 32:727l -7277 ( 1993 }} and yeast metalloendopeptidase {Hrycyna, C. A., and Clarke, S., Biochemistry 32:l 1293-11301 (l993)).

i pH and Temperature Optima. To determine the pH-optimum of pinoresinolllariciresinol reductase, the enzyme preparation from the gel Superose 12 filtration step (Example 8) was assayed utilizing standard assay conditions (Example 8), except that the buffer was replaced with SO mM Bis-Tris Propane buffer in the pH range of 6.3 to 9.4. The pH optimum was found to be pH 7.4.
The temperature optimum of pinoresinolllariciresinol reductase was examined in the range between 4°C and 80°C under standard assay conditions (Example 8) utilizing the enzyme preparation ftom the gel filtration step (Example 8). At optimum pH, the temperature optimum for the reductase activity was established to be ~30°C.
Kinetic Parameters. Velocity studies were carried out to ascertain whether the two reductase isoforms catalyzed distinct reductions, i.e., that of the conversion of (+)-pinoresinol to (+)-lariciresinol, and {+)-lariciresinol to (-}-secoisolariciresinol, respectively, or whether either displayed a preference for (+)-pinoresinol or 1 S (+)-lariciresinol as substrates. The initial velocity studies were carried out individually utilizing the two isoforms of the enzyme, and individually employing both (+)-pinoresinol and (+)-lariciresinol as substrates. Initial velocity studies were performed in triplicate experiments, using 50 mM Bis-Tris Propane buffer, pH
7.4 containing 5 mM dithiothreitol, pure enzyme (after MonoQ anion-exchange chromatography), ten different substrate concentrations (between 8.8 and 160 ~M) at a constant NADPH concentration (80 ~.M). Incubations were carried out at 30 °C for 10 min (within the linear kinetic range). Kinetic parameters were determined from Lineweaver-Burk plots.
Importantly, the kinetic parameters were essentially the same for both the 35 kDa and the 36 kDa forms of the enzyme (i.e., Km for pinoresinol: 27t1.5~m for the 35 kDa form of the enzyme, and 23+1.3M for the 36 kDa form of the enzyme;
Km for lariciresinol: 121t5.0~M for the 35 kDa form of the enzyme and 123+6.O~M
for the 36 kDa form of the enzyme). In an analogous manner, apparent maximum velocities (expressed as ~mol h-l mg ~ of protein} were also essentially identical (i.e., Vmax for pinoresinol: 16.2t0.4 for the 35 kDa form of the enzyme and 17.3t0.5 for the 36 kDa form of the enzyme; for lariciresinoi: 25.2+0.7 for the 35 kDa form of the enzyme and 29.90.7 for the 36 kDa form of the enzyme). Thus, a11 available evidence suggests that (+)-pinoresinol/(+)-lariciresinol reductase exists as two isoforms, with each capable of catalyzing the reduction of both substrates.
How this reduction is carried out, i.e., whether both reductions are done in tandem, in either -Sl quinone or furano ring form, awaits further study using a more abundant protein source.
Enzymatic Formation of (+)-(7'R-ZHJLariciresinol. Since the two (+)-pinoresinol/(+)-lariciresinol reductase isoforms exhibited essentially identical _ 5 catalytic characteristics, the Sepharose I2 enzyme preparation (Example 8}, containing both isoforms, was used to examine the stereospecificity of the hydride transfer. The strategy adopted utilized selective deuterium labeling using as cofactor for the reduction of (+)-pinoresinol, with the enzymatic product, (+)-lariciresinol, being analyzed by ' H NMR and mass spectroscopy. Thus, a solution of (t)-pinoresinols (5.2 mM in MeOH, 4 ml) was added to Tris-HCI buffer (20 mM, pH 8.0, containing 5 mM dithiothreitol, 22 ml) and stereospecifically deutero-labeled [4R-2H]NADPH (20 mM in H20, 4 ml) prepared via the method of Anderson and Lin (Anderson, J.A., and Lin B.K., Phytochemistry 32:81l-812 (1993)), with the whole added to the enzyme preparation (20 ml). After incubation at 30°C
for 1 h with shaking, the assay mixture was extracted with EtOAc (2 x 50 ml). The EtOAc soluble fraction was combined, washed with saturated NaCI (50 ml), dried (Na2SO4), and evaporated to dryness in vacuo. The resulting extract was reconstituted in a minimum amount of EtOAc, applied to a silica gel column (0.5 x 7 cm), and eluted with EtOAclhexanes ( 1:2). Fractions containing the enzymatic product were combined and evaporated to dryness.
The enzymatic product was established to be (+)-[7'R 2H]lariciresinol, as evidenced by the disappearance of the T-proR proton at 8 2.51 ppm due to its replacement by deuterium and by its molecular ion at (m/z) 361 (M++1) corresponding to the presence of one deuterium atom at C-7. 'H NMR (300 MHz) (CDCl3): 2.39 (m,'H, C8H), 2.7I (m,' H, C8'H), 2.88 (8,' H, JTS,B'=5.0 Hz, C7'HS), 3.73 (88,'H, J8',9'b=7.0 Hz, J9'a,9'b=8.5 Hz, C9'H13), 3.76 (88,'H, J8,9S=6.5 Hz, J9R,9S=8.5 Hz, C9HS), 3.86 (s,3H, OCH3), 3.88 (s,3H, OCH3), 3.92 (88,'H, J8,9R=6.0 Hz, J9R,9S=9.5 Hz, C9HR), 4.04 (88,' H, J8',9'a=7.0 Hz, J9'a9'b=8.5 Hz, C9'Ha), 4.77 (8,' H, J7,8=6.6 Hz, C7H), 6.68 - 6.70 (m,2H, ArH), 6.75 - 6.85 (m,4H, ArH); MS m/z (%) : 361 (M++I, 71.2), 360 (M+, 31.1 ), 237 ( 11.1 ), 153 (41.5), I 52 (20.2), I51 (67.0), 138 (100), 137 (71.l).
Thus, hydride transfer from (+)-pinoresinol to {+)-lariciresinol had occurred in a manner whereby only the T-proR hydrogen position of (+)-lariciresinol was deuterated. An analogous result was observed for the conversion of (+)-lariciresinol i~

WO 98I20113 PCTlUS97120391 into (-)-secoisolariciresinol, thereby establishing that the overall hydride transfer was completely stereospecific.

Amino Acid Sequence Analysis of Purified Pinoresinol/Lariciresinol Reductase from Forsythia intermedia PinoresinollLariciresinol Reductase Amino Acid Sequencing. The (+)-pinoresinol/(+)-lariciresinol reductase N-terminal amino acid sequence was obtained from each of the purified proteins, and a mixture of both, using an Applied Biosystems protein sequencer with on-line HPLC detection. The N-terminal sequence was the same for both isoforms (SEQ ID No:36).
For trypsin digestion, l50 pmol of the enzyme purified from the Sepharose 12 column (Example 8) was suspended in 0.1 M Tris-HCl (50 ~l, pH 8.5), with urea added to give a final concentration of 8 M in 77.5 ~.1. The mixture was incubated for min at SO°C, then 100 mM iodoacetamide (2.5 0,1) was added, with the whole kept 15 at room temperature for 15 min. Trypsin ( 1 ug in 20 ~l) was then added, with the mixture digested for 24 h at 37°C, after which TFA (4 pl) was added to stop the enzymatic reaction.
The resulting mixture was subjected to reversed phase HPLC analysis {C-8 column, Applied Biosytems), this being eluted with a linear gradient over 2 h from 0 to 100% acetonitrile (in 0.1 % TFA) at a flow rate of 0.2 ml/min with detection at 280 nm. Fractions containing individual oligopeptide peaks were collected manually and directly submitted to amino acid sequencing. Four tryptic fragments were resolved in sufficient quantity to permit amino acid sequence determination.
(SEQ ID Nos:37-40).
Cyanogen bromide digestion was performed by incubation of 150 pmol of the reductase purified from the Sepharose 12 column (Example 8) with 0.5 M
cyanogen bromide in 70% formic acid for 40 h at 37°C, following which the cyanogen bromide and formic acid were removed by centrifugation under reduced pressure (SpeedVac).
The resulting oligopeptide fragments were separated by HPLC and three were resolved in suff cient quantity to permit sequencing (SEQ ID Nos:41-43).

Cloning of Pinoresinol/Lariciresinol Reductase from Forsythia intermedia Plant Materials. Forsythia intermedia plants were either obtained from Bailey's Nursery (var. Lynwood Gold, St., Paul, MN), and maintained in Washington State University greenhouse facilities, or were gifts from the local community.

WO 98I20113 _53_ PCTIUS97120391 -Materials. All solvents and chemicals used were reagent or HPLC grade.
UV RNA and DNA determinations at OD26o were obtained on a Lambda 6 UV/VIS
spectrophotometer. A Temptronic II thermocycler (Thermolyne) was used for alI
PCR amplifications. Taq thermostable DNA polymerase was obtained from Promega, whereas restriction enzymes were from Gibco BRL (HaeIII), Boehringer Mannheim (Sau3a) and Promega (TaqI). pT7Blue T-vector and competent NovaBlue cells were purchased from Novagen and radiolabeled nucleotides ([a 32P]dCTP and [y 32P]ATP) were from DuPont NEN.
Oligonucleotide primers for polymerase chain reaction (PCR) and sequencing were synthesized by Gibco BRL Life Technologies. GENECLEAN II~ kits (BIO 101 Inc.) were used for purification of PCR fragments, with the gel-purified DNA concentrations determined by comparison to a low DNA mass ladder (Gibco BRL) in 1.5% agarose gels.
Forsythia RNA Isolation. Initial attempts to isolate functional F, intermedia RNA from fast-growing, green stem tissue were unsuccessful, due to difficulties encountered via facile oxidation by its plant phenolic constituents. This problem was, however, successfully overcome by utilization of an RNA isolation procedure, specifically designed for woody plant tissue, which uses low pH and reducing conditions in the extraction buffer to prevent oxidation (Dong, Z.D., and Dunstan, D.I., Plant Cell Reports 15: 516-S21 ( 1996)).
Forsythia intermedia stem cDNA Library Synthesis. Total RNA 0300 ~eg/g fresh weight) was obtained from young green stems of greenhouse-grown Forsythia intermedia plants (var. Lynwood Gold} (bong, Z.D., and Dunstan, D.L, Plant Cell Reports 15:516-521 ( 1996)). A Forsythia intermedia stem cDNA library was constructed using S ~.g of purified poly A+ mRNA (Oligotex-dTTM Suspension, QIAGEN) with the ZAP-cDNA~ synthesis kit, the Uni-ZAPTM XR vector and the Gigapack~ II Gold packaging extract (Stratagene), with a titer of 1.2x106 PFU
for the primary library. A portion (30 ml) of the amplified library ( 1.2x 10 ~
° PFU/ml;
l58 ml total) was used to obtain pure cDNA library DNA for PCR
{Sambrook, J. et al., Molecular Cloning: A Laboratory Manual, 3 volumes, 3rd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY ( 1994); Ausubel, F.M. et al., Current Protocols in Molecular Biology, 2 volumes, Greene Publishing Associates and Wiley-Interscience, John Wiley & Sons, NY (199l)).
PinoresinollLariciresinol Reductase DNA Probe Synthesis - The N-terminal 3 5 and internal peptide amino acid sequences were used to construct the degenerate i oligonucleotide primers. Specifically, the primer PLRNS (SEQ ID No:44) was based on the sequence of amino acids 7 to 13 of the N-terminal peptide (SEQ ID
No:36).
The primer PLR14R (SEQ ID No:45) was based on the sequence of amino acids 2 to 8 of the internal peptide sequence set forth in (SEQ ID No:37). The primer PLR15R (SEQ ID No:46) was based on the sequence of amino acids 9 to 15 of the internal peptide sequence set forth in (SEQ ID No:37). The sequence of amino acids 9 to 15 of the internal peptide sequence set forth in SEQ ID No:37, upon which the sequence of primer PLR15R (SEQ ID No:46) was based, also corresponded to the sequence of amino acids 4 to 10 of the cyanogen bromide-generated, internal fragment set forth in SEQ ID No:4l.
Purified F. intermedia cDNA library DNA (5 ng) was used as the template in 100 ~l PCR reactions (10 mM Tris-HCl [pH 9.0], 50 mM KCI, 0.1% Triton X-100, 2.5 mM MgCl2, 0.2 mM each dNTP and 2.5 units Taq DNA polymerase) with primer PLRNS (SEQ ID No:44) (100 pmol) and either primer PLRISR (SEQ ID No:46) 1 S (20 pmol) or primer PLRI4R (SEQ ID No:45) (20 pmol). PCR amplification was carried out in a thermocycler as follows: 35 cycles of 1 min at 94°C, 2 min at 50°C
and 3 min at 72°C; with 5 min at 72°C and an indefinite hold at 4°C after the final cycle. Single-primer, template-only and primer-only reactions were performed as controls. PCR products were resolved in 1.5% agarose gels. The combination of primers PLRNS (SEQ ID No:44) and PLRI4R (SEQ ID No:45) yielded a single band of 3 80-by corresponding to bases 22 to 393 of SEQ ID No:47. The combination of primers PLRNS (SEQ ID No:44) and PLRISR (SEQ ID No:46) yielded a single band of 400-by corresponding to bases 22 to 423 of SEQ ID No:47.
To determine the nucleotide sequence of the two amplified bands, five, 100 pl PCR reactions were performed as above with each of the following combinations of template and primers: 380 by amplified product plus primers PLRNS
(SEQ ID No:44) and PLRI4R (SEQ ID No:45); 400 by amplified product plus primers PLRNS (SEQ ID No:44) and PLRISR (SEQ ID No:46}. The 5 reactions from each combination of primers and template were concentrated (Microcon 30, Amicon Inc.) and washed with TE buffer (10 mM Tris-HCI, pH 8.0, 1 mM EDTA;
2 x 200 gel), with the PCR products subsequently recovered in TE buffer (2 x 50 p,l).
These were resolved in preparative 1.5% agarose gels. Each gel-purified PCR
product (~0.2 pmol) was then ligated into the pT7Blue T-vector and transformed into competent NovaBlue cells, according to Novagen's instructions. Insert sizes were determined using the rapid boiling lysis and PCR technique (utilizing R20mer (SEQ ID No:74) and U 19mer (SEQ ID No:75) primers according to the manufacturer's (Novagen's) instructions.
Restriction analysis was performed to determine whether all inserts for each combination of primers and template were the same. Restriction analysis was carried out as follows: each of the inserts was amplified by PCR utilizing the R20 (SEQ ID No:74) and U 19 (SEQ ID No:75) primers. To 20 ~l each of a 100 p.l PCR
reaction were added 4 units HaeIII, 1.5 units Sau3a or 5 units TaqI
restriction enzyme. Restriction digestions were allowed to proceed for 60 min at 37°C for HaeIII and Sau3A and at 65°C for TaqI reactions. Restriction products were resolved in 1.5% agarose gels giving one restriction group for all inserts tested.
Five of the resulting, recombinant plasmids were selected for DNA
sequencing. The inserts from three of the recombinant plasmids (called pT7PLR1-pT7PLR3) were generated by a combination of primers PLRNS (SEQ ID No:44) and PLRISR (SEQ ID No:46) with the 400 by PCR product as substrate. The inserts from the remaining two recombinant plasmids (called pT7PLR4 and pT7PLR5) were generated from a combination of primers PLRNS (SEQ iD No:44) and PLRI4R
(SEQ ID No:45) and the 380 by PCR product as substrate. All of the five, sequenced PCR products contained the same open reading frame.
The (+)-pinoresinol/(+)-lariciresinol reductase probe was constructed as follows: five, 100 ~1 PCR reactions were performed as described above with 10 ng pT7PLR3 DNA with primers PLRNS (SEQ ID No:44) and PLRISR (SEQ ID No:46).
GeI-purified pT7PLR3 cDNA insert (50 ng) was used with Pharmacia's T7QuickPrime~ kit and [a-32P]dCTP, according to kit instructions, to produce a radiolabeled probe (in 0.1 ml), which was purified over BioSpin 6 columns (Bio-Rad) and added to carrier DNA (0.9 ml of 0.5 mg/ml sheared salmon sperm DNA obtained from Sigma).
Library Screening. 600,000 PFU of F. intermedia amplified cDNA library were plated for primary screening, according to Stratagene's instructions.
Plaques were blotted onto Magna Nylon membrane circles (Micron Separations Inc.), which were then allowed to air dry. The membranes were placed between two layers of Whatman~ 3MM Chr paper. cDNA library phage DNA was fixed to the membranes and denatured in one step by autoclaving for 2 min at 100°C with fast exhaust. The membranes were washed for 30 rnin at 37°C in 6X standard saline citrate (SSC) and 0.1 % SDS and prehybridized for 5 h with gentle shaking at 5 7-5 8 °C
in preheated 6X

i SSC, 0.5% SDS and SX Denhardt's reagent (hybridization solution, 300 ml) in a crystallization dish {190x75 mm).
The [32P]radiolabeled probe was denatured (boiling, 10 min), quickly cooled (ice, 15 min) and added to a preheated fresh hybridization solution (60 ml, 58°C) in a S crystallization dish (150x75 mm). The prehybridized membranes were next added to this dish, which was then covered with plastic wrap. Hybridization was performed for 18 h at 57-58°C with gentle shaking. The membranes were washed in and 0.5% SDS for 5 min at room temperature, transferred to 2X SSC and 0.5% SDS
(at room temperature) and incubated at 57-58°C for 20 min with gentle shaking, wrapped with plastic wrap to prevent drying and finally exposed to Kodak X-GMAT
AR film for 24 h at -80°C with intensifying screens.
This screening procedure resulted in more than 3 50 positive plaques, with twenty (of different signal intensities) being subjected to two additional rounds of screening. After final purification, six of the twenty cDNAs were subcloned by in vivo excision into pBluescript. These six cDNAs were called plr-Fi 1 to plr-Fi6 (SEQ ID Nos:47, 49, 51, 53, 55, 57).
In vivo Excision and Sequencing of plr-Fil plr-Fi6 Phagemids. The six purified cDNA clones were rescued from the phage following Stratagene's in vivo excision protocol. Both strands of the six different cDNAs (plr-Fi 1 to plr-Fi6) that coded for (+}-pinoresinol/ (+)-lariciresinol reductase were completely sequenced using overlapping sequencing primers.
Purification of DNA for sequencing employed a QIAwell Plus plasmid purification system {QIAGEN) followed by PEG precipitation (Sambrook, J., Molecular Cloning: A laboratory Manual, 3 volumes, 3rd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY ( 1994)), with DNA sequences determined using an Applied Biosystems Model 373A automated sequencer. DNA and amino acid sequence analyses were performed using the Unix-based GCG Wisconsin Package (Program Manual for the Wisconsin Package, Version 8, September 1994, Genetics Computer Group, 575 Science Drive, Madison, Wisconsin, USA 53711; Rice, P., Program Manual for the EGCG Package, Peter Rice, The Sanger Centre, Hinxton Hall, Cambridge, CB 10 1 Rq, England ( 1996)) and the ExPASy World Wide Web molecular biology server (Geneva University Hospital and University of Geneva, Geneva, Switzerland).
All six cDNAs had the same coding but different 5'-untranslated regions. On the other hand, analysis of the 3'-untranslated region of each of the six cDNAs established that a11 were truncated versions of the longest cDNA's 3'-region.
Preliminary RNA gel blot analysis with total RNA from greenhouse-grown plant . stem tips confirmed a single transcript with a length of approximately 1.2 kb.
RNA gel blot analysis. For RNA gel blot analysis, total RNA (30 p.g per lane) from F. intermedia stem tips was separated by size by denaturing agarose gel electrophoresis. The RNA was transferred to charged nylon membranes (GeneScreen Plus~, Dupont NEN), cross-linked to the membrane (Stratalinker from Stratagene), prehybridized, hybridized with the same probe used to screen the cDNA library during cDNA cloning and washed according to the manufacturer's instructions for aqueous hybridization conditions. The membrane was then exposed to Kodak X-OMAT film for 48 hr at -80°C with intensifying screens.

Expression of (+)-Pinoresinol/(+)-Lari ciresinol Reductase cDNA plr-Fi 1 in E.
col i Expression in Escherichia cvli. In order to confirm that the putative I5 (+)-pinoresinol/(+)-lariciresinol reductase cDNAs encoded functional (+)-pinoresinol/(+)-lariciresinol reductase, the cDNAs putatively encoding (+)-pinoresinol/(+)-lariciresinol reductase were heterologously expressed in E. coli.
Heterologous expression was also necessary in order to obtain sufficient protein to enable the systematic study of the precise biochemical mechanism of {+)-pinoresinoi/(+)-lariciresinol reductase at a future date.
Examination of the six putative (+)-pinoresinol/(+)-lariciresinol reductase clones revealed that one, plr-Fi I (SEQ ID No:47), was in frame with the a-complementation particle of J3-galactosidase in pBluescript. This was fortuitous, since it potentially provided a facile means to express the fully functional fusion protein, and hence to provide proof that the cloned sequence was correct.
Purified plasmid DNA from plr-Fi 1 {SEQ ID No:47) was transformed into NovaBlue cells according to Novagen's instructions. Transformed cells (5 ml cultures) were grown at 37°C with shaking (225 rpm) to mid log phase (OD6op=0.5) in LB medium (Sambrook, J., Molecular Cloning: A Laboratory Manual, 3 volumes, 3rd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1994)) supplemented with 12.5 pg ml-i tetracycline and SO pg ml-1 ampicillin. IPTG
(isopropyl ~i-D-thioglucopyranoside) was then added to a final concentration of 10 mM, and the cells were allowed to grow for 2 h. Cells were collected by centrifugation and resuspended in 500 pl (per 5 ml culture tube) buffer (20 mM
Tris-HCI, pH 8.0, 5 mM dithiothreitol). Lysozyme (5 pl of 0.1 mg ml-', Research Organics, Inc.) was next added and following incubation for 10 min, the cells were lysed by sonication (3 x 15 s). After centrifugation at 14,000 x g at 4°C for 10 min, the supernatant was removed and assayed fox (+)-pinoresinol/(+)-lariciresinol reductase activity (210 pl supernatant per assay) as described in Example 8.
Catalytic activity was established by incubating cell-free extracts for 2 h at 30°C with (~)-pinoresinols (0.4 mM) and [4R-3H]NADPH (0.8 mM) under standard conditions. Following incubation, unlabeled (t)-lariciresinols and (+)-secoiso-lariciresinols were added as radiochemical carriers, with each lignan isolated by reversed-phase HPLC. Controls included assays of a pinoresinol/lariciresinol reductase cDNA which contains an out-of frame cDNA insert, with all assay components, as well as plr-Fi 1 (SEQ ID No:47) and an out-of frame pinoresinol/-lariciresinol reductase cDNA with no substrate except [4R-3H]NADPH. Separation of products and chiral identif cation were performed by HPLC as previously described (Chu, A., et al., .l. Biol. Chem. 2b8:27026-27033 (1993)).
Subsequent chiral HPLC analysis revealed that both (+)-lariciresinol and {-)-secoisolariciresinol, but not the corresponding antipodes, were radiolabeled (total activity: 54 nmol h-' mg ' ). By contrast, no catalytic activity was detected either in the absence of (+)-pinoresinols, or when control cells were used which contained a plasmid in which the cDNA insert was not in-frame with the I3-galactosidase gene.
Thus, the heterologously expressed (+)-pinoresinol/(+}-lariciresinol reductase and the plant protein function in precisely the same enantiospecific manner.

Sequence and Homology Analysis of the cDNA Insert of Clone plr-Fi 1 (S~ ID No:471 Encoding(+ypinoresinoll(+1-lariciresinol reductase Sequence Analysis. The full length sequence of the cloned (+)-pinoresinol/
(+)-lariciresinol reductase plr-Fi 1 (SEQ ID No:47) contained all of the peptide sequences determined by Edman degradation of digest fragments.
The single ORF predicts a polypeptide of 312 amino acids (SEQ ID No:48) with a calculated molecular mass of 34.9 kDa, in close agreement with the value (~35 or ~36 kDa) estimated previously by SDS-PAGE for the two isoforms of {+)-pinoresinolJ(+)-lariciresinol reductase. An equal number of acidic and basic residues are also present, with a theoretical isoelectric point {pI) of 7.09, in contrast to that experimentally obtained by chromatofocussing (pI ~5.7).
The amino acid composition reveals seven methionine residues.
Interestingly, the N-terminus of the plant-purified enzyme lacks the initial methionine, this being the most common post-translational protein modif cation known. Consequently, the first methionine in the cDNA can be considered to be the . site of translational initiation. The sequence analysis also reveals a possible N-glycosylation site at residue 215 (although no secretory targeting signal is present), and seven possible protein phosphorylation sites at residues 50 and 228 (protein kinase C-type), residues 228, 250, 302 and 303 (casein kinase II-type ) and residue 301 (tyrosine kinase type).
Regions of the pinoresinol/lariciresinol polypeptide chain (SEQ ID N0:48) were also identified that contained conserved sequences associated with NADPH
binding (Jornvall, H., in Dehydrogenases Requiring Nicotinamide Coenzymes (Jeffery, J., ed) pp. l26-148, Birkhauser Verlag, Basel ( 1980); Branden, C., and Tooze, J., Introduction to Protein Structure, pp. 141-l59, Garland Publishing, Inc., New York and London ( 1991 ); Wierenga, R.K. et al., J. Mol. Biol. 187:1 O 1-{1986)). There is a limited number of invariant amino acids in the sequences of different reductases which are viewed as indicative of NADPH binding sites.
These include three conserved giycine residues with the sequence G-X-G-X-X-G
(SEQ ID No:76), where X is any residue, and six conserved hydrophobic residues.
The glycine-rich region is considered to play a central role in positioning the NADPH in its correct conformation. In this regard, a comparison of the N-terminal region of (+)-pinoresinol/(+}-lariciresinol reductase with that of the conserved, NADPH-binding regions of Drosophila melanogaster alcohol dehydrogenase (Branden, C., and Tooze, J., Introduction to Protein Structure, pp. 141-l59, Garland Publishing, Inc., New York and London ( I 991 )}, Pinus taeda cinnamyl alcohol dehydrogenase (MacKay J.J. et al., Mol. Gen. Genet. 247:S37-545 (1995)), dogfish muscle lactate dehydrogenase (Branden, C., and Tooze, J., Introduction to Protein Structure, pp. 141-1S9, Garland Publishing, Inc., New York and London (1991}) and human erythrocyte glutathione reductase (Branden, C., and Tooze, J., Introduction to Protein Structure, pp. I41-I59, Garland Publishing, Inc., New York and London ( 1991 )), revealed some interesting parallels. The invariant glycine residues are aligned in every case, as are four of the six hydrophobic residues required for the correct packaging in the formation of the domain. Hence,. the NADPH-binding site of (+)-pinoresinol/(+)-Iariciresinol reductase isoforms is localized close to the N-terminus.
Homology Analysis: Comparison to Isoflavone Reductase. A BLAST search (Altschul, S.F, et al., J. Mol. Biol. 215:403-410 ( 1990)) was conducted with the i translated amino acid sequence of (+)-pinoresinoll(+)-lariciresinol reductase (SEQ ID No:48) against the non-redundant peptide database at the National Center for Biotechnology Information. Significant homology was noted for {+)-pinoresinol/(+)-lariciresinol reductase with various isoflavone reductases from the legumes, Cicer arietinum (Tiemann, K., et ai., Eur. J. Biochem. 200:751-( 1991 )) (63.5% similarity, 44.4% identity), Medicago sativa (Paiva, N.L., et al., Plant Mol. Biol. 17:653-667 ( 1991 )) (62.6% similarity, 42.0% identity) and Pisum sativum (Paiva, N.L., et al., Arch. Biochem. Biophys. 312:501-S10 (l994)) (61.6%
similarity, 41.3% identity}. This observation is of considerable interest since isoflavonoids are formed via a related branch of phenylpropanoid-acetate pathway metabolism. Specifically, isoflavone reductases catalyze the reduction of a,J3-unsaturated ketones during isoflavonoid formation. For example, the Medicago sativa L. isoflavone reductase catalyzes the stereospecific conversion of 2'-hydroxy-formononetin to (3R)-vestitone in the biosynthesis of the phytoalexin, (-)-medicarpin (Paiva, N.L. et al., Plant Mol. Biol. 17:653-667 ( 1991 )). This sequence similarity may be significant given that both lignans and isoflavonoids are offshoots of general phenylpropanoid metabolism, with comparable plant defense functions and pharmacological roles, e.g., as "phytoestrogens" . Consequently, since both reductases catalyze very similar reactions, it is tempting to speculate that the isoflavone reductases may have evolved from (+)-pinoresinoll(+)-lariciresinoi reductase. This is considered likely since the lignans are present in the pteridophytes, hornworts, gymnosperms and angiosperms; hence their pathways apparently evolved prior to the isoflavonoids (Gang et al., In Phytochemicals ,for Pest Control, Hedin et al., eds, ACS Symposium Series, Washington D.C., 658:58-59 (l997)).
Comparable homology was also observed with putative isoflavone reductase "homologs" from Arabidopsis thaliana (Babiychuk, E., et al., Direct Submission (25-MAY-199S) to the EMBL/GenBankIDDBJ databases (1995)) (6S.9% similarity, 50.8% identity), Nicotiana tabacum {Hibi, N., et al., Plant Cell 6:723-735 (1994)) (64.6% similarity, 47.2% identity), Solanum tuberosum (van Eldik, G.J., et al., (1995) Direct submission (06-UCT-1995) to the EMBLIGenBank/DDBJ databases) (65.5% similarity, 47.7% identity) Zea mays (Petrucco, S., et al., Plant Cell 8:69-80 ( 1996)) (61.6% similarity, 44.9% identity) and especially Lupinus albus {Attuci, S., et al., Personal communication and direction submission (06/6I96) to the EMBL/Genbank/DDBJ databases ( 1996)) (85.9% similarity, 66.2% identity).

By contrast, homology with other NADPH-dependent reductases was significantly lower: for example, dihydroflavonol reductases from Petunia hybrida (Beld, M. et al., Plant Mol. Biol. 13:49l-502 (1989)) (43.2% similarity, 21.5%
identity) and Hordeum vulgate (Kristiansen, K.N., and Rohde, W., Mol. Gen.
Genet.
230:49-59 ( 199l )) (46.2% similarity, 21.1 % identity), chalcone reductase from Medicago sativa (Ballance, G.M. and Dixon, R.A., Plant Physiol. 107:1027-l028 (1995)) (39.5% similarity, 15.8% identity), chalcone reductase "homolog" from Sesbania rostrata {Goormachtig, S., et al., ( I 99S) Direct Submission ( 13-MAR-1995) to the EMBLJGenBankiDDBJ databases) (47.6% similarity, 24.1 % identity), cholesterol dehydrogenase from Nocardia sp. (Horinouchi, S., et al., Appl.
Environ.
Microbiol. 57:1386-1393 ( I 991 )) {46.6% similarity, 21.0% identity) and 3-(3-hydroxy-5-ene steroid dehydrogenase from Rattus norvegicus (Zhao, H.-F., et al., Journal Endocrinology 1Z7:3237-3239 (1990)) (43.5% similarity, 20.6%
identity).
Thus, sequence analysis establishes significant homology between (+)-pinoresinoll(+)-lariciresinol reductase, isoflavone reductases and putative isoflavone reductase "homologs" which do not possess isoflavone reductase activity.

cDNA Cloning of Thuia nlicata (-)-Pinoresinoll~-Lariciresinol Reductases Plant Materials. Western red cedar plants (Thuja plicata) were maintained in Washington State University greenhouse facilities.
Materials. All solvents and chemicals used were reagent or HPLC grade.
Taq thermostable DNA polymerase and restriction enzymes (SacI and XbaI) were obtained from Promega. pT7BIue T-vector and competent NovaBIue cells were purchased from Novagen and radiolabeled nucleotide ([a-32P]dCTP) was purchased from DuPont NEN.
Oligonucleotide primers for polymerase chain reaction (PCR) and sequencing were synthesized by Gibco BRL Life Technologies. GENECLEAN II~ kits (BIO 101 Inc.) were used for purification of PCR fragments, with the gel-purified DNA concentrations determined by comparison to a low DNA mass ladder (Gibco BRL) in 1.3% agarose gels.
Instrumentation. UV (including RNA and DNA determinations at OD26o) spectra were recorded on a Lambda 6 UV/VIS spectrophotometer. A Temptronic II
thermocycler (Thermolyne) was used for a11 PCR amplifications. Purification of plasmid DNA for sequencing employed a QIAwell Plus plasmid purification system . 35 (Qiagen) followed by PEG precipitation (Sambrook, J., et al., Molecular Cloning: A

i WO 98l20113 PCTlUS97/20391 -Laboratory Manual, 3 volumes, 3rd Ed., Cold Spring Harbor Laboratory, Cald Spring Harbor, NY (1994)) or Wizard~ Plus SV Minipreps DNA Purification System (Promega), with DNA sequences determined using an Applied Biosystems Model 373A automated sequences.
Thuja plicata cDNA Library Synthesis. Total RNA (b.7 ~.g/g fresh weight) was obtained from young green leaves (including stems) of greenhouse-grown western red cedar plants (Thuja plicata) according to the method of Lewinsohn et al (Lewinsohn, E., et al., Plant Mol. Biol. Rep. 12:20-25 (1994)). A T.plicata cDNA
library was constructed using 3 ~.g of purified poly(A)+ mRNA (Oligotex-dTTM
Suspension, Qiagen) with the ZAP-cDNA~ synthesis kit, the Uni ZAPTM XR vector, and the Gigapack~ II Gold packaging extract (Stratagene), with a titer of 1.2 pfu for the primary library. The amplified library (7.1 X 108 pfu /ml; 28 ml total) was used for screening (Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, 3 volumes, 3rd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY ( 1994)).
T. plicata ()-Pinoresinoll()-Lariciresinol Reductase cDNA Synthesis.
T. plicata (-)-pinoresinol/(-)-lariciresinol reductase cDNA was obtained from mRNA
by a reverse transcription-polymerise chain reaction (RT-PCR) strategy {Sambrook, J., et al., Molecular Cloning. A Laboratory Manual, 3 volumes, 3rd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY ( 1994)). First-strand cDNA was synthesized from the purified mRNA previously used for the synthesis of the T. plicata cDNA library, described above. Purified mRNA (150 ng) was mixed with linker-primer (1.4 fig) from ZAP-cDNA~ synthesis kit (Stratagene), heated to 70°C far 10 min, and quickly chilled on ice. The mixture of denatured mRNA
template and linker-primer was then mixed with First Strand Buffer (Life Technologies), 10 mM DTT, 0.5 mM each dNTP, and 200 units of Super ScriptTMII
(Life Technologies) in a final volume of 20 ~.1. The reaction was carried out at 42°C
for 50 min and then stopped by heating (70°C, 1 S min). E coli RNase H
( 1.5 units, 1 ~1) was added to the solution and incubated at 37°C for 20 min.
The first-strand reaction (2 ~1) was next used as the template in 100-p.l PCR
reactions (10 mM Tris-HCI, pH 9.0, 50 mM KCI, 0.1 % Triton X-100, 1.5 mM
MgCl2, 0.2 mM each dNTP, and 5 units of Taq DNA polymerise) with primer CR6-NT (5'GCACATAAGAGTATGGATAAG3')(SEQ ID No:60) (10 pmol) and primer Xhol-Poly(dT) (5'GTCTCGAGTTTTTTTTTTTTTTTTTT3')(SEQ ID No:59) (10 pmol). PCR amplif cation was carried out in a thermocycler as described in (Dinkova-Kostova, A.T., et al., J. l3iol. Chem. 27I:29473-29482 (l996)) except for the annealing temperature at S2°C. PCR products were resolved in 1.3 %
agarose gels, where at least two bands possessing the expected length (about 1,200-bp) were observed. The bands were extracted from the gel. The gel-purified PCR products (56 ng) were then ligated into the pT7Blue T-vector (50 ng) and transformed into competent NovaBlue cells, according to Novagen's instructions.
The size and orientation of the inserted cDNAs were determined using the rapid boiling lysis and PCR technique, following the manufacturer's (Novagen's) instructions, with the following primer combinations: R20-mer(SEQ ID No:74) with U19-mer (SEQ ID No:7S); R20-mer (SEQ ID No:74) with CR6-NT (SEQ ID
No:60); U19-mer (SEQ ID No:7S) with CR6-NT (SEQ ID No:60). The CR6-NT
primer end of the inserted DNAs was located next to the U 19-mer primer site of the T-vector. The T-vectors containing the inserted cDNAs were purified with Wizard~
Plus SV Minipreps DNA Purification System. Five inserted cDNAs were completely 1 S sequenced using overlapping sequencing primers and were shown to be identical except that polyadenylation sites were different. Therefore, the longest cDNA, designated plr-Tp 1, (SEQ ID No:61 ) was used for detection of enzyme activity using the pBluescript expression system.
Sequence Analysis - DNA and amino acid sequence analyses were performed using the Unix-based GCG Wisconsin Package (Program Manual far the Wisconsin Package, Version 8, September 1994, Genetics Computer Group, 57S Science Drive, Madison, Wisconsin, USA 53711 (1996); Rice, P., Program Manual for the EGCG
Package, Peter Rice, The Sanger Centre, Hinxton Hall, Cambridge, CB 10 1 Rq, England) and the ExPASy World Wide Web molecular biology server (Geneva University Hospital and University of Geneva, Geneva, Switzerland).

cDNA Cloning_and Expression of ThJa~licata (+}-Pinoresinol/
(+)-Lariciresinol Reductase 7: plicata (+)-Pinoresinoll(+)-Lariciresinol Reductase cD~VA cloning. After plr-Tp 1 was cloned and sequenced, the full-length clone was used to screen the T. plicata cDNA library as described in Example 11, except that the entire plr-Tpl cDNA insert was used as a probe. Several positive clones were sequenced, revealing one new, unique cDNA which was called plr-Tp2. This cDNA encodes a reductase with high sequence similarity to plr-Tp 1 (~8 I % similarity at the amino acid level), i WO 98l20113 PCTIUS97120391 but with substrate specificity properties identical to the original Forsythia intermedia reductase, as described below.
Enzyme Assays. Pinoresinol and lariciresinol reductase activities were assayed by monitoring the formation of [3H]lariciresinol and [3H]secoisolariciresinol as set forth in Example 8, with the following modifications. Briefly, each assay for pinoresinol reductase activity consisted of (t)-pinoresinols (5 mM in MeOH, 20 pl) and the enzyme preparation (i.e., total protein extract from E. coli, 210 pl).
The enzymatic reaction was initiated by addition of [4R-3H]NADPH (10 mM, 6.79 kBq/mmol in distilled H20, 20 ~l). After 3 hour incubation at 30°C
with shaking, the assay mixture was extracted with EtOAc (500 ul) containing (~)-lariciresinols (20 ~tg) and (~)-secoisolariciresinols (20 pg) as radiochemical carriers. After centrifugation (13,800 x g, 5 min), the EtOAc solubles were removed and the extraction procedure was repeated. For each assay, the EtOAc solubles were combined with an aliquot ( 100 pl) removed for determination of its radioactivity using liquid scintillation counting. The remainder of the combined EtOAc solubles was evaporated to dryness in vacuo, reconstituted in MeOH/H20 (3 0: 70, 100 p l) and subjected to reversed phase and chiral column HPLC.
Lariciresinol reductase activity was assayed by monitoring the formation of (+}-[3H]secoisolariciresinol. These assays were carried out exactly as described above, except that (~)-lariciresinols (5 mM in MeOH, 20 ~1) were used as substrates, with (~)-secoisolariciresinols (20 ~.g) added as radiochemical carriers.
Expression of plr-Tpl in E toll - In order for the open reading frame (ORF) of plr-Tp 1 to be in frame with the (3-galactosidase gene a-complementation particle in pBluescript SK(-}, plr-Tp 1 was excised out of pT7Blue T-vector with SacI
and XbaI, gel-purified, and then ligated into the expression vector digested with these same enzymes. This plasmid, pPCR-Tp 1, was transformed into NovaBlue cells according to Novagen's instructions. The transformed cells (S-ml cultures) were grown at 37°C in LB medium (Sambrook, J., et al., Molecular Cloning.' A
Laboratory Manual, 3 volumes, 3rd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY { 1994)) supplemented with 50 pg ml' ~ carbenicillin with shaking (225 rpm) to mid log phase (A6oo = 0.5-0.7). The cells were next collected by centrifugation (1000 x g, 10 min) and resuspended in fresh LB medium supplemented with 10 mM IPTG (isopropyl ~3-D-thioglucopyranoside) and 50 p.g m1-~ carbenicillin to an absorbance of 0.6 (at 600 nm). The cells, allowed to grow overnight, were collected by centrifugation and resuspended in S00-700 ~1 of (per ml culture tube) of buffer (SO mM Tris-HCI, pH 7.5, 2 mM EDTA, and 5 mM
DTT). Next, the cells were lysed by sonication {5 x 45 s) and after centrifugation ( 17500 x g, 4°C, 10 min) the supernatant was removed and assayed for (-)-pinoresinol/(-)-lariciresinol reductase activity as described above.
Controls 5 included assays of pBluescript (SK(-)) without insert DNA (as negative control} or with pPLR-Fil (cDNA of authentic F. intermedia (+)-pinoresinol/ (+)-lariciresinol reductase in frame) as stereospecific control, as wel l as pPLR-Tp 1 with no substrate except {4R)-3HNADPH.
The results showed that both (-)-lariciresinol and (+)-secoisolariciresinol were radiolabeled and that no incorporation of radioactivity was found in (-)-secoisolariciresinol. However, accumulation of radiolabel into (+)-lariciresinol was also observed, although at a much slower rate than that observed for (-)-lariciresinol. These results indicate that plr-Tpl can use both (-)-pinoresinol and (+)-pinoresinol as substrates, with the former being converted via (-)-lariciresinol 1 S completely to (+)-secoisolariciresinol, and the latter being converted much more slowly to (+)-lariciresinol, but not further to (-)-secoisolariciresinol.
Expression of plr-Tp2 in E coli. The plr-Tp2 cDNA was found to be in frame with the ~i-galactosidase gene a-complementation particle in pBluescript SK(-).
When evaluated for activity and substrate specificity, as described above, plr-Tp2 was found to possess the same substrate specificity and product formation as the original Forsythia intermedia reductase (Dinkova-Kostova, A.T., et al., J.
Biol.
Chem. 271:29473-29482 ( 1996)) except that a smal l amount of (-)-lariciresinol was also detected. This is interesting, because plr-Tp2 has a higher sequence similarity to plr-Tp 1 than it does to the Forsythia reductase.
A11 the above observations were confirmed using deuterolabeled substrates (+)-[9,9'?H2, OC2H3]pinoresinols with isolation of the corresponding lignans;
each was then subjected to chiral column chromatography and HPLC-mass spectral analysis to confirm these findings.

Cloning of Additional Pinoresinol/Lariciresinol Reductases from T'hu,La plicata and Tsuga heterophylla Two additional pinoresinol/lariciresinol reductases were cloned from a Thuja plicata young stem cDNA library as described in Example 15 for the cloning of plr-Tp2. The two additional pinoresinol/Iariciresinol reductases were designated plr-Tp3 (SEQ ID No:65) and plr-Tp4 (SEQ ID No:67).

i~

WO 98l20113 PCT/LTS97/20391 -Two additional pinoresinol/lariciresinol reductases were cloned from a Tsuga heterophylla young stem cDNA library as described in Example 15 for the cloning of plr-Tp2. The two additional pinoresinol/lariciresinol reductases from Tsuga heterophylla were designated plr-Tp3 (SEQ ID No:69) and plr-Tp4 (SEQ ID
No:71).
While the preferred embodiment of the invention has been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.

SEQUENCE LISTING
{1) GENERAL INFORMATION:
( (i) APPLICANT: Lewis, Norman G
Davin, Laurence B
Dinkova-Kostova, Albena T
Fujita, Masayuki . Gang, David R
Sarkanen, Simo (ii) TITLE OF INVENTION: Recombinant Pinoresinol/Lariciresinol Reductases,Recombinant Dirigent Proteins and Methods of Use (iii) NUMBER OF SEQUENCES: 76 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Christensen, 0'Connor, Johnson & Kindness (B) STREET: l420 Fifth Avenue, Suite 2800 (C) CITY: Seattle (D) STATE: Washington (E) COUNTRY: USA
(F) ZIP: WA 98101-2397 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk (B) COMPUTER: IBM PC compatible (C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: PatentIn Release #1.0, Version #1.30 (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(viii) ATTORNEY/AGENT INFORMATION:
{A) NAME: Shelton, Dennis K
(B) REGISTRATION NUMBER: 26,997 (C) REFERENCE/DOCKET NUMBER: WSUR111351 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 206 682 8100 (B) TELEFAX: 206 224 0779 (2) INFORMATION FOR SEQ ID N0:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO

i~

(v) FRAGMENT TYPE: N-terminal (vi) ORIGINAL SOURCE:
(A) ORGANISM: Forsythia intermedia dirigent protein N-terminal sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:
Lys Pro Arg Pro Xaa Arg Xaa Xaa Lys Glu Leu Val Phe Tyr Phe Xaa Asp Ile Leu Phe Lys Gly Xaa Asn Tyr Asn Xaa Ala (2) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: Forsythia intermedia dirigent protein internal tryptic fragment (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:
Thr Ala Met Ala Val Pro Phe Asn Tyr Gly Asp Leu Val Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn (2) INFORMATION FOR SEQ ID N0:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: Forsythia intermedia dirigent protein internal tryptic fragment (xi) SEQUENCE DESCRIPTION: SEQ ID N0:3:

-b9-Tyr Val Gly Thr Leu Asn Phe Ala Gly Ala Asp Pro Leu Leu Xaa Lys (2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids (B) TYPE: amino acid {C) STRANDEDNESS: not relevant (D} TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: Forsythia intermedia dirigent protein internal tryptic fragment (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4:
Asp Ile Ser Val Ile Gly Gly Thr Gly Asp Phe Phe Met Ala Arg (2) INFORMATION FOR SEQ ID N0:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D} TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv} ANTI-SENSE: NO
(v) FRAGMENT TYPE: Forsythia intermedia dirigent protein internal tryptic fragment (xi) SEQUENCE DESCRIPTION: SEQ ID N0:5:
Gly Val Ala Thr Leu Met Thr Asp Ala Phe Glu Gly Asp Xaa Tyr (2) INFORMATION FOR SEQ ID N0:6:
(i) SEQUENCE CHARACTERTSTICS:
(A) LENGTH: 10 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide i (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: Forsythia intermedia dirigent protein internal tryptic fragment (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6:
Ala Gln Gly Met Tyr Phe Tyr Asp Gln Lys (2) INFORMATION FOR SEQ ID N0:7;
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: Forsythia intermedia dirigent protein internal tryptic fragment (xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:
Tyr Asn Ala Trp Leu (2) INFORMATION FOR SEQ ID N0:8:
{i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: "PCR primer PSINT1"
(iii) HYPOTHETICAL: NO
{iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8:

(2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTICS:

WO 98I20113 PCTIUS9?I20391 -(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: "PCR primer PSI1R"
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9:

(2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION:"PCR primer PSI2R"
(iii) HYPOTHETICAL: NO
(ivy ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:10:

(2) INFORMATION FOR SEQ ID I30:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A} DESCRIPTION:"PCR primer PSI?R"
(iii) HYPOTHETICAL: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:

(2) INFORMATION FOR SEQ ID N0:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 901 base pairs (B) TYPE: nucleic acid i WO 98I20113 PCT/US97l20391 -(C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Forsythia intermedia clone psd-fil (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 26..583 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:12:

CAAAC ACA

MetVal SerLysThr GlnIleVal Ala LeuPhe LeuCys PheLeuThr SerThr SerSerAla ThrTyrGly Arg LysPro ArgPro ArgArgPro CysLys GluLeuVal PheTyrPhe His AspVal LeuPhe LysGlyAsn AsnTyr HisAsnAla ThrSerAla Ile ValGly SerPro GlnTrpGly AsnLys ThrAlaMet AlaValPro Phe AsnTyr GlyAsp LeuValVal PheAsp AspProIle ThrLeuAsp Asn AsnLeu HisSer ProProVal GlyArg AlaGlnGly MetTyrPhe Tyr AspGln LysAsn ThrTyrAsn AlaTrp LeuGlyPhe SerPheLeu Phe AsnSer ThrLys TyrValG1y ThrLeu AsnPheAla GlyAlaAsp Pro LeuLeu AsnLys ThrArgAsp IleSer ValIleGly GlyThrGly Asp Phe Phe Met Ala Arg Gly Val Ala Thr Leu Met Thr Asp Ala Phe Glu l55 160 165 CGT ATT

Gly Asp Val Tyr Phe Arg Leu Val Asp Asn Leu Tyr Glu Cys Arg Ile 170 175 180 l85 ATATAT ATATTTCATA

Trp TGTACTATTG F,AAAAAAAAA AAAAAAAA 901 (2) INFORMATION FOR SEQ ID N0:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: l86 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein: Forsythia intermedia PSD-Fil protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:13:
Met Val Ser Lys Thr Gln Ile Val Ala Leu Phe Leu Cys Phe Leu Thr Ser Thr Ser Ser Ala Thr Tyr Gly Arg Lys Pro Arg Pro Arg Arg Pro Cys Lys Glu Leu Val Phe Tyr Phe His Asp Val Leu Phe Lys Gly Asn Asn Tyr His Rsn Ala Thr Ser Ala Ile Val Gly Ser Pro Gln Trp Gly Asn Lys Thr Ala Met Ala Val Pro Phe Asn Tyr Gly Asp Leu Val Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Leu His Ser Pro Pro Val Gly Arg Ala Gln Gly Met Tyr Phe Tyr Asp Gln Lys Asn Thr Tyr Asn Ala Trp Leu Gly Phe Ser Phe Leu Phe Asn Ser Thr Lys Tyr Val Gly Thr Leu Asn Phe Ala Gly Ala Asp Pro Leu Leu Asn Lys Thr Arg Asp 130 135 l40 i Ile Ser Val Ile Gly Gly Thr Gly Asp Phe Phe Met Ala Arg Gly Val Ala Thr Leu Met Thr Asp Ala Phe Glu Gly Asp Val Tyr Phe Arg Leu Arg Val Asp Ile Asn Leu Tyr Glu Cys Trp (2)INFORMATION
FOR
SEQ
ID
N0:14:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 8S8 base s pair (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Forsythiantermedia i cDNA
PSD-Fi2 (iii)HYPOTHETICAL: NO

(iv)ANTI-SENSE: NO

(ix)FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 19..573 (xi)SEQUENCE DESCRIPTION: ID
SEQ N0:19:

GAGGAAAA
ATG
GCA
GCT
AAA

Met Ala Ala Lys Thr GlnThrThr AlaLeuPhe TCC GCC

LeuCys Leu Leu Ile Cys Ile Val TyrGlyHis LysThrArg Ser Ala CTC GTT

SerArg Arg Pro Cys Lys Glu Phe PhePheHis AspIleLeu Leu Val AAT GCC

TyrLeu Gly Tyr Asn Arg Asn Thr AlaValIle ValAlaSer Asn Ala GCC ATG

ProGln Trp Gly Asn Lys Thr Ala LysProPhe AsnPheGly Ala Met Asp Leu Val Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Leu His Ser Pro Pro Val Gly Arg Ala Gln Gly Thr Tyr Phe Tyr Asp Gln Trp CTT

Ser Ile TyrGly AlaTrp Gly PheSerPheLeu PheAsnSer Thr Leu AAT

Asp Tyr ValGly ThrLeu Phe AlaGlyAlaAsp ProLeuIle Asn Asn GTA

Lys Thr ArgAsp IleSer Ile GlyGlyThrGly AspPhePhe Met Val GTG

Ala Arg GlyVal AlaThr Ser ThrAspAlaPhe GluGlyAsp Val Val GAT

Tyr Phe ArgLeu ArgVal Ile ArgLeuTyrGlu CysTrp Asp TAAATTTACC TTATTTTTCC GTTTGACTCG GATTTGACTA

ATTTTCTTGA ATAATGTCTT

CTGTAATCCT TGTTTTTGAT CGATTTTATC AATTAGTGAT

CAATTTGTGG TGTTTGGTTC

ATATTTTAAT CTGTTAAAAA CAAAAGCCAA TAACCACAAC

AAATTGTGGT CGTAGGGAGT

TTTTTCCGTT AAGGGGAAAA CCATGTGTTA CTACGTTTTC

AAAAGTATGT AATTTCATTC

AAAATTTGCT TTTCAATCAT P,~1AAAAAAAAAAAAA

CTTCTTCAAA

(2) INFORMATION FOR SEQ ID N0:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 185 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Forsythia intermedia dirigent protein PSD-Fi2 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:15:
Met Ala Ala Lys Thr Gln Thr Thr Ala Leu Phe Leu Cys Leu Leu Ile Cys Ile Ser Ala Val Tyr Gly His Lys Thr Arg Ser Arg Arg Pro Cys Lys Glu Leu Val Phe Phe Phe His Asp Ile Leu Tyr Leu Gly Tyr Asn Arg Asn Asn Ala Thr Ala Val Ile Val Ala Ser Pro Gln Trp Gly Asn Lys Thr Ala Met Ala Lys Pro Phe Asn Phe Gly Asp Leu Val Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Leu His Ser Pro Pro Val Gly i~

WO 98I20113 PCTIUS97l20391 -Arg Ala Gln Gly Thr Tyr Phe Tyr Asp Gln Trp Ser Ile Tyr Gly Ala 100 l05 110 Trp Leu Gly Phe Ser Phe Leu Phe Asn Ser Thr Asp Tyr Val Gly Thr 1l5 120 125 Leu Asn Phe Ala Gly Ala Asp Pro Leu Ile Asn Lys Thr Arg Asp Ile Ser Val Ile Gly Gly Thr Gly Asp Phe Phe Met Ala Arg Gly Val Ala Thr Val Ser Thr Asp Ala Phe Glu Gly Asp Val Tyr Phe Arg Leu Arg l65 170 175 Val Asp Ile Arg Leu Tyr Glu Cys Trp (2) INFORMATION FOR SEQ ID N0:16:
(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 99B base s pair (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Tsuga ophylla proteincDNA PSD-Thl heter dirigent (iii)HYPOTHETICAL: NO

(iv)ANTI-SENSE: NO

(ix)FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 109..688 (xi)SEQUENCE DESCRIPTION: ID N0:16:
SEQ

TC

CCCATATCTT TCAATG GCAATCAAG l15 CTTCTATAAT
CACTTTAGTC
TATAAGATTG

Met AlaIleLys AATCGT AAT AGA GCT GTG CAC TTT TGGCTT CTACTGTCC l63 TTG TGT CTA

AsnArg Asn Arg Ala Val His Phe TrpLeu LeuLeuSer Leu Cys Leu l90 195 200 205 GAT GGG AGC

SerVal Leu Leu Gln Thr Ser Lys TrpLys LysHisArg Asp Gly Ser CTG GTG TAT

LeuArg Lys Pro Cys Arg Asn Leu PheHis AspValIle Leu Val Tyr AAC GCT TCC

TyrAsn Gly Ser Asn Ala Lys Thr ThrLeu ValGlyAla Asn Ala Ser _77_ ProHis GlySerAsn LeuThrLeu LeuAla GlyLysAsp AsnHis Phe GlyAsp LeuAlaVal PheAspAsp ProIle ThrLeuAsp AsnAsn Phe HisSer ProProVal GlyArgAla GlnGly PheTyrPhe TyrAsp Met LysAsn ThrPheSer SerTrpLeu GlyPhe ThrPheVal LeuAsn Ser ThrAsp TyrLysGly ThrIleThr PheSer GlyAlaAsp ProIle Leu ThrLys TyrArgAsp IleSerVal ValGly GlyThrGly AspPhe Ile MetAla ArgGlyIle AlaThrIle SerThr AspAlaTyr GluGly Asp ValTyr PheArgLeu CysValAsn IleThr LeuTyrGlu CysTyr (2) INFORMATION FOR SEQ ID N0:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 195 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Tsuga heterophylla dirigent protein PSD-Thl (xi) SEQUENCE DESCRIPTION: SEQ ID N0:17:
Met Ala Ile Lys Asn Arg Asn Arg Ala Val His Leu Cys Phe Leu Trp Leu Leu Leu Ser Ser Val Leu Leu Gln Thr Ser Asp Gly Lys Ser Trp i~

_78_ Lys Lys His Arg Leu Arg Lys Pro Cys Arg Asn Leu Val Leu Tyr Phe His Asp Val Ile Tyr Asn Gly Ser Asn Ala Lys Asn Ala Thr Ser Thr Leu Val Gly Ala Pro His Gly Ser Asn Leu Thr Leu Leu Ala Gly Lys Asp Asn His Phe Gly Asp Leu Ala Val Phe Asp Asp.Pro Ile Thr Leu Asp Asn Asn Phe His Ser Pro Pro Val Gly Arg Ala Gln Gly Phe Tyr Phe Tyr Asp Met Lys Asn Thr Phe Ser Ser Trp Leu Gly Phe Thr Phe Val Leu Asn Ser Thr Asp Tyr Lys Gly Thr Ile Thr Phe Ser Gly Ala Asp Pro Ile Leu Thr Lys Tyr Arg Asp Ile Ser Val Val Gly Gly Thr 145 l50 155 160 Gly Asp Phe Ile Met Ala Arg Gly Ile Ala Thr Ile Ser Thr Asp Ala 165 l70 175 Tyr Glu Gly Asp Val Tyr Phe Arg Leu Cys Val Asn Ile Thr Leu Tyr Glu Cys Tyr l95 (2) INFORMATION FOR SEQ ID N0:18:
(i) SEQUENCE CHARACTERISTICS:
(A} LENGTH: 899 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Tsuga heterophyila dirigent protein PSD-Th2 cDNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 71..625 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:18:

Met Ala Ile Lys Ser Asn Arg Ala Val Arg Phe Cys Phe _79_ ValTrpLeu LeuLeu LeuGln SerGlyPheVal PhePro LeuProGln 2l0 215 220 ProCysArg AsnLeu ValLeu TyrPheHisAsp ValLeu TyrAsnGly PheAsnAla HisAsn AlaThr SerThrLeuVal GlyAla ProGlnGly AlaAsnLeu ThrLeu LeuAla GlyLysAspAsn HisPhe GlyAspLeu AlaValPhe AspAsp ProIle ThrLeuAspAsn AsnPhe GlnSerPro ProValGly ArgAla GlnGly PheTyrPheTyr AspMet LysAsnThr PheSerSer TrpLeu GlyPhe ThrPheValLeu AsnSer ThrAspTyr LysGlyThr IleThr PheSer GlyAlaAspPro IleLeu ThrLysTyr ArgAspIle SerVal ValGly GlyThrGlyAsp PheIle MetAlaArg GlyIleAla ThrIle SerThr AspAlaTyrGlu GlyAsp ValTyrPhe ArgLeuArg ValAsn IleThr LeuTyrGluCys Tyr (2) INFORMATION FOR SEQ ID N0:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18S amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear i WO 98/20113 PCTlUS97120391 -(ii) MOLECULE TYPE: Tsuga heterophylla dirigent protein translated from PSD-Th2 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:19:
Met Ala Ile Lys Ser Asn Arg Ala Val Arg Phe Cys Phe Val Trp Leu Leu Leu Leu Gln Ser Gly Phe Val Phe Pro Leu Pro Gln Pro Cys Arg Asn Leu Val Leu Tyr Phe His Asp Val Leu Tyr Asn Gly Phe Asn Ala His Asn Ala Thr Ser Thr Leu Val Gly Ala Pro Gln Gly Ala Asn Leu Thr Leu Leu Ala Gly Lys Asp Asn His Phe Gly Asp Leu Ala Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Phe Gln Ser Pro Pro Val Gly Arg Ala Gln Gly Phe Tyr Phe Tyr Asp Met Lys Asn Thr Phe Ser Ser 100 105 1l0 Trp Leu Gly Phe Thr Phe Val Leu Asn Ser Thr Asp Tyr Lys Gly Thr Ile Thr Phe Ser Gly Ala Asp Pro Ile Leu Thr Lys Tyr Arg Asp Ile Ser Val Val Gly Gly Thr Gly Asp Phe Ile Met Ala Arg Gly Ile Ala 195 l50 155 l60 Thr Ile Ser Thr Asp Ala Tyr Glu Gly Asp Val Tyr Phe Arg Leu Arg 165 l70 l75 Val Asn Ile Thr Leu Tyr Glu Cys Tyr 180 l85 (2) INFORMATION FOR SEQ ID N0:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 873 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Thuja plicata dirigent protein PSD-Tpl cDNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 25..591 (xi)SEQUENCE SEQ ID
DESCRIPTION: N0:20:

AGAT ATA
ATG

Met Ser ArgI1eAla PheHisLeu Cys Phe Met GlyLeuLeu LeuSerSerThr ValLeuArg AsnValAsp Gly His Ala TrpLysArg GlnLeuProMet ProCysLys AsnLeuVal Leu Tyr Phe HisAspIle LeuTyrAsnGly LysAsnIle HisAsnAla Thr Ala Ala LeuValAla AlaProAlaTrp GlyAsnLeu ThrThrPhe Ala Glu Pro PheLysPhe GlyAspValVal ValPheAsp AspProIle Thr Leu Asp AsnAsnLeu HisSerProPro ValGlyArg AlaGlnGly Phe Tyr Leu TyrAsnMet LysThrThrTyr AsnAlaTrp LeuGlyPhe Thr Phe Val LeuAsnSer ThrAspTyrLys GlyThrIle ThrPheAsn Gly Ala Asp ProProLeu ValLysTyrArg AspIleSer ValValGly Gly Thr Gly AspPheLeu MetAlaArgGly IleAlaThr LeuSerThr Asp Ala Ile GluGlyAsn ValTyrPheArg LeuArgVal AsnIleThr Leu TAAC GAGAG GTTTAG

Tyr Glu CysTyr . TAGTGTGTTG GGCTGTTTAC TTAAAGTCGA CGTTCTATGC AGTTGAAGTC TTTGTTTAGA 691 i TATTTTGAGT CGAAAAAAAA AAAAAAAAAA P,F~~AAAAAAA P,~~~AAAAAAA AAAAAAAAAA 8 71 (2) INFORMATION FOR SEQ ID N0:21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 189 amino acids (B} TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:21:
Met Ser Arg Ile Ala Phe His Leu Cys Phe Met G1y Leu Leu Leu Ser Ser Thr Val Leu Arg Asn Val Asp Gly His Ala Trp Lys Arg Gln Leu Pro Met Pro Cys Lys Asn Leu Val Leu Tyr Phe His Asp Ile Leu Tyr Asn Gly Lys Asn Ile His Asn Ala Thr Ala Ala Leu Val Ala Ala Pro Ala Trp Gly Asn Leu Thr Thr Phe Ala Glu Pro Phe Lys Phe Gly Asp Val Val Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Leu His Ser Pro Pro Val Gly Arg Ala Gln Gly Phe Tyr Leu Tyr Asn Met Lys Thr l00 105 110 Thr Tyr Asn Ala Trp Leu Gly Phe Thr Phe Val Leu Asn Ser Thr Asp Tyr Lys Gly Thr Ile Thr Phe Asn Gly Ala Asp Pro Pro Leu Val Lys 130 135 l40 Tyr Arg Asp Ile Ser Val Val Gly Gly Thr Gly Asp Phe Leu Met A1a Arg Gly Ile Ala Thr Leu Ser Thr Asp Ala Ile Glu Gly Asn Val Tyr Phe Arg Leu Arg Val Asn Ile Thr Leu Tyr Glu Cys Tyr (2) INFORMATION FOR SEQ ID N0:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 867 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Thuja plicata dirigent protein PSD-Tp2 cDNA
~ fiii) HYPOTHETICAL: NO
(iv)ANTI-SENSE: NO

(ix)FEATURE:

(A! CDS
NAME/KEY:

(B) ON:80..655 LOCATI

(xi)SEQUENCE SEQ D
DESCRIPTION: I N0:22:

GCTGGTTCAG ACCCAAACAT
TAATCTATGT
CTT

TTTGCAAAA AAA
ATG TTT
GCA CTG
ATG CAT
AAG TTC

MetAla Me t AlaAla LysPhe His Lys Leu Phe LeuPhe IleTrp LeuLeuVal CysThr ValLeu LeuLysSer AlaAsp CysHis ArgTrp LysLysLys IlePro GluPro CysLysAsn LeuVal LeuTyr PheHis AspIleLeu TyrAsn GlySer AsnLysHis AsnAla ThrSer AlaIle ValGlyAla ProLys GlyAla AsnLeuThr IleLeu ThrGly AsnAsn HisPheGly AspVal ValVal PheAspAsp ProIle ThrLeu AspAsn AsnLeuHis SerThr ProVal GlyArgAla GlnGly PheTyr PheTyr AspMetLys AsnThr PheAsn SerTrpLeu GlyPhe 300 305 3l0 ThrPhe ValLeu AsnSerThr AsnTyr LysGly ThrIleThr PheAsn GlyAla AspPro IleLeuThr LysTyr ArgAsp IleSerVal ValGly GlyThr GlyAsp PheLeuMet AlaArg GlyIle AlaThrIle SerThr i~

GAT GCA TAC GAG GGA GAT GTT TTC CGT AGG.GTG AAT ATC ACT 640 TAT CTT

Asp Ala Tyr Glu Gly Asp Val Phe Arg Arg Val Asn Ile Thr Tyr Leu T GTTCTAATCT
CTAATTTGAG

Leu Tyr Glu Cys Tyr (2) INFORMATION FOR SEQ ID N0:23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: l92 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:23:
Met Ala Met Lys Ala Ala Lys Phe Leu His Phe Leu Phe Ile Trp Leu Leu Val Cys Thr Val Leu Leu Lys Ser Ala Asp Cys His Arg Trp Lys Lys Lys Ile Pro Glu Pro Cys Lys Asn Leu Val Leu Tyr Phe His Asp Ile Leu Tyr Asn Gly Ser Asn Lys His Asn Ala Thr Ser Ala Ile Val Gly Ala Pro Lys Gly Ala Asn Leu Thr Ile Leu Thr Gly Asn Asn His Phe Gly Asp Val Val Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Leu His Ser Thr Pro Val Gly Arg Ala Gln Gly Phe Tyr Phe Tyr Asp 100 105 l10 Met Lys Asn Thr Phe Asn Ser Trp Leu Gly Phe Thr Phe Val Leu Asn 1l5 120 125 Ser Thr Asn Tyr Lys Gly Thr Ile Thr Phe Asn Gly Ala Asp Pro Ile Leu Thr Lys Tyr Arg Asp Ile Ser Val Val Gly Gly Thr Gly Asp Phe Leu Met Ala Arg Gly Ile Ala Thr Ile Ser Thr Asp Ala Tyr Glu Gly Asp Val Tyr Phe Arg Leu Arg Val Asn Ile Thr Leu Tyr Glu Cys Tyr ~ (2) INFORMATION FOR SEQ ID N0:24:
(i) SEQUENCE CHARACTERISTTCS:
(A) LENGTH: 919 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Thuja plicata dirigent protein PSD-Tp3 cDNA
(iii) HYPOTHETICAL; NO
(iv) ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(8) LOCATION: 94..669 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:29:
CGTAGGAAAT ATCTCAGAGG TGTTGTACGA

GAGCCGAAAA AATATATAAA
TTGAGATAAT

TGCAGATGTT TCT ACA
GTT GCT
GCT
AGA

Val LysThr Ala Ser Ala Arg GTT CTGCATTTA TGC TTTCTATGG CTTCTAGTA TCTGCAATC TTCATA l62 Val LeuHisLeu Cys PheLeuTrp LeuLeuVal SerAlaIle PheIle 200 205 2l0 215 Lys SerAlaAsp Cys ArgSerTrp LysLysLys LeuProLys ProCys Arg AsnLeuVal Leu TyrPheHis AspIleIle TyrAsnGly LysAsn Ala GluAsnAla Thr SerAlaLeu ValSerAla ProGlnGly AlaAsn Leu ThrI1eMet Thr GlyAsnAsn HisPheGly AsnLeuAla ValPhe Asp AspProIle Thr LeuAspAsn AsnLeuHis SerProPro ValGly Arg AlaGlnGly Phe TyrPheTyr AspMetLys AsnThrPhe SerAla i~

WO 98I20113 PCTlUS97120391 --8b-TrpLeu GlyPhe ThrPheVal LeuAsnSer ThrAsp HisLysGly Ser IleThr PheAsn GlyAlaAsp ProIleLeu ThrLys TyrArgAsp Ile SerVal ValGly GlyThrGly AspPheLeu MetAla ArgGlyIle Ala ThrIle SerThr AspSerTyr GluGlyAsp ValTyr PheArgLeu Arg CCTTGCTCTG

ValAsn IleThr LeuTyrGlu CysTyr CCAGATAAAG TATGTCATGT GCTTTGACAA AAAAAAAAAA AAAAA 9l4 (2) INFORMATION FOR SEQ ID N0:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 192 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:25:
Val Ser Lys Thr Ala Ala Arg Val Leu His Leu Cys Phe Leu Trp Leu Leu Val Ser Ala Ile Phe Ile Lys Ser Ala Asp Cys Arg Ser Trp Lys Lys Lys Leu Pro Lys Pro Cys Arg Asn Leu Val Leu Tyr Phe His Asp Ile I1e Tyr Asn Giy Lys Asn Ala Glu Asn Ala Thr Ser Ala Leu Val Ser Ala Pro Gln Gly Ala Asn Leu Thr Ile Met Thr Gly Asn Asn His Phe Gly Asn Leu Ala Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Leu His Ser Pro Pro Val Gly Arg Ala Gln Gly Phe Tyr Phe Tyr Asp WO 98I20113 PCTlUS97120391 -_87_ Met Lys Asn Thr Phe Ser Ala Trp Leu Gly Phe Thr Phe Val Leu Asn ~ Ser Thr Asp His Lys Gly Ser Ile Thr Phe Asn Gly Ala Asp Pro Ile Leu Thr Lys Tyr Arg Asp Ile Ser Val Val Gly Gly.Thr Gly Asp Phe Leu Met Ala Arg Gly Ile Ala Thr Ile Ser Thr Asp Ser Tyr Glu Gly 165 I70 l75 Asp Val Tyr Phe Arg Leu Arg Val Asn Ile Thr Leu Tyr Glu Cys Tyr (2) INFORMATION FOR SEQ ID N0:26:
(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 704 base s pair (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Thuja ta PSD-Tp4 plica dirigent cDNA
protein (iii)HYPOTHETICAL: NO

(iv)ANTI-SENSE: NO

(ix)FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 3..416 (xi)SEQUENCE DESCRIPTION: ID
SEQ N0:26:

GCC GTT GGA
CAC GCA
AAT GCC
GCA CCT
ACA
TCT
GCA

Asn Leu ProGlu Ala Val Gly His Ala Asn Ala Ala Thr Ser Ala GGT AAT

AlaAsn Leu Thr Ile Met Thr Asn HisPhe GlyAsnIle Ala Gly Asn CTT GAC

ValPhe Asp Asp Pro Ile Thr Asn AsnLeu HisSerPro Ser Leu Asp TAC TTC

ValGly Arg Ala Gln Gly Phe Tyr AspMet LysAspThr Phe Tyr Phe TTT GTG

AsnAla Trp Leu Gly Phe Thr Leu AsnSer ThrAspHis Lys Phe Val GCA GAT

GlyThr Ile Thr Phe Asn Gly Pro IleLeu ThrLysTyr Arg Ala Asp i _88_ TCT ACA TTC

Asp Ile Val Va1 Gly Gly Gly Asp Leu Met Ala Arg Gly Ser Thr Phe ACC TCA GGA

Ile Ala I1e Ser Thr Asp Tyr Glu Asp Val Tyr Phe Arg Thr Ser Gly 305 310 3l5 GTC TAT TAC

Leu Arg Asn Ile Thr Leu Glu Cys Val Tyr Tyr ATTTTACAGAGTTTAGTTTT GCCCTCTAGAATATTATGTTTTCAAAATGC TCTATGAAAG6l6 A,F~AAAAAAAAP.,~~3AAAAAAA AP 7 (2) INFORMATION FOR SEQ ID N0:27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 138 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:27:
Asn Ala His Asn Ala Thr Ser Ala Leu Val Ala Ala Pro Glu Gly Ala Asn Leu Thr Ile Met Thr Gly Asn Asn His Phe G1y Asn Ile Ala Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Leu His Ser Pro Ser Val Gly Arg Ala Gln Gly Phe Tyr Phe Tyr Asp Met Lys Asp Thr Phe Asn Ala Trp Leu Gly Phe Thr Phe Val Leu Asn Ser Thr Asp His Lys Gly Thr I1e Thr Phe Asn Gly Ala Asp Pro Ile Leu Thr Lys Tyr Arg Asp Ile Ser Val Val Gly Gly Thr Gly Asp Phe Leu Met Ala Arg Gly Ile Ala Thr Ile Ser Thr Asp Ser Tyr Glu Gly Asp Val Tyr Phe Arg Leu Arg Val Asn Ile Thr Leu Tyr Glu Cys Tyr (2) INFORMATION FOR SEQ ID N0:28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 820 base pairs - (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Thuja plicata dirigent protein PSD-Tp5 cDNA
(iii) HYPOTHETICAL: NO
{iv) ANTI-SENSE: NO
(ix)FEATURE:

(A) CDS
NAME/KEY:

(B) 43..612 LOCATION;

(xi)SEQUENCE SEQID
DESCRIPTION: N0:28:

GAGAAAATTC TTACCAATAG ATG GCC
CAATAATTT ATT

Met Lys Ile Ala ArgValLeuHis LeuCys PheLeuCysLeu Leu ValSerAlaIleLeu LeuLysSerAla AspCys HisSerTrpLys Lys LysLeuProLysPro 160 l65 170 CysLysAsnLeu ValLeu TyrPheHisAsp Ile IleTyrAsnGlyLys AsnAlaGluAsn AlaThr SerAlaLeuVal Ala AlaProGluGlyAla l95 200 205 AsnLeuThrIle MetThr GlyAsnAsnHis Phe GlyAsnLeuAlaVal PheAspAspPro I1eThr LeuAspAsnAsn Leu HisSerProProVal GlyArgAlaGln GlyPhe TyrPheTyrAsp Met LysAsnThrPheSer AlaTrpLeuGly PheThr PheValLeuAsn Ser ThrAspHisLysGly i~

ACT CCA CTG TAC

ThrIle Phe AsnGly Ala Asp Ile Thr Lys Arg Asp Thr Pro Leu Tyr GTT GAT TTG AGA

IleSer Val GlyGly Thr Gly Phe Met Ala Gly Ile Val Asp Leu Arg ATT GAG GAA TTC

AlaThr Ser ThrAsp Ser Tyr Gly Val Tyr Arg Leu Ile Glu Glu Phe AAT TGT TGAGCAAATG
CCTGTCTTCT

ArgVal Ile ThrLeu Tyr Glu Tyr Asn Cys GGGTGCCTTT AATGTCTCTG
GAGGAATAGT

GGAGTCTATT GTCTCTATAT
TTGAAGATTA

ATATATATAT TGAAGAGAAT CTTTTCATTC,AAAAAAAAA B12 GAGATCTGTT P
TTAGGTAGCT

(2)INFORMATIONFORSEQ ID N0:29:

(i) SEQUENCE CHARACTERISTICS:

(A} LENGTH: cids amino a (B) TYPE:
amino acid (D) TOPOLOGY:
linear (ii) MOLECULE TYPE: protein (xi} SEQUENCE DESCRIPTION: SEQ ID N0:29:
Met Lys Ala Ile Arg Val Leu His Leu Cys Phe Leu Cys Leu Leu Val Ser Ala Ile Leu Leu Lys Ser Ala Asp Cys His Ser Trp Lys Lys Lys Leu Pro Lys Pro Cys Lys Asn Leu Val Leu Tyr Phe His Asp Ile Ile Tyr Asn Gly Lys Asn Ala Glu Asn Ala Thr Ser Ala Leu Val Ala Ala Pro Glu Gly Ala Asn Leu Thr Ile Met Thr Gly Asn Asn His Phe Gly 65 . 70 75 80 Asn Leu Ala Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Leu His Ser Pro Pro Val Gly Arg Ala Gln Gly Phe Tyr Phe Tyr Asp Met Lys l00 1Q5 Z10 Asn Thr Phe Ser Ala Trp Leu Gly Phe Thr Phe Val Leu Asn Ser Thr Asp His Lys Gly Thr Ile Thr Phe Asn Gly Ala Asp Pro IIe Leu Thr Lys Tyr Arg Asp Ile Ser Val Va1 Gly Gly Thr Gly Asp Phe Leu Met Ala Arg Gly Ile Ala Thr Ile Ser Thr Asp Ser Tyr Glu Gly Glu Val Tyr Phe Arg Leu Arg Val Asn Ile Thr Leu Tyr Glu Cys Tyr (2) INFORMATION FOR SEQ ID N0:30:
(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: rs 2023 base pai (B) TYPE:
nucleic acid (C) STRANDEDNESS:
single (D) TOPOLOGY:linear (ii)MOLECULE TYPE:Thuja plicata protein PSD-Tp6 dirigent cDNA

(iii)HYPOTHETICAL:
NO

(iv)ANTI-SENSE:
NO

(ix)FEATURE:

(A) NAME/KEY:CDS

(B) LOCATION:47..616 (xi)SEQUENCE DESCRIPTION: ID
SEQ N0:30:

TTGAGAGAAA ATG GCC
ATTCCAATAA
TTTTTTCCCA

Met Lys Ala TTA CTA

IleArg Val Leu Gln Cys Phe Trp LeuLeuVal SerAla Ile Leu Leu GAT AGC

LeuLeu Lys Ser Ala Cys His Trp LysLysLys LeuPro Lys Asp Ser GTG TTC

ProCys Lys Asn Leu Leu Tyr His AspIleIle TyrAsn Gly Val Phe GCA GCA

LysAsn Ala Glu Asn Thr Ser Leu ValAlaAla ProGlu Gly Ala Ala ATG AAT

AlaAsn Leu Thr Ile Thr Gly Asn HisPheGly AsnLeu Ala Met Asn i~

WO 98/201i3 PCT/US97120391 -Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Leu His Ser Pro Pro Val Gly Arg Ala Gln Gly Phe Tyr Phe Tyr Asp Met Lys Asn Thr Phe Ser Ala Trp Leu Gly Phe Thr Phe Val Leu Asn Ser Thr Asp His Lys ATT GCA ATC AAG

Gly Thr Thr Phe Asn Gly Asp Pro Leu Thr Tyr Arg Ile Ala Ile Lys TCT ACA TTC GCC

Asp Ile Val Val Gly Gly Gly Asp Leu Met Arg Gly Ser Thr Phe Ala ACC TCA GGA TAT

Ile Ala Ile Ser Thr Asp Tyr Glu Asp Val Phe Arg Thr Ser Gly Tyr GTC TAT TAC

Leu Arg Asn Ile Thr Leu Lys Cys Val Tyr Tyr (2} INFORMATION FOR SEQ ID N0:31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 190 amino acids (B} TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:31:
Met Lys Ala Ile Arg Val Leu Gln Leu Cys Phe Leu Trp Leu Leu Val Ser Ala I1e Leu Leu Lys Ser A1a Asp Cys His Ser Trp Lys Lys Lys Leu Pro Lys Pro Cys Lys Asn Leu Val Leu Tyr Phe His Asp Ile Ile Tyr Asn Gly Lys Asn Ala Glu Asn Ala Thr 5er Ala Leu Val Ala Ala Pro Glu Gly Ala Asn Leu Thr Ile Met Thr Gly Asn Asn His Phe Gly Asn Leu Ala Val Phe Asp Asp Pro Ile Thr Leu Asp Asn Asn Leu His Ser Pro Pro Val Gly Arg Ala Gln Gly Phe Tyr Phe Tyr Asp Met Lys 100 l05 1l0 Asn Thr Phe Ser Ala Trp Leu Gly Phe Thr Phe Val Leu Asn Ser Thr Asp His Lys Gly Thr Ile Thr Phe Asn Gly Ala Asp Pro Ile Leu Thr l30 135 190 Lys Tyr Arg Asp Ile Ser Val Val Gly Gly Thr Gly Asp Phe Leu Met Ala Arg Gly Ile Ala Thr Ile Ser Thr Asp Ser Tyr Glu Gly Asp Val Tyr Phe Arg Leu Arg Val Asn Ile Thr Leu Tyr Lys Cys Tyr 180 l85 190 (2} INFORMATION FOR SEQ ID N0:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 913 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Thuja plicata dirigent protein PSD-Tp7 cDNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 77,.652 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:32:

Met Ala Ile Trp Asn Gly Arg Val Leu Asn Leu Cys Ile Leu Trp Leu Leu Val Ser Ile Val Leu Leu Asn Gly Ile Asp ' 205 210 215 i~

AAA AAG
AAG

CysHis SerArgLys LysLys LeuProLys ProCysArg AsnLeuVal LeuTyr PheHisAsp IleIle TyrAsnGly LysAsnAla GlyAsnAla ACATCT ACGCTTGTT GCAGCC CCTCAAGGA GCTAATCTC ACCATTATG 30l ThrSer ThrLeuVal AlaAla ProGlnGly AlaAsnLeu ThrIleMet ThrGly AsnTyrHis PheGly AspLeuSer ValPheAsp AspProIle 2?0 275 2B0 ThrVal AspAsnAsn LeuHis SerProPro ValGlyArg AlaGlnGly PheTyr PheTyrAsp MetLys AsnThrPhe SerAlaTrp LeuGlyPhe ThrPhe ValLeuAsn SerThr AspTyrLys GlyThrIle ThrPheGly GlyAla AspProIle LeuAla LysTyrArg AspIleSer ValValGly GlyThr GlyAspPhe LeuMet AlaArgGly IleAlaThr 11eAspThr AspAla TyrGluGly AspVal TyrPheArg LeuArgVal AsnIleThr TAGAATAGCT
CAATCTGATA

LeuTyr GluCysTyr (2) INFORMATION FOR SEQ ID N0:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 192 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:33:
Met Ala Ile Trp Asn Gly Arg Val Leu Asn Leu Cys Ile Leu Trp Leu ~ Leu Val Ser Ile Val Leu Leu Asn Gly Ile Asp Cys His Ser Arg Lys Lys Lys Leu Pro Lys Pro Cys Arg Asn Leu Val Leu Tyr Phe His Asp Ile Ile Tyr Asn Gly Lys Asn Ala Gly Asn Ala Thr Ser Thr Leu Val Ala Ala Pro Gln Gly Ala Asn Leu Thr Ile Met Thr Gly Asn Tyr His Phe Gly Asp Leu Ser VaI Phe Asp Asp Pro Ile Thr Val Asp Asn Asn Leu His Ser Pro Pro Val Gly Arg Ala Gln Gly Phe Tyr Phe Tyr Asp Met Lys Asn Thr Phe Ser Ala Trp Leu Gly Phe Thr Phe Val Leu Asn i15 120 125 Ser Thr Asp Tyr Lys Gly Thr Ile Thr Phe Gly Gly Ala Asp Pro Ile l30 135 1A0 Leu Ala Lys Tyr Arg Asp Ile Ser Val Val Gly Gly Thr Gly Asp Phe Leu Met Ala Arg Gly Ile Ala Thr Ile Asp Thr Asp Ala Tyr Glu Gly Asp Val Tyr Phe Arg Leu Arg Val Asn Ile Thr Leu Tyr Glu Cys Tyr (2) INFORMATION FOR SEQ ID N0:39:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 890 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Thuja plicata dirigent protein PSD-Tp8 cDNA
(iii) HYPOTHETICAL: NO
(ivy ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 49..619 i~

-96~
(xi)SEQUENCE SEQ ID
DESCRIPTION: N0:34:

ACATTTTTGA
TA

MetAla IleTrp l95 AsnGly ArgVal LeuAsn LeuCysIle LeuTrp LeuLeuVal SerIle ValLeu LeuAsn GlyIle AspCysHis SerArg LysLysLys LeuPro LysPro CysArg AsnLeu ValLeuTyr PheHis AspIleIle TyrAsn GlyLys AsnAla GlyAsn AlaThrSer ThrLeu ValAlaAla ProGln GlyAla AsnLeu ThrIle MetThrGly AsnTyr HisPheGly AspLeu AlaVal PheAsp AspPro IleThrVal AspAsn AsnLeuHis SerPro ProVal GlyArg AlaGln GlyPheTyr PheTyr AspMetLys AsnThr PheSer AlaTrp LeuGly PheThrPhe ValLeu AsnSerThr AspTyr 310 3l5 320 LysGly ThrIle ThrPhe GlyGlyAla AspPro IleLeuAla LysTyr ArgAsp IleSer ValVal GlyGlyThr GlyAsp PheLeuMet AlaArg GlyIle AlaThr IleAsp ThrAspAla TyrGlu GlyAspVal TyrPhe ArgLeu ArgVal AsnIle ThrLeuTyr GluCys Tyr GTATTCTATG TAGAATAGCT GCTATATT ATTTTGAGAG

CAATCTGATA CATAGGTAGT
TG

TAAGTTTTAT AACTAAGTAG TCATTGAA AACTTGGGTG

TGAACCATGA CTCATGCACA
GA

GTTTTCATAT TTTCTAAATA TATTACAT TTATGGATTG

AGTCTGCTCG TTGAGAATTG
AC

GTCAAAAAAA AF,F~AAAAAAA A 890 (2) INFORMATION FOR SEQ ID N0:35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: l92 amino acids (H) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:35:
Met Ala Ile Trp Asn Gly Arg Val Leu Asn Leu Cys Ile Leu Trp Leu Leu Val Ser Ile Val Leu Leu Asn Gly Ile Asp Cys His Ser Arg Lys Lys Lys Leu Pro Lys Pro Cys Arg Asn Leu Val Leu Tyr Phe His Asp Ile Ile Tyr Asn Gly Lys Asn Ala Gly Asn Ala Thr Ser Thr Leu Val Ala Ala Pro Gln Gly Ala Asn Leu Thr I1e Met Thr Gly Asn Tyr His Phe Gly Asp Leu Ala Val Phe Asp Asp Pro Ile Thr Val Asp Asn Asn Leu His Ser Pro Pro Val Gly Arg Ala Gln Gly Phe Tyr Phe Tyr Asp Met Lys Asn Thr Phe Ser Ala Trp Leu Gly Phe Thr Phe Val Leu Asn 1l5 120 125 Ser Thr Asp Tyr Lys Gly Thr Ile Thr Phe Gly Gly Ala Asp Pro Ile 130 135 l90 Leu Ala Lys Tyr Arg Asp Ile Ser Val Val Gly Gly Thr Gly Asp Phe Leu Met Ala Arg Gly Ile Ala Thr Ile Asp Thr Asp Ala Tyr Glu Gly 165 l70 175 Asp Val Tyr Phe Arg Leu Arg Val Asn Ile Thr Leu Tyr Glu Cys Tyr (2) INFORMATION FOR SEQ ID N0:36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant ~ (D) TOPOLOGY: not relevant i~

_98_ (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: N-terminal sequence from Forsythia intermedia (+)-pinoresinol/(+)-lariciresinol reductase (xi) SEQUENCE DESCRIPTION: SEQ ID N0:36:
Gly Lys Ser Lys Val Leu Ile Ile Gly Gly Thr Gly Tyr Leu Gly Arg Arg Leu Val Lys Ala Ser Leu Ala Gln Gly His Glu Thr Tyr (2) INFORMATION FOR SEQ ID N0:37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16 amino acids (B) TYPE: amino acid (C) STRANDEDNESS; not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: internal tryptic fragment from Forsythia intermedia (+)-pinoresinol/(+)-lariciresinol reductase (xi) SEQUENCE DESCRIPTION: SEQ ID N0:37:
Phe Met Asp Ile Ala Met Xaa Pro Gly Lys Val Thr Leu Asp Glu Lys (2) INFORMATION FOR SEQ ID N0:38:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: internal tryptic fragment from Forsythia intermedia (+)-pinoresinol/(+)-lariciresinol reductase (xi) SEQUENCE DESCRIPTION: SEQ ID N0:38:
Leu Pro Xaa Glu Phe Gly Met Asp Pro Ala Lys Phe Met , 1 5 10 (2) INFORMATION FOR SEQ ID N0:39:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: internal tryptic fragment from Forsythia intermedia (+)-pinoresinol/(+)-lariciresinol reductase (xi) SEQUENCE DESCRIPTTON: SEQ ID N0:39:
Glu Val Val Gln Xaa Xaa Glu Lys (2) INFORMATION FOR SEQ ID N0:90:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: internal tryptic fragment from Forsythia intermedia (+)-pinoresinol/(+)-lariciresinol reductase (xi) SEQUENCE DESCRIPTION: SEQ ID N0:90:
Tyr Xaa Ser Val Glu Glu Tyr Leu Lys Arg ~ (2) INFORMATION FOR SEQ ID N0:91:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 12 amino acids ~ (B) TYPE: amino acid ~ I

(C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: internal cyanogen bromide fragment from Forsythia intermedia (+)-pinoresinol/(+)-lariciresinol reductase (xi) SEQUENCE DESCRIPTION: SEQ ID N0:41:
Met Glu Pro Gly Lys Val Thr Leu Asp Glu Lys Met (2) INFORMATION FOR SEQ ID N0:92:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv} ANTI-SENSE: NO
(v) FRAGMENT TYPE: internal cyanogen bromide fragment from Forsythia intermedia (+)-pinoresinol/(+}-lariciresinol reductase (xi} SEQUENCE DESCRIPTION: SEQ ID N0:92:
Met Asp Pro Ala Lys Phe Met (2) INFORMATION FOR SEQ ID N0:43:
(i} SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: internal cyanogen bromide fragment from Forsythia intermedia (+)-pinoresinol/(+)-lariciresinol reductase (xi) SEQUENCE DESCRIPTION: SEQ ID N0:43:
Met Leu Ile Ser Phe Lys Met (2) INFORMATION FOR SEQ ID N0:44;
(i) SEQUENCE CHARACTERISTICS:
(A} LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A} DESCRIPTION:"PCR primer PLRNS"
(iii) HYPOTHETICAL: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:49:

(2) INFORMATION FOR SEQ ID N0:45:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs (B} TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION:"PCR primer PLR14R"
(iii) HYPOTHETICAL: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:45:

(2) INFORMATION FOR SEQ ID N0:96:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION:"PCR primer PLR15R"
(iii) HYPOTHETICAL: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:46:

i (2) INFORMATION FOR SEQ ID N0:97:
(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 060 pairs 1 base (B) TYPE: leicacid nuc (C) STRANDEDNESS:single (D) TOPOLOGY:linear (ii)MOLECULE TYPE:Forsyth ia cDNA
intermedia PLR-Fil (iii)HYPOTHETICAL:NO

(iv)ANTI-SENSE:
NO

(ix)FEATURE:

(A) NAME/KEY:CDS

(B) LOCATION:28..963 (xi)SEQUENCE DESCRIPTION: SEQ
ID
N0:97:

GAGAAAAACA GGA GTT
GAGAGAG AAA TTG
A AGC ATC
AA

M et r Gly Lys Lys Val Se Leu Ile TAC

IleGly Gly Thr Gly LeuGlyArg ArgLeu ValLysAla SerLeu Tyr ACA

AlaGln Gly His Glu TyrIleLeu HisArg ProG1uIle GlyVal Thr GAA

AspIle Asp Lys Val MetLeuIle SerPhe LysMetGln GlyAla Glu TCT

HisLeu Val Ser Gly PheLysAsp PheAsn SerLeuVal GluAla Ser GTA

ValLys Leu Val Asp ValIleSer AlaIle SerGlyVal HisIle Val CTT

ArgSer His Gln Ile LeuGlnLeu LysLeu ValGluAla IleLys Leu AAG

GluAla Gly Asn Val ArgPheLeu ProSer GluPheGly MetAsp Lys GAT

ProAla Lys Phe Met ThrAlaMet GluPro GlyLysVal ThrLeu Asp GTA

AspG1u Lys Met Val ArgLysAla IleGlu LysAlaG1y IlePro Val TTC ACA TAT GTC TCT GCA AAT TGC TTT GCT GGT TAT TTC TTG GGA GG'P 531 Phe Thr Tyr Val Ser Ala Asn Cys Phe Ala Gly Tyr Phe Leu Gly Gly ATT

Leu CysGln PheGly LysIleLeu ProSerArg AspPheVal IleIle His GlyAsp GlyAsn LysLysAla IleTyrAsn AsnGluAsp AspIle Ala ThrTyr AlaIle LysThrIle AsnAspPro ArgThrLeu AsnLys Thr IleTyr IleSer ProProLys AsnIleLeu SerGlnArg GluVal 4l0 41S 920 Val GlnThr TrpGlu LysLeuIle GlyLysGlu LeuGlnLys IleThr Leu SerLys GluAsp PheLeuAla SerValLys GluLeuGlu TyrAla Gln GlnVal GlyLeu SerHisTyr HisAspVal AsnTyrGln GlyCys Leu ThrSer PheGlu IleGlyAsp GluGluGlu AlaSerLys LeuTyr Pro GluVal LysTyr ThrSerVal GluGluTyr LeuLysArg TyrVal TCGTTAAATA ATATGTGTTG AATTTTGCTT CCAAAAA l060 (2) INFORMATION FOR SEQ ID N0:98:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 312 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (hi) SEQUENCE DESCRIPTION: SEQ ID N0:48:
Met Gly Lys Ser Lys Val Leu Ile Ile Gly Gly Thr Gly Tyr Leu Gly i~

WO 98!20I13 PCTIUS97120391 -Arg Arg Leu Val Lys Ala Ser Leu Ala Gln Gly His Glu Thr Tyr Ile Leu His Arg Pro Glu I1e Gly Val Asp Ile Asp Lys Val Glu Met Leu Ile Ser Phe Lys Met Gln Gly Ala His Leu Val Ser Gly Ser Phe Lys Asp Phe Asn Ser Leu Val Glu Ala Val Lys Leu Val Asp Val Val Ile Ser Ala Ile Ser Gly Val His Ile Arg Ser His Gln Ile Leu Leu Gln Leu Lys Leu Val Glu Ala Ile Lys Glu Ala Gly Asn Val Lys Arg Phe Leu Pro Ser Glu Phe Gly Met Asp Pro Ala Lys Phe Met Asp Thr Ala 115 120 l25 Met Glu Pro Gly Lys Val Thr Leu Asp Glu Lys Met Val Val Arg Lys Ala Ile Glu Lys Ala Gly Tle Pro Phe Thr Tyr Val Ser Ala Asn Cys l45 150 l55 160 Phe Ala Gly Tyr Phe Leu Gly Gly Leu Cys Gln Phe Gly Lys Ile Leu 16S 170 l75 Pro Ser Arg Asp Phe Val Ile Ile His Gly Asp Gly Asn Lys Lys Ala Ile Tyr Asn Asn Glu Asp Asp Ile Ala Thr Tyr Ala Ile Lys Thr Ile Asn Asp Pro Arg Thr Leu Asn Lys Thr Ile Tyr Ile Ser Pro Pro Lys Asn Ile Leu Ser Gln Arg Glu Val Val Gln Thr Trp Glu Lys Leu Ile Gly Lys Glu Leu Gln Lys Ile Thr Leu Ser Lys Glu Asp Phe Leu Ala Ser Val Lys Glu Leu Glu Tyr Ala Gln Gln Val Gly Leu Ser His Tyr His Asp Val Asn Tyr Gln Gly Cys Leu Thr Ser Phe Glu Ile Gly Asp Glu Glu Glu Ala Ser Lys Leu Tyr Pro Glu Val Lys Tyr Thr Ser Val Glu Glu Tyr Leu Lys Arg Tyr Val 305 3l0 (2) INFORMATION FOR SEQ ID N0:99:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1112 base pairs (H) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Forsythia intermedia cDNA PLR-Fi2 (iii) HYPOTHETICAL: NO
(iv)ANTI-SENSE: NO

(ix)FEATURE:

(A) CDS
NAME/KEY:

(B) ON:49..979 LOCATI

(xi}SEQUENCE SEQ
DESCRIPTION: ID
N0:99:

GAGCTCGTGC
CGCACAGAGA
AAAACAGAGA

Met GlyLysSer AAAGTT TTGATC ATTGGGGGT ACAGGGTAC TTAGGGAGG AGATTGGTT l03 LysVal LeuIle IleGlyGly ThrGlyTyr LeuGlyArg ArgLeuVal LysAla 5erLeu AlaGlnGly HisGluThr TyrIleLeu HisArgPro GluIle GlyVal AspIleAsp LysValGlu MetLeuIle SerPheLys MetGln GlyAla HisLeuVal SerGlySer PheLysAsp PheAsnSer LeuVal GluAla ValLysLeu ValAspVal ValIleSer AlaIleSer GlyVal HisIle ArgSerHis GlnIleLeu LeuGlnLeu LysLeuVal GluAla IleLys GluAlaGly AsnValLys ArgPheLeu ProSerGlu PheGly MetAsp ProAlaLys PheMetAsp ThrAlaMet GluProGly LysVal ThrLeu AspGluLys MetValVal ArgLysAla IleGluLys i -i a6-AAT

AlaGlyIle ProPhe ThrTyr ValSerAla AsnCysPhe AlaGlyTyr PheLeuGly GlyLeu CysGln PheGlyLys IleLeuPro SerArgAsp PheValIle IleHis GlyAsp G1yAsnLys LysAlaIle TyrAsnAsn GluAspAsp IleAla ThrTyr AlaIleLys ThrIleAsn AspProArg ThrLeuAsn LysThr IleTyr Ile5erPro ProLysAsn IleLeuSer GlnArgGlu ValVal GlnThr TrpGluLys LeuIleGly LysGluLeu GlnLysI1e ThrLeu SerLys GluAspPhe LeuAlaSer ValLysGlu LeuGluTyr AlaGln GlnVal GlyLeuSer HisTyrHis AspValAsn TyrGlnGly CysLeu ThrSer PheGluIle GlyAspGlu GluGluAla SerLysLeu TyrPro GluVal LysTyrThr SerValGlu GluTyrLeu AAGCGTTAC GTGTAGTTGAAAG 10l9 CTTTCCATTA
TTATTGTAAT
AATATTTAAA

LysArgTyr Val GTCGATTGAA ATGGAATTTT GAAGTCAAAA AAA 11l2 (2) INFORMATION FOR SEQ ID N0:50:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 312-amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:50:

Met Gly Lys Ser Lys Val Leu Ile Ile Gly Gly Thr Gly Tyr Leu Gly Arg Arg Leu Val Lys Ala Ser Leu Ala Gln Gly His Glu Thr Tyr Ile Leu His Arg Pro Glu Ile Gly Val Asp Ile Asp Lys Val Glu Met Leu Ile Ser Phe Lys Met Gln Gly Ala His Leu Val Ser Gly Ser Phe Lys Asp Phe Asn Ser Leu Val Glu Ala Val Lys Leu Val Asp Val Val Ile Ser Ala Ile Ser Gly Val His Ile Arg Ser His Gln Ile Leu Leu Gln Leu Lys Leu Val Glu Ala Ile Lys Glu Ala Gly Asn Val Lys Arg Phe 100 105 1l0 Leu Pro Ser Glu Phe Gly Met Asp Pro Ala Lys Phe Met Asp Thr Ala 1l5 120 125 Met Glu Pro Gly Lys Val Thr Leu Asp Glu Lys Met Val Val Arg Lys Ala Ile Glu Lys Ala Gly Ile Pro Phe Thr Tyr Val Ser Ala Asn Cys Phe Ala Gly Tyr Phe Leu Gly Gly Leu Cys Gln Phe Gly Lys Ile Leu Pro Ser Arg Asp Phe Val Ile Ile His Gly Asp Gly Asn Lys Lys Ala Ile Tyr Asn Asn Glu Asp Asp Ile Ala Thr Tyr Ala Ile Lys Thr Ile l95 200 205 Asn Asp Pro Arg Thr Leu Asn Lys Thr Ile Tyr Ile Ser Pro Pro Lys Asn Ile Leu Ser Gln Arg Glu Val Val Gln Thr Trp Glu Lys Leu Ile Gly Lys Glu Leu Gln Lys Ile Thr Leu Ser Lys Glu Asp Phe Leu Ala Ser Val Lys Glu Leu Glu Tyr Ala Gln Gln Val Gly Leu Ser His Tyr His Asp Val Asn Tyr Gln Gly Cys Leu Thr Ser Phe Glu Ile Gly Asp Glu Glu Glu Ala Ser Lys Leu Tyr Pro Glu Val Lys Tyr Thr Ser Val Glu Glu Tyr Leu Lys Arg Tyr Val i WO 98/20113 PCTlUS97l20391 (2) INFORMATION FOR SEQ ID N0:51:
(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 124 basepairs (B) TYPE: leicacid nuc tC) STRANDEDNESS:single (D) TOPOLOGY:lin ear (ii)MOLECULE TYPE:For sythia ntermedia NA i3 i cD PLR-F

(iii)HYPOTHETICAL:NO

(iv)ANTI-SENSE:
NO

(ix)FEATURE:

(A} NAME/KEY:CDS

(B) LOCATION:29. .964 (xi}SEQUENCE DESCRIPTION:SEQID 0:51:
N

GAGGAAAAAC ATG AGC
AGAGA

Met Gly Lys Lys ValLeu Ile Ser ATTGGG GGT ACA GGG TTA GGGAGGAGA TTG GTTAAG GCAAGT TTA l00 TAC

IleGly Gly Thr Gly Leu GlyArgArg Leu ValLys AlaSer Leu Tyr ACA

AlaGln Gly His Glu Tyr IleLeuHis Arg ProGlu IleGly Val Thr GATATT GAT AAA GTT ATG CTAATATCA TTT AAA.ATGCAAGGA GCT 196 GAA

AspIle Asp Lys Val Met LeuIleSer Phe LysMet GlnGly Ala Glu TCT

HisLeu Val Ser Gly Phe LysAspPhe Asn SerLeu ValGlu Ala Ser GTA

ValLys Leu Val Asp Val IleSerAla Ile SerGly ValHis Ile Val CTT

ArgSer His Gln Ile Leu GlnLeuLys Leu ValGlu AlaIle Lys Leu AAG

GluAla Gly Asn Val Arg PheLeuPro Ser GluPhe GlyMet Asp Lys GAT

ProAla Lys Phe Met Thr AlaMetGlu Pro GlyLys ValThr Leu Asp GTA

AspGlu Lys Met Val Arg LysAlaIle Glu LysAla GlyIle Pro Val AAT

PheThrTyr ValSerAla AsnCysPhe AlaGly TyrPheLeu GlyGly LeuCysGln PheGlyLys IleLeuPro SerArg AspPheVal IleIle HisGlyAsp GlyAsnLys LysAlaIle TyrAsn AsnGluAsp AspIle 500 505 5l0 AlaThrTyr AlaIleLys ThrIleAsn AspPro ArgThrLeu AsnLys ThrIleTyr IleSerPro ProLysAsn IleLeu SerGlnArg GluVal ValGlnThr TrpGluLys LeuIleGly LysGlu LeuGlnLys IleThr LeuSerLys GluAspPhe LeuAlaSer ValLys GluLeuGlu TyrAla GlnGlnVal GlyLeuSer HisTyrHis AspVal AsnTyrGln GlyCys LeuThrSer PheGluIle GlyAspGlu GluGlu AlaSerLys LeuTyr ProGluVal LysTyrThr SerValGlu GluTyr LeuLysArg TyrVal 6l0 615 620 GAAGTCATCT TCTCCACAAT ATTAGTCCAA ATAAAAAAAA 1l24 (2) INFORMATION FOR SEQ ID N0:52:
{i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 312 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear . (ii) MOLECULE TYPE: protein {xi) SEQUENCE DESCRIPTION: SEQ ID N0:52:

i~

Met Gly Lys Ser Lys Val Leu Ile Ile Gly Gly Thr Gly Tyr Leu Gly Arg Arg Leu Val Lys Ala Ser Leu Ala Gln Gly His Glu Thr Tyr Ile Leu His Arg Pro Glu Ile Gly Val Asp Ile Asp Lys Val Glu Met Leu Ile Ser Phe Lys Met Gln Gly A1a His Leu Val Ser Gly Ser Phe Lys Asp Phe Asn Ser Leu Val Glu Ala Val Lys Leu Val Asp Val Val Ile Ser Ala ile Ser Gly Val His Ile Arg Ser His G1n Ile Leu Leu Gln Leu Lys Leu Val Glu Ala Ile Lys Glu Ala Gly Asn Val Lys Arg Phe Leu Pro Ser Glu Phe Gly Met Asp Pro Ala Lys Phe Met Asp Thr Ala Met G1u Pro Gly Lys Val Thr Leu Asp Glu Lys Met Va1 Val Arg Lys 130 l35 190 Ala Ile Glu Lys Ala Gly Ile Pro Phe Thr Tyr Val Ser Ala Asn Cys 145 150 l55 160 Phe Ala Gly Tyr Phe Leu Gly Gly Leu Cys Gln Phe Gly Lys Ile Leu Pro Ser Arg Asp Phe Val Ile Ile His Gly Asp Gly Asn Lys Lys Ala Ile Tyr Asn Asn Glu Asp Asp Ile A1a Thr Tyr Ala Ile Lys Thr Ile Asn Asp Pro Arg Thr Leu Asn Lys Thr Ile Tyr Ile Ser Pro Pro Lys Asn Ile Leu Ser Gln Arg Glu Val Val Gln Thr Trp Glu Lys Leu Ile Gly Lys Glu Leu Gln Lys Ile Thr Leu Ser Lys Glu Asp Phe Leu Ala Ser Val Lys Glu Leu Glu Tyr Ala Gln Gln Val Gly Leu Ser His Tyr His Asp Val Asn Tyr Gln Gly Cys Leu Thr Ser Phe G1u Ile Gly Asp Glu Glu Glu Ala Ser Lys Leu Tyr Pro Glu Val Lys Tyr Thr Ser Val Glu Glu Tyr Leu Lys Arg Tyr Val (2) INFORMATION FOR SEQ ID N0:53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1097 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Forsythia intermedia cDNA PLR-Fi9 (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 29..969 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:53:

GAGGAAAAAC GGA AAA TTG
AGAGAGAG AAA

Met Lys SerLysVal Ile Gly Leu Ile Gly GlyThr GlyTyrLeu GlyArgArgLeu ValLysAla SerLeu GCT CAA GGTCAT GAAACATAC ATTCTGCATAGG CCTGAAATT GGTGTT l48 Ala Gln GlyHis GluThrTyr IleLeuHisArg ProGluIle GlyVal Asp Ile AspLys ValGluMet LeuIleSerPhe LysMetGln GlyAla His Leu ValSer GlySerPhe LysAspPheAsn SerLeuVal GluAla Val Lys LeuVal AspValVal IleSerAlaIle SerGlyVal HisIle Arg Ser HisGln IleLeuLeu GlnLeuLysLeu ValGluAla I1eLys Glu Ala GlyAsn ValLysArg PheLeuProSer GluPheGly MetAsp Pro Ala LysPhe MetAspThr AlaMetGluPro GlyLysVal ThrLeu Asp Glu LysMet ValValArg LysAlaIleGlu LysAlaGly IlePro ' 450 455 460 i~

AAT

PheThr TyrVal SerAla CysPhe AlaGlyTyr PheLeuGly Gly Asn ATT

LeuCys GlnPhe GlyLys LeuPro SerArgAsp PheValIle Ile Ile AA.A

HisGly AspGly AsnLys AlaIle TyrAsnAsn GluAspAsp Ile Lys ACA

AlaThr TyrAla IleLys IleAsn AspProArg ThrLeuAsn Lys Thr CCA

ThrIle TyrIle SerPro LysAsn IleLeuSer GlnArgGlu Val Pro Val Gln Thr Trp Glu Lys Leu Ile Gly Lys Glu Leu Gln Lys Ile Thr Leu Ser Lys Glu Asp Phe Leu Ala Ser Val Lys Glu Leu Glu Tyr Ala Gln Gln Val Gly Leu Ser His Tyr His Asp Val Asn Tyr Gln Gly Cys GGA GAG

LeuThr SerPhe GluIle Asp Glu Glu Ala SerLysLeu Tyr Gly Glu AGT TAC

ProGlu ValLys TyrThr Val Glu Glu Leu LysArgTyr Val Ser Tyr TAGTTGAAAG CTTTCCATTA AATATTTAAATCAGTATGTA GTTTTAAATT

TTATTGTAAT

TCGTTAAATA ATATGTGTTG CAAACGAGTGGTCGATTGAA

AATTTTGCTT ATGGAATTTT

(2) INFORMATION FOR SEQ ID N0:59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 312 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:54:

Met Gly Lys Ser Lys Val Leu Ile Ile Gly Gly Thr Gly Tyr Leu Gly Arg Arg Leu Val Lys Ala Ser Leu Ala Gln Gly His Glu Thr Tyr Ile Leu His Arg Pro Glu Ile Gly Val Asp Ile Asp Lys Val Glu Met Leu Ile Ser Phe Lys Met Gln Gly Ala His Leu Val Ser Gly Ser Phe Lys Asp Phe Asn Ser Leu Val Glu Ala Val Lys Leu Val Asp Val Val Ile Ser Ala Ile Ser Gly Val His Ile Arg Ser His Gln Ile Leu Leu Gln Leu Lys Leu Val Glu Ala Ile Lys Glu Ala Gly Asn Val Lys Arg Phe 100 105 1l0 Leu Pro Ser Glu Phe Gly Met Asp Pro Ala Lys Phe Met Asp Thr Ala Met Glu Pro Gly Lys Val Thr Leu Asp Glu Lys Met Val Val Arg Lys Ala Ile Glu Lys Ala Gly Ile Pro Phe Thr Tyr Val Ser Ala Asn Cys Phe Ala Gly Tyr Phe Leu Gly Gly Leu Cys Gln Phe Gly Lys Ile Leu Pro Ser Arg Asp Phe Val Ile Ile His Gly Asp Gly Asn Lys Lys Ala Ile Tyr Asn Asn Glu Asp Asp Ile Ala Thr Tyr Ala Ile Lys Thr Ile Asn Asp Pro Arg Thr Leu Asn Lys Thr Ile Tyr Ile Ser Pro Pro Lys Asn Ile Leu Ser Gln Arg Glu Val Val Gln Thr Trp Glu Lys Leu Ile Gly Lys Glu Leu Gln Lys Ile Thr Leu Ser Lys Glu Asp Phe Leu Ala Ser Val Lys Glu Leu Glu Tyr Ala Gln Gln Val Gly Leu Ser His Tyr His Asp Val Asn Tyr Gln Gly Cys Leu Thr Ser Phe Glu Ile Gly Asp Glu Glu Glu Ala Ser Lys Leu Tyr Pro Flu Val Lys Tyr Thr Ser Val Glu Glu Tyr Leu Lys Arg Tyr Val i (2) INFORMATION FOR SEQ ID N0:55:
(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: basepairs (B} TYPE: acid nucleic (C) STRANDEDNESS: single (D) TOPOLOGY:lin ear (ii)MOLECULE TYPE:For sythia ntermedia i cDNA
PLR-Fi5 (iii)HYPOTHETICAL:
NO

(iv)ANTI-SENSE:
NO

(ix)FEATURE:

(A) NAME/KEY:CDS

(B) LOCATION:31. .966 (xi)SEQUENCE DESCRIPTION: SEQID
N0:55:

ACAGA ATG
GGA
AAA
AGC
AAA
GTT
TTG
ATC

Met Ser u Ile Gly Lys Lys Val Le TAC

IleGly Gly Thr Gly Leu GlyArgArg Leu ValLysAlaSer Leu Tyr ACA

AlaGln Gly His Glu Tyr IleLeuHis Arg ProGluIleGly Val Thr GAA

AspIle Asp Lys Val Met LeuIleSer Phe LysMetGlnGly Ala Glu TCT

HisLeu Va1 Ser Gly Phe LysAspPhe Asn SerLeuValGlu Ala Ser GTA

ValLys Leu Val Asp Val IleSerAla Ile SerGlyValHis Ile Val CTT

ArgSer His Gln Ile Leu GlnLeuLys Leu ValGluAlaIle Lys Leu AAG

GluAla Gly Asn Val Arg PheLeuPro 5er GluPheGlyMet Asp Lys GAT

ProAla Lys Phe Met Thr AlaMetGlu Pro GiyLysValThr Leu Asp GTA

AspGlu Lys Met Val Arg LysAlaIle Glu LysAlaGlyIle Pro Val PheThr TyrValSer AlaAsnCys PheAla GlyTyrPhe LeuGlyGly LeuCys GlnPheGly LysIleLeu ProSer ArgAspPhe ValIleIie HisGly AspGlyAsn LysLysAla IleTyr AsnAsnGlu AspAspIle AlaThr TyrAlaIle LysThrI1e AsnAsp ProArgThr LeuAsnLys 5l5 520 525 Thrile TyrIleSer ProProLys AsnIle LeuSerGln ArgGluVal ValGln ThrTrpGlu LysLeuIle GlyLys GluLeuGln LysIleThr LeuSer LysGluAsp PheLeuAla SerVal LysGluLeu GluTyrAla GlnGln ValGlyLeu SerHisTyr HisAsp ValAsnTyr GlnGlyCys LeuThr SerPheGlu IleGlyAsp GluGlu GluAlaSer LysLeuTyr ProGlu ValLysTyr ThrSerVal GluGlu TyrLeuLys ArgTyrVal GAAGTCATCT TCTCCAAAAA AAA 1l09 (2) INFORMATION FOR SEQ ID N0:56:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 312 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:56:

i WO 98I20113 PCTlUS97120391 -Met Gly Lys Ser Lys Val Leu Ile Ile Gly Gly Thr Gly Tyr Leu Gly Arg Arg Leu Val Lys Ala Ser Leu Ala Gln Gly His Glu Thr Tyr Ile Leu His Arg Pro Glu Ile Gly Val Asp Ile Asp Lys Val Glu Met Leu Ile Ser Phe Lys Met Gln Gly Ala His Leu Val Ser Gly Ser Phe Lys Asp Phe Asn Ser Leu Val Glu Ala Val Lys Leu Val Asp Val Val Ile Ser Ala Ile Ser Gly Val His Ile Arg Ser His Gln Ile Leu Leu Gln Leu Lys Leu Val Glu Ala Ile Lys Glu Ala Gly Asn Val Lys Arg Phe 100 105 1l0 Leu Pro Ser Glu Phe Gly Met Asp Pro Ala Lys Phe Met Asp Thr Ala 115 l20 125 Met Glu Pro Gly Lys Val Thr Leu Asp Glu Lys Met Val Val Arg Lys 130 135 l40 Ala Ile Glu Lys Ala Gly Ile Pro Phe Thr Tyr Val Ser Ala Asn Cys 145 150 155 l60 Phe Ala Gly Tyr Phe Leu Gly Gly Leu Cys Gln Phe Gly Lys Ile Leu l65 l70 175 Pro Ser Arg Asp Phe Val Ile Ile His Gly Asp Gly Asn Lys Lys Ala 180 185 l90 Ile Tyr Asn Asn Glu Asp Asp Ile Ala Thr Tyr Ala I1e Lys Thr Ile Asn Asp Pro Arg Thr Leu Asn Lys Thr Ile Tyr Ile Ser Pro Pro Lys 210 2l5 220 Asn Ile Leu Ser Gln Arg Glu Val Val Gln Thr Trp Glu Lys Leu I1e Gly Lys Glu Leu Gln Lys Ile Thr Leu Ser Lys Glu Asp Phe Leu Ala Ser Val Lys Glu Leu Glu Tyr Ala Gln Gln Val Gly Leu Ser His Tyr His Asp Val Asn Tyr Gln Gly Cys Leu Thr Ser Phe Glu Ile Gly Asp Glu Glu Glu Ala Ser Lys Leu Tyr Pro Glu Val Lys Tyr Thr Ser Val Glu Glu Tyr Leu Lys Arg Tyr Val (2) INFORMATION FOR SEQ ID N0:57:
(i) SEQUENCE CHARACTERISTICS:
- (A) LENGTH: 1107 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Forsythia intermedia cDNA PLR-Fi6 (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(ix) FEATURE:
{A) NAME/KEY: CDS
(B) LOCATION: 27..962 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:57:

GAGAAAACAG GGA ATC
AGAGAG AAA
AGC
AAA

MetGly Ser ValLeu Ile Lys Lys Ile GGG GGTACA GGGTAC TTAGGG AGGAGATTG GTTAAG GCAAGTTTA GCT l01 Gly GlyThr GiyTyr LeuGly ArgArgLeu ValLys AlaSerLeu Ala CAA GGTCAT GAAACA TACATT CTGCATAGG CCTGAA ATTGGTGTT GAT l49 Gln G1yHis GluThr TyrI1e LeuHisArg ProGlu IleGlyVa1 Asp Ile AspLys ValGlu MetLeu IleSerPhe LysMet GlnGlyAla His Leu ValSer GlySer PheLys AspPheAsn SerLeu ValGluAla Val Lys LeuVal AspVal ValIle SerAlaIle SerGly ValHisIle Arg AGC CATCAA ATTCTT CTTCAA CTCAAGCTT GTTGAA GCTATTAAA GAG 34l Ser HisGln IleLeu LeuGln LeuLysLeu ValGlu AlaIleLys Glu 905 4l0 415 Ala GlyAsn ValLys ArgPhe LeuProSer GluPhe GlyMetAsp Pro Ala LysPhe MetAsp ThrAla MetGluPro GlyLys ValThrLeu Asp GAG AAGATG GTGGTA AGGAAA GCAATTGAA AAGGCT GGGATTCCT TTC ~

Glu LysMet ValVal ArgLys AlaIleGlu LysAla GlyIlePro Phe ~ I

WO 98I20113 PCTlUS97120391 -AAT

ThrTyr ValSerAla AsnCys PheAlaGly TyrPhe LeuGlyGly Leu CysGln PheGlyLys IleLeu ProSerArg AspPhe ValIleIle His GlyAsp GlyAsnLys LysAla IleTyrAsn AsnGlu AspAspIle Ala ThrTyr AlaIleLys ThrIle AsnAspPro ArgThr LeuAsnLys Thr 5i5 520 525 IleTyr IleSerPro ProLys AsnIleLeu SerGln ArgGluVal Val GlnThr TrpGluLys LeuIle GlyLysGlu LeuGln LysIleThr Leu SerLys GluAspPhe LeuAla SerValLys GluLeu GluTyrAla Gln GlnVal GlyLeuSer HisTyr HisAspVal AsnTyr GlnGlyCys Leu ThrSer PheGluIle GlyAsp GluGluGlu AlaSer LysLeuTyr Pro GluVal LysTyrThr SerVal GluGluTyr LeuLys ArgTyrVal (2) INFORMATION FOR SEQ ID N0:58:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 312 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:58:

Met Gly Lys Ser Lys Val Leu Ile Ile Gly Gly Thr Gly Tyr Leu Gly Arg Arg Leu Val Lys Ala Ser Leu Ala Gln Gly His Glu Thr Tyr Ile = 20 25 30 Leu His Arg Pro Glu Ile Gly Val Asp Ile Asp Lys Val Glu Met Leu I1e Ser Phe Lys Met Gln Gly Ala His Leu Val Ser Gly Ser Phe Lys Asp Phe Asn Ser Leu Val Glu Ala Val Lys Leu Val Asp Val Val Ile Ser Ala Ile Ser Gly Val His Ile Arg Ser His Gln Ile Leu Leu Gln 85 90 ' 95 Leu Lys Leu Val Glu Ala Ile Lys Glu Ala Gly Asn Val Lys Arg Phe 100 l05 110 Leu Pro Ser Glu Phe Gly Met Asp Pro Ala Lys Phe Met Asp Thr Ala Met Glu Pro Gly Lys Val Thr Leu Asp Glu Lys Met Val Val Arg Lys Ala Ile Glu Lys Ala Gly Ile Pro Phe Thr Tyr Val Ser Ala Asn Cys 145 l50 155 160 Phe Ala Gly Tyr Phe Leu Gly Gly Leu Cys Gln Phe Gly Lys Ile Leu Pro Ser Arg Asp Phe Val Ile Ile His Gly Asp Gly Asn Lys Lys Ala Ile Tyr Asn Asn Glu Asp Asp Ile Ala Thr Tyr Ala Ile Lys Thr Ile Asn Asp Pro Arg Thr Leu Asn Lys Thr Ile Tyr Ile Ser Pro Pro Lys 210 2l5 220 Asn Ile Leu Ser Gln Arg Glu Val Val Gln Thr Trp Glu Lys Leu Ile Gly Lys Glu Leu Gln Lys Ile Thr Leu Ser Lys Glu Asp Phe Leu Ala Ser Val Lys Glu Leu Glu Tyr Ala Gln Gln Val Gly Leu Ser His Tyr His Asp Val Asn Tyr Gln Gly Cys Leu Thr Ser Phe Glu Ile Gly Asp Glu G1u G1u Ala Ser Lys Leu Tyr Pro Glu Val Lys Tyr Thr Ser Val Glu Glu Tyr Leu Lys Arg Tyr Val i (2) INFORMATION FOR SEQ ID N0:59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION:"cDNA synthesis linker primer"
(iii) HYPOTHETICAL: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:59:

(2) INFORMATION FOR SEQ ID N0:60:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: "cDNA synthesis primer"
(iii) HYPOTHETICAL: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:60:

(2) TNFORMATION FOR SEQ ID N0:61:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1190 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Thuja plicata cDNA PLR-Tpl (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 13..951 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:61:
GCACATAAGA GT ATG CAT AAG AAG ACC ACA GTT CTG ATA GTG GGG GGC 9$
Met Asp Lys Lys Ser Arg Val Leu Ile Val Gly Gly WO 98I20113 PCT/iJS97/20391 -AGA

ThrGlyTyr IleGly LysArg IleValAsn AlaSerIle SerLeuGly HisProThr TyrVal LeuPhe ArgProGlu ValValSer AsnIleAsp LysValGln MetLeu LeuTyr PheLysGln LeuGlyAla LysLeuIle GluAlaSer LeuAsp AspHis GlnArgLeu ValAspAla LeuLysGln ValAspVal ValIle SerAla LeuAlaGly GlyValLeu SerHisHis IleLeuGlu GlnLeu LysLeu ValGluAla IleLysGlu AlaGlyAsn IleLysArg PheLeu ProSer GluPheGly MetAspPro AspIleMet GluHisAla LeuGln ProGly SerIleThr PheIleAsp LysArgLys ValArgArg AlaIle GluAla AlaSerIle ProTyrThr TyrValSer SerAsnMet PheAla GlyTyr PheAlaGly SerLeuAla GlnLeuAsp GlyHisMet MetPro ProArg AspLysVal LeuIleTyr GlyAspGly AsnValLys GlyIle TrpVal AspGluAsp AspValGly ThrTyrThr IleLysSer IleAsp AspPro GlnThrLeu AsnLysThr MetTyrIle ArgProPro MetAsn IleLeu SerGlnLys GluValIle GlnIleTrp Glu Arg Leu Ser Glu Gln Asn Leu Asp Lys Ile Tyr Ile Ser Ser Gln ~ 550 555 560 1 Id GACTTTCTT GCAGATATG AAAGAT AAATCATAT GAAGAGAAG ATTGTA 8l6 AspPheLeu AlaAspMet LysAsp LysSerTyr GluGluLys IleVal ArgCysHis LeuTyrGln IlePhe PheArgGly AspLeuTyr AsnPhe GluIleGly ProAsnAla IleGlu AlaThrLys LeuTyrPro GluVal AAATACGTA ACCATGGAT TCATAT TTAGAGCGC TATGTTTGAATATCTT 96l LysTyrVal ThrMetAsp SerTyr LeuGluArg TyrVal 6l5 620 625 TATAATTATTTATAGATCTTATTTTAAATA AAAAAAAAAA F,AAAAAAAA 1l90 (2} INFORMATION FOR SEQ ID N0:62:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3l3 amino acids (B) TYPE: amino acid {D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: 5EQ ID N0:62:
Met Asp Lys Lys Ser Arg Val Leu Ile Val Gly Gly Thr Gly Tyr Ile Gly Lys Arg Ile Val Asn Ala Ser I1e Ser Leu Gly His Pro Thr Tyr Val Leu Phe Arg Pro Glu Va1 Val Ser Asn Ile Asp Lys Val Gln Met Leu Leu Tyr Phe Lys Gln Leu Gly Ala Lys Leu Iie Glu Ala Ser Leu Asp Asp His Gln Arg Leu Val Asp Ala Leu Lys Gln Val Asp Val Val Ile Ser Ala Leu Ala Gly Gly Val Leu Ser His His Ile Leu Glu Gln Leu Lys Leu Val Glu Ala Ile Lys Glu Ala Gly Asn Ile Lys Arg Phe Leu Pro 5er Glu Phe Gly Met Asp Pro Asp Ile Met Glu His Ala Leu 1l5 120 125 Gln Pro Gly 5er Ile Thr Phe Ile Asp Lys Arg Lys Val Arg Arg Ala 130 135 l40 Ile Glu Ala Ala Ser Ile Pro Tyr Thr Tyr Val Ser Ser Asn Met Phe Ala Gly Tyr Phe Ala Gly Ser Leu Ala Gln Leu Asp Gly His Met Met Pro Pro Arg Asp Lys Val Leu Ile Tyr Gly Asp Gly Asn Val Lys Gly Ile Trp Val Asp Glu Asp Asp Val Gly Thr Tyr Thr Ile Lys Ser Ile Asp Asp Pro Gln Thr Leu Asn Lys Thr Met Tyr Ile Arg Pro Pro Met Asn Ile Leu Ser Gln Lys Glu Val Ile Gln Ile Trp Glu Arg Leu Ser Glu Gln Asn Leu Asp Lys Ile Tyr Ile Ser Ser Gln Asp Phe Leu Ala Asp Met Lys Asp Lys Ser Tyr Glu Glu Lys Ile Val Arg Cys His Leu Tyr Gln Ile Phe Phe Arg Gly Asp Leu Tyr Asn Phe Glu Ile Gly Pro Asn Ala Ile Glu Ala Thr Lys Leu Tyr Pro Glu Val Lys Tyr Val Thr Met Asp Ser Tyr Leu Glu Arg Tyr Val (2) INFORMATION FOR SEQ ID N0:63:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1l51 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Thuja plicata cDNA PLR-Tp2 (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 61..996 ' (xi) SEQUENCE DESCRIPTION: SEQ ID N0:63:

~ I

MetGlu GluSerSer ArgVal LeuIleVal GlyGlyThr GlyTyrIle GlyArg ArgIleVal LysAla SerIleAla LeuGlyHis ProThrPhe IleLeu PheArgLys GluVal ValSerAsp ValGluLys ValGluMet LeuLeu SerPheLys LysAsn GlyAlaLys LeuLeuGlu AlaSerPhe AspAsp HisGluSer LeuVal AspAlaVal LysG1nVal AspValVal IleSer AlaValAla GlyAsn HisMetArg HisHisIle LeuGlnGln LeuLys LeuValGlu AlaIle LysGluAla GlyAsnIie LysArgPhe 410 4l5 920 425 ValPro SerGluPhe GlyMet AspProGly LeuMetGlu HisAlaMet AlaPro GlyAsnIle ValPhe IleAspLys IleLysVal ArgGluAla IleGlu AlaAlaSer IlePro HisThrTyr IleSerAla AsnIlePhe AlaGly TyrLeuVal GlyGly LeuAlaGln LeuGlyArg ValMetPro ProSer GluLysVal IleLeu TyrGlyAsp GlyAsnVa1 LysAlaVal TrpVal AspGluAsp AspVal GlyIleTyr ThrIleLys AlaIleAsp AspPro HisThrLeu AsnLys ThrMetTyr IleArgPro ProLeuAsn IleLeu SerGlnLys GluVal ValGluLys TrpGluLys LeuSerGly WO 98l20113 PCTIUS9?I20391 -Lys Ser Leu Asn Lys Ile Asn Ile Ser Val Glu Asp Phe Leu Ala Gly Met Glu Gly Gln Ser Tyr Gly Glu Gln Ile Gly Ile Ser His Phe Tyr TTT GGA

Gln MetPheTyr Arg GlyAspLeu Tyr Asn Glu Ile Pro Asn Phe Gly GTA ACA

Gly ValGluAla Ser GlnLeuTyr Pro Glu Lys Tyr Thr Val Val Thr 605 6l0 615 ATATCTAAAT

Asp SerTyrMet Glu ArgTyrLeu TTAATTTAAG CTTTCTAAAA TTTGACATTATGCTAAATAA

GTTTTTATAT AAATGGAGAG

TATCTAGATA ATAATATTGA TAAAAATTATTGGGATTAAA AAAAAAAAA

CCAATCATAT A

(2) INFORMATION FOR SEQ ID N0:64:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 312 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:64:
Met Glu Glu Ser Ser Arg Val Leu Ile Val Gly Gly Thr Gly Tyr Ile Gly Arg Arg Ile Val Lys Ala Ser Ile Ala Leu Gly His Pro Thr Phe Ile Leu Phe Arg Lys Glu Val Val Ser Asp Val Glu Lys Val Glu Met Leu Leu Ser Phe Lys Lys Asn G1y Ala Lys Leu Leu Glu Ala Ser Phe Asp Asp His Glu Ser Leu Val Asp Ala Val Lys Gln Val Asp Val Val Ile Ser Ala Val Ala Gly Asn His Met Arg His His Ile Leu Gln Gln Leu Lys Leu Val Glu Ala Ile Lys Glu Ala Gly Asn Ile Lys Arg Phe . .

WO 98l20113 PCT/IFS97120391 -Val Pro Ser Glu Phe Gly Met Asp Pro Gly Leu Met Glu His Ala Met l15 l20 125 Ala Pro Gly Asn Ile Val Phe Ile Asp Lys Ile Lys Val Arg Glu Ala Ile Glu Ala Ala Ser Ile Pro His Thr Tyr Ile Ser Ala Asn Ile Phe Ala Gly Tyr Leu Val Gly Gly Leu Ala Gln Leu Gly Arg Val Met Pro Pro Ser Glu Lys Val Ile Leu Tyr Gly Asp Gly Asn Val Lys Ala Val l80 185 190 Trp Val Asp Glu Asp Asp Val Gly Ile Tyr Thr Ile Lys Ala IIe Asp Asp Pro His Thr Leu Asn Lys Thr Met Tyr Ile Arg Pro Pro Leu Asn 2l0 215 220 Ile Leu Ser Gln Lys Glu Val Val Glu Lys Trp Glu Lys Leu Ser Gly Lys Ser Leu Asn Lys Ile Asn Ile Ser Val Glu Asp Phe Leu Ala Gly Met Glu Gly Gln Ser Tyr Gly Glu Gln Ile Gly Ile Ser His Phe Tyr Gln Met Phe Tyr Arg Gly Asp Leu Tyr Asn Phe Glu Ile Gly Pro Asn Gly Val Glu Ala Ser Gln Leu Tyr Pro Glu Val Lys Tyr Thr Thr Val Asp Ser Tyr Met Glu Arg Tyr Leu (2) INFORMATION FOR SEQ ID N0:65:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 130B base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Thuja plicata cDNA PLR-Tp3 (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/REY: CDS
(B) LOCATION: 164..1l05 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:65:

AATTTGACTG
TGAAAGTGGA
TGCACATAAG

Met AspLys Lys Ser Arg ValLeuIle ValGly GlyThrGly PheTleGly LysArg Ile Val Lys AlaSerLeu AlaLeu GlyHisPro ThrTyrVal LeuPhe Arg CCA GAA GCCCTCTCT TACATT GACAAAGTG CAGATGTTG ATATCC TTC 3l9 Pro Glu AlaLeuSer TyrIle AspLysVal GlnMetLeu IleSer Phe Lys Gln LeuGlyAla LysLeu LeuGluAla SerLeuAsp AspHis Gln Gly Leu ValAspVal ValLys GlnValAsp ValValIle SerAla Val Ser Gly GlyLeuVal ArgHis HisIleLeu AspGlnLeu LysLeu Val 400 405 4l0 GAG GCA ATTAAAGAA GCTGGC AATATTAAG AGATTTCTT CCTTCA GAA 5l1 Glu Ala IleLysGlu AlaGly AsnIleLys ArgPheLeu ProSer Glu 9l5 920 425 Phe Gly MetAspPro AspVal ValGluAsp ProLeuG1u ProGly Asn Ile Thr PheIleAsp LysArg LysValArg ArgAlaIle GluAla Ala Thr Ile ProTyrThr TyrVal SerSerAsn MetPheAla GlyPhe Phe Ala Gly SerLeuAla GlnLeu GlnAspAla ProArgMet MetPro Ala Arg Asp LysValLeu IleTyr GlyAspGly AsnValLys GlyVal Tyr Val Asp GluAspAsp AlaGly IleTyrIle ValLysSer IleAsp Asp .

i WO 98l20113 PCT/US97/20391 -ProArgThr LeuAsn LysThrVal TyrIle ArgProPro MetAsnIle LeuSerGln LysGlu ValValGlu IleTrp GluArgLeu SerGlyLeu SerLeuGlu LysIle TyrValSer GluAsp GlnLeuLeu AsnMetLys AspLysSer TyrVal GluLysMet AlaArg CysHisLeu TyrHisPhe PheIleLys GlyAsp LeuTyrAsn PheGlu IleGlyPro AsnAlaThr GluGlyThr LysLeu TyrProGlu ValLys TyrThrThr MetAspSer ATTTTTCTTA
AATAATAGCT

TyrMetGlu ArgTyr Leu (2) INFORMATION FOR SEQ ID N0:66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3l4 amino acids (B) TYPE: amino acid {D} TOPOLOGY: linear (ii} MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:66:
Met Asp Lys Lys Ser Arg Val Leu Ile Val Gly Gly Thr Gly Phe Ile Gly Lys Arg Ile Val Lys Ala Ser Leu Ala Leu Gly His Pro Thr Tyr Va1 Leu Phe Arg Pro Glu Ala Leu Ser Tyr Ile Asp Lys Val Gln Met Leu Ile Ser Phe Lys Gln Leu Gly Ala Lys Leu Leu Glu Ala Ser Leu Asp Asp His Gln Gly Leu Val Asp Val Val Lys Gln Val Asp Val Val Ile Ser Ala Val Ser Gly Gly Leu Val Arg His His Ile Leu Asp Gln Leu Lys Leu Val Glu Ala Ile Lys Glu Ala Gly Asn Ile Lys Arg Phe Leu Pro Ser Glu Phe Gly Met Asp Pro Asp Val Val Glu Asp Pro Leu Glu Pro Gly Asn Ile Thr Phe Ile Asp Lys Arg Lys Val Arg Arg Ala Ile Glu Ala Ala Thr Ile Pro Tyr Thr Tyr Val Ser Ser Asn Met Phe l45 150 155 160 Aia Gly Phe Phe Ala Gly Ser Leu Ala Gln Leu Gln Asp Ala Pro Arg Met Met Pro Ala Arg Asp Lys Val Leu Ile Tyr Gly Asp Gly Asn Val Lys Gly Val Tyr Val Asp Glu Asp Asp Ala Gly Ile Tyr Ile Val Lys Ser Ile Asp Asp Pro Arg Thr Leu Asn Lys Thr Val Tyr Ile Arg Pro Pro Met Asn Ile Leu Ser Gln Lys Glu Val Val Glu Ile Trp Glu Arg Leu Ser Gly Leu Ser Leu Glu Lys Ile Tyr Val Ser Glu Asp Gln Leu Leu Asn Met Lys Asp Lys Ser Tyr Val Glu Lys Met Ala Arg Cys His Leu Tyr His Phe Phe Ile Lys Gly Asp Leu Tyr Asn Phe Glu Ile Gly Pro Asn Ala Thr Glu Gly Thr Lys Leu Tyr Pro G1u Va1 Lys Tyr Thr Thr Met Asp Ser Tyr Met Glu Arg Tyr Leu {2) INFORMATION FOR SEQ ID N0:67:
{i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1287 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Thuja plicata cDNA PLR-Tp9 (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO

i (ix)FEATURE:

(A) AMElKEY:CDS
N

( B) ION:l1. .996 LOCAT

(xi)SEQUEN CE ESCRIPTION: SEQ ID 0:67:
D N

ATG AGC ATT GTA GGC

Met GluGluSer SerArg IleLeu ValGly Thr Val Gly GlyTyr IleGly ArgArgIle ValLys AlaSer IleAlaLeu GlyHis ProThr PheIle LeuPheArg LysGlu ValVal SerAspVal GluLys GTGGAG ATGTTA TTGTCCTTC AAAAAG AATGGT GCCAAATTA CTGGAG l93 ValGlu MetLeu LeuSerPhe LysLys AsnGly AlaLysLeu LeuGlu Ala5er PheAsp AspHisGlu SerLeu ValAsp AlaValLys GlnVal AspVal ValIle SerAlaVal AlaGly AsnHis MetArgHis HisIle LeuGln GlnLeu LysLeuVal GluAla IleLys GluAlaGly AsnIle 410 4l5 420 LysArg PheVal ProSerGlu PheGly MetAsp ProGlyLeu MetAsp HisAla MetAla ProGlyAsn IleVal PheIle AspLysIle LysVal ArgGlu AlaIle GluAlaAla AlaIle ProHis ThrTyrIle SerAla AsnIle PheAla GlyTyrLeu ValGly GlyLeu AlaGlnLeu GlyArg ValMet ProPro SerAspLys ValPhe LeuTyr GlyAspGly AsnVal LysAla ValTrp IleAspGlu GluAsp ValGly IleTyrThr IleLys Ala Ile Asp Asp Pro Arg Thr Leu Asn Lys Thr Val Tyr Ile Arg Pro Pro Leu Asn Val Leu Ser Gln Lys Glu Val Val G1u Lys Trp Glu Lys Leu Ser Arg Lys Ser Leu Asp Lys Ile Tyr Met Ser Val Glu Asp Phe Leu Ala Gly Met Glu Gly Gln Ser Tyr Gly Glu Lys Ile Gly Ile Sex His Phe Tyr Gln Met Phe Tyr Lys Gly Asp Leu Tyr Asn Phe Glu Ile Gly Pro Asn Gly Val Glu Ala Ser Gln Leu Tyr Pro Gly Val Lys Tyr Thr Thr Val Asp Ser Tyr Met Glu Arg Tyr Leu GACTTTTTCC CTTTAACTGC ATGCTCAACA TATTTTATACAAACAAGCTAATGTCTTTTA1l96 T T T F-~AAAAAA P~~;AAAAAAAP. A 12 (2) INFORMATION
FOR SEQ ID N0:68:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 312 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ
ID N0:68:

Met Glu G1u Ser Ser Arg Ile Leu Val Val Gly Thr Tyr Ile Gly Gly Gly Arg Arg Ile Val Lys Ala Ser Ile Ala Gly His Thr Phe Leu Pro Ile Leu Phe Arg Lys Glu Val Val Ser Asp Val Glu Lys Val Glu Met i~

WO 98I20113 PCTlUS97120391 -Leu Leu Ser Phe Lys Lys Asn Gly Ala Lys Leu Leu Glu Ala Ser Phe Asp Asp His Glu Ser Leu Val Asp Ala Val Lys Gln Val Asp Val Val Ile Ser Ala Val Ala Gly Asn His Met Arg His His Ile Leu Gln Gln Leu Lys Leu Val Glu A1a Ile Lys Glu Ala Gly Asn Ile Lys Arg Phe l00 105 110 Val Pro Ser Glu Phe Gly Met Asp Pro Gly Leu Met Asp His Ala Met 1l5 120 125 Ala Pro Gly Asn Ile Val Phe Ile Asp Lys Ile Lys Val Arg Glu Ala I1e Glu Ala Ala Ala Ile Pro His Thr Tyr Ile Ser Ala Asn Ile Phe 145 l50 155 160 Ala Gly Tyr Leu Val Gly Gly Leu Ala Gln Leu Gly Arg Val Met Pro Pro Ser Asp Lys Val Phe Leu Tyr Gly Asp Gly Asn Val Lys Ala Val 180 185 l90 Trp Ile Asp Glu Glu Asp Val Gly Ile Tyr Thr Ile Lys Ala Ile Asp l95 200 205 Asp Pro Arg Thr Leu Asn Lys Thr Val Tyr Ile Arg Pro Pro Leu Asn 210 2l5 220 Val Leu Ser Gln Lys Glu Val Val Glu Lys Trp Glu Lys Leu Ser Arg Lys Ser Leu Asp Lys Ile Tyr Met Ser Val Glu Asp Phe Leu Ala Gly Met Glu Gly Gln Ser Tyr Gly Glu Lys Ile Gly Ile Ser His Phe Tyr Gln Met Phe Tyr Lys Gly Asp Leu Tyr Asn Phe Glu Ile Gly Pro Asn Gly Vai Glu Ala Ser Gln Leu Tyr Pro Gly Val Lys Tyr Thr Thr Val Asp Ser Tyr Met Glu Arg Tyr Leu (2) INFORMATION FOR SEQ ID N0:69:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1282 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear WO 98!2U113 PCT/US97/20391 -(ii) MOLECULE TYPE: Tsuga heterophylla cDNA PLR-Thl (iii) HYPOTHETICAL: NO
(iv} ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 2..922 (xi)SEQUENCE ID
DESCRIPTION: N0:69:
SEQ

CTA GGT
ATA AGA
GTG AAA
GGT TTT
GGC
ACA
GGA
TAC
AT

Arg Va l r e Leu Gly Gly Ile Tyr Arg Val Il Lys Gly Phe Gly Th 3l5 320 325 Val LysAlaSer LeuAlaLeu GlyHisPro Thr PheValLeuSer Arg CCA GAAGTAGGG TTTGACATT GAGAAGGTG CAC ATGTTGCTCTCC TTC l42 Pro GluValGly PheAspIle GluLysVal His MetLeuLeuSer Phe Lys GlnAlaGly AlaArgLeu LeuGluGly Ser PheGluAspPhe Gln Ser LeuValAla AlaLeuLys GlnValAsp Val ValIleSerAla Val Ala GlyAsnHis PheArgAsn LeuIleLeu Gln GlnLeuLysLeu Val Glu AlaIleLys GluAlaGly AsnIleLys Arg PheLeuProSer Glu Phe GlyMetGlu ProAspLeu MetGluHis Ala LeuGluProGly Asn Ala ValPheIle AspLysArg LysValArg Arg AlaIleGluAla Ala Gly IleProTyr ThrTyrVal SerSerAsn Ile PheAlaGlyTyr Leu Ala G1yGlyLeu AlaGlnIle GlyArgLeu Met ProProArgAsp Glu Val ValIleTyr GlyAspGly AsnValLys Ala ValTrpValAsp Glu ATA ACA
ATC

AspAsp ValGly IleTyrThr LeuLysThr IleAspAsp ProArgThr LeuAsn LysThr ValTyrIle ArgProLeu LysAsnIle LeuSerGln LysGlu LeuVal AlaLysTrp GluLysLeu SerGlyLys CysLeuLys LysThr TyrIle SerAlaGlu AspPheLeu AlaGlyIle GluAspGln ProTyr GluHis GlnValGly IleSerHis PheTyrGln MetPheTyr SerGly AspLeu TyrAsnPhe G1uIleGly ProAspGly ArgGluAla ThrVal LeuTyr ProGluVal GlnTyrThr ThrMetAsp SerTyrLeu GAAGGTTAAT
GTTCTACGAC
ATGAATCCCA

LysArg TyrLeu (2) INFORMATION FOR SEQ ID N0:70:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 307 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:70:
Arg Val Leu Ile Val Gly G1y Thr Gly Tyr Ile Gly Arg Lys Phe Val WO 98l20113 PCT/US97/20391 -Lys Ala Ser Leu Ala Leu Gly His Pro Thr Phe Val Leu Ser Arg Pro Glu Val Gly Phe Asp I1e Glu Lys Val His Met Leu Leu Ser Phe Lys Gln Ala Gly Ala Arg Leu Leu Glu Gly Ser Phe Glu Asp Phe Gln Ser Leu Val Ala Ala Leu Lys Gln Val Asp Val Val Ile Ser Ala Val Ala Gly Asn His Phe Arg Asn Leu Ile Leu Gln Gln Leu Lys Leu Val Glu Ala Ile Lys Glu Ala Gly Asn Ile Lys Arg Phe Leu Pro Ser Glu Phe 100 105 i10 Gly Met Glu Pro Asp Leu Met Glu His Ala Leu Glu Pro Gly Asn Ala Va1 Phe Ile Asp Lys Arg Lys Val Arg Arg Ala Ile Glu Ala Ala Gly Ile Pro Tyr Thr Tyr Val Ser Ser Asn Ile Phe Ala Gly Tyr Leu Ala Gly Gly Leu Ala Gln Ile Gly Arg Leu Met Pro Pro Arg Asp Glu Val l65 l70 275 Val Ile Tyr Gly Asp Gly Asn Val Lys Ala Val Trp Val Asp Glu Asp 180 185 l90 Asp Val Gly Ile Tyr Thr Leu Lys Thr Ile Asp Asp Pro Arg Thr Leu l95 200 205 Asn Lys Thr Val Tyr Ile Arg Pro Leu Lys Asn Ile Leu Ser Gln Lys Glu Leu Val Ala Lys Trp Glu Lys Leu Ser Gly Lys Cys Leu Lys Lys Thr Tyr Ile Ser Ala Glu Asp Phe Leu Ala Gly Ile Glu Asp Gln Pro Tyr Glu His Gln Val Gly Ile Ser His Phe Tyr Gln Met Phe Tyr Ser Gly Asp Leu Tyr Asn Phe Glu Ile Gly Pro Asp Gly Arg Glu Ala Thr Val Leu Tyr Pro Glu Val Gln Tyr Thr Thr Met Asp Ser Tyr Leu Lys Arg Tyr Leu (2) INFORMATION FOR SEQ ID N0:71:

. .

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1328 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Tsuga heterophylla cDNA PLR-Th2 {iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
{ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 20..946 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:71:

CGAGCTAAC ATA
ATG GTG
AGC GGT
AGA GGC
GTT ACA
GGA

Met Le u Gly Ser Ile Gly Arg Val Thr Val Gly 3l0 315 TyrIleGly ArgLysPhe ValLysAla SerLeu AlaLeuGly HisPro ThrPheVal LeuSerArg ProGluVal GlyPhe AspIleGlu LysVal HisMetLeu LeuSerPhe LysGlnAla GlyAla Arg-LeuLeu GluGly SerPheGlu AspPheGln SerLeuVal AlaAla LeuLysGln ValAsp ValValIle SerAlaVal AlaGlyAsn HisPhe ArgAsnLeu IleLeu GlnGlnLeu LysLeuVa1 GluA1aIle LysGlu AlaArgAsn IleLys ArgPheLeu ProSerGlu PheGlyMet AspPro AspLeuMet GluHis AlaLeuGlu ProGlyAsn AlaValPhe IleAsp LysArgLys ValArg ArgAlaIle GluAlaAla GlyIlePro TyrThr TyrValSer SerAsn Ile Phe Ala Gly Tyr Leu Ala Gly Gly Leu Ala Gln Ile Gly Arg Leu Met Pro Pro Arg Asp Glu Val Val Ile Tyr Gly Asp Gly Asn Val Lys ACA

Ala ValTrp ValAspGlu AspAspVal GlyIle TyrThrLeu LysThr 495 500 505 5l0 Ile AspAsp ProArgThr LeuAsnLys ThrVal TyrIleArg ProLeu 5l5 520 525 Lys AsnIle LeuSerGln LysGluLeu ValAla LysTrpGlu LysLeu Ser GlyLys PheLeuLys LysThrTyr IleSer AlaGluAsp PheLeu Ala GlyIle GluAspGln ProTyrGlu HisGln ValGlyIle SerHis Phe TyrGln MetPheTyr SerG1yAsp LeuTyr AsnPheGlu IleGiy Pro AspGly ArgGluAla ThrMetLeu TyrPro GluValGln TyrThr ACC ATGGAT TCTTATTTG AAGCGCTAC TTATAAGCAGGAT GAAGGTTAAT

Thr MetAsp SerTyrLeu LysArgTyr Leu GTTCTACGAC ATGAATCCCA CGAGAAATAC CAGAA_ATCTT CATTCAAGAT CAAATAATGG 1026 (2) INFORMATION FOR SEQ ID N0:72:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 309 amino acids (B) TYPE: amino acid WO 98I20113 PCTlUS97120391 -(D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:72:
Met Ser Arg Val Leu Ile Val Gly Gly Thr Gly Tyr Ile Gly Arg Lys Phe Va1 Lys Ala Ser Leu Ala Leu Gly His Pro Thr Phe Val Leu Ser Arg Pro Glu Val Gly Phe Asp Ile Glu Lys Val His Met Leu Leu Ser Phe Lys Gln Ala Gly Ala Arg Leu Leu Glu Gly Ser Phe Glu Asp Phe Gln Ser Leu Val Ala Ala Leu Lys Gln Val Asp Va1 Val Ile Ser Ala Val Ala Gly Asn His Phe Arg Asn Leu Ile Leu Gln Gln Leu Lys Leu Val Glu Ala Tle Lys Glu Ala Arg Asn Ile Lys Arg Phe Leu Pro Ser 100 105 l10 Glu Phe Gly Met Asp Pro Asp Leu Met Glu His Ala Leu Glu Pro Gly l15 120 125 Asn Ala Val Phe Ile Asp Lys Arg Lys Val Arg Arg Ala Ile Glu Ala 130 l35 140 Ala Gly Ile Pro Tyr Thr Tyr Val Ser Ser Asn Ile Fhe Ala Gly Tyr Leu Ala Gly Gly Leu Ala Gln Ile Gly Arg Leu Met Pro Pro Arg Asp l65 l70 175 Glu Val Val Ile Tyr Gly Asp Gly Asn Val Lys A1a Val Trp Val Asp l80 185 190 Glu Asp Asp Val Gly Ile Tyr Thr Leu Lys Thr Ile Asp Asp Pro Arg Thr Leu Asn Lys Thr Val Tyr Ile Arg Pro Leu Lys Asn Ile Leu Ser 2l0 215 220 Gln Lys Glu Leu Val Ala Lys Trp Glu Lys Leu Ser Gly Lys Phe Leu Lys Lys Thr Tyr Ile Ser Ala Glu Asp Phe Leu Ala Gly Ile Glu Asp Gln Pro Tyr Glu His Gln Val Gly Tle Ser His Phe Tyr Gln Met Phe Tyr Ser Gly Asp Leu Tyr Asn Phe Glu Ile Gly Pro Asp Gly Arg Glu Ala Thr Met Leu Tyr Pro Glu Val Gln Tyr Thr Thr Met Asp Ser Tyr Leu Lys Arg Tyr Leu (2) INFORMATION FOR SEQ ID N0:73:
Y

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 355 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA probe used to isolate Forsythia intermedia dirigent protein cD NA clone (iii) HYPOTHETICAL: NO

(iv) ANTI-SENSE: NO

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:73 :

(2} INFORMATION FOR SEQ ID N0:74:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION:"PCRprimer R20"
(iii) HYPOTHETICAL: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:79:

(2) INFORMATION FOR SEQ ID N0:75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH; 19 base pairs (B) TYPE: nucleic acid i WO 98I20113 PCTlUS97/20391 -(C) STRANDEDNESS: single {D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION:"PCR primer U19"
(iii) HYPOTHETICAL: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:75:

(2) INFORMATION FOR SEQ ID N0:76:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: not relevant (ii) MOLECULE TYPE: peptide(NADPH) binding motif (iii) HYPOTHETICAL: NO
(v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:76:
Gly Xaa Gly Xaa Xaa Gly

Claims (58)

The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows:
1. An isolated protein from a lignan biosynthetic pathway selected from the group consisting of dirigent protein and pinoresinol/lariciresinol reductases, wherein when the isolated protein is a pinoresinol/lariciresinol reductase the isolated protein has an enzymatic activity of at least 51 nmol h-1 mg-1.
2. An isolated protein of Claim 1 having the biological activity of dirigeat protein.
3. An isolated protein of Claim 2 having the biological activity of dirigent protein from Forsythia.
4. An isolated protein, of Claim 3 having the biological activity of dirigent protein from Forsythia intermedia.
5. An isolated protein of Claim 2 having the biological activity of dirigent protein from T suga.
6. An isolated protein of Claim 5 having the biological activity of dirigent protein from T suga heterophylla.
7. An isolated protein of Claim 2 having the biological activity of diirigent protein from Thuja.
8. An isolated protein of Claim 7 having the biological activity of dirigent protein from Thuja plicata.
9. Am isolated protein of Claim 1 having the biological activity of dirigent protein selected from the group consisting of SEQ ID Nos: 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33 and 35.
10. An isolated protein of Claim 1 having the biological activity of pinoresinol/lariciresinol reductase.
11. An isolated protein of Claim 10 having the biological activity of pinoresinol/lariciresinol reductase from Forsythia.
12. An isolated protein of Claim 11 haying the biological activity of pinoresinol/lariciresinol reductase from Forsythia intermedia.
13. An isolated protein of Claim 10 having the biological activity of pinoresinol/lariciresinol reductase from T suga.
14. An isolated protein of Claim 13 having the biological activity of pinoresinol/lariciresinol reductase from T suga heterophylla.
15. An isolated protein of Claim 10 having the biological activity of pinoresinol/lariciresinol reductase from Thuja.
16. An isolated protein of Claim 15 having the biological activity of pinoresinol/lariciresinol reductase from Thuja plicata.
17. An isolated protein of Claim 1 having the biological activity of pinoresinol/lariciresinol reductase selected from the group consisting of SEQ
ID
Nos:48, 50, 52, 54, 56, 58, 62, 64, 66, 68, 70 and 72.
18. An isolated nucleotide sequence encoding a dirigent protein.
19. An isolated nucleotide sequence encoding a dirigent protein from a Forsythia species.
20. A nucleotide sequence of Claim 19 encoding a dirigent protein from Forsythia intermedia.
21. An isolated nucleotide sequence encoding a protein having the biological activity of SEQ ID No:13 or SEQ ID No:15.
22. An isolated nucleotide sequence of Claim 19 which encodes the amino acid sequence of SEQ ID No:13 or SEQ ID No:15.
23. An isolated nucleotide sequence of Claim 19 having the sequence of SEQ ID No:12 or SEQ ID No:14.
24. An isolated nucleotide sequence encoding a dirigent protein from a T suga species.
25. A nucleotide sequence of Claim 24 encoding a dirigent protein from T suga heterophylla.
26. An isolated nucleotide sequence encoding a protein having the biological activity of SEQ ID No:17 or SEQ ID No:19.
27. An isolated nucleotide sequence of Claim 24 which encodes the amino acid sequence of SEQ ID No:17 or SEQ ID No:19.
28. An isolated nucleotide sequence of Claim 24 having the sequence of SEQ ID No:16 or SEQ ID No:18.
29. An isolated nucleotide sequence encoding a dirigent protein from a Thuja species.
30. A nucleotide sequence of Claim 29 encoding a dirigent protein from Thuja plicata.
31. An isolated nucleotide sequence encoding a protein having the biological activity of any one of SEQ ID Nos:21, 23, 25, 27, 29, 31, 33 or 35.
32. An isolated nucleotide sequence of Claim 29 which encodes the amino acid sequence of any one of SEQ ID Nos:21, 23, 25, 27, 29; 31, 33 or 35.
33. An isolated nucleotide sequence of Claim 29 having the sequence of any one of SEQ ID Nos:20, 22, 24, 26, 28, 30, 32 or 34.
34. An isolated nucleotide sequence encoding a pinoresinol/lariciresinol reductase from a Forsythia species.
35. A nucleotide sequence of Claim 34 encoding a pinoresinol/lariciresinol reductase from Forsythia intermedia.
36. An isolated nucleotide sequence encoding a protein having the biological activity of any one of SEQ ID Nos:48, 50, 52, 54, 56 or 58.
37. An isolated nucleotide sequence of Claim 34 which encodes the amino acid sequence of any one of SEQ ID Nos:48, 50, 52, 54, 56 or 58.
38. An isolated nucleotide sequence of Claim 34 having the sequence of any one of SEQ ID Nos:47, 49, 51, 53, 55 or 57.
39. An isolated nucleotide sequence encoding a pinoresinol/lariciresinol reductase from a Thuja species.
40. A nucleotide sequence of Claim 39 encoding a pinoresinol/-lariciresinol reductase from Thuja plicata.
41. An isolated nucleotide sequence encoding a protein having the biological activity of any one of SEQ ID Nos:62, 64, 66 or 68.
42. An isolated nucleotide sequence of Claim 39 which encodes the amino acid sequence of any one of SEQ ID Nos:62, 64, 66 or 68.
43. An isolated nucleotide sequence of Claim 39 having the sequence of any one of SEQ ID Nos:61, 63, 65 or 67.
44. An isolated nucleotide sequence encoding a pinoresinol/lariciresinol reductase from a Tsuga species.
45. A nucleotide sequence of Claim 44 encoding a pinoresinol/-lariciresinol reductase from Tsuga heterophylla.
46. An isolated nucleotide sequence encoding a protein having the biological activity of SEQ ID No:70 or SEQ ID No:72.
47. An isolated nucleotide sequence of Claim 44 which encodes the amino acid sequence of SEQ ID No:70 or SEQ ID No:72.
48. An isolated nucleotide sequence of Claim 44 having the sequence of SEQ ID No:69 or SEQ ID No:71.
49. A replicable expression vector comprising a nucleotide sequence encoding a protein having the biological activity of a dirigent protein selected from the group consisting of SEQ ID Nos:13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33 and 35.
50. A replicable expression vector comprising a nucleotide sequence encoding a protein having the biological activity of a pinoresinol/lariciresinol reductase selected from the group consisting of SEQ ID Nos:48, 50, 52, 54, 56, 58, 62, 64, 66, 68, 70 and 72.
51. A host cell comprising a vector of Claim 49.
52. A host cell comprising a vector of Claim 50.
53. A method of enhancing the expression of pinoresinol/lariciresinol reductase in a suitable host cell comprising introducing into the host cell an expression vector that comprises a nucleotide sequence encoding a protein having the biological activity of a protein selected from the group consisting of SEQ ID
Nos:48, 50, 52, 54, 56, 58, 62, 64, 66, 68, 70 and 72.
54. A method of modifying the expression of pinoresinol/lariciresinol reductase in a suitable host cell comprising introducing into the host cell an expression vector that comprises a nucleotide sequence that expresses an RNA
that is complementary to all or part of a nucleic acid molecule selected from the group consisting of SEQ ID Nos:47, 49, 51, 53, 55, 57, 61, 63, 65, 67, 69 and 71.
55. A method of enhancing the expression of dirigent protein in a suitable host cell comprising introducing into the host cell an expression vector that comprises a nucleotide sequence encoding a protein having the biological activity of a protein selected from the group consisting of SEQ ID Nos:13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33 and 35.
56. A method of modifying the expression of dirigent protein in a suitable host cell comprising introducing into the host cell an expression vector that comprises a nucleotide sequence that expresses an RNA that is complementary to all or part of a nucleic acid molecule selected from the group consisting of SEQ ID Nos:12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32 and 34.
57. A method of producing optically-pure lignans comprising introducing into a host cell an expression vector that comprises a nucleotide sequence encoding a dirigent protein capable of directing a bimolecular phenoxy coupling reaction to produce an optically pure lignan, and purifying the optically pure lignan from the host cell.
58. The method of Claim 57 wherein the nucleotide sequence is selected from the group consisting of SEQ ID Nos:12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32 and 34.
CA002270905A 1996-11-08 1997-11-07 Recombinant pinoresinol/lariciresinol reductase, recombinant dirigent protein, and methods of use Abandoned CA2270905A1 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US3052296P 1996-11-08 1996-11-08
US60/030,522 1996-11-08
US5438097P 1997-07-31 1997-07-31
US60/054,380 1997-07-31
PCT/US1997/020391 WO1998020113A1 (en) 1996-11-08 1997-11-07 Recombinant pinoresinol/lariciresinol reductase, recombinant dirigent protein, and methods of use

Publications (1)

Publication Number Publication Date
CA2270905A1 true CA2270905A1 (en) 1998-05-14

Family

ID=26706133

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002270905A Abandoned CA2270905A1 (en) 1996-11-08 1997-11-07 Recombinant pinoresinol/lariciresinol reductase, recombinant dirigent protein, and methods of use

Country Status (5)

Country Link
EP (1) EP0948602A1 (en)
JP (1) JP2001507931A (en)
AU (1) AU728116B2 (en)
CA (1) CA2270905A1 (en)
WO (1) WO1998020113A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6210942B1 (en) * 1996-11-08 2001-04-03 Washington State University Research Foundation Recombinant pinoresinol/lariciresinol reductase, recombinant dirigent protein, and methods of use
US20020174452A1 (en) * 2000-09-07 2002-11-21 Lewis Norman G. Monocot seeds with increased lignan content
WO2002061039A2 (en) * 2000-10-25 2002-08-08 Washington State University Research Foundation Thuja plicata dirigent protein promotors
CN1886499B (en) 2003-09-30 2011-10-05 三得利控股株式会社 A gene encoding an enzyme for catalyzing biosynthesis of lignan, and use thereof
JP4667007B2 (en) 2004-11-02 2011-04-06 サントリーホールディングス株式会社 Lignan glycosylation enzyme and its use
US8288613B2 (en) 2007-12-28 2012-10-16 Suntory Holdings Limited Lignan hydroxylase
CN112322621B (en) * 2020-11-10 2022-07-22 贵州大学 Eucommia DIR1 gene MeJA response promoter and application thereof
CN113603757B (en) * 2021-08-20 2023-05-26 昆明理工大学 Lily regale Dirigent similar protein gene LrDIR1 and application thereof

Also Published As

Publication number Publication date
JP2001507931A (en) 2001-06-19
WO1998020113A1 (en) 1998-05-14
AU5199398A (en) 1998-05-29
EP0948602A1 (en) 1999-10-13
AU728116B2 (en) 2001-01-04

Similar Documents

Publication Publication Date Title
CA2302873C (en) Nucleic and amino acid sequences for a novel transketolase from (mentha piperita)
CA2052792C (en) Method and composition for increasing sterol accumulation in higher plants
CA2270711A1 (en) Improved production of isoprenoids
CA2348155A1 (en) Nucleic acids encoding taxus geranylgeranyl diphosphate synthase, and methods of use
US6210942B1 (en) Recombinant pinoresinol/lariciresinol reductase, recombinant dirigent protein, and methods of use
CA2306207A1 (en) Geranyl diphosphate synthase from mint (mentha piperita)
CA2304799A1 (en) Monoterpene synthases from common sage (salvia officinalis)
CA2353084A1 (en) Plant 1-deoxy-d-xylulose 5-phosphate reductoisomerase
CA2270905A1 (en) Recombinant pinoresinol/lariciresinol reductase, recombinant dirigent protein, and methods of use
US6420159B2 (en) 1-deoxy-D-xylulose-5-phosphate reductoisomerases, and methods of use
CA2353306A1 (en) Nucleic acid sequences encoding isoflavone synthase
CA2276110A1 (en) Gene for adenylate cyclase and its use
EP1226265B1 (en) Bioproduction of para-hydroxycinnamic acid
CA2330167A1 (en) Genes of carotenoid biosynthesis and metabolism and methods of use thereof
CA2326380A1 (en) Recombinant secoisolariciresinol dehydrogenase, and methods of use
US6703229B2 (en) Aryl propenal double bond reductase
MXPA01000843A (en) Recombinant dehydrodiconiferyl alcohol benzylic ether reductase, and methods of use.
CA2381710A1 (en) D-gluconolactone oxidase gene and method for producing recombinant d-gluconolactone oxidase
CA2498381A1 (en) Acc gene
MXPA00010446A (en) Recombinant secoisolariciresinol dehydrogenase, and methods of use

Legal Events

Date Code Title Description
EEER Examination request
FZDE Dead