CA2319853A1

CA2319853A1 - Floral homeotic genes for manipulation of flowering in poplar trees and other plant species

Info

Publication number: CA2319853A1
Application number: CA 2319853
Authority: CA
Inventors: Steven H. Strauss; William H. Rottman; Amy M. Brunner; Lorraine A. Sheppard
Original assignee: Individual
Current assignee: Oregon State Board of Higher Education
Priority date: 1999-10-01
Filing date: 2000-10-02
Publication date: 2001-04-01

Abstract

Four floral homeotic genes from Poplar are disclosed. The disclosed nucleic acid molecules are useful for producing transgenic plants having modified fertility characteristics, particularly sterility.

Description

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 FLORAL HOMEOTIC GENES FOR MANIPULATION OF FLOWERING
IN POPLAR AND OTHER PLANT SPECIES
FIELD
This invention relates to nucleic acid molecules isolated from Populus species, and methods of using these molecules and derivatives thereof to produce plants, particularly trees such as Populus species, that have modified fertility characteristics.
BACKGROUND
The increasing demand for pulp and paper products and the diminishing availability of productive forest lands are being addressed in part by efforts to develop trees that produce increased yields in shorter growth periods. Many such efforts are focused on the production of transgenic trees having modified growth characteristics, such as reduced lignin content (see for example, U.S. Patent No. 5,451,514, "Modification of Lignin Synthesis in Plants"), and resistance to insect, viruses and herbicides. A major concern with the production of transgenic trees is the possibility that the transgenic traits might be introduced into indigenous tree populations by cross-fertilization. Thus, for example, the introduction of genes for insect resistance into indigenous tree populations could accelerate the evolution of resistant insects, adversely affect endangered insect species and interfere with normal food chains. Because of these concerns, the U.S. and other governments have instituted regulatory review processes to assess the risks associated with proposed environmental releases of transgenic plants (both for field trials and commercial production).
Genetic engineering of sterility into trees offers the possibility of securing introduced genes in the engineered tree; trees that produce neither pollen nor seeds will not be able to transmit introduced genes by normal routes of reproduction. Additional potential benefits of engineering sterility into trees include increased wood yields and reduced production of allergens such as pollen. For a review of engineering reproductive sterility in forest trees, see Strauss et al.
(1995a,b).
Two primary methods for engineering sterility have been described. In the first method, termed genetic ablation, a cytotoxic gene is expressed under the control of a reproductive tissue-specific promoter. Cytotoxic genes employed in this method to date include RNase (Mariani et al., 1990; Mariani et al., 1992; Reynarts et al., 1993; Goldman et al., 1994), ADP-ribosyl transferase (Thorsness et al., 1991; Kandasamy, 1993; Thorseness et al., 1993), the Agrobacterium RoIC gene (Schmulling, 1993), and glucanase (Worrall et al., 1992, Paul et al., 1992).
The expression of the cytotoxic gene results (ideally) in the death of all cells in which the reproductive tissue-specific promoter is active. It is therefore critical that the promoter be highly specific to the reproductive tissue to avoid pleiotropic effects on vegetative tissue. For this reason, genome position effects on the transgene need to be monitored (see Strauss et al., 1995a,b). The success of genetic ablation RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 methods in trees will thus depend on the availability of a suitable reproductive tissue-specific promoter for the tree species in question.
The second method for engineering sterility involves inhibiting the expression of genes that are essential for reproduction. This can be accomplished in a number of ways, including the use of antisense RNA, sense suppression and promoter-based suppression.
Details and applications of antisense (Kooter, 1993; Mol et al., 1994; Van der Meer et al., 1992;
Pnueli et al., 1994), sense suppression (Flavell, 1994; Jorgensen, 1992; Taylor et al., 1992) and promoter-based suppression (Brusslan et al., 1993; Matzke et al., 1993) technologies in plants have been described in the scientific literature. The key to the use of any of these methods in the production of sterile trees is the identification of appropriate indigenous genes, i.e, disruption of the expression of such genes must result in the abolition of correct reproductive tissue development.
Genes specifically expressed in reproductive tissues have been isolated from a number of plant species (for a review, see Strauss et al., 1995a). Genes that have been characterized as acting early in the development of floral structures include LEAFY (LFY) from Arabidopsis (Weigel et al., 1992), APETALA1 (AP1) from Arabidopsis (Mandel et al, 1992a,b), and FLORICAULA
(FLO) from Antirrhinum (Coen et al., 1990), which regulate the transition from inflorescence to floral meristems. APETALA2 (AP2) appears to regulate the AGAMOUS gene (AG) which plays a role in differentiation of male and female floral tissues (see Okamuro et al., 1993). DEFICIENS
(DEF) is a floral homeotic gene from Antirrhinum that is expressed throughout flower development (Schwarz-Sommer et al. 1992).
The majority of floral homeotic genes are members of the MADS-box family of transcription factors (Yanofsky et al., 1990). The MADS-box is a conserved region of approximately 60 amino acid residues. MADS is an acronym for the first four known genes in which the MADS-box was identified: yeast minichromosomal maintenance factor (MCM1), the floral homeotic genes AG and DEF, and human serum response factor (SRF). Plant MADS-box genes contain four domains: the highly conserved MADS-box region located near or at the 5' end of the translated region in plant genes; the L or linker region between the MADS and K domains;
the K domain, a moderately conserved keratin-like region predicted to form amphipathic a-helices;
and a highly variable carboxy-terminal region. The K-box is only present in plant MADS-box genes. It is thought to be involved in protein-protein interactions (Pnueli et al., 1991).
Studies have shown that the organization of the MADS domain in plants is similar to that in SRF; the basic N-terminal portion of the domain is required for DNA-binding and the C-terminal half of the box is required for dimerization. Because MADS proteins bind DNA
as dimers, the MADS box as well as a C-terminal extension that is involved in dimerization are required for DNA-binding. The C-terminal extension varies throughout the gene family. C-terminal deletions indicate that the minimal DNA-binding domain of AP1 and AG includes the MADS-box and part of the L region, whereas AP3 and PI require a portion of the K box in addition to the MADS and L

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 regions (Riechmann et al., 1996). The difference in the sizes of the minimal binding domains is thought to reflect the dimerization characteristics of the respective proteins: AP1 and AG bind DNA as homodimers whereas AP3/PI and their Antirrhinum homologs DEF/GLO bind as heterodimers.
MADS-box proteins have been found to bind to a motif found in target gene promoters referred to as the CArG-box. CArG-box motifs are also found in the promoters of MADS-box genes, where they are thought to be targets for auto-regulation. Riechmann et al. ( 1996) used circular permutation and phasing analysis to detect conformational changes in DNA that resulted from MADS-box protein binding (Reichmann et al., 1996). They found that bound API, AP3/Pl, and AG all induce DNA bending oriented toward the minor groove. For a review of MADS box biology, see Ma, 1994; Purugganan et al., 1995; and Yanofsky, 1995. AG and DEF
have been characterized as MADS box genes; while FLO and LFY appear to encode transcription factors and have proline-rich and acidic domains, they are not MADS box genes.
Following a functional analyses of MADS box genes, Mizukami et al. (1996) created deletion mutants of AG in which various domains of the gene, including the MADS and K boxes were deleted. Based on their results, they proposed that dominant negative mutations of MADS
box genes could be created by deleting the all or part of the MADS domain, or by deleting all or part of the K domain or by deleting various portions of the 3' region of the AG open reading frame.
It was proposed that the proteins encoded by these deletion mutants would be able to bind either the target DNA (i.e., the nucleotide sequence to which the transcription factor binds) or the protein co-factors required for transcription, but not both. Thus, it was proposed that such mutant proteins would interfere with the functioning of the coexisting corresponding endogenous gene. The studies of floral homeotic genes discussed in the preceding paragraphs have been primarily undertaken in model plants such as Arabidopsis and Antirrhinum; few, if any, studies have addressed the genetics of flowering in tree species at the molecular level.
Species of the genus Populus are becoming increasingly important in the forestry industry, particularly for pulp and paper production, in part because of their fast growth characteristics. This group includes aspens (species of Populus section Leuce and their hybrids), and hybrids between black cottonwood (P. trichocarpa Torr. and Gray, also classified as P.
balsamifera subsp.
trichocarpa; Brayshaw, 1965) and eastern cottonwood (P. deltoides L.). These species are also well suited to manipulation by genetic engineering because they are fast-growing, have relatively small genomes, are easy to regenerate in vitro, and are susceptible to transformation with Agrobacterium. To date however, relatively few genes have been cloned from these species.
Notably, the genetic basis underlying floral development in these species is almost completely 3 5 uncharacterized.
Floral development in the genus Populus is significantly different from what is seen in a typical hermaphroditic annual (Nagaraj, 1952; Boes and Strauss, 1994). The apices of the branches RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October I, 1999 do not become inflorescences. The flowers are borne on axillary inflorescences, or catkins, with male and female flowers found on separate trees, although occasionally mixed inflorescences or hermaphroditic flowers are seen. The inflorescences appear from dormant buds in the spring, usually occurring from about five years of age. Instead of the usual structure of four concentric whorls of organs (sepals outermost, followed by petals, then stamens surrounding one or more carpels in the center), the Populus flower apparently has only two whorls (a reduced perianth cup surrounding either stamens or carpels). Unlike several other species that produce unisexual flowers through developmental arrest or degeneration of one set of organs (Cheng et al., 1983; Grant et al., 1994), Populus does not initiate male organs in female flowers or vice versa (Boes and Strauss, 1994; Sheppard, 1997). After releasing pollen or seeds, the entire inflorescences are shed (Kaul, 1995). By late spring, the inflorescence buds for the next year's flowers have already been initiated in the axils of the current year's leaves, and will develop for several more months before going dormant.
The availability of genes that control floral development in Populus species would permit the production of genetically engineered sterile trees. In turn, the ability to control fertility of Populus trees in this way would be of great value in environmental and biosafety of Populus trees engineered for improved agronomic characteristics. It is to such genes that the present invention is directed.
SUMMARY OF THE DISCLOSURE
The present invention provides four floral homeotic genes from Populus trichocarpa. The four genes are herein termed PTLF, PTD, PTAG-1 and PTAG-2. These genes are homologs of floral homeotic genes isolated from other plant species. Specifically, PTLF is a homolog of LEAFY (LFY) and FLORICAULA (FLO), PTD is a homolog of DEFICIENS (DEF) and PTAG-and PTAG-2 are homologs of AGAMOUS (AG). The Populus genes are shown to be expressed in floral tissues; for example, PTLF is expressed in immature inflorescences on which floral promordia are developing, whereas PTD is expressed strongly in stamen primordia from the onset of organogenesis. PTD is also expressed at low levels in carpel primordia.
The invention provides the nucleic acid sequences of these four Populus genes, the corresponding cDNA sequences and the deduced amino acid sequences of the encoded polypeptides. Along with these sequences, the present invention also provides methods of using the gene and cDNA sequences to produce genetically engineered Populus species and other trees having modified fertility characteristics, including sterility.
Genetic constructs useful in producing genetically engineered Populus and other trees include antisense versions of PTLF, PTD, PTAG-1 and PTAG-2, dominant negative mutants of these genes, and constructs useful for sense suppression. In addition, the promoter sequences of these genes may be used to obtain floral-specific expression of genes such as cytotoxins that may RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 be employed in genetic ablation strategies to produce trees having modified fertility characteristics, including sterility.
In one aspect, the invention provides isolated nucleic acid molecules comprising portions of the disclosed nucleic acid sequences. Such molecules comprise at least 15 consecutive nucleotides of the disclosed PTLF, PTD, PTAG-1 or PTAG-2 nucleic acid sequences, and may be longer, comprising at least 20, 25, 50, or 100 consecutive nucleotides of these sequences. Such molecules are useful, among other things, as primers and probes for amplifying all or parts of the disclosed sequences and for detecting the expression of the nucleic acid molecules in cells, such as cells of transgenic plants. Thus, in one aspect, such molecules are useful to monitor the expression of transgenes comprising some portion of the PTD, PTLF, PTAG-1 or PTAG-2 molecules.
Modification of the fertility traits of plants, such as Populus species may also be obtained by introducing genetic constructs containing variants of all or portions of the disclosed PTD, PTLF, PTAG-1 or PTAG-2 sequences. Such variants are provided by the invention and may comprise a nucleotide sequence of at least 50 (or, for example, at least 100) nucleotides in length which sequence hybridizes under stringent conditions to the disclosed nucleic acid sequences.
Alternatively, such variants may share a specified percentage of sequence identity with the disclosed nucleic acid sequences (e.g., at least 75% or at least 90% sequence identity) as determined using a specified sequence alignment program.
The disclosed nucleic acid molecules and variant forms of these molecules may be assembled in nucleic acid vectors for introduction into cells, such as plant cells. Thus, another aspect of the invention comprises the disclosed nucleic acid molecules and variants thereof, and vectors comprising these molecules.
In another embodiment, the invention provides transgenic plants comprising the vectors.
Such transgenic plants may have altered phenotypes (compared to non-transgenic plants of the same species) including modified fertility characteristics. Modified fertility characteristics include modifications in the timing of flowering, for example, advancing the timing of flowering relative to non-transgenic plants of the same species, and sterility. Sterility may be complete sterility, or may be male only or female only sterility. Examples of transgenic plants provided by the present invention include genetically engineered sterile Populus and Eucalyptus species.
In another embodiment, the invention provides transgenic plants that comprise a recombinant expression cassette, wherein the recombinant expression cassette comprises a promoter sequence operably linked to a first nucleic acid sequence, and wherein the first nucleic acid sequence comprises all or part of one of the disclosed nucleic acid molecules, or a variant of one of the disclosed nucleic acid molecules. By way of example, such transgenic plants include plants in which the first nucleic acid is arranged in reverse orientation to the promoter sequence in the recombinant expression cassette, such that an antisense RNA is produced.
In another example, such transgenic plants include plants in which the first nucleic acid is a dominant negative mutant RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 of PTD, PTLF, PTAG-I or PTAG-2, produced by deletion of part of the coding region, such as the 3' portion of the open reading frame, or all or part of a MADS or K-box region of the coding region. In other embodiments, the promoter sequence driving expression of the first nucleic acid may be a promoter that confers enhanced expression of the first nucleic acid molecule in floral tissues of the plant relative to non-floral tissues.
In other embodiments, the expression of at least one endogenous gene in transgenic plants containing such a recombinant expression cassette will be modified as a result of the cassette. In particular embodiments, that modified expression will affect the fertility of the plant, and will render the plant sterile.
In yet other embodiments, the invention provides transgenic plants comprising a recombinant expression cassette, wherein the recombinant expression cassette comprises a promoter sequence operably linked to a first nucleic acid sequence, and wherein the promoter sequence is a promoter sequence from PTD, PTLF, PTAG-1 or PTAG-2. In particular embodiments, the first nucleic acid sequence encodes a cytotoxic polypeptide.
These and other aspects of the invention are described in more detail below.
SEQUENCE LISTING
The nucleic and amino acid sequences listed in the accompanying Sequence Listing are showed using standard letter abbreviations for nucleotide bases, and three letter code for amino acids. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood to be included by any reference to the displayed strand.
Seq. LD. No. 1 shows the nucleic acid sequence of the PTD gene. The sequence comprises the following regions:

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Nucleotide numbers Feature 1-1872 5' regulatory region 1752-1756 probable CAAT box 1782-1786 probable CAAT box 1845-1851 probable TATA box 1873-2188 Exon 1 (including inferred 5' UTR) 2189-2327 Intron 1 2328-2394 Exon 2 2395-2484 Intron 2 2485-2546 Exon 3 2547-2652 Intron 3 2653-2752 Exon 4 2753-3309 Intron 4 3310-3351 Exon 5 3352-3432 Intron 5 3433-3477 Exon 6 3478-3584 Intron 6 3585-4000 Exon 7 3765-4285 3' regulatory region (including 3' UTR) 3765-4000 3' UTR

Seq. LD. No. 2 shows the nucleic acid sequence of the PTD cDNA.
Seq. LD. No. 3 shows the nucleic acid sequence of the PTD ORF.
Seq. LD. No. 4 shows the amino acid sequence of the PTD polypeptide. The sequence comprises the following regions:
Amino Acid numbers Feature 1-57 MADS domain 87-154 K-domain Seq. LD. No. 5 shows the nucleic acid sequence of the PTLF gene. The sequence comprises the following regions:

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Nucleotide numbers Feature 1-2638 5' regulatory region 2477-2481 probable CAAT box 2536-2542 probable TATA box 2568-2574 probable TATA box 2628-3074 Exon 1 3075-3655 Intron 1 3656-3990 Exon 2 3991-4679 Intron 2 4680-5197 Exon 3 5043-5197 3' UTR

5043-5656 3' regulatory region (including 3' UTR) Seq. LD. No. 6 shows the nucleic acid sequence of the PTLF cDNA
Seq. LD. No. 7 shows the nucleic acid sequence of the PTLF ORF.
Seq. LD. No. 8 shows the amino acid sequence of the PTLF polypeptide.
Seq. LD. No. 9 shows the nucleic acid sequence of the PTAG-1 gene. The sequence comprises the following regions:
Nucleotide numbers Feature 1-2410 5' regulatory region 2411-2588 Exon 1 2589-3056 Intron 1 3057-3296 Exon 2 3297-8161 Intron 2 8162-8243 Exon 3 8244-8894 Intron 3 8895-8956 Exon 4 8957-9041 Intron 4 9042-9141 Exon 5 9142-9284 Intron 5 9285-9326 Exon 6 9327-9529 Intron 6 9530-9571 Exon 7 9572-971 I Intron 7 9712-9878 Exon 8 9879-10930 Intron 8 10931-11215 Exon 9 10935-I 1485 3' regulatory region (including 3' UTR) 10935-11215 3' UTR

Seq. LD. No. 10 shows the nucleic acid sequence of the PTAG-1 cDNA.
Seq. LD. No. 11 shows the nucleic acid sequence of the PTAG-1 ORF.
Seq. LD. No. 12 shows the amino acid sequence of the PTAG-1 polypeptide. The sequence comprises the following regions:
Amino Acid numbers Feature 17-72 MADS domain 106-172 K-domain RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October l, 1999 Seq. LD. No. 13 shows the nucleic acid sequence of the PTAG-2 gene. The sequence comprises the following regions:
Nucleotide numbers Feature I-2336 5' regulatory region 21 I 8-2122 probable CAAT box 2256-2262 probable TATA box 2337-2421 Exon 1 2422-2913 Intron 1 2914-3153 Exon 2 3154-7035 Intron 2 7036-7117 Exon 3 7118-7946 Intron 3 7947-8008 Exon 4 8009-8094 Intron 4 8095-8194 Exon 5 8195-8331 Intron 5 8332-8373 Exon 6 8374-8529 Intron 6 8530-8571 Exon 7 8572-8700 Intron 7 8701-8863 Exon 8 8864-9396 Intron 8 9397-9691 Exon 9 8863-10007 3' regulatory region (including 3' UTR) 8863-8863 joined to 9397-96913' UTR

Seq. LD. No. 14 shows the nucleic acid sequence of the PTAG-2 cDNA.
Seq. LD. No. 15 shows the nucleic acid sequence of the PTAG-2 ORF.
Seq. LD. No. 16 shows the amino acid sequence of the PTAG-2 polypeptide. The sequence comprises the following regions:
Amino Acid numbers Feature 16-72 MADS domain 3 5 106-172 K-domain Seq. LD. Nos. 17-24 show oligonucleotide primers that may be used to amplify portions of the disclosed floral homeotic nucleic acid sequences.
DETAILED DESCRIPTION
I. Definitions and Abbreviations Unless otherwise noted, technical terms are used according to conventional usage.
Definitions of common terms in molecular biology may be found in Benjamin Lewin, Genes V, published by Oxford University Press, 1994 (ISBN 0-19-854287-9); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0-632-02182-9); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 1-56081-569-8).

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 In order to facilitate review of the various embodiments of the invention, the following definitions of terms are provided:
Isolated: An "isolated" biological component (such as a nucleic acid or protein or organelle) has been substantially separated or purified away from other biological components in the cell of the organism in which the component naturally occurs, i.e., other chromosomal and extra-chromosomal DNA and RNA, proteins and organelles. Nucleic acids and proteins that have been "isolated" include nucleic acids and proteins purified by standard purification methods. The term also embraces nucleic acids and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acids.
cDNA (complementary DNA): A piece of DNA lacking internal, non-coding segments (introns). cDNA is synthesized in the laboratory by reverse transcription from messenger RNA
extracted from cells.
Oligonucleotide: A linear polynucleotide sequence of up to about 100 nucleotide bases in length.
Operably linked: A first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence.
Generally, operably linked DNA sequences are contiguous and, where necessary to join two protein-coding regions, in the same reading frame.
ORF (open reading frame): A series of nucleotide triplets (codons) coding for amino acids without any termination codons. These sequences are usually translatable into a peptide.
Ortbolog: Two nucleotide or amino acid sequences are orthologs of each other if they share a common ancestral sequence and diverged when a species carrying that ancestral sequence split into two species. Orthologous sequences are also homologous sequences.
Probes and primers: Molecules useful as nucleic acid probes and primers may readily be prepared based on the nucleic acids provided by this invention. Typically, but not necessarily, such molecules are oligonucleotides, i.e., linear nucleic acid molecules of up to about 100 nucleotides bases in length. However, longer nucleic acid molecules, up to and including the full length of a particular floral homeotic gene may also be employed for such purposes.
A nucleic acid probe comprises at least one copy (and typically many copies) of an isolated nucleic acid molecule of known sequence that is used in a nucleic acid hybridization protocol.
Generally (but not always) the nucleic acid molecule is attached to a detectable label or reporter molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. Methods for labeling and guidance in the choice of labels appropriate for various purposes are discussed, e.g., in Sambrook et al. (1989) and Ausubel et al. (1987).
Primers are short nucleic acids, usually DNA oligonucleotides 8-10 nucleotides or more in RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 length, and more typically 15-25 nucleotides in length. Primers may be annealed to a complementary target DNA strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA strand, and then extended along the target DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR) or other nucleic-acid amplification methods known in the art.
Methods for preparing and using probes and primers are described, for example, in Sambrook et al. ( 1989), Ausubel et al. ( 1987), and Innis et al., ( 1990).
PCR primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as Primer (Version 0.5, O 1991, Whitehead Institute for Biomedical Research, Cambridge, MA).
One of skill in the art will appreciate that the specificity of a particular probe or primer increases with its length. Thus, for example, a primer comprising 20 consecutive nucleotides of the cDNA
disclosed in Seq. LD. No. 2 will anneal to a target sequence such as a homologous sequence in Eucalyptus contained within a Eucalyptus cDNA library with a higher specificity than a corresponding primer of only 15 nucleotides. Thus, in order to obtain greater specificity, probes and primers may be selected that comprise 20, 25, 30, 35, 40, 50, 75, 100 or more consecutive nucleotides of the disclosed nucleic acid sequences.
The invention thus includes isolated nucleic acid molecules that comprise specified lengths of the disclosed floral homeotic sequences. Such molecules may comprise at least 8-10, 15, 20, 25, 30, 35, 40, 50, 75, or 100 consecutive nucleotides of these sequences and may be obtained from any region of the disclosed sequences. By way of example, the floral homeotic genes shown in the Sequence Listing may be apportioned into halves or quarters based on sequence length, and the isolated nucleic acid molecules may be derived from the first or second halves of the molecules, or any of the four quarters. The PTD cDNA, shown in Seq. LD. No. 2 may be used to illustrate this. This cDNA is 924 nucleotides in length and so may be hypothetically divided into halves (nucleotides 1-462 and 463-924) or quarters (nucleotides 1-231, 232-462, 463-693 and 694-924).
Nucleic acid molecules may be selected that comprise at least 8-10, 15, 20, 25, 30, 35, 40, 50, 75 or 100 consecutive nucleotides of any of these portions of the floral homeotic genes. Thus, one such nucleic acid molecule might comprise at least 25 consecutive nucleotides of the region comprising nucleotides 1-924 of the disclosed floral homeotic genes.
Purified: The term purified does not require absolute purity; rather, it is intended as a relative term. Thus, for example, a purified PTAG-1 protein preparation is one in which the PTAG-1 protein is more pure than the protein in its natural environment within a cell. Generally, a preparation of a floral homeotic protein is purified such that the floral homeotic protein represents at least 5% of the total protein content of the preparation. For particular applications, higher purity may be desired, such that preparations in which the floral homeotic protein represents at least 50%
or at least 75% of the total protein content may be employed.

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Recombinant: A recombinant nucleic acid is one that has a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two otherwise separated segments of sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques.
Transformed: A transformed cell is a cell into which has been introduced a nucleic acid molecule by molecular biology techniques. As used herein, the term transformation encompasses all techniques by which a nucleic acid molecule might be introduced into such a cell, including Agrobacterium-mediated transformation, transfection with viral vectors, transformation with plasmid vectors and introduction of naked DNA by electroporation, lipofection, and particle gun acceleration.
Transgenic plant: As used herein, this term refers to a plant that contains recombinant genetic material not normally found in plants of this type and which has been introduced into the plant in question (or into progenitors of the plant) by human manipulation.
Thus, a plant that is grown from a plant cell into which recombinant DNA is introduced by transformation is a transgenic plant, as are all offspring of that plant that contain the introduced transgene (whether produced sexually or asexually).
Vector: A nucleic acid molecule as introduced into a host cell, thereby producing a transformed host cell. A vector may include nucleic acid sequences that permit it to replicate in a host cell, such as an origin of replication. A vector may also include one or more selectable marker genes and other genetic elements known in the art.
Sequence identity: the relatedness of two nucleic acid sequences, or two amino acid sequences is typically expressed in terms of the identity between the sequences (in the case of amino acid sequences, similarity is an alternative assessment). Sequence identity is frequently measured in terms of percentage identity; the higher the percentage, the more similar the two sequences are. Homologs of a disclosed floral homeotic protein or nucleic acid sequence will possess a relatively high degree of sequence identity when aligned using standard methods.
Methods of alignment of sequences for comparison are well known in the art.
Various programs and alignment algorithms are described in: Smith and Waterman (1981);
Needleman and Wunsch (1970); Pearson and Lipman (1988); Higgins and Sharp (1988); Higgins and Sharp (1989);
Corpet et al. ( 1988); Huang et al. ( 1992); and Pearson et al. ( 1994).
Altschul et al. ( 1994) presents a detailed consideration of sequence alignment methods and homology calculations.
The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al., 1990) is available from several sources, including the National Center for Biotechnology Information (NCBI, Bethesda, MD) and on the Internet, for use in connection with the sequence analysis programs blastp, blastn, blastx, tblastn and tblastx. It can be accessed at RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 http://www.ncbi.nlm.nih.gov/BLAST/. A description of how to determine sequence identity using this program is available at http://www.ncbi.nlm.nih.gov/BLAST/blast help.html.
Homologs of the disclosed floral homeotic proteins are typically characterized by possession of at least 50% sequence identity counted over the full length alignment with the amino acid sequence of a selected floral homeotic protein using the NCBI Blast 2.0, gapped blastp set to default parameters. Proteins with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 90% or at least 95% sequence identity.
When less than the entire sequence is being compared for sequence identity, homologs will typically possess at least 75%
sequence identity over short windows of 10-20 amino acids, and may possess sequence identities of at least 85% or at least 90% or 95% depending on their similarity to the reference sequence. Methods for determining sequence identity over such short windows are described at http://www.ncbi.nlm.nih.gov/BLAST/blast FAQs.html. One of skill in the art will appreciate that these sequence identity ranges are provided for guidance only; it is entirely possible that strongly significant homologs could be obtained that fall outside of the ranges provided. The present invention provides not only the peptide homologs as described above, but also nucleic acid molecules that encode such homologs.
Homologs of the disclosed floral homeotic nucleic acids are typically characterized by possession of at least 50% sequence identity counted over the full length alignment with the nucleic acid sequence of a selected floral homeotic gene using the NCBI Blast 2.0, blastn set to default parameters. Homologs with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 90% or at least 95% sequence identity.
An alternative indication that two nucleic acid molecules are closely related is that the two molecules hybridize to each other under stringent conditions. Stringent conditions are sequence dependent and are different under different environmental parameters.
Generally, stringent conditions are selected to be about 5°C to 20°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Conditions for nucleic acid hybridization and calculation of stringencies can be found in Sambrook et al. (1989) and Tijssen (1993). Nucleic acid molecules that hybridize under stringent conditions to a disclosed nucleic acid sequences will typically hybridize to a probe corresponding to either the entire cDNA or selected portions of the cDNA under wash conditions of 0.2x SSC, 0.1% SDS at 65°C.
Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences, due to the degeneracy of the genetic code. It is understood RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 that changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequence that all encode substantially the same protein.
Floral Specific Promoter: As used herein, the term "floral specific promoter"
refers to a regulatory sequence which confers gene expression only in, or predominantly in, floral tissues.
The complete sequences of four floral specific promoters are disclosed herein:
the promoter of PTD, located within the 5' regulatory region comprising nucleotides 1-1872 of Seq. LD. No. 1; the promoter of PTFL, located within the 5' regulatory region comprising nucleotides 1-2638 of Seq.
LD. No. 5; the promoter of PTAG-1, located within the 5' regulatory region comprising 1-2410 of Seq. LD. No. 9; and the promoter of PTAG-2, located within the 5' regulatory region comprising nucleotides 1-2336 of Seq. LD. No. 13). Accordingly, these promoter sequences may be used to produce transgene constructs that are specifically or predominantly expressed in floral tissues. One of skill in the art will recognize that effective floral-specific expression may be achieved with less than the entire promoter sequences noted above. Thus, by way of example, floral-specific expression may be obtained by employing sequences comprising 500 nucleotides or fewer (e.g., 250, 200, 150, or 100 nucleotides) upstream of the start codon, AUG, of the disclosed gene sequences.
The determination of whether a particular sub-region of the disclosed sequences operates to confer floral specific expression in a particular system (taking into account the plant species into which the construct is being introduced, the level of expression required, etc.), is preformed using known methods, such as operably linking the promoter sub-region to a marker gene (e.g. GUS), introducing such constructs into plants and then determining the level of expression of the marker gene in floral and other plant tissues. Sub-regions which confer only or predominantly floral expression, are considered to contain the necessary elements to confer floral specific expression.
II. Methods The four floral homeotic genes were obtained, and the present invention can be practiced, using standard molecular biology and plant transformation procedures, unless otherwise noted.
Standard molecular biology procedures are described in Sambrook et al (1989), Ausubel et al.
(1987) and Innis et al. (1990).
III. Isolation and Characterization of PTLF
Genomic DNA was purified from dormant vegetative buds of a single Populus trichocarpa tree using a modified CTAB extraction technique (Wagner et al., 1987). After centrifugation to pellet nuclei, a large gummy pellet of resin was evident.
This was left intact during the resuspension of nuclei, and then discarded. Normal yield of DNA was approximately 1 mg per 40 g of tissue. A genomic library was constructed from DNA partially digested with Sau3A, filled in with DNA Pol I and dATP and dGTP, and ligated into LambdaGem-12 vector RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 (Stratagene) having partially filled-in Xho I sites. Packaging of the DNA into phage particles was performed with GigaPack Gold II (Stratagene).
RNA was extracted using the lithium dodecyl sulfate method of Baker et al. ( 1990), and purified by centrifugation through a 5.7 M CsCI pad. After redissolving the RNA pellet in TE, pH
8.0, NaCI was added to 400 mM and the RNA was precipitated with EtOH to remove excess CsCI.
PolyA+ RNA was selected using oligo dT-cellulose columns (mRNA Separation Kit, Clontech).
RNA was stored at -80°C until use. Ten-microgram samples of total RNA
were used as templates for single-stranded cDNA synthesis. Reactions included 50 mM TrisHCl (pH 8.3), 75 mM KCI, 10 mM dithiothreitol, 3 mM MgCl2, 100 ItM each dNTP, 4 Itg primer XT, 10 ltCi [a32P]-dCTP, and 200 U M-MLV reverse transcriptase (Gibco BRL) in 50 ltL. Incubations were performed at 37°C
for I hr, then the cDNA was purified with GeneClean (BIO101 ) silica matrix.
Typical yields were 10-40 ng of cDNA, as determined by'ZP incorporation. The size ranges of the cDNA samples were characterized by alkaline gel electrophoresis. cDNA products were between 500 to 4000 bases in length, with an average size of 1000 bases. The DNA was diluted to 0.25 ng/1tL
in 10 mM
TrisHCl, I mM EDTA (pH 8.0) and stored at -20°C.
cDNA libraries were prepared using the Lambda-ZAP CDNA cloning kit (Stratagene).
From 5 ltg of polyA' RNA, approximately 106 clones were recovered per preparation, with an average size of 1 kb and a size range of 500 by to 3 kb. A hybridization probe for the Populus FLOlLFYhomolog was obtained by touchdown PCR (Don et al., 1991) ofthe cDNA
library with a degenerate primer specific to a highly conserved region of the FLO and LFY
genes and a primer specific for the vector plus 3'-end of polyadenylated cDNAs. The PCR protocol was as follows:
(94°C, 30 sec; 60°C, 30 sec; 71°C, 1 min) x 2, (94°C, 30 sec; 58°C, 30 sec; 71°C, 1 min) x 2, (94°C, 30 sec; 56°C, 30 sec; 71°C, 1 min) x 2, (94°C, 30 sec; 54°C, 30 sec; 71°C, 1 min) x 2, (94°C, 30 sec; 52°C, 30 sec; 71°C, 1 min) x 2, (94°C, 30 sec; 50°C, 30 sec; 71°C, 1 min) x 8, (94°C, 30 sec; 52°C, 30 sec; 71°C, 1 min) x 25. The approximately 480 by fragment obtained was gel-purified and subcloned into pBluescript SK(-) for further characterization.
The PTLF genomic clone was isolated by screening the genomic library using probes derived from the PTLF cDNA sequence. Sequencing of the cDNA was performed using the dideoxy- terminator-based Sequenase 2.0 kit (Unites States Biochemical Corp.), according to the methods described by the manufacturer. Most sequencing of the cDNA and subclones of the gene was done using universal primers on nested deletions created with ExoIII
(Henikoff, 1984). Gaps were filled in by sequencing from specific primers synthesized at Oregon State University.
Sequence analysis was performed using PCGENE (Intelligenetics).
A total of 5,656 by of the PTLF gene locus was sequenced, including 2,638 by upstream of the initiation codon and 457 by downstream of the polyA addition site. This sequence is available on GenBank (http://www.ncbi.nlm.nih.gov/Entrez/nucleotide.html) under accession number U93196 and is shown in Seq. LD. No. 5. The positions of the two introns found in both RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 FLO and LFY are conserved in PTLF. The longest cDNA obtained (Seq. LD. No. 6) includes an open reading frame (Seq. LD. No. 7) that encodes for a predicted polypeptide of 377 amino acid residues (Seq. LD. No. 8). Comparison of the deduced PTLF amino acid sequence with several FLO/LFY homologs revealed conserved amino- and carboxyl-terminal domains (133 and 175 residues, respectively, in PTLF) linked by a poorly conserved, highly charged domain (69 residues). The overall sequence identity between PTLF and FLO (Coen et al., 1990) is 79%, with 88% amino acid sequence similarity.
Due to the limited seasonal availability of inflorescence and flower tissue, and the difficulty of obtaining large amounts of developing meristems, the levels of PTLF expression were compared using RT-PCR. PTLF was detected most strongly in developing inflorescences, with no significant differences between samples from male and female trees.
For in situ hybridization analysis, tissue samples from various sources were fixed, embedded, sectioned, and hybridized as described by Kelly et al. (1995), with the following modifications. Sections were 10 Itm in thickness. Probes were generated from a plasmid consisting of the PTLF cDNA inserted between the EcoRI and Kpn I sites of the vector pBluescriptII SK (-), and were not alkaline hydrolyzed. A PTLF antisense probe hybridized strongly to the floral meristems and developing flowers of both male and female plants. PTLF was not detected in the apical inflorescence meristem, but was seen in the flanking nascent floral meristems. Developing flowers showed expression in the immature carpels and anthers. Both male and female flowers exhibited some hybridization on the inner (adaxial) rim of the perianth cup during the middle stages of development. PTLF also showed marked hybridization to bracts.
Hybridization was observed with vegetative buds from mature branches. The pattern of hybridization showed that there was RNA in the axils of the newly formed leaves, but not in the center of the vegetative meristem. There was also significant expression in the tips of the leaf primordia, and in some portions of the surrounding developing leaves.
Overexpression and antisense constructs of PTLF cDNA were produced for analysis in transgenic trees. The insert from the cDNA clone of PTLF was cut out using EcoR I and Kpn I, and the ends were polished with T4 DNA polymerase. The insert was then ligated into the Sma I
site of pBI121 (Jefferson et al., 1987). Clones with each orientation were identified by PCR, and the structures of the junction sites near the promoters of both were verified by sequencing of the PCR fragments. Hybrid aspens were used for transformation, in part because of the relative ease of transformation, and in part because of concern that transgenic cottonwoods might interact with native cottonwoods in the vicinity of the experimental site. The P. tremula x alba hybrid aspen female clone 717-1B4 and the P. tremula x tremuloides hybrid aspen male clone 353-38 were transformed with pDW151 (Weigel and Nilsson, 1995) and the above binary vectors using Agrobacterium tumefasciens strain C58 (Leple et al., 1992) with modifications as described by Han et al. (1996).

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Although overexpression of LFY in aspens was reported to result in short, bushy plants that flower within a year (Weigel and Nilsson, 1995), no such obvious phenotypes were seen with PTLF. During more than one year of growth in soil in a greenhouse, and an additional year at a field site in Corvallis, OR, few differences were noted for any of the transgenics relative to control plants.
IV. Isolation and Characterization of PTD
The PTD cDNA and gene were isolated by probing the Populus cDNA library described above at low stringency using an Eco RI fragment of pCIT2241 (Ma et al., 1991) which contains the MADS box region of AGL 1. The PTD cDNA (Seq. LD. No. 2) comprises an open reading frame (Seq. LD. No. 3) encoding a 227 amino acid polypeptide (Seq. LD. No. 4).
The PTD gene (Seq. LD. No. 1) consists of seven exons.
The PTD polypeptide is 81% conserved overall with respect to DEF. PTD has MADS
and K domains. The MADS domain extends over amino acids 1-57, while the K-domain extends over amino acids 87-154. The MADS domain is 93% conserved with respect to DEF, whereas the K
domain is 85% conserved at the amino acid level.
To determine if the promoter of PTD would confer the floral-specific expression, 1.9 kb of its promotor and 5' untranslated region were fused to a GUS-intron reporter gene, and introduced into Arabidopsis, tobacco and poplar. GUS expression was observed in floral tissues including petals and stamens. This expression pattern is characteristic of a "B
function" gene like APETALA3, suggesting that PTD has retained the regulatory motifs (i.e.
sequence patterns) that direct it to stamens and petals (though poplar has no true petals). No vegetative GUS expression was observed, except in poplar, where vegetative expression was confined to leaf like structures subtending induced floral structures.
V. Isolation and Characterization of PTAG-1 and PTAG-2 Two cDNAs and their corresponding genes were isolated from Populus using the methodologies described above and a probe derived from the 3' region of the AG
cDNA. Denoted PTAG-1 and PTAG-2, these two sequences are the orthologs of AG.
The genomic, cDNA and open reading frame sequences of PTAG-1 are shown in Seq.
LD.
Nos. 9, 10 and 1 l, respectively. The open reading frame encodes a polypeptide of 241 amino acids in length (Seq. LD. No. 12). The PTAG-1 polypeptide contains both a MADS
domain and a K-domain. The MADS domain extends from amino acids 17-72 and the K-domain from amino acids 106-172. The PTAG-1 nucleotide and amino acid sequences are available on GenBank under accession number AF052570.
The genomic, cDNA and open reading frame sequences of PTAG-2 are shown in Seq.
LD.
Nos. 13, 14 and 15, respectively. The open reading frame encodes a polypeptide of 238 amino RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October l, 1999 acids in length (Seq. LD. No. 16). The PTAG-2 polypeptide contains both a MADS
domain and a K-domain. The MADS domain extends from amino acids 16-72 and the K-domain from amino acids 106-172. The PTAG-2 nucleotide and amino acid sequences are available on GenBank under accession number AF052571.
Like AG (Yanofsky et al., 1990), both PTAG 1 and PTAG2 contain 8 introns at conserved positions. All introns have canonical donor (GT) and acceptor (AG) sites.
At the amino acid level, PTAG-1 and PTAG 2 are 89% identical, and show 72-75%
sequence similarity with AG.
Because AG is only expressed in floral tissues and is essential for the development of both male and female reproductive organs, it is ideally suited for use in modifying fertility through genetic engineering approaches. In situ hybridization studies show that the PTAG genes in Populus are expressed in the central zone of both male and female floral meristems, and, as with AG, expression begins before reproductive primordia emerge and continues in developing stamens and carpets. Northern analysis of PTAG gene expression in populus revealed that transcripts are present in immature and mature flowers from both male and female trees. In addition, low levels of PTAG gene expression are present in all vegetative tissues tested.
Interestingly, the size of the transcripts from the vegetative tissues are shorter (~--150-200 bp) than the floral transcripts. This size difference is not due to alternate intron/exon splicing.
EXAMPLES
The following examples are provided to illustrate the scope of the invention.
Example 1 Preferred Method of Making the Populus Genes and cDNAs With the provision of the four Populus floral homeotic nucleic acid sequences PTD, PTLF, PTAG-1 and PTAG-2, the polymerase chain reaction (PCR) may now be utilized in a preferred method for producing the cDNAs and genes, as well as derivatives of these sequences.
PCR amplification of the sequence may be accomplished either by direct PCR
from an appropriate cDNA or genomic library. Alternatively, the cDNAs may be amplified by Reverse-Transcription PCR (RT-PCR) using RNA extracted from Populus cells as a template. Similarly, the gene sequences may be directly amplified using Populus genomic DNA as a template.
Methods and conditions for both direct PCR and RT-PCR are known in the art and are described in Innis et al.
(1990). Suitable plant cDNA and genomic libraries for direct PCR include Populus libraries made by methods described above. Other tree cDNA and genomic libraries may be used in order to amplify orthologous cDNAs of tree species, such as Pinus and Eucalyptus.
The selection of PCR primers will be made according to the portions of the cDNA or gene that are to be amplified. Primers may be chosen to amplify small segments of the cDNA or gene, or the entire cDNA or genes. Variations in amplification conditions may be required to RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October l, 1999 accommodate primers of differing lengths; such considerations are well known in the art and are discussed in Innis et al. ( 1990), Sambrook et al. ( 1989), and Ausubel et al.
( 1987). By way of example only, the PTD cDNA molecule as shown in Seq. LD. No. 2 (with the exception of the 5' poly-A tail) may be amplified using the following combination of primers:
5' ATGGGTCGTGGAAAGATTGAAATCAAG 3' (Seq. LD. No. 17) 5' ATTTGTGAAAAAGAGCTTTTATATTTA 3' (Seq. LD. No. 18) The open reading frame portion of the PTD cDNA may be amplified using the following primer pair:
5' ATGGGTCGTGGAAAGATTGAAATCAAG 3' (Seq. LD. No. 17) 1 O 5' AGGAAGGCGAAGTTCATGGGATCCAAA 3' (Seq. LD. No. 19) A derivative version of the PTD ORF that lacks the MADS box domain may be amplified using the following primers:
5' TCCACATCGACAAAGAAGATCTACGAT 3' (Seq. LD. No. 20) 5' AGGAAGGCGAAGTTCATGGGATCCAAA 3' (Seq. LD. No. 19) These primers are illustrative only; it will be appreciated by one skilled in the art that many different primers may be derived from the provided cDNA and gene sequences in order to amplify particular regions of the provided nucleic acid molecules. Suitable amplification conditions include those described above for the original isolation of the PTLF cDNA. As is well known in the art, amplification conditions may need be varied in order to amplify orthologous genes where the sequence identity is not 100%; in such cases, the use of nested primers, as described above may be beneficial. Resequencing of PCR products obtained by these amplification procedures is recommended; this will facilitate confirmation of the amplified cDNA sequence and will also provide information on natural variation on this sequence in different ecotypes, cultivars and plant populations.
Oligonucleotides that are derived from the PTD, PTLF, PTAG-1 and PTAG-2 cDNA
and gene sequences and which are suitable for use as PCR primers to amplify corresponding nucleic acid sequences are encompassed within the scope of the present invention.
Preferably, such oligonucleotide primers will comprise a sequence of 15-20 consecutive nucleotides of the selected cDNA or gene sequence. To enhance amplification specificity, primers comprising at least 25, 30, 35, 50 or 100 consecutive nucleotides of the PTD, PTLF, PTAG-1 or PTAG-2 gene or cDNA
sequences may be used.
Example 2 Use of the Populus Genes and cDNAs to Modify Fertility Characteristics Once a nucleic acid encoding a protein involved in the determination of a particular plant characteristic, such as flowering, has been isolated, standard techniques may be used to express the nucleic acid in transgenic plants in order to modify that particular plant characteristic. One RJP:SLR 245-53375 9/29100 Express Mail Label No.: EM295145989US
Date of Deposit: October l, 1999 approach is to clone the nucleic acid into a vector, such that it is operably linked to control sequences (e.g., a promoter) which direct expression of the nucleic acid in plant cells. The transformation vector is then introduced into plant cells by one of a number of techniques (e.g., electroporation and Agrobacterium-mediated transformation) and progeny plants containing the introduced nucleic acid are selected. Preferably all or part of the transformation vector will stably integrate into the genome of the plant cell. That part of the vector which integrates into the plant cell and which contains the introduced nucleic acid and associated sequences for controlling expression (the introduced "transgene") may be referred to as the recombinant expression cassette.
Selection of progeny plants containing the introduced transgene may be made based upon the detection of an altered phenotype. Such a phenotype may result directly from the nucleic acid cloned into the transformation vector or may be manifested as enhanced resistance to a chemical agent (such as an antibiotic) as a result of the inclusion of a dominant selectable marker gene incorporated into the transformation vector.
The choice of (a) control sequences and (b) how the nucleic acid (or selected portions of the nucleic acid) are arranged in the transformation vector relative to the control sequences determine, in part, how the plant characteristic affected by the introduced nucleic acid is modified.
For example, the control sequences may be tissue specific, such that the nucleic acid is only expressed in particular tissues of the plant (e.g., reproductive tissues) and so the affected characteristic will be modified only in those tissues. The nucleic acid sequence may be arranged relative to the control sequence such that the nucleic acid transcript is expressed normally, or in an antisense orientation. Expression of an antisense RNA that is the reverse complement of the cloned nucleic acid will result in a reduction of the targeted gene product (the targeted gene product being the protein encoded by the plant gene from which the introduced nucleic acid was derived). Over-expression of the introduced nucleic acid, resulting from a plus-sense orientation of the nucleic acid relative to the control sequences in the vector, may lead to an increase in the level of the gene product, or may result in a reduction in the level of the gene product due to co-suppression (also termed "sense suppression") of that gene product. In another approach, the nucleic acid sequence may be modified such that certain domains of the encoded peptide are deleted.
Depending on the domain deleted, such modified nucleic acid may act as dominant negative mutations, suppressing the phenotypic effects of the corresponding endogenous gene.
Successful examples of the modification of plant characteristics by transformation with cloned nucleic acid sequences are replete in the technical and scientific literature. Selected examples, which serve to illustrate the level of knowledge in this field of technology include:
U.S. Patent No. 5,432,068 to Albertson (control of male fertility using externally inducible promoter sequences);
U.S. Patent No. 5,686,649 to Chua (suppression of plant gene expression using processing-defective RNA constructs);

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 U.S. Patent No. 5,659,124 to Crossland (transgenic male sterile plants);
U.S. Patent No. 5,451,514 to Boudet (modification of lignin synthesis using antisense RNA and co-suppression);
U.S. Patent No. 5,443,974 to Hitz (modification of saturated and unsaturated fatty acid levels using antisense RNA and co-suppression);
U.S. Patent No. 5,530,192 to Murase (modification of amino acid and fatty acid composition using antisense RNA);
U.S. Patent No. 5,455,167 to Voelker (modification of medium chain fatty acids) U.S. Patent No. 5,231,020 to Jorgensen (modification of flavonoids using co-suppression);
U.S. Patent No. 5,583,021 to Dougherty (modification of virus resistance by expression of plus-sense RNA); and Mizukami et al. (1996) (dominant negative mutations in floral development using partial deletions of AG).
These examples include descriptions of transformation vector selection, transformation techniques and the production of constructs designed to over-express an introduced nucleic acid, dominant negative mutant forms, untranslatable RNA forms or antisense RNA. In light of the foregoing and the provision herein of the PTD, PTLF, PTAG-1 and PTAG-2 cDNA
and gene sequences, it is apparent that one of skill in the art will be able to introduce these cDNAs or genes, or derivative forms of these sequences (e.g., antisense forms), into plants in order to produce plants having modified fertility characteristics, particularly sterility. This Example provides a description of the approaches that may be used to achieve this goal. For convenience the PTD, PTLF, PTAG-1 and PTAG-2 cDNAs and genes disclosed herein will be generically referred to as the "floral homeotic nucleic acids," and the encoded polypeptides as the "floral homeotic polypeptides".
Example 3 provides an exemplary illustration of how an antisense form of one of these floral homeotic nucleic acids, specifically the PTD cDNA, may be introduced into poplar species using Agrobacterium transformation, in order to produce genetically engineered sterile poplars. Example 4 provides an exemplary illustration of how mutant forms of PTAG-1 may be produced and introduced into poplar species to produce modified fertility characteristics.
a. Plant Types The floral homeotic nucleic acids disclosed herein may be used to produce transgenic plants having modified fertility characteristics. In particular, the amenable plant species include, but are not limited to, members of the genus Populus, including Populus trichocarpa (commonly known as black cottonwood, California poplar and western balsam poplar) and poplar hybrid species. Other woody species that are amenable to fertility modification by the methods disclosed herein include members of the genera Picea, Pinus Pseudotsuga, Tsuga, Seguoia, Abies, Thuja, Libocedrus, Chamaecyparis and Larix. In particular, members of the genera Eucalyptus, Acacia RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 and Gmelina, which are becoming increasingly important for pulp production, may be engineered for sterility using the nucleic acid sequences and methods disclosed here.
b. Vector construction, choice of promoters A number of recombinant vectors suitable for stable transfection of plant cells or for the establishment of transgenic plants have been described including those described in Pouwels et al., ( 1987), Weissbach and Weissbach, ( 1989), and Gelvin et al., ( 1990).
Typically, plant transformation vectors include one or more cloned plant genes (or cDNAs) under the transcriptional control of 5' and 3' regulatory sequences and a dominant selectable marker. Such plant transformation vectors typically also contain a promoter regulatory region (e.g., a regulatory region controlling inducible or constitutive, environmentally or developmentally regulated, or cell-or tissue-specific expression), a transcription initiation start site, a ribosome binding site, an RNA
processing signal, a transcription termination site, and/or a polyadenylation signal.
Examples of constitutive plant promoters which may be useful for expressing the floral homeotic nucleic acids include: the cauliflower mosaic virus (CaMV) 35S
promoter, which confers constitutive, high-level expression in most plant tissues (see, e.g., Odell et al., 1994, Dekeyser et al., 1990, Terada and Shimamoto, 1990; Benfey and Chua, 1990); the nopaline synthase promoter (An et al., 1988); and the octopine synthase promoter (Fromm et al., 1989).
A variety of plant gene promoters that are regulated in response to environmental, hormonal, chemical, and/or developmental signals, also can be used for expression of the floral homeotic nucleic acids in plant cells, including promoters regulated by: (a) heat (Callis et al., 1988;
Ainley, et al. 1993; Gilmartin et al. 1992); (b) light (e.g., the pea rbcS-3A
promoter, Kuhlemeier et al., 1989, and the maize rbcS promoter, Schaffner and Sheen, 1991; (c) hormones, such as abscisic acid (Marcotte et al., 1989); (d) wounding (e.g., wunl, Siebertz et al., 1989); and (e) chemicals such as methyl jasminate or salicylic acid (see also Gatz 1997) can also be used to regulate gene expression.
Alternatively, tissue specific (root, leaf, flower, and seed for example) promoters (Carpenter et al., 1992; Denis et al., 1993; Opperman et al., 1993; Stockhause et al., 1997; Roshal et al., 1987; Schernthaner et al., 1988; and Bustos et al., 1989) can be fused to the coding sequence to obtained particular expression in respective organs. In addition, the timing of the expression can be controlled by using promoters such as those acting at senescencing (Gan and Amasino 1995) or late seed development (Odell et al., 1994).
The promoter regions of the PTD, P'TLF, PTAG-1 or PTAG-2 gene sequences confer floral-specific (or floral-enriched) expression in Populus. Accordingly, these native promoters may be used to obtain floral-specific (or floral-enriched) expression of the introduced transgene.
Plant transformation vectors may also include RNA processing signals, for example, introns, which may be positioned upstream or downstream of the ORF sequence in the transgene.
In addition, the expression vectors may also include additional regulatory sequences from the 3'-RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 untranslated region of plant genes, e.g., a 3' terminator region to increase mRNA stability of the mRNA, such as the PI-II terminator region of potato or the octopine or nopaline synthase 3' terminator regions.
Finally, as noted above, plant transformation vectors may also include dominant selectable marker genes to allow for the ready selection of transfotmants. Such genes include those encoding antibiotic resistance genes (e.g., resistance to hygromycin, kanamycin, bleomycin, 6418, streptomycin or spectinomycin) and herbicide resistance genes (e.g., phosphinothricin acetyltransferase).
c. Arrangement of floral homeotic nucleic acid sequence in vector Modified fertility characteristics in plants may be obtained using the floral homeotic nucleic acid sequences disclosed herein in a variety of forms. Over-expression, sense-suppression, antisense RNA and dominant negative mutant forms of the disclosed floral homeotic nucleic acid sequences may be constructed in order to modulate or supplement the expression of the corresponding endogenous floral homeotic genes, and thereby to produce plants having modified fertility characteristics. Alternatively, the floral-specific (or floral-enriched) expression conferred by the promoters of the disclosed floral homeotic genes may be employed to obtain corresponding expression of cytotoxic products. Such constructs will comprise the appropriate floral homeotic promoter sequence operably linked to a suitable open reading frame (discussed further below) and will be useful in genetic ablation approaches to engineering sterility in plants.
i. Modulation/supplementation of floral homeotic nucleic acid expression The particular arrangement of the floral homeotic nucleic acid sequence in the transformation vector will be selected according to the type of expression of the sequence that is desired.
Enhanced expression of a floral homeotic nucleic acid may be achieved by operably linking the floral homeotic nucleic acid to a constitutive high-level promoter such as the CaMV
35S promoter. As noted below, modified activity of a floral homeotic polypeptide in planta may also be achieved by introducing into a plant a transformation vector containing a variant form of a floral homeotic nucleic acid, for example a form which varies from the exact nucleotide sequence of the disclosed floral homeotic nucleic acid.
A reduction in the activity of a floral homeotic polypeptide in the transgenic plant may be obtained by introducing into plants antisense constructs based on the floral homeotic nucleic acid sequence. For expression of antisense RNA, the floral homeotic nucleic acid is arranged in reverse orientation relative to the promoter sequence in the transformation vector.
The introduced sequence need not be the full length floral homeotic nucleic acid, and need not be exactly homologous to the floral homeotic nucleic acid found in the plant type to be transformed.
Generally, however, where the introduced sequence is of shorter length, a higher degree of homology to the native floral homeotic nucleic acid sequence will be needed for effective antisense RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 suppression. Preferably, the introduced antisense sequence in the vector will be at least 30 nucleotides in length, and improved antisense suppression will typically be observed as the length of the antisense sequence increases. Preferably, the length of the antisense sequence in the vector will be greater than 100 nucleotides. Transcription of an antisense construct as described results in the production of RNA molecules that are the reverse complement of mRNA
molecules transcribed from the endogenous floral homeotic gene in the plant cell. Although the exact mechanism by which antisense RNA molecules interfere with gene expression has not been elucidated, it is believed that antisense RNA molecules bind to the endogenous mRNA molecules and thereby inhibit translation of the endogenous mRNA.
Suppression of endogenous floral homeotic polypeptide activity can also be achieved using ribozymes. Ribozymes are synthetic RNA molecules that possess highly specific endoribonuclease activity. The production and use of ribozymes are disclosed in U.S. Patent No.
4,987,071 to Cech and U.S. Patent No. 5,543,508 to Haselhoff. The inclusion of ribozyme sequences within antisense RNAs may be used to confer RNA cleaving activity on the antisense RNA, such that endogenous mRNA molecules that bind to the antisense RNA are cleaved, which in turn leads to an enhanced antisense inhibition of endogenous gene expression.
Constructs in which the floral homeotic nucleic acid (or variants thereon) are over-expressed may also be used to obtain co-suppression of the endogenous floral homeotic nucleic acid gene in the manner described in U.S. Patent No. 5,231,021 to Jorgensen.
Such co-suppression (also termed sense suppression) does not require that the entire floral homeotic nucleic acid cDNA
or gene be introduced into the plant cells, nor does it require that the introduced sequence be exactly identical to the endogenous floral homeotic nucleic acid gene.
However, as with antisense suppression, the suppressive efficiency will be enhanced as (1) the introduced sequence is lengthened and (2) the sequence similarity between the introduced sequence and the endogenous floral homeotic nucleic acid gene is increased.
Constructs expressing an untranslatable form of the floral homeotic nucleic acid mRNA
may also be used to suppress the expression of endogenous floral homeotic genes. Methods for producing such constructs are described in U.S. Patent No. 5,583,021 to Dougherty et al.
Preferably, such constructs are made by introducing a premature stop codon into the floral homeotic nucleic acid ORF.
Finally, dominant negative mutant forms of the disclosed sequences may be used to block endogenous floral homeotic polypeptide activity using approaches similar to that described by Mizukami et al. ( 1996). Such mutants require the production of mutated forms of the floral homeotic polypeptide that bind either to an endogenous binding target (for example, a nucleic acid sequence in the case of floral homeotic polypeptides, such as PTD, that function as transcription factors) or to a second polypeptide sequence (such as transcription co-factors), but do not function normally after such binding (i.e. do not function in the same manner as the non-mutated form of the RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 polypeptide). By way of example, such dominant mutants can be constructed by deleting all or part of the C-terminal domain of a floral homeotic polypeptide, leaving an intact MADS domain.
Polypeptides lacking all or part of the C-terminal region may bind to the appropriate DNA target, but are unable to interact with protein co-factors, thereby blocking transcription. Alternatively, dominant negative mutants may be produced by deleting all or part of the MADS
domain, or all or part of the K-domain.
ii. Genetic ablation An alternative approach to modulating floral development is to specifically target a cytotoxic gene product to the floral tissues. This may be achieved by producing transgenic plants that express a cytotoxic gene product under the control of a floral-specific promoter, such as the promoter regions of PTLF, PTD, PTAG-1 and PTAG-2 as disclosed herein. The promoter regions of these gene sequences are generally contained within the first 150 base pairs of sequence upstream of the open reading frame, although floral-specific expression may be conferred by using smaller regions of this sequence. Thus, regions as small as the first 50 base pairs of sequence upstream of the open reading frame may be effective in conferring floral-specific expression.
However, longer regions, such as at least 100, 150, 200 or 250 base pairs of the upstream sequences are preferred.
A number of known cytotoxic gene products may be expressed under the control of the disclosed promoter sequences of the floral homeotic genes. These include:
RNases, such as barnase from Bacillus amyloliquefaciens and RNase-T1 from Aspergillus (Mariani et al., 1990;
Mariani et al., 1992; Reynaerts et al., 1993); ADP-ribosyl-transferase (Diphtheria toxin A chain) (Pappenheimer, 1977; Thorness et al., 1991; Kandasamy et al., 1993); RoIC from Agrobacterium rhizogenes (Schmulling et al., 1993); DTA (diphtheria toxin A) (Pappenheimer, 1977) and glucanase (Worrall et al., 1992).
d. Transformation and regeneration techniques Constructs designed as discussed above to modulate or supplement expression of native floral homeotic genes in plants, or to express cytotoxins in a tissue-specific manner can be introduced into plants by a variety of means. Transformation and regeneration of both monocotyledonous and dicotyledonous plant cells is now routine, and the selection of the most appropriate transformation technique will be determined by the practitioner.
The choice of method will vary with the type of plant to be transformed; those skilled in the art will recognize the suitability of particular methods for given plant types. Suitable methods may include, but are not limited to: electroporation of plant protoplasts; liposome-mediated transformation; polyethylene mediated transformation; transformation using viruses; micro-injection of plant cells; micro-projectile bombardment of plant cells; vacuum infiltration; and Agrobacterium tumefaciens (AT) mediated transformation. Typical procedures for transforming and regenerating plants are described in the patent documents listed at the beginning of this section.

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October l, 1999 Methods that are particularly suited to the transformation of woody species include (for Picea species) methods described in Ellis et al. (1991, 1993) and (for Populus species) the use ofA.
tumefaciens (Settler, 1993; Strauss et al., 1995a,b), A. rhizogenes (Han et al., 1996) and biolistics (McCown et al., 1991).
e. Selection of transformed plants Following transformation and regeneration of plants with the transformation vector, transformed plants are preferably selected using a dominant selectable marker incorporated into the transformation vector. Typically, such a marker will confer antibiotic resistance on the seedlings of transformed plants, and selection of transformants can be accomplished by exposing the seedlings to appropriate concentrations of the antibiotic.
After transformed plants are selected and grown to maturity, the effect on fertility can be determined by visual inspection of floral morphology, including the determination of the production of pollen or ova. In addition, the effect on the activity of the endogenous floral homeotic gene may be directly determined by nucleic acid analysis (hybridization or PCR
methodologies) or immunoassay of the expressed protein. Antisense or sense suppression of the endogenous floral homeotic gene may be detected by analyzing mRNA expression on Northern blots or by reverse transcription polymerise chain reaction (RT-PCR).
Example 3 Introduction of antisense PTD cDNA into hybrid aspens By way of example, the following methodology may be used to produce poplar trees with modified expression of PTD. The PTD cDNA (Seq. LD. No. 2) is excised from the cloning vector and blunt ended using T4 DNA polymerise. The cDNA is then ligated into the Sma I site of pBI121 (Jefferson et al., 1987), and clones containing the cDNA in reverse orientation with respect to the promoter are identified by sequence analysis.
Hybrid aspens, such as the P. tremula x alba hybrid aspen and the P. tremula x tremuloides hybrid aspen are transformed with pDW 151 (Weigel and Nilsson, 1995) and the above binary vectors using Agrobacterium tumefasciens strain C58 (Leple et al., 1992) with modifications as described by Han et al. ( 1996).
Expression of the antisense transgene is assessed in immature plants by extraction of mRNA and northern blotting using the PTD cDNA as a probe, or by RT-PCR. Levels of PTD
protein are analyzed by extraction and concentration of cellular proteins followed by western blotting, or by in situ hybridization.
Example 4 Expression of mutant PTAG-1 sequences in plants RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 PTAG-1 mutants are constructed by PCR amplification using standard PCR
methodologies as described above and a Populus cDNA library as a template. A
mutant form of PTAG-1 in which the MADS box domain is deleted is amplified using the following primer combination:
S 5' GTCACTTTCTGCAAAAGGCGCAGTGGT 3' (Seq. LD. No. 21) 5' AACTAACTGAAGGGCCATCTGATCTTG 3' (Seq. LD. No. 22) A mutant form of PTAG-1 in which a portion of the 3' region of the encoded polypeptide is deleted is amplified using the following primer combination:
5' ATGGAATATCAAAATGAATCCCTTGAG 3' (Seq. LD. No. 23) 5'ATTCATGCTCTGTCGCTTTCTTTCATTCT 3' (Seq. LD. No. 24) The amplified products are cloned using standard cloning vectors and then ligated into a transformation vector such as pBI121 (Jefferson et al., 1987).
Hybrid aspens, such as the P. tremula x alba hybrid aspen and the P. tremula x tremuloides hybrid aspen are transformed with pDW151 (Weigel and Nilsson, 1995) and the pBI121 binary vector containing the mutant PTAG-1 construct using Agrobacterium tumefasciens strain C58 (Leple et al., 1992) with modifications as described by Han et al.
(1996).
Expression of the mutant PTAG-1 transgenes is assessed in immature plants by extraction of mRNA and northern blotting using the PTAG-1 cDNA as a probe or by RT-PCR.
Levels of mutant protein are analyzed by extraction and concentration of cellular proteins followed by western blotting, or by in situ hybridization.
Example 5 Production of Sequence Variants As noted above, modification of the activity of floral homeotic polypeptides such as PTD, PTLF, PTAG-1 and PTAG-2 in plant cells can be achieved by transforming plants with a selected floral homeotic nucleic acid (cDNA or gene, or parts therof), antisense constructs based on the disclosed floral homeotic nucleic acid sequences or other variants on the disclosed sequences.
Sequence variants include not only genetically engineered sequence variants, but also naturally occurring variants that arise within Populus populations, including allelic variants and polymorphisms, as well as variants that occur in different genotypes and species of Populus. These naturally occurring variants may be obtained by PCR amplification from genomic or cDNA
libraries made from genetic material of Populus species, or by RT-PCR from mRNA from such species, or by other methods known in the art, including using the disclosed nucleic acids as probes to hybridize with genetic libraries. Methods and conditions for both direct PCR and RT-PCR are known in the art and are described in Innis et al. (1990).
As noted, variant DNA molecules also include those created by DNA genetic engineering techniques, for example, M 13 primer mutagenesis. Details of these techniques are provided in RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Sambrook et al. (1989), Ch. 15. By the use of such techniques, variants may be created which differ in minor ways from the floral homeotic cDNA or gene sequences disclosed. DNA molecules and nucleotide sequences which are derived from the floral homeotic nucleic acids disclosed include DNA sequences which hybridize under stringent conditions to the DNA
sequences disclosed, or fragments thereof.
Nucleic acid molecules and proteins that are variants of those disclosed herein may be identified by the degree of sequence identity that they share with a nucleic acid molecule or protein disclosed herein. Typically, such variants share at least 50% sequence identity with a disclosed nucleic acid or protein, as determined by the methods described above for homologs.
Alternatively, for nucleic acid molecules, variants may be identified by their ability to hybridize to a disclosed sequence under stringent conditions, as described above.
The degeneracy of the genetic code further widens the scope of the present invention as it enables major variations in the nucleotide sequence of a DNA molecule while maintaining the amino acid sequence of the encoded protein. For example, the 32nd amino acid residue of the Poplar PTD protein shown in Seq. LD. No. 4 is alanine. This is encoded in the Poplar PTD open reading frame by the nucleotide codon triplet GCC. Because of the degeneracy of the genetic code, three other nucleotide codon triplets: GCT, GCA and GCG, also code for alanine. Thus, the nucleotide sequence of the Poplar PTD ORF could be changed at this position to any of these three codons without affecting the amino acid composition of the encoded protein or the characteristics of the protein. Based upon the degeneracy of the genetic code, variant DNA
molecules may be derived from the cDNA and gene sequences disclosed herein using standard DNA
mutagenesis techniques as described above, or by synthesis of DNA sequences. Thus, this invention also encompasses nucleic acid sequences which encode a floral homeotic protein but which vary from the disclosed nucleic acid sequences by virtue of the degeneracy of the genetic code.
One skilled in the art will recognize that DNA mutagenesis techniques may be used not only to produce variant DNA molecules, but will also facilitate the production of proteins which differ in certain structural aspects from the Poplar floral homeotic proteins, yet which proteins are clearly derivative of these proteins. Newly derived proteins may also be selected in order to obtain variations on the characteristic of the Poplar floral homeotic proteins. Such derivatives include those with variations in amino acid sequence including minor deletions, additions and substitutions.
While the site for introducing an amino acid sequence variation is predetermined, the mutation per se need not be predetermined. For example, in order to optimize the performance of a mutation at a given site, random mutagenesis may be conducted at the target codon or region and the expressed protein variants screened for the optimal combination of desired activity. Techniques for making substitution mutations at predetermined sites in DNA having a known sequence as described above are well known.

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Amino acid substitutions are typically of single residues; insertions usually will be on the order of about from 1 to 10 amino acid residues; and deletions will range about from 1 to 30 residues. Deletions or insertions preferably are made in adjacent pairs, i.e., a deletion of two residues or insertion of two residues. Substitutions, deletions, insertions or any combination thereof may be combined to arrive at a final construct. Obviously, the mutations that are made in the DNA encoding the protein must not place the sequence out of reading frame and preferably will not create complementary regions that could produce secondary mRNA structure.
Substitutional variants are those in which at least one residue in the amino acid sequence has been removed and a different residue inserted in its place. Such substitutions generally are made in accordance with the following Table 1 when it is desired to finely modulate the characteristics of the protein. Table 1 shows amino acids which may be substituted for an original amino acid in a protein and which are typically regarded as conservative substitutions.

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Table 1 Original Residue Conservative Substitutions Ala ser Arg lys Asn gln; his Asp glu Cys ser G In asn G lu asp Gly pro His asn; gln Ile leu, val Leu ile; val Lys arg; gln; glu Met leu; ile Phe met; leu; tyr Ser thr Thr ser Trp tyr Tyr trp; phe Val ile; leu Substantial changes in transcription factor function or other features are made by selecting substitutions that are less conservative than those in Table 1, i.e., selecting residues that differ more significantly in their effect on maintaining (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. The substitutions which in general are expected to produce the greatest changes in protein properties will be those in which (a) a hydrophilic residue, e.g., Beryl or threonyl, is substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an electropositive side chain, e.g., lysyl, arginyl, or histadyl, is substituted for (or by) an electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine.
Homologous polypeptides that share at least 50% amino acid sequence identity to the disclosed PTD, PTLF, PTAG-1 or PTAG-2 amino acid sequences as determined using BLAST 2.0, gapped blastp, with default parameters, are encompassed by this invention.
Homologs with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 90% or at least 95% sequence identity. Such homologous peptides are preferably at least 10 amino acids in length, and more preferably at least 25 or 50 amino acids in length. When less than the entire sequence is being compared for sequence identity, homologs will typically possess at least 75% sequence identity over short windows of 10-20 amino acids, and may possess sequence identities of at least 85% or at least 90% or 95% depending on their similarity to the reference RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 sequence. Also encompassed by the present invention are the nucleic acid sequences that encode these homologous peptides.
Similarly, homologous nucleic acids that share at least 50% nucleotide identity to the disclosed PTD, PTLF, PTAG-1 or PTAG-2 nucleic acid sequences as determined using BLAST
2.0, gapped blastn, with default parameters, are encompassed by this invention. Homologs with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 90% or at least 95% sequence identity. Such homologous nucleic acids are preferably at least 50 nucleotides on length, and more preferably at least 100 or 250 nucleotides in length.
Example 6 Other Applications of the Disclosed Sequences The disclosed floral homeotic nucleic acids and polypeptide are useful as laboratory reagents to study and analyze floral gene expression in plants, including plants engineered for modified fertility characteristics. For example, probes and primers derived from the PTD
sequence, as well as monoclonal antibodies specific for the PTD polypeptide may be used to detect and quantify expression of PTD in seedlings transformed with an antisense PTD
construct as described above. Such analyses would facilitate detection of those transformants that display modified PTD expression and which may therefore be good candidates for having modified fertility characteristics.
The production of probes and primers derived from the disclosed sequences is described in detail above. Production of monoclonal antibodies requires that all or part of the protein against which the antibodies to be raised be purified. With the provision herein of the floral homeotic nucleic acid sequences, as well as the sequences of the encoded polypeptides, this may be achieved by expression in heterologous expression systems, or chemical synthesis of peptide fragments.
Many different expression systems are available for expressing cloned nucleic acid molecules. Examples of prokaryotic and eukaryotic expression systems that are routinely used in laboratories are described in Chapters 16-17 of Sambrook et al. (1989). Such systems may be used to express the floral homeotic polypeptides at high levels to faciliate purification.
By way of example only, high level expression of a floral homeotic polypeptide may be achieved by cloning and expressing the selected cDNA in yeast cells using the pYES2 yeast expression vector (Invitrogen, San Diego, CA). Secretion of the recombinant floral homeotic polypeptide from the yeast cells may be achieved by placing a yeast signal sequence adjacent to the floral homeotic nucleic acid coding region. A number of yeast signal sequences have been characterized, including the signal sequence for yeast invertase. This sequence has been successfully used to direct the secretion of heterologous proteins from yeast cells, including such proteins as human interferon (Chang et al., 1986), human lactoferrin (Liang and Richardson, 1993) RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 and prochymosin (Smith et al., 1985). Alternatively, the enzyme may be expressed at high level in prokaryotic expression systems, such as E. coli.
Monoclonal or polyclonal antibodies may be produced to the selected floral homeotic polypeptide or portions thereof. Optimally, antibodies raised against a specified floral homeotic polypeptide will specifically detect that polypeptide. That is, for example, antibodies raised against the PTD polypeptide would recognize and bind the PTD polypeptide and would not substantially recognize or bind to other proteins found in poplar cells. The determination that an antibody specifically detects PTD is made by any one of a number of standard immunoassay methods; for instance, the Western blotting technique (Sambrook et al., 1989). To determine that a given antibody preparation (such as one produced in a mouse against PTD) specifically detects PTD by Western blotting, total cellular protein is extracted from poplar cells and electrophoresed on a sodium dodecyl sulfate-polyacrylamide gel. The proteins are then transferred to a membrane (for example, nitrocellulose) by Western blotting, and the antibody preparation is incubated with the membrane. After washing the membrane to remove non-specifically bound antibodies, the presence of specifically bound antibodies is detected by the use of an anti-mouse antibody conjugated to an enzyme such as alkaline phosphatase; application of the substrate 5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium results in the production of a dense blue compound by immuno-localized alkaline phosphatase. Antibodies which specifically detect PTD
will, by this technique, be shown to bind to substantially only the PTD band (which will be localized at a given position on the gel determined by its molecular weight).
Non-specific binding of the antibody to other proteins may occur and may be detectable as a weak signal on the Western blot. The non-specific nature of this binding will be recognized by one skilled in the art by the weak signal obtained on the Western blot relative to the strong primary signal arising from the specific antibody-PTD binding.
Substantially pure floral homeotic polypeptides suitable for use as an immunogen may be isolated from transformed cells as described above. Concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms per milliliter. Alternatively, peptide fragments of the specified floral homeotic polypeptide may be utilized as immunogens. Such fragments may be chemically synthesized using standard methods, or may be obtained by cleavage of the whole floral homeotic polypeptide followed by purification of the desired peptide fragments. Peptides as short as 3 or 4 amino acids in length are immunogenic when presented to the immune system in the context of a Major Histocompatibility Complex (MHC) molecule, such as MHC class I or MHC class II. Accordingly, peptides comprising at least 3 and pereferably at least 4, 5, 6 or 10 or more consecutive amino acids of the disclosed floral homeotic polypeptide amino acid sequences may be employed as immuogens to raise antibodies. Because naturally occurring epitopes on proteins are frequently comprised of amino acid residues that are not adjacently arranged in the peptide when the peptide sequence is RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 viewed as a linear molecule, it may be advantageous to utilize longer peptide fragments from the floral homeotic polypeptide amino acid sequences in order to raise antibodies.
Thus, for example, peptides that comprise at least 10, I5, 20, 25 or 30 consecutive amino acid residues of the floral homeotic polypeptide amino acid sequence may be employed. Monoclonal or polyclonal antibodies to the intact floral homeotic polypeptide or peptide fragments of this protein may be prepared as described below.
Monoclonal antibody to epitopes of the selected floral homeotic polypeptide can be prepared from murine hybridomas according to the classical method of Kohler and Milstein (1975) or derivative methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks. The mouse is then sacrificed, and the antibody-producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall (1980), and derivative methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Harlow and Lane (1988).
Having illustrated and described the principles of isolating the Populus floral homeotic genes, the proteins encoded by these genes and modes of use of these biological molecules, it should be apparent to one skilled in the art that the invention can be modified in arrangement and detail without departing from such principles. We claim all modifications coming within the spirit and scope of the claims presented herein.

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 References Ainley et al. (1993). Regulatable endogenous production of cytokinins up to "toxic"
levels in transgenic plants and plant tissues. Plant Mol. Biol. 22:13-23.
Altschul et al. (1990). J. Mol. Biol. 215:403-410.
Altschul et al. (1994). Nature Genetics 6:119-129.
An et al. ( 1988). Plant Physiol. 88: 547.
Ausubel et al. (1987). In: Current Protocals in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences.
Baker et al. (1990). RNA and DNA isolation from recalcitrant plant tissues.
BiolTechnigues 9:268-272.
Benfey and Chua (1990). The cauliflower mosaic virus 35S promoter:
Combinatorial regulationof transcription in plants. Science 250:959-966.
Boes and Strauss (1994). Floral phenology and morphology of black cottonwood, Populus trichocarpa (Salicaceae). Am. J. Bot. 8:562-567.
Brayshaw (1965). The status of the black cottonwood (Populus trichocarpa Torreyand Gray). Can. Field Nat. 79:91-95.
Brusslan et al. (1993). An Arabidopsis mutant with a reduced level of cabl40 RNA is aresult of cosuppression. Plant Cell 5:667-677.
Bustos et al. ( 1989). Plant Cell I : 839.
Callis et al. (1988). Plant Physiol. 88: 965.
Carpenter et al. (1992). Preferential expression of an a-tubulin gene ofArabidopsis in pollen. Plant Cell 4:557-571.
Chang et al. (1986). Saccharomyces cerevisiae secretes and correctly processes human interferon hybrid protein containing yeast invertase signal peptides. Mol. and Cell. Biol. 6:1812 1819.
Cheng et al. (1983). Organ initiation and the development of unisexual flowers in the tassel and ear of Zea mays. Am. J. Bot. 70:450-462.
Coen et al. (1990). FLORICAULA: a homeotic gene required for flower development in Antirrhinum majus. Cell 63:1311-1322.
Corpet et al. (1988). Nucleic Acids Research 16:10881-10890.
Dekeyser et al. ( 1990). Plant Cell 2:591.
Denis et al. (1993). Expression of engineered nuclear male sterility in Brassica napus.
Plant Physiol. 101:1295-1304.
Don et al. (1991). "Touchdown" PCR to circumvent spurious priming during gene amplification. Nucl. Acids Res. 19:4008.
Ellis et al. ( 1991 ). Plant. Mol. Biol. 17:19-27.
Ellis et al. (1993). BiolTechnolo~ 11:84-89.

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Engvall ( 1980). Enrymol. 70:419.
Flavell (1994). Inactivation of gene expression in plants as a consequence of specific sequence duplication. Proc Natl Acad Sci USA 91:3490-3496.
Fromm et al. ( 1989). Plant Cell 1:977.
Gan and Amasino (1995). Inhibition of leaf senescence by autoregulated production of cytokinin. Science 270:1986-1988.
Gatz ( 1997). Chemical control of gene expression. Ann. Rev. Plant Physiol.
Plant Mol Biol. 48:89-108.
Gelvin et al. (1990). Plant Molecular Biology Manual, Kluwer Academic Publishers.
Gilmartin et al. (1992). Characterization of a gene encoding a DNA binding protein with specificity for a light-responsive element. Plant Cell 4:839-949.
Goldman et al. (1994). Female sterile tobacco plants are produced by stigma-specific cell ablation. EMBO J. 13:2976-2984.
Grant et al. (1994). Developmental differences between male and female flowers in the dioecious plant Silene latifolia. Plant J. 6:471-480.
Han et al. (1996). Cellular and molecular biology ofAgrobacterium-mediated transformation of plants and its application to genetic transformation of Populus. In: Stettler et al.
[eds.] Biology of Populus and its Implications for Management and Conservation, Part I, Chapter 9, pp. 201-222, NRC Research Press, Nat. Res. Coun. of Canada, Ottawa, Ontario.
Harlow and Lane ( 1988). Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, New York.
Henikoff (1984). Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene 28:351-359.
Higgins and Sharp (1988). Gene 73: 237-244.
Higgins and Sharp (1989). CABIOS 5:151-153.
Huang et al. (1992). Computer Applications in the Biosciences 8:155-165.
Innis et al. (1990). PCR Protocols, A Guide to Methods and Applications, Innis et al.
[eds.], Academic Press, Inc., San Diego, California.
Jefferson et al. (1987). GUS fusions: (3-glucuronidase as a sensitive and versatile gene fusion marker in higher plants. EMBO J. 6:3901-3907.
Jorgensen ( 1992). Silencing of plant genes by homologous transgenes.
AgBiotech News Info 4:265N-273N.
Kandasamy et al. (1993). Ablation of papillar cell function in Brassica flowers results in the loss of stigma receptivity to pollination. Plant Cell 5:263-275.
Kaul (1995). Reproductive structure and organogenesis in a cottonwood, Populus deltoides (Salicaceae). Int. J. Plant Sci. 156:172-180.

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Kawasaki et al. ( 1990). In: PCR Protocals, A Guide to Methods and Applications, Innis et al. [eds.], pp. 21-27, Academic Press, Inc., San Diego, California.
Kelly et al. (1995) NFL, the tobacco homolog of FLORICAULA and LEAFY, is transcriptionally expressed in both vegetative and floral meristems. Plant Cell 7:225-34.
Kohler and Milstein (1975). Nature 256:495.
Kooter (1993). Mol JNM: Trans-inactivation of gene expression of plants. Curr Opin Biotechnol 4:166-171.
Kuhlemeier et al. (1989). Plant Cell 1:471.
Leple et al. (1992). Transgenic poplars: Expression of chimeric genes using four different constructs. Plant Cell Rep. 11:137-41.
Liang and Richardson (1993). Expression and characterization of human lactoferrin in yeast (Saccharomyces cerevisiae). J. Agric. Food Chem. 41:1800-1807.
Ma (1994). The unfolding drama of flower development: Recent results from genetic and molecular analyses. Genes Dev. 8:745-756.
Ma et al. (1991). AGLI-AGL6, an Arabidopsis gene family with similarity to floral homeotic and transcription factor genes. Genes 8c Dev. 5:484-495.
Mandel et al. (1992x). Manipulation of flower structure in transgenic tobacco.
Cell 71:133-143.
Mandel et al. ( 1992b). Molecular characterization of the Arabidopsis floral homeotic gene APETALAI. Nature 360: 273-277.
Marcotte et al. (1989). Plant Cell 1:969.
Mariani et al. (1990). Induction of male sterility in plants by a chimaeric ribonucleae gene. Nature 347:737-741.
Mariani et al. ( 1992). A chimaeric ribonuclease-inhibitor gene restores fertility to male-sterile plants. Nature 357:384-387.
Matzke et al. ( 1993). Genomic imprinting in plants: parental effects and trans-inactivation phenomena. Annu. Rev. Plant Physiol. Plant Mol. Biol. 44:53-76.
McCown et al. (1991 ). Stable transformation of Populus and incorporation of pest resistance by electric discharge particle acceleration. Plant Cell Rep. 9:590-594.
Mizukami et al. (1996). Plant Cell 8:831-845 Mol et al. ( 1994). Post-transcriptional inhibition of gene expression: Sense and antisense genes. In: Paszkowski J (ed.) Homologous Recombination and Gene Silencing in Plants, pp. 309-334, Kluwer Academic Publishers, Dordrecht.
Nagaraj (1952). Floral morphology of Populus deltoides and P. tremuloides.
Bot. Gaz.
3 5 114:222-243.
Needleman and Wunsch ( 1970). J. Mol. Biol. 48:443.

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Odell et al. (1994). Seed specific gene activation mediated by the Cre/lox site-specific recombination system. Plant Physiol. 106:447-458.
Okamuro et al. (1993). Regulation of Arabidopsis flower development. Plant Cell 5:1183-93.
Opperman et al. (1993). Root knot nematode directed expression of a plant root specific gene. Science 263:221-223.
Pappenheimer (1977). Diphtheria toxin. Annu. Rev. Biochem. 46:69-94.
Paul et al. (1992). The isolation and characterization o the tapetum-specific Arabidopsis thaliana A9 gene. Plant Mol Biol 19:611-622.
Pearson and Lipman (1988). Proc. Natl. Acad Sci. USA 85:2444.
Pearson et al. ( 1994). Methods in Molecular Biology 24:307-331.
Pnueli et al. (1991). Plant J. 1:255-266.
Pnueli et al. (1994). Isolation of the tomato Agamous gene TAG1 and analysis of its homeotic role in transgenic plants. Plant Cel16:163-173.
Pouwels et al. (1987). Cloning Vectors: A Laboratory Manual, 1985 supplement.
Purugganan et al. (1995). Molecular evolution of flower development:
Diversification of the plant MADS-box regulatory gene family. Genetics 140:345-56.
Reynaerts et al. (1993). Engineered genes for fertility control and their application in hybrid seed production. Sci. Hortic 55:125-139.
Riechmann et al. (1996). DNA binding properties of Arabidopsis MADS domain homeotic proteins APETELA1, APETELA3, PISTILLATA and AGAMOUS. Nuc. Acid Res.
24( 16): 3134-3141.
Roshal et al. (1987). EMBOJ. 6:1155.
Sambrook et al. (1989). Molecular Cloning: A laboratory manual,. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York.
Schaffner and Sheen ( 1991 ). Plant Cell 3:997.
Schernthaner et al. ( 1988). EMBO J. 7:1249.
Schmulling et al. (1993). Resoration of fertility by antisense RNA in genetically engineered male sterile tobacco plants. Mol. Gen. Genet.237-385-394.
Schwarz-Sommer et al. (1992). EMBOJ. 11: 251-263.
Sheppard ( 1997). PTD: a Populus trichocarpa gene with homology to floral homeotic transcription factors. Ph.D. dissertation. Oregon State University.
Siebertz et al. (1989). Plant Cell 1:961.
Smith and Waterman (1981). Adv. Appl. Math. 2:482.
Smith et al. (1985). Heterologous protein secretion from yeast. Science 229:1219-1224.
Stettler (1993). Popular Molecular Network Newsletter 1 ( 1 ), College of Forest Resources AR-10, University of Washington, Seattle, Washington.

RJP:SLR 245-53375 9/29/00 Express Mail Label No.: EM295145989US
Date of Deposit: October 1, 1999 Stockhause et al. ( 1997). The promoter of the gene encoding the CQ Flaveria spp. Plant Cell 9:479-489.
Strauss et al. ( 1995a). Molecular Breeding 1:5-26.
Strauss et al. (1995b). TGERC Annual Report: 1994-1995. Forest Research Laboratory, Oregon State University.
Taylor et al. (1992). Conditional male-fertility in chalcone synthase-deficient petunia. J.
Hered 83:11-17.
Terada and Shimamoto (1990). Mol. Gen. Genet. 220:389.
Thorsness et al. ( 1991 ). A Brassica S-locus gene promoter targets toxic gene expression and cell death to the pistil and pollen of transgenic Nicotiana. Devel. Biol.
143:173-184.
Thorsness et al. (1993). Genetic ablation of floral cells in Arabiodopsis.
Plant Cell 5:253-61.
Tijssen (1993). Overview of principles of hybridization and the strategy of nucleic acid probe assays. In: Laboratory Techniques in Biochemistry and Molecular Biology -Hybridization with Nucleic Acid Probes, Part I, Chapter 2. Elsevier, New York.
Van der Meer et al. (1992). Antisense inhibition of flavanoid biosynthesis sin petunia anthers results in male sterility. Plant Cell 4:253-262.
Wagner et al. (1987). Chloroplast DNA polymorphisms in lodgepole pine and their hybrids. Proc. Natl. Acad Sci. USA 84:2097-2100.
Weigel et al. (1992). LEAFY controls floral meristem identity in Arabidopsis.
Cell 69:843-59.
Weigel and Nilsson (1995). A developmental switch sufficient for flower initiation in diverse plants. Nature 377: 495-500.
Weissbach and Weissbach (1989). Methods for Plant Molecular Biology, Academic Press.
Worrall et al. (1992). Premature dissolution of the microporocyte callose wall causes male sterility in transgenic tobacco. Plant Cell 4:759-771.
Yanofsky (1995). Floral meristems to floral organs: Genes controlling early events in Arabidopsis flower development. Annu. Rev. Plant Physiol. 46:167-188.
Yanofsky et al. (1990). The protein encoded by the Arabidopsis homeotic gene AGAMOUS resembles transcription factors. Nature 346: 35-39.

SEQUENCE LISTING
(1) GENERAL
INFORMATION:

(i) APPLICANT: THE STATE OF OREGON ACTING BY AND THROUGH THE
STATE BOARD

OF HIGHER EDUCATION ON BEHALF OF OREGON STATE UNIVERSITY

(ii) TITLE OF INVENTION: FLORAL HOMEOTIC GENES FOR MANIPULATION
OF

FLOWERING IN POPLAR TREES AND OTHER PLANT

SPECIES

(iii) NUMBER OF SEQUENCES: 24 (iv) CORRESPONDENCE ADDRESS:

(A) ADDRESSEE: SMART & BIGGAR

(B) STREET: P.O. BOX 2999, STATION D

(C) CITY: OTTAWA

(D) STATE: ONT

(E) COUNTRY: CANADA

(F) ZIP: K1P 5Y6 (v) COMPUTER READABLE FORM:

(A) MEDIUM TYPE: Floppy disk 2 (B) COMPUTER: IBM PC compatible (C) OPERATING SYSTEM: PC-DOS/MS-DOS

(D) SOFTWARE: ASCII (text) (vi) CURRENT APPLICATION DATA:

(A) APPLICATION NUMBER: CA 2,319,853 (B) FILING DATE: 02-OCT-2000 (C) CLASSIFICATION:

(vii) PRIOR APPLICATION DATA:

(A) APPLICATION NUMBER:

(B) FILING DATE:

3 (viii) ATTORNEY/AGENT INFORMATION:
O

(A) NAME: SMART & BIGGAR

(B) REGISTRATION NUMBER:

(C) REFERENCE/DOCKET NUMBER: 63198-1293 (ix) TELECOMMUNICATION INFORMATION:

(A) TELEPHONE: (613)-232-2486 (B) TELEFAX: (613)-232-8440 (2) INFORMATION FOR SEQ ID NO.: 1:
4 O (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 4285 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Populus balsamifera subsp. trichocarpa (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 1:

O

O

O

(2) INFORMATION FOR SEQ ID NO.: 2:
60 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 946 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:

(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Populus balsamifera subsp. trichocarpa (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1) . . (684) (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 2:

Met Gly Arg Gly Lys Ile Glu Ile Lys Lys Ile Glu Asn Pro Thr Asn Arg Gln Val Thr Tyr Ser Lys Arg Arg Asn Gly Ile Phe Lys Lys Ala Gln Glu Leu Thr Val Leu Cys Asp Ala Lys Val Ser Leu Ile Met Phe Ser Asn Thr Asn Lys Leu Asn Glu Tyr Ile Ser Pro Ser Thr Ser Thr Lys Lys Ile Tyr Asp Gln Tyr Gln Asn Ala Leu Gly Ile Asp Leu Trp Gly Thr Gln Tyr Glu Lys Met Gln Glu His Leu Arg Lys Leu Asn Asp Ile Asn His Lys Leu Arg Gln Glu Ile Arg Gln Arg Arg Gly Glu Gly Leu A.sn Asp Leu Ser Ile Asp His Leu Arg Gly Leu Glu Gln His Met Thr Glu Ala Leu Asn Gly Val Arg Gly Arg Lys Tyr His Val Ile Lys Thr Gln Asn Glu Thr Tyr Arg Lys Lys Val Lys Asn Leu Glu Glu Arg His Gly Asn Leu Leu Met Glu Tyr Glu Ala Lys Leu Glu Asp Arg Gln Tyr Gly Leu Val Asp Asn Glu Ala Ala Val Ala Leu Ala Asn Gly Ala Ser Asn Leu Tyr Ala Phe Arg Leu His His Gly His Asn His His His His Leu Pro Asn Leu His Leu Gly Asp Gly Phe Gly Ala His Glu Leu Arg Leu Pro GCTCTTTTTC ACAAATAAAA P~~~1AAAAAAA P~AAAAAAAAA AA 946 (2) INFORMATION FOR SEQ ID NO.: 3:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 681 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Populus balsamifera subsp. trichocarpa 2 O (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(681) (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 3:

Met Gly Arg Gly Lys Ile Glu Ile Lys Lys Ile Glu Asn Pro Thr Asn Arg Gln Val Thr Tyr Ser Lys Arg Arg Asn Gly Ile Phe Lys Lys Ala Gln Glu Leu Thr Val Leu Cys Asp Ala Lys Val Ser Leu Ile Met Phe Ser Asn Thr Asn Lys Leu Asn Glu Tyr Ile Ser Pro Ser Thr Ser Thr Lys Lys Ile Tyr Asp Gln Tyr Gln Asn Ala Leu Gly Ile Asp Leu Trp Gly Thr Gln Tyr Glu Lys Met Gln Glu His Leu Arg Lys Leu Asn Asp Ile Asn His Lys Leu Arg Gln Glu Ile Arg Gln Arg Arg Gly Glu Gly 50 loo 105 110 Leu Asn Asp Leu Ser Ile Asp His Leu Arg Gly Leu Glu Gln His Met Thr Glu Ala Leu Asn Gly Val Arg Gly Arg Lys Tyr His Val Ile Lys Thr Gln Asn Glu Thr Tyr Arg Lys Lys Val Lys Asn Leu Glu Glu Arg His Gly Asn Leu Leu Met Glu Tyr Glu Ala Lys Leu Glu Asp Arg Gln Tyr Gly Leu Val Asp Asn Glu Ala Ala Val Ala Leu Ala Asn Gly Ala Ser Asn Leu Tyr Ala Phe Arg Leu His His Gly His Asn His His His His Leu Pro Asn Leu His Leu Gly Asp Gly Phe Gly Ala His Glu Leu Arg Leu Pro (2) INFORMATION FOR SEQ ID NO.: 4:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 227 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
3 0 (A) ORGANISM: Populus balsamifera subsp. trichocarpa (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 4:
Met Gly Arg Gly Lys Ile Glu Ile Lys Lys Ile Glu Asn Pro Thr Asn Arg Gln Val Thr Tyr Ser Lys Arg Arg Asn Gly Ile Phe Lys Lys Ala Gln Glu Leu Thr Val Leu Cys Asp Ala Lys Val Ser Leu Ile Met Phe Ser Asn Thr Asn Lys Leu Asn Glu Tyr Ile Ser Pro Ser Thr Ser Thr Lys Lys Ile Tyr Asp Gln Tyr Gln Asn Ala Leu Gly Ile Asp Leu Trp Gly Thr Gln Tyr Glu Lys Met Gln Glu His Leu Arg Lys Leu Asn Asp 5 0 Ile Asn His Lys Leu Arg Gln Glu Ile Arg Gln Arg Arg Gly Glu Gly Leu Asn Asp Leu Ser Ile Asp His Leu Arg Gly Leu Glu Gln His Met Thr Glu Ala Leu Asn Gly Val Arg Gly Arg Lys Tyr His Val Ile Lys Thr Gln Asn Glu Thr Tyr Arg Lys Lys Val Lys Asn Leu Glu Glu Arg His Gly Asn Leu Leu Met Glu Tyr Glu Ala Lys Leu Glu Asp Arg Gln Tyr Gly Leu Val Asp Asn Glu Ala Ala Val Ala Leu Ala Asn Gly Ala Ser Asn Leu Tyr Ala Phe Arg Leu His His Gly His Asn His His His His Leu Pro Asn Leu His Leu Gly Asp Gly Phe Gly Ala His Glu Leu 1 0 Arg Leu Pro (2) INFORMATION 5:
FOR SEQ
ID NO.:

(i) SEQUENCE
CHARACTERISTICS

(A) LENGTH:

(B) TYPE: nucleic acid (C) STRANDEDNESS:

(D) TOPOLOGY:

2 (ii) MOLECULE DNA
O TYPE:

(vi) ORIGINAL
SOURCE:

(A) ORGANISM: us balsamifera trichocarpa Popul subsp.

(xi) SEQUENCE
DESCRIPTION:
SEQ ID
NO.: 5:

O

O

O

AAAGAAAAAA
AGACAAAAAA

O

O

O

(2) INFORMATION 6:
FOR SEQ
ID NO.:

(i) SEQUENCE
CHARACTERISTICS

(A) LENGTH:

60 (B) TYPE: nucleic acid (C) STRANDEDNESS:

(D) TOPOLOGY:

(ii) MOLECULE DNA
TYPE:

(vi) ORIGINAL
SOURCE:

(A) ORGANISM: Populus balsamifera subsp. trichocarpa (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (12)..(1145) (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 6:

Met Asp Pro Glu Ala Phe Thr Ala Ser Leu Phe Lys Trp Asp Thr Arg Ala Met Val Pro His Pro Asn Arg Leu Leu Glu Met Val Pro Pro Pro Gln Gln Pro Pro Ala Ala Ala Phe Ala Val Arg Pro Arg Glu Leu Cys Gly Leu Glu Glu Leu Phe Gln Ala Tyr Gly Ile Arg Tyr Tyr Thr Ala Ala Lys Ile Ala Glu Leu Gly Phe Thr Val Asn Thr Leu Leu Asp Met Lys Asp Glu Glu Leu Asp Glu Met Met Asn Ser Leu Ser Gln Ile Phe Arg Trp Asp Leu Leu Val Gly Glu Arg Tyr Gly Ile Lys Ala Ala Val Arg Ala Glu Arg Arg Arg Leu Asp Glu Glu Asp Pro Arg Arg Arg Gln Leu Leu Ser Gly Asp Asn Asn Thr Asn Thr Leu Asp Ala Leu Ser Gln Glu Gly Phe Ser Glu Glu Pro Val Gln Gln Asp Lys Glu Ala Ala Gly Ser Gly Gly Arg Gly Thr Trp Glu Ala Val Ala Ala Gly Glu Arg Lys Lys Gln Ser Gly Arg Lys Lys Gly Gln Arg Lys Val Val Asp Leu Asp Gly Asp Asp Glu His Gly Gly Ala Ile Cys Glu Arg Gln Arg Glu His Pro Phe Ile Val Thr Glu Pro Gly Glu Val Ala Arg Gly Lys Lys Asn Gly Leu Asp Tyr Leu Phe His Leu Tyr Glu Gln Cys Arg Asp Phe Leu Ile Gln Val Gln Ser Ile Ala Lys Glu Arg Gly Glu Lys Cys Pro Thr Lys Val Thr Asn Gln Val Phe Arg Tyr Ala Lys Lys Ala Gly Ala Ser Tyr Ile Asn Lys Pro Lys Met Arg His Tyr Val His Cys Tyr Ala Leu His Cys Leu Asp Glu Asp Ala Ser Asn Ala Leu Arg Arg Ala Phe Lys Glu Arg Gly Glu Asn Val Gly Ala Trp Arg Gln Ala Cys Tyr Lys Pro Leu Val Ala Ile Ala Ser Arg Gln Gly Trp Asp Ile Asp Ser Ile Phe Asn Ala His Pro Arg Leu Ala Ile Trp Tyr Val Pro Thr 3 0 Lys Leu Arg Gln Leu Cys Tyr Ala Glu Arg Asn Ser Ala Thr Ser Ser Ser Ser Val Ser Gly Thr Gly Gly His Leu Pro Phe (2) INFORMATION FOR SEQ ID NO.: 7:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 1131 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii.) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
5 0 (A) ORGANISM: Populus balsamifera subsp. trichocarpa (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(1131) (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 7:

Met Asp Pro Glu Ala Phe Thr Ala Ser Leu Phe Lys Trp Asp Thr Arg 60 Ala Met Val Pro His Pro Asn Arg Leu Leu Glu Met Val Pro Pro Pro Gln Gln Pro Pro Ala Ala Ala Phe Ala Val Arg Pro Arg Glu Leu Cys Gly Leu Glu Glu Leu Phe Gln Ala Tyr Gly Ile Arg Tyr Tyr Thr Ala Ala Lys Ile Ala Glu Leu Gly Phe Thr Val Asn Thr Leu Leu Asp Met Lys Asp Glu Glu Leu Asp Glu Met Met Asn Ser Leu Ser Gln Ile Phe Arg Trp Asp Leu Leu Val Gly Glu Arg Tyr Gly Ile Lys Ala Ala Val Arg Ala Glu Arg Arg Arg Leu Asp Glu Glu Asp Pro Arg Arg Arg Gln Leu Leu Ser Gly Asp Asn Asn Thr Asn Thr Leu Asp Ala Leu Ser Gln 3 0 Glu Gly Phe Ser Glu Glu Pro Val Gln Gln Asp Lys Glu Ala Ala Gly Ser Gly Gly Arg Gly Thr Trp Glu Ala Val Ala Ala Gly Glu Arg Lys Lys Gln Ser Gly Arg Lys Lys Gly Gln Arg Lys Val Val Asp Leu Asp Gly Asp Asp Glu His Gly Gly Ala Ile Cys Glu Arg Gln Arg Glu His Pro Phe Ile Val Thr Glu Pro Gly Glu Val Ala Arg Gly Lys Lys Asn 5 0 Gly Leu Asp Tyr Leu Phe His Leu Tyr Glu Gln Cys Arg Asp Phe Leu Ile Gln Val Gln Ser Ile Ala Lys Glu Arg Gly Glu Lys Cys Pro Thr Lys Val Thr Asn Gln Val Phe Arg Tyr Ala Lys Lys Ala Gly Ala Ser Tyr Ile Asn Lys Pro Lys Met Arg His Tyr Val His Cys Tyr Ala Leu His Cys Leu Asp Glu Asp Ala Ser Asn Ala Leu Arg Arg Ala Phe Lys Glu Arg Gly Glu Asn Val Gly Ala Trp Arg Gln Ala Cys Tyr Lys Pro Leu Val Ala Ile Ala Ser Arg Gln Gly Trp Asp Ile Asp Ser Ile Phe Asn Ala His Pro Arg Leu Ala Ile Trp Tyr Val Pro Thr Lys Leu Arg Gln Leu Cys Tyr Ala Glu Arg Asn Ser Ala Thr Ser Ser Ser Ser Val Ser Gly Thr Gly Gly His Leu Pro Phe (2) INFORMATION FOR SEQ ID NO.: 8:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 377 (B) TYPE: amino acid 3 O (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: Populus balsamifera subsp. trichocarpa (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 8:
Met Asp Pro Glu Ala Phe Thr Ala Ser Leu Phe Lys Trp Asp Thr Arg Ala Met Val Pro His Pro Asn Arg Leu Leu Glu Met Val Pro Pro Pro Gln Gln Pro Pro Ala Ala Ala Phe Ala Val Arg Pro Arg Glu Leu Cys Gly Leu Glu Glu Leu Phe Gln Ala Tyr Gly Ile Arg Tyr Tyr Thr Ala Ala Lys Ile Ala Glu Leu Gly Phe Thr Val Asn Thr Leu Leu Asp Met Lys Asp Glu Glu Leu Asp Glu Met Met Asn Ser Leu Ser Gln Ile Phe Arg Trp Asp Leu Leu Val Gly Glu Arg Tyr Gly Ile Lys Ala Ala Val Arg Ala Glu Arg Arg Arg Leu Asp Glu Glu Asp Pro Arg Arg Arg Gln Leu Leu Ser Gly Asp Asn Asn Thr Asn Thr Leu Asp Ala Leu Ser Gln Glu Gly Phe Ser Glu Glu Pro Val Gln Gln Asp Lys Glu Ala Ala Gly Ser Gly Gly Arg Gly Thr Trp Glu Ala Val Ala Ala Gly Glu Arg Lys Lys Gln Ser Gly Arg Lys Lys Gly Gln Arg Lys Val val Asp Leu Asp Gly Asp Asp Glu His Gly Gly Ala Ile Cys Glu Arg Gln Arg Glu His Pro Phe Ile Val Thr Glu Pro Gly Glu Val Ala Arg Gly Lys Lys Asn Gly Leu Asp Tyr Leu Phe His Leu Tyr Glu Gln Cys Arg Asp Phe Leu Ile Gln Val Gln Ser Ile Ala Lys Glu Arg Gly Glu Lys Cys Pro Thr Lys Val Thr Asn Gln Val Phe Arg Tyr Ala Lys Lys Ala Gly Ala Ser Tyr Ile Asn Lys Pro Lys Met Arg His Tyr Val His Cys Tyr Ala Leu His Cys Leu Asp Glu Asp Ala Ser Asn Ala Leu Arg Arg Ala Phe Lys Glu Arg Gly Glu Asn Val Gly Ala Trp Arg Gln Ala Cys Tyr Lys Pro Leu Val Ala Ile Ala Ser Arg Gln Gly Trp Asp Ile Asp Ser Ile Phe Asn Ala His Pro Arg Leu Ala Ile Trp Tyr Val Pro Thr Lys Leu Arg Gln Leu Cys Tyr Ala Glu Arg Asn Ser Ala Thr Ser Ser Ser Ser Val 4 0 Ser Gly Thr Gly Gly His Leu Pro Phe (2) INFORMATION FOR SEQ ID NO.: 9:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 11485 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
5 O (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Populus balsamifera subsp. trichocarpa (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 9:

O

O

O

O

O

O

O

O

O

O

O

(2) INFORMATION 10:
FOR SEQ
ID NO.:

(i) SEQUENCE
CHARACTERISTICS

(A) LENGTH:

(B) TYPE: nucleic acid 60 (C) STRAN DEDNESS:

(D) TOPOLOGY:

(ii) MOLECULE DNA
TYPE:

(vi) ORIGINAL SOURCE:

(A) ORGANISM: trichocarpa Populus balsamifera subsp.

(ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (196)..(921) (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 10:

Met Glu Tyr Gln Asn Glu Ser Leu Glu Ser Ser Pro Leu Arg Lys Leu Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg Gln Val Thr Phe Cys Lys Arg Arg Ser Gly Leu Leu Lys Lys Ala Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Val Phe Ser Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asp Ser 3 0 Val Lys Ser Thr Ile Glu Arg Tyr Lys Lys Ala Ser Ala Asp Ser Ser Asn Thr Gly Ser Val Ser Glu Ala Asn Ala Gln Tyr Tyr Gln Gln Glu Ala Ala Lys Leu Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg His Met Leu Gly Glu Ala Leu Ser Ser Leu Ser Val Lys Glu Leu Lys Ser Leu Glu Ile Arg Leu Glu Lys Gly Ile Ser Arg Ile Arg Ser Lys 5 0 Lys Asn Glu Leu Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu Val Asp Leu His Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ser Glu Asn Glu Arg Lys Arg Gln Ser Met Asn Leu Met Pro Gly Gly Ala Asp Phe Glu Ile Val Gln Ser Gln Pro Tyr Asp Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu Gln Pro Ala Ser His Tyr Ser His Gln Asp Gln Met Ala Leu Gln Leu Val (2) INFORMATION FOR SEQ ID NO.: 11:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 723 (B) TYPE: nucleic acid 2 O (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Populus balsamifera subsp. trichocarpa (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(723) (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 11:

3 0 Met Glu Tyr Gln Asn Glu Ser Leu Glu Ser Ser Pro Leu Arg Lys Leu Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg Gln Val Thr Phe Cys Lys Arg Arg Ser Gly Leu Leu Lys Lys Ala Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Val Phe Ser Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asp Ser Val Lys Ser Thr 5 0 Ile Glu Arg Tyr Lys Lys Ala Ser Ala Asp Ser Ser Asn Thr Gly Ser Val Ser Glu Ala Asn Ala Gln Tyr Tyr Gln Gln Glu Ala Ala Lys Leu Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg His Met Leu Gly Glu Ala Leu Ser Ser Leu Ser Val Lys Glu Leu Lys Ser Leu Glu Ile Arg Leu Glu Lys Gly Ile Ser Arg Ile Arg Ser Lys Lys Asn Glu Leu Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu Val Asp Leu His Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ser Glu Asn Glu Arg Lys Arg Gln Ser Met Asn Leu Met Pro Gly Gly Ala Asp Phe Glu Ile Val Gln Ser Gln Pro Tyr Asp Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu Gln Pro Ala Ser His Tyr Ser His Gln Asp Gln Met Ala Leu Gln Leu Val (2) INFORMATION FOR SEQ ID NO.: 12:
3 O (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 241 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: Populus balsamifera subsp. trichocarpa (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 12:
Met Glu Tyr Gln Asn Glu Ser Leu Glu Ser Ser Pro Leu Arg Lys Leu Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg Gln Val Thr Phe Cys Lys Arg Arg Ser Gly Leu Leu Lys Lys Ala Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Val Phe Ser Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asp Ser Val Lys Ser Thr Ile Glu Arg Tyr Lys Lys Ala Ser Ala Asp Ser Ser Asn Thr Gly Ser Val Ser Glu Ala Asn Ala Gln Tyr Tyr Gln Gln Glu Ala Ala Lys Leu Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg His Met Leu Gly Glu Ala Leu Ser Ser Leu Ser Val Lys Glu Leu Lys Ser Leu Glu Ile Arg Leu Glu Lys Gly Ile Ser Arg Ile Arg Ser Lys Lys Asn Glu Leu Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu Val Asp Leu His Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ser Glu Asn Glu Arg Lys Arg Gln Ser Met Asn Leu Met Pro Gly Gly Ala Asp Phe Glu Ile Val Gln Ser Gln Pro Tyr Asp Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu Gln Pro Ala Ser His Tyr Ser His Gln Asp Gln Met Ala Leu Gln Leu Val (2) INFORMATION 13:
FOR SEQ
ID NO.:

(i) SEQUENCE
CHARACTERISTICS

(A) LENGTH:

(B) TYPE: nucleic acid (C) STRANDEDNESS:

(D) TOPOLOGY:

(ii) MOLECULE DNA
TYPE:

(vi) ORIGINAL
SOURCE:

3 (A) ORGANISM: trichocarpa 0 Populus balsamifera subsp.

(xi) SEQUENCE ID NO.:
DESCRIPTION: 13:
SEQ

O

O

AGTCCCTTTC

ATTGAGGTCTCAGTCTTCCTATAGCGTATTCTCTAATTAATTCCAAGATAP~~;~1AAAAAAA2160 O

O

O

O

AAAAAGTACA AATGATTTAA

O

O

O

(2) INFORMATION FOR SEQ ID NO.: 14:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 1159 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Populus balsamifera subsp. trichocarpa (ix) FEATURE
(A) NAME/KEY: CDS
2 0 (B) LOCATION: (99)..(815) (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 14:

Met Ala Tyr Gln Asn Glu Pro Gln Glu Ser Ser Pro Leu Arg Lys Leu Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg Gln Val Thr Phe Cys Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Val Phe Ser Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asn Ser Val Lys Ser Thr Ile Glu Arg Tyr Lys Lys Ala Cys Ala Asp Ser Ser Asn Asn Gly Ser Val Ser Glu Ala Asn Ala Gln Phe Tyr Gln Gln Glu Ala Ala Lys Leu Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg Asn Met Leu Gly Glu Ser Leu Ser Ala Leu Ser Val Lys Glu Leu Lys Ser Leu Glu Ile Lys Leu Glu Lys Gly Ile Gly Arg Ile Arg Ser Lys Lys Asn Glu Leu Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu Ile Asp Leu His Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ala Glu Asn Glu Arg Lys Arg Gln His Met Asn Leu Met Pro Gly Gly Val Asn Phe Glu Ile Met Gln Ser Gln Pro Phe Asp Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu Pro Pro Ala Asn His Tyr Pro His Glu Asp Gln Leu Phe Ser TGACTAACTT ATTATATATT TTGTCTTATA TTTCTTAAAA AAAAAAAAAA F~~AAAAAAAA 1135 (2) INFORMATION FOR SEQ ID NO.: 15:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 714 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
4 O (vi) ORIGINAL SOURCE:
(A) ORGANISM: Populus balsamifera subsp. trichocarpa (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(714) (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 15:

Met Ala Tyr Gln Asn Glu Pro Gln Glu Ser Ser Pro Leu Arg Lys Leu Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg Gln Val Thr Phe Cys Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Val Phe Ser Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asn Ser Val Lys Ser Thr CA 02319853 2000-12-22 .

Ile Glu Arg Tyr Lys Lys Ala Cys Ala Asp Ser Ser Asn Asn Gly Ser Val Ser Glu Ala Asn Ala Gln Phe Tyr Gln Gln Glu Ala Ala Lys Leu Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg Asn Met Leu Gly Glu Ser Leu Ser Ala Leu Ser Val Lys Glu Leu Lys Ser Leu Glu Ile Lys Leu Glu Lys Gly Ile Gly Arg Ile Arg Ser Lys Lys Asn Glu Leu Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu Ile Asp Leu His Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ala Glu Asn Glu Arg Lys 3 0 Arg Gln His Met Asn Leu Met Pro Gly Gly Val Asn Phe Glu Ile Met Gln Ser Gln Pro Phe Asp Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu Pro Pro Ala Asn His Tyr Pro His Glu Asp Gln Leu Phe Ser (2) INFORMATION FOR SEQ ID NO.: 16:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 238 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: Populus balsamifera subsp. trichocarpa (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 16:
Met Ala Tyr Gln Asn Glu Pro Gln Glu Ser Ser Pro Leu Arg Lys Leu Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg Gln Val Thr Phe Cys Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Val Phe Ser Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asn Ser Val Lys Ser Thr Ile Glu Arg Tyr Lys Lys Ala Cys Ala Asp Ser Ser Asn Asn Gly Ser Val Ser Glu Ala Asn Ala Gln Phe Tyr Gln Gln Glu Ala Ala Lys Leu Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg Asn Met Leu Gly Glu Ser Leu Ser Ala Leu Ser Val Lys Glu Leu Lys Ser Leu Glu Ile Lys Leu Glu Lys Gly Ile Gly Arg Ile Arg Ser Lys Lys Asn Glu Leu Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu Ile Asp Leu His Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ala Glu Asn Glu Arg Lys Arg Gln His Met Asn Leu Met Pro Gly Gly Val Asn Phe Glu Ile Met Gln Ser Gln Pro Phe Asp Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu Pro Pro Ala Asn His Tyr Pro His Glu Asp Gln Leu Phe Ser (2) INFORMATION FOR SEQ ID NO.: 17:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 27 (B) TYPE: nucleic acid (C) STRANDEDNESS:
4 O (D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Artificial Sequence (ix) FEATURE
(C) OTHER INFORMATION: Description of Artificial Sequence:
oligonucleotide primer (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 17:

(2) INFORMATION FOR SEQ ID NO.: 18:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 27 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Artificial Sequence (ix) FEATURE
(C) OTHER INFORMATION: Description of Artificial Sequence:
oligonucleotide primer (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 18:

(2) INFORMATION FOR SEQ ID NO.: 19:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 27 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Artificial Sequence (ix) FEATURE
(C) OTHER INFORMATION: Description of Artificial Sequence:
oligonucleotide primer (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 19:

(2) INFORMATION FOR SEQ ID NO.: 20:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 27 2 0 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Artificial Sequence (ix) FEATURE
(C) OTHER INFORMATION: Description of Artificial Sequence:
oligonucleotide primer (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 20:

(2) INFORMATION FOR SEQ ID NO.: 21:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 27 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
4 O (vi) ORIGINAL SOURCE:
(A) ORGANISM: Artificial Sequence (ix) FEATURE
(C) OTHER INFORMATION: Description of Artificial Sequence:
oligonucleotide primer (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 21:

(2) INFORMATION FOR SEQ ID NO.: 22:
S O (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 27 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Artificial Sequence (ix) FEATURE
(C) OTHER INFORMATION: Description of Artificial Sequence:
60 oligonucleotide primer (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 22:

(2) INFORMATION FOR SEQ ID NO.: 23:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 27 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Artificial Sequence 1 O ( ix) FEATURE
(C) OTHER INFORMATION: Description of Artificial Sequence:
oligonucleotide primer (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 23:

(2) INFORMATION FOR SEQ ID NO.: 24:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 29 2 0 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Artificial Sequence (ix) FEATURE
(C) OTHER INFORMATION: Description of Artificial Sequence:
oligonucleotide primer (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 24:

Claims

1. An isolated nucleic acid molecule comprising at least 15 consecutive nucleotides of a nucleic acid sequence selected from the group consisting of Seq. I.D.
Nos. 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14 and 15.

2. An isolated nucleic acid molecule according to claim 1 wherein the nucleic acid molecule includes at least 25 consecutive nucleotides of the specified nucleic acid sequence.

3. An isolated nucleic acid molecule according to claim 1 wherein the nucleic acid molecule includes at least 50 consecutive nucleotides of the specified nucleic acid sequence.

4. A recombinant nucleic acid molecule comprising a promoter sequence operably linked to a nucleic acid molecule according to claim 1.

5. A recombinant nucleic acid molecule according to claim 4 wherein the nucleic acid molecule is arranged in antisense orientation relative to the promoter.

6. A cell transformed with a recombinant nucleic acid molecule according to claim 4.

7. A cell transformed with a recombinant nucleic acid molecule according to claim 5.

8. A transgenic plant comprising a recombinant nucleic acid molecule according to claim 4.

9. A transgenic plant comprising a recombinant nucleic acid molecule according to

10. A transgenic plant according to claim 8 wherein the activity of at least one endogenous gene in the plant is modified as a result of the presence of the recombinant nucleic acid molecule.

11. A transgenic plant according to claim 10 wherein the plant is a Populus species and the affected endogenous gene is selected from the group consisting of PTD, PTLF, PTAG-1 and PTAG-2.

12. A transgenic plant according to claim 10 wherein the plant has a modified phenotype relative to non-transgenic plants of the same species.

13. A transgenic plant according to claim 12 wherein the modified phenotype is a modified fertility phenotype.

14. A transgenic plant comprising a recombinant nucleic acid molecule, wherein the recombinant nucleic acid molecule comprises a promoter sequence operably linked to a first nucleic acid sequence, and wherein the promoter sequence is a promoter sequence from PTD, PTLF, PTAG-1 or PTAG-2.

15. A transgenic plant according to claim 14 wherein the first nucleic acid sequence encodes a cytotoxic polypeptide.

16. A transgenic plant according to claim 14 wherein the plant is a Populus species.

17. An isolated nucleic acid molecule comprising a nucleotide sequence of at least 50 nucleotides in length wherein said molecule shares at least 75% sequence identity with a nucleic acid selected from the group consisting of Seq. ID. Nos. 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14 and 15.

18. An isolated nucleic acid molecule according to claim 17 wherein the molecule comprises a nucleotide sequence of at least 100 nucleotides in length and wherein said molecule shares at least 90% sequence identity with a nucleic acid selected from the group consisting of Seq.
ID. Nos. 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14 and 15.

19. A recombinant nucleic acid molecule comprising a promoter sequence operably linked to a nucleic acid molecule according to claim 17.

20. A recombinant nucleic acid molecule according to claim 19 wherein the nucleic acid molecule is arranged in antisense orientation relative to the promoter.

21. A cell transformed with a recombinant nucleic acid molecule according to claim 19.

22. A transgenic plant comprising a recombinant nucleic acid molecule according to claim 19.

23. A purified protein having an amino acid sequence selected from the group consisting of:
(a) Seq. I.D. No. 4;
(b) Seq. I.D. No. 8;
(c) Seq. I.D. No. 12;
(d) Seq. I.D. No. 16; and (e) sequences that differ from (a)-(d) by one or more conservative amino acid substitutions.

24. An isolated nucleic acid molecule encoding a protein according to claim 23.

25. An isolated nucleic acid molecule according to claim 24 wherein the nucleic acid molecule comprises a sequence selected from the group consisting of Seq. I.D.
Nos. 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14 and 15.