WO2010002276A1

WO2010002276A1 - Compositions and methods for improving trees

Info

Publication number: WO2010002276A1
Application number: PCT/NZ2009/000127
Authority: WO
Inventors: Sheree Alice Cato
Original assignee: New Zealand Forest Research Institute Limited
Priority date: 2008-06-30
Filing date: 2009-06-30
Publication date: 2010-01-07

Abstract

The invention provides a method for identifying a tree with a genotype indicative of increased diameter growth, the method including detecting in the tree, or a sample derived from the tree, by direct or indirect methods, the presence of an allele of the dehydrin gene that includes an additional serine (S) residue in the casein kinase II (CKII)-hydrophilic charged domain of the dehydrin gene, relative to more commonly identified alleles, such as those disclosed herein. The invention also provides the isolated polynucleotides of such alleles, constructs, host cells, plant cells and plants comprising such polynucleotides. The invention also provided methods for producing plants with increased diameter growth, using the polynucleotides of the invention, and plants produced by the methods.

Description

COMPOSITIONS AND METHODS FOR IMPROVING TREES

FIELD OF THE INVENTION

The present invention relates to methods and compositions for identifying or producing trees with increased diameter growth.

BACKGROUND

The diameter growth of trees is important to the forestry industry. Increased diameter growth would provide increased yield or biomass per unit time or area. Increased yield or biomass is beneficial in biomaterial, biofuel/bioenergy solid wood, pulp and paper applications. It is therefore of significant interest and value to the forestry industry to adopt breeding strategies aimed at developing trees with increased diameter growth.

It is possible to measure diameter growth in trees, and to select trees with relatively increased diameter growth for use as parents in breeding programs designed to produce offspring with increased growth rate. However, trees may need to reach relatively mature growth stage (i.e. 8 to 10 years of age) before useful diameter growth data can be collected.

Marker assisted selection (MAS) is an approach that is often used to identify plants or animals with alteration in a particular trait using a genetic marker associated with the trait. The alteration in the trait may be desirable and be advantageously selected for, or non-desirable and advantageously selected against, in selective breeding programs. MAS allows breeders to identify and select plants or animals at a relatively immature growth stage, and is particularly valuable for traits that are not revealed until the plant or animal reaches advanced maturity. The best markers for MAS are the causal polymorphisms or mutations, but where these are not available, markers that are linked, and preferably in linkage disequilibrium, with the causal mutation can also be used. Such information can be used to accelerate genetic gain, increase selection intensity, and/or reduce trait measurement costs, and thereby has utility in commercial breeding programmes. To apply such approaches to diameter growth in trees, of course requires the availability markers linked to the diameter growth trait. It would therefore be beneficial to have available markers that could be used to identify trees with increased diameter growth.

Advances in genetic manipulation provide the tools to transform plants, including trees, to contain and express foreign genes. This has led to the development of plants capable of expressing pharmaceuticals and other chemicals, plants with increased pest resistance, increased stress tolerance and many other beneficial traits. To use such approaches for increasing diameter growth in trees, it is necessary to identify genes that can influence diameter growth when introduced into trees by the genetic manipulation techniques.

It is an object of the invention to provide methods and compositions for identifying or producing trees with increased diameter growth, and/or at least to provide the public with a useful choice.

SUMMARY OF THE INVENTION

The present invention results from the applicants' discovery that particular alleles of the dehydrin gene, when present in the homozygous state, are associated with increased diameter growth. The alleles encode dehydrin proteins that include an additional serine in a casein kinase Il-hydrophilic charged domain, relative to proteins of more commonly identified alleles. Specifically in more commonly identified alleles, two serine residues are present at amino acid positions 161 and 162 within the casein kinase Il-hydrophilic charged domain of the dehydrin proteins. In the allele responsible, when in the homozygous state, for the increased diameter growth rate phenotype, an additional serine is present such that three serines are present at amino acid positions 161, 162 and 163. Alleles with the additional serine residue have been designated S+ alleles by the applicants.

The invention provides methods for identifying and selecting trees with genotypes indicative of increased diameter growth based on detection of presence of the S+ alleles, preferably in the homozygous state. The invention also provides transgenic methods for producing trees with increased diameter growth by manipulating expression of S+ dehydrin alleles in trees. In the first aspect the invention provides a method for identifying a tree with a genotype indicative of increased diameter growth, the method including detecting in the tree, or a sample derived from the tree, by direct or indirect methods, the presence of an allele of the dehydrin gene that includes an additional serine (S) residue in the casein kinase II (CKII)-hydrophilic charged domain of the dehydrin gene, relative to more commonly identified alleles, such as those disclosed herein.

In one embodiment the casein kinase II — hydrophilic charged domain comprises the sequence motif: SnD[E/D]CG[G/V]KEEKK (SEQ ID NO:69), where n=2-3.

Preferably the domain consists of the sequence motif SnD[E/D]CG[G/V]KEEKK (SEQ ID NO:69), where n=2-3.

In a preferred embodiment, two copies of the allele are detected, that is the method detects the presence of the alleles in the homozygous state.

Serine (S) + dehydrin allele polypeptides

The expressed proteins of commonly identified dehydrin alleles include serine (S) residues at amino acid positions 161 and 162, but not amino acid position 163 in the dehydrin protein. Examples of such more commonly identified alleles are shown in the sequences of SEQ ID NO:1-21 and 28-34.

In one embodiment of the method of the invention the polypeptide of the allele detected is characterised by the presence of an additional serine (S) residue such that three serine (S) residues are present at amino acid positions 161, 162 and 163 in the dehydrin protein.

Preferably the polypeptide of the dehydrin allele detected, comprises a sequence with at least 70% identity to the polypeptide sequence of any one of SEQ ID NO:22-27.

Preferably the polypeptide of the dehydrin allele detected comprises the sequence of any one of SEQ ID NO:22-27. Serine (S) + dehydrin allele polynucleotides

The polynucleotides of commonly identified dehydrin alleles encode polypeptides with serine (S) residues at amino acid positions 161 and 162, but not at amino acid position 163, in the dehydrin protein. Examples of the polynucleotide sequence of such more commonly identified alleles are shown in the sequences of SEQ ID NO:35-55 and 62-68.

In one embodiment of the method of the invention the polynucleotide of the dehydrin allele detected comprises a sequence with at least 70% identity to the polynucleotide sequence of any one of SEQ ID NO: 56-61.

Preferably the polynucleotide contains codons encoding serine at nucleotide positions 481-483, 484-486 and 487-489. Preferably the codon at nucleotide position 484-486 encodes the additional serine. Preferably the codon at nucleotide position 484-486 is TCT.

Preferably the polynucleotide of the dehydrin allele detected comprises the polynucleotide sequence of any one of SEQ ID NO: 56-61.

Preferably presence of the homozygous pair of alleles with the additional serine (S+ alleles) is in LD with the increased diameter growth trait.

More preferably presence of the homozygous pair of alleles is in LD with the increased diameter growth trait at a D' value of at least 0.1, more preferably at least 0.2, more preferably at least 0.3, more preferably at least 0.4, more preferably at least 0.5.

More preferably the presence of the homozygous pair of alleles is in LD with the increased diameter growth trait at a R² value of at least 0.05, more preferably at least 0.075, more preferably at least 0.1, more preferably at least 0.2, more preferably at least 0.3, more preferably at least 0.4, more preferably at least 0.5. Presence of either or both of the homozygous pair of alleles may be detected directly, or may be detected indirectly by detecting a marker that is linked to the specified allele.

> Preferably the marker is in linkage disequilibrium (LD) with the S+ allele. That is the marker is in LD with the additional serine in the polypeptide of the S+ allele, or in LD with the additional codon encoding serine in the polynucleotide of the S+ allele.

Preferably the marker is in LD with the S+ allele at a D' value of at least 0.1, more preferably at least 0.2, more preferably at least 0.3, more preferably at least 0.4, more preferably at least 0.5.

Preferably the marker is in LD with the S+ allele at a R value of at least 0.05, more preferably at least 0.075, more preferably at least 0.1, more preferably at least 0.2, more preferably at least 0.3, more preferably at least 0.4, more preferably at least 0.5.

Table 1 shows a list of markers that are in linkage disequilibrium with the allele, or characteristic serine or codon encoding serine in the casein kinase II (CKII)-hydrophilic charged domain of the S+ dehydrin protein.

Table 1

It will be appreciated by those skilled in the art that the protein isoibrms are encoded by corresponding nucleic acid alleles. Thus the method of the invention can be applied by detecting the presence of the specified nucleotides in the polynucleotides encoding the allelic polypeptides, or by detecting the presence of the specified amino acids in the encoded allelic polypeptides.

In a preferred embodiment the invention provides a method for identifying a tree with genotype a indicative of increased radial growth, the method comprising detection of: a) serine at amino acid position 163 of a dehydrin protein with at least 70% identity to the sequence of any one of SEQ ID NO: 22-27; b) an amino acid marker in linkage disequilibrium with the serine in a); c) the TCT codon at nucleotide position 484-486 in the dehydrin encoding polynucleotide of any one of SEQ ID NO: 56-61 ; or d) a nucleotide marker in linkage disequilibrium with the TCT codon in c).

Preferably the dehydrin protein in a) comprises the sequence of any one of SEQ ID NO:22-27.

It will be appreciated, that although the additional serine in the protein of the S+ allele is at position 162, there is also a serine at position 162 in the more commonly occurring S- alleles. Therefore presence of a serine residue at 161 can be diagnostic of the S+ allele.

Preferably the dehydrin encoding polynucleotide inc) comprises the sequence of any one of SEQ ID NO: 56-61.

The nucleic acid alleles, or linked nucleic acid markers, may be detected by any suitable method. Preferably the alleles or markers are detected using a polymerase chain reaction (PCR) step. PCR methods are well known to those skilled in the art and are described for example in Mullis et ah, Eds. 1994 The Polymerase Chain Reaction, Birkhauser, incorporated herein by reference.

Preferably a PCR product is produced by amplifying the marker with primers comprising sequence complimentary to sequence of the tree genome flanking the polymorphism or marker. Any suitable primer pair may be used. Preferably the PCR is performed using at least one primer selected from those set forth in Table 2. Preferably the PCR is performed using at least one primer pair selected from those set forth in Table 2.

Table 2: Exemplary primers for amplifying PCR products comprising the allele specific nucleotides of the invention

In one embodiment presence of the nucleotide characteristic of a specific allele, or marker, is identified by assessing the size of the PCR product amplified. Size may be estimated by running the PCR product through an agarose or acylamide gel. Preferably a size standard is also run in the gel for comparison with the PCR product. Alternatively, if one of the primers is fluorescently labelled, the size of the PCR product may also be assessed by electrophoresis through denaturing-polymer on a DNA sequencer (e.g. Applied Biosystems). Preferably a fluorescent size standard and Hi-Di formamide are added to the PCR product and samples are run using standard electrophoresis conditions and analysis software.

PCR Products can also be sequence directly in order to identify S+/S+; S+/S-; or S-/S- individuals.

Other methods for detecting the presence of nucleotides characteristic of a specific allele are also contemplated, such as but not limited to probe-based methods, which are well known to those skilled in the art as described in Sambrook et al. , Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987, incorporated herein by reference.

Use of other methods such as the oligonucleotide ligation assay (OLA) are also included within the scope of the invention. OLA methods are well known to those skilled in the art.

In one embodiment presence of the S+ allele is detected directly by detecting presence of the additional serine amino acid characteristic of the S+ allele.

The presence of amino acids characteristic of specified alleles may thus also be detected in a protein, or polypeptide, sample derived from the tree. Any suitable method for detecting the presence of the characteristic amino acid in a protein or polypeptide may be applied. Typical methods involve the use of antibodies for detection of the protein polymorphism. Methods for producing and using antibodies are well known to those skilled in the art and are described for example in Antibodies, A Laboratory Manual, Haiiow A Lane, Eds, Cold Spring Harbour Laboratory, 1998.

Identification of the homozygous S+/S+ state

Amplicons may be deemed to be either homozygous S+/S+, heterozygous S+/S- or homozygous S-/S- based on size differences on a gel. The S+/S+ state yields a single band that is 3bp larger than the corresponding S- band. Heterozygous (S+/S-) individuals give two bands on a gel (3 bp apart in size). Homozygous S-/S- give a band that is 3 bp smaller than the corresponding S+ band. Alternatively, a direct sequencing approach may be used to identify S+/S+ individuals.

Selection method

In a further aspect the invention provides a method for selecting a tree with a genotype indicative of increased diameter growth, the method comprising selecting a tree identified by a method of the invention.

Transgenic methods

In a further aspect the invention provides a method for producing a tree cell or tree with increased diameter growth, the method comprising transformation of a tree cell or tree with a polynucleotide encoding a dehydrin protein that includes an additional serine (S) residue in the casein kinase II (CKII)-hydrophilic charged domain of the dehydrin gene, relative to more commonly identified alleles, such as those disclosed herein.

In one embodiment the casein kinase II -hydrophilic charged domain comprises the sequence motif: SnD[E/D]CG[G/V]KEEKK (SEQ ID NO:1), where n=2-3.

Preferably the domain consists of the sequence motif SnD[E/D]CG[G/V]KEEKK (SEQ ID NO:1), where n=2-3.

Serine (S) + dehydrin allele polynucleotides transformed

In one embodiment the polynucleotide transformed encodes a dehydrin protein is characterised by the presence of an additional serine (S) residue within a casein kinase II (CKII)-hydrophilic charged domain of the dehydrin protein.

Preferably three serine (S) residues are present at amino acid positions 161, 162 and 163 in sequence of the dehydrin protein. Preferably the dehydrin protein comprises a sequence with at least 70% identity to the polypeptide sequence of any one of SEQ ID NO:22-27.

Preferably three serine (S) residues are present at amino acid positions 161, 162 and 163 in sequence of the dehydrin protein shown in the specified sequences.

Preferably the dehydrin protein comprises the sequence of any one of SEQ ID NO:22-27.

Preferably the polynucleotide transformed comprises a sequence with at least 70% identity to the polynucleotide sequence of any one of SEQ ID NO: 56-61.

Preferably the polynucleotide comprises the polynucleotide sequence of any one of SEQ ID NO:56-61.

In one embodiment expression of the non-S+ (S-) dehydrin allele is disrupted. Expression may be disrupted by any method such as but not limited to the silencing methods described herein.

In a further aspect the invention provides a tree cell or tree produced by a method of the invention.

Polynucleotides encoding polypeptides

In a further aspect the invention provides an isolated polynucleotide encoding a polypeptide with the sequence of any one of SEQ ID NO:22-27 or a variant thereof, wherein the variant is a polypeptide capable of increasing diameter growth when expressed in the homozygous state in a plant.

In one embodiment the polypeptide has at least 70% identity to the sequence of any one of SEQ ID NO:22 to 27. In a further embodiment the polypeptide includes serine (S) residues at amino acid positions 161, 162 and 163.

Preferably the polypeptide with comprises the sequence of any one of SEQ ID NO.22 to 27.

Preferably the polypeptide with consists of the sequence of any one of SEQ ID NO.22 to 27.

Polynucleotides

In a further aspect the invention provides an isolated polynucleotide comprising the sequence of any one of SEQ ID NO: 56-61 or a variant thereof, wherein the variant encodes a polypeptide capable of capable of increasing diameter growth when expressed in the homozygous state in a plant.

In one embodiment the polynucleotide comprises a sequence with at least 70% identity to the polynucleotide sequence of any one of SEQ ID NO: 56-61.

In a further embodiment the polynucleotide includes a codon encoding serine at nucleotide positions 484-486. Preferably the codon at nucleotide position 484-486 is TCT.

In a further embodiment the polynucleotide includes codons encoding serine at nucleotide positions 481-483, 484-486 and 487-489. Preferably the codon at nucleotide position 484-486 is TCT.

Preferably the polynucleotide comprises the sequence of any one of SEQ ID NO: 56-61.

Preferably the polynucleotide consists of the sequence of any one of SEQ ID NO: 56-61. Polypeptides

In a further aspect the invention provides an isolated polypeptide with the sequence of any one of SEQ ID NO: 22 to 27 or a variant thereof, wherein the variant is a polypeptide capable of increasing diameter growth when expressed in the homozygous state in a plant.

In one embodiment the polypeptide has at least 70% identity to the sequence of any one of SEQ ID NO:22 to 27.

In a further embodiment the polypeptide includes serine (S) residues at amino acid positions 161, 162 and 163.

Preferably the polypeptide with comprises the sequence of any one of SEQ ID NO:22 to 27.

Preferably the polypeptide with consists of the sequence of any one of SEQ ID NO:22 to 27.

In a further aspect the invention provides a polynucleotide encoding a polypeptide of the invention.

Constructs

In a further aspect the invention provides a genetic construct comprising a polynucleotide of the invention.

In one embodiment the genetic construct is an expression construct.

In a further aspect the invention provides a vector comprising a polynucleotide, genetic construct or expression construct of the invention.

In a further aspect the invention provides a host cell comprising a polynucleotide, genetic construct or expression construct of the invention. In a further aspect the invention provides a host cell genetically modified to express a polynucleotide of the invention.

In a further aspect the invention provides a plant cell comprising a genetic construct or the expression construct of the invention.

In a further aspect the invention provides a plant cell genetically modified to express a polynucleotide of the invention.

In a further aspect the invention provides a plant which comprises a plant cell of the invention.

Methods — using polynucleotides encoding polypeptides

In a further aspect the invention provides a method of producing a plant with increased diameter growth, the method comprising transformation of a plant with: a) a polynucleotide encoding of a polypeptide with the sequence of any one of SEQ ID NO:22 to 27, or a variant of the polypeptide, wherein the variant is capable of increasing diameter growth in a plant; b) a polynucleotide comprising a fragment, of at least 15 nucleotides in length, of the polynucleotide of a); or c) a polynucleotide comprising a complement of the polynucleotide of a) or b).

Preferably the plant is transformed with a genetic construct or vector comprising the polynucleotide.

In one embodiment the isolated polynucleotide of a) encodes a polypeptide with at least 70% identity to the sequence of any one of SEQ ID NO:22 to 27.

In a preferred embodiment the isolated polynucleotide of a) encodes a polypeptide with the amino acid sequence of any one of SEQ ID NO:22 to 27. Methods using polynucleotides

In a further aspect the invention provides a method of producing a plant with increased diameter growth, the method comprising transformation of a plant cell or plant with: a) a polynucleotide comprising the nucleotide sequence of any one of SEQ ID NO: 56- 61, or a variant thereof wherein the variant encodes a dehydrin polypeptide capable of increasing diameter growth in a plant; b) a polynucleotide comprising a fragment, of at least 15 nucleotides in length, of the polynucleotide of a); or c) a polynucleotide comprising a complement of the polynucleotide of a) or b).

In one embodiment the isolated polynucleotide of a) comprises a sequence with at least 70% identity to the sequence of any one of SEQ ID NO: 56-61.

In a more preferred embodiment the isolated polynucleotide of a) encodes a polypeptide with the amino acid sequence of any one of SEQ ID NO: 56-61.

Preferably the plants of the invention are trees, and the plant cells are tree cells.

Trees

The trees in the methods of the invention may be from any tree species.

Preferred trees are those from gymnosperm species such as, but not limited to: Abies amabilis, Abies balsamea, Abies concolor, Abies grandis, Abies lasiocarpa, Abies magnified, Abies procera, Chamaecyparis lawsoniona, Chamaecyparis nootkatensis, Chamaecyparis thyoides, Juniperus virginiana, Larix decidua, Larix Jaricina, Larix leptolepis, Larix occidentalis, Larix siberica, Lϊbocedrus decurrens, Picea abies, Picea engelmanni, Picea glauca, Picea mariana, Picea pungens, Picea rubens, Picea sitchensis, Pinus banksiana, Pinus brutia, Pinus caribaea, Pinus clausa, Pinus contorta, Pinus coulteri, Pinus echinata, Pinus eldarica, Pinus ellioti, Pinus jeffreyi, Pinus lambertiana, Pinus monticola, Pinus nigra, Pinus palustrus, Pinus pinaster, Pinus ponderosa, Pinus radiata, Pinus resinosa, Pinus rigida, Pinus serotina, Pinus strobus, Pinus sylvestris, Pinus taeda, Pinus virginiana; Pseudotsuga menziesii, Sequoia gigantea, Sequoia sempervirens, Taxodium distichum, Tsiiga canadensis, Tsuga heterophylla, Tsuga mertensiana, Thuja occidentalis, and Thuja plicata.

Particularly preferred trees are those of the Pinus genus, including but limited to: Pinus banksiana, Pinus brutia, Pinus caribaea, Pinus clausa, Pinus contorta, Pinus coulteri, Pinus echinata, Pinus eldarica, Pinus ellioti, Pinus jeffreyi, Pinus lambertiana, Pinus monticola, Pinus nigra, Pinus palustrus, Pinus pinaster, Pinus ponderosa, Pinus radiata, Pinus resinosa, Pinus rigida, Pinus serotina, Pinus strobus, Pinus sylvestris, Pinus taeda, and Pinus virginiana.

Preferred Pinus species are those from the subgenus Pinus subsection Trifolia. Preferred subsection Trifolia species include P. taeda, Pinus radiata, P. attenuate, P. muricata, P. teocote, P. greggiii, P. herrerae, P. devoniana, P. pseudostrobus, and P. contorta.

Preferred Pinus species also include those selected from the group including Pinus radiata, Pinus taeda, Pinus sylvestris and Pinus pinaster.

Particularly preferred Pinus species include Pinus radiata and Pinus taeda.

A particularly preferred Pinus species is Pinus radiata.

DETAILED DESCRIPTION OF THE INVENTION

Definitions

In this specification where reference has been made to patent specifications, other external documents, or other sources of information, this is generally for the purpose of providing a context for discussing the features of the invention. Unless specifically stated otherwise, reference to such external documents is not to be construed as an admission that such documents, or such sources of information, in any jurisdiction, are prior art, or form part of the common general knowledge in the art.

The term "comprising" as used in this specification means "consisting at least in part of. When interpreting each statement in this specification that includes the term "comprising", features other than that or those prefaced by the term may also be present. Related terms such as "comprise" and "comprises" are to be interpreted in the same manner.

The term "dehydrin" in relation to polypeptides, proteins, polynucleotides and genes has the same meaning as dehydrin as commonly used by those skilled in the art.

Dehydrins (DHNs) are part of a large group of highly hydrophilic proteins known as LEA (Late Embryogenesis Abundant). The distinctive feature of all DHNs is a conserved, lysine-rich 15- amino acid domain, EKKGIMDKIKEKLPG, named the K-segment. It is usually present near the C-terminus. Other typical dehydrin features are: a track of Ser residues (the S-segment); a consensus motif, T/VDEYGNP (the Y-segment), located near the N-terminus; and less conserved regions, usually rich in polar amino acids (the Phi-segments). The number and order of the Y-, S-and K-segments define different DHN sub-classes: Y(n)SK(n), Y(n)Kn, SK(n), K(n) and K(n)S. (Rorat, T., 2006, Plant dehydrins— tissue location, structure and function., Cell MoI Biol Lett. 2006; 11(4):536-56. Epub 2006 Sept).

Dehydrins are distributed in a wide range of organisms including the higher plants, algae, yeast and cyanobacteria. They accumulate late in embryogenesis, and in nearly all the vegetative tissues during normal growth conditions and in response to stress leading to cellular dehydration (e.g. drought, low temperature and salinity). DHNs are localized in different cell compartments, such as the cytosol, nucleus, mitochondria, vacuole, and the vicinity of the plasma membrane; however, they are primarily localized to the cytoplasm and nucleus.

In vitro experiments have revealed that some DHNs (YSK(n)-type) bind to lipid vesicles that contain acidic phospholipids, and others (K(n)S) were shown to bind metals and have the ability to scavenge hydroxyl radicals [Asghar, R. et al. Protoplasma 177 (1994) 87-94], protect lipid membranes against peroxidation or display cryoprotective activity towards freezing-sensitive enzymes. The SK(n)-and K-type seem to be directly involved in cold acclimation processes. The main question arising from the in vitro findings is whether each DHN structural type could possess a specific function and tissue distribution. Much recent in vitro data clearly indicates that dehydrins belonging to different subclasses exhibit distinct functions.

"Diameter growth" as used herein means an increase in the stem/trunk/bole diameter over a period of time. Diameter growth can be inferred from the stem circumference, growth-ring widths, or pith-to-bark increment cores.

Diameter growth can be measured by several methods well known to those skilled in the art. Such methods include the measurement of growth-ring widths using, for example, pith-to-bark increment cores or wood-rounds. Measurements can be performed with either a ruler/tape measure, or via x-ray densitometry or Silviscan. Diameter growth can also be inferred from measurements of the stem circumference.

Methods for measuring diameter growth are also provided in the examples section of this specification.

"Polymorphism" is a condition in DNA in which the most frequent variant (or allele) has a population frequency which does not exceed 99%.

The term "linkage disequilibrium" or LD as used herein, refers to a derived statistical measure of the strength of the association or co-occurrence of two independent genetic markers. Various statistical methods can be used to summarize linkage disequilibrium (LD) between two markers but in practice only two, termed D' and R², are widely used.

Marker linked, and or in LD, with the specified polymorphisms may be of any type including but not limited to, SNPs, substitutions, insertions, deletions, indels, simple sequence repeats (SSRs).

The abbreviation "SSR" stands for a "simple sequence repeat" and refers to any short sequence, for example, a mono-, di-, tri-. or tetra-nucleotide that is repeated at least once in a particular nucleotide sequence. These sequences are also known in the art as "microsatellites." A SSR can be represented by the general formula (Nl N2 . . . Ni)n, wherein N represents nucleotides A, T, C or G, i represents the number of the nucleotides in the base repeat, and n represents the number IS of times the base is repeated in a particular DNA sequence. The base repeat, i.e., Nl N2 . . . Ni, is also referred to herein as an "SSR motif." For example, (ATC)4, refers to a tri-nucleotide ATC motif that is repeated four times in a particular sequence. In other words, (ATC)4 is a shorthand version of "ATCATCATCATC."

The term "complement of a SSR motif refers to a complementary strand of the represented motif. For example, the complement of (ATG) motif is (TAC).

The term "SSR locus" refers to a location on a chromosome of a SSR motif; locus may be occupied by any one of the alleles of the repeated motif. "Allele" is one of several alternative forms of the SSR motif occupying a given locus on the chromosome. For example, the (ATC)8 locus refers to the fragment of the chromosome containing this repeat, while (ATC)4 and (ATC)7 repeats represent two different alleles of the (ATC)8 locus. As used herein, the term locus refers to the repeated SSR motif and the flanking 5' and 3' non-repeated sequences. SSR loci of the invention are useful as genetic markers, such as for determination of polymorphism.

The terms "tree", "tree plant" and "plants" can be used interchangeably throughout this specification.

The term "polynucleotide(s)," as used herein, means a single or double-stranded deoxyribonucleotide or ribonucleotide polymer of any length but preferably at least 15 nucleotides, and include as non-limiting examples, coding and non-coding sequences of a gene, sense and antisense sequences complements, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polynucleotides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes or primers and fragments.

The term "primer" refers to a short polynucleotide, usually having a free 3 'OH group, that is hybridized to a template and used for priming polymerization of a polynucleotide complementary to the target.

The term "probe" refers to a short polynucleotide that is used to detect a polynucleotide sequence, that is complementary to the probe, in a hybridization-based assay. Polynucleotides and fragments

The term "polynucleotide^)," as used herein, means a single or double-stranded deoxyribonucleotide or ribonucleotide polymer of any length but preferably at least 15 nucleotides, and include as non-limiting examples, coding and non-coding sequences of a gene, sense and antisense sequences complements, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polypeptides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers and fragments.

A "fragment" of a polynucleotide sequence provided herein is a subsequence of contiguous nucleotides that is capable of specific hybridization to a target of interest, e.g., a sequence that is at least 15 nucleotides in length. The fragments of the invention comprise 15 nucleotides, preferably at least 20 nucleotides, more preferably at least 30 nucleotides, more preferably at least 50 nucleotides, more preferably at least 50 nucleotides and most preferably at least 60 nucleotides of contiguous nucleotides of a polynucleotide of the invention. A fragment of a polynucleotide sequence can be used in antisense, gene silencing, triple helix or ribozyme technology, or as a primer,^" a probe, included in a microarray, or used in polynucleotide-based selection methods of the invention.

The term "probe" refers to a short polynucleotide that is used to detect a polynucleotide sequence, that is complementary to the probe, in a hybridization-based assay. The probe may consist of a "fragment" of a polynucleotide as defined herein.

Polypeptides and fragments

The term "polypeptide", as used herein, encompasses amino acid chains of any length but preferably at least 5 amino acids, including full-length proteins, in which amino acid residues are W

20 linked by covalent peptide bonds. Polypeptides of the present invention may be purified natural products, or may be produced partially or wholly using recombinant or synthetic techniques. The term may refer to a polypeptide, an aggregate of a polypeptide such as a dimer or other multimer, a fusion polypeptide, a polypeptide fragment, a polypeptide variant, or derivative thereof.

A "fragment" of a polypeptide is a subsequence of the polypeptide that performs a function that is required for the biological activity and/or provides three dimensional structure of the polypeptide. The term may refer to a polypeptide, an aggregate of a polypeptide such as a dimer or other multimer, a fusion polypeptide, a polypeptide fragment, a polypeptide variant, or derivative thereof capable of performing the above enzymatic activity.

The term "isolated" as applied to the polynucleotide or polypeptide sequences disclosed herein is used to refer to sequences that are removed from their natural cellular environment. An isolated molecule may be obtained by any method or combination of methods including biochemical, recombinant, and synthetic techniques.

The term "recombinant" refers to a polynucleotide sequence that is removed from sequences that surround it in its natural context and/or is recombined with sequences that are not present in its natural context.

A "recombinant" polypeptide sequence is produced by translation from a "recombinant" polynucleotide sequence.

The term "derived from" with respect to polynucleotides and polypeptides of the invention being "derived from" a particular genera or species, means that the polynucleotide or polypeptide has the same sequence as a polynucleotide or polypeptide found naturally in that genera or species. The polynucleotide or polypeptide which is derived from a genera or species may therefore be produced synthetically or recombinantly. Variants

As used herein, the term "variant" refers to polynucleotide or polypeptide sequences different from the specifically identified sequences, wherein one or more nucleotides or amino acid residues is deleted, substituted, or added. Variants may be naturally occurring allelic variants, or non-naturally occurring variants. Variants may be from the same or from other species and may encompass homologues, paralogues and orthologues. In certain embodiments, variants of the inventive polypeptides and polypeptides possess biological activities that are the same or similar to those of the inventive polypeptides or polypeptides. The term "variant" with reference to polypeptides and polypeptides encompasses all forms of polypeptides and polypeptides as defined herein.

Polynucleotide variants

Variant polynucleotide sequences preferably exhibit at least 50%, more preferably at least 51%, more preferably at least 52%, more preferably at least 53%, more preferably at least 54%, more preferably at least 55%, more preferably at least 56%, more preferably at least 57%, more preferably at least 58%, more preferably at least 59%, more preferably at least 60%, more preferably at least 61%, more preferably at least 62%, more preferably at least 63%, more preferably at least 64%, more preferably at least 65%, more preferably at least 66%, more preferably at least 67%, more preferably at least 68%, more preferably at least 69%, more preferably at least 70%, more preferably at least 71%, more preferably at least 72%, more preferably at least 73%, more preferably at least 74%, more preferably at least 75%, more preferably at least 76%, more preferably at least %, more preferably at least 77%, more preferably at least 78%, more preferably at least 79%, more preferably at least 80%, more preferably at least 81%, more preferably at least 82%, more preferably at least 83%, more preferably at least 84%, more preferably at least 85%, more preferably at least 86%, more preferably at least 87%, more preferably at least 88%, more preferably at least 89%, more preferably at least 90%, more preferably at least 91%, more preferably at least 92%, more preferably at least 93%, more preferably at least 94%, more preferably at least 95%, more preferably at least 96%, more preferably at least 97%, more preferably at least 98%, and most preferably at least 99% identity to a specified polynucleotide sequence. Identity is found over a comparison window of at least 20 nucleotide positions, preferably at least 50 nucleotide positions, more preferably at least 100 nucleotide positions, and most preferably over the entire length of the specified polynucleotide sequence.

Polynucleotide sequence identity can be determined in the following manner. The subject polynucleotide sequence is compared to a candidate polynucleotide sequence using BLASTN (from the BLAST suite of programs, version 2.2.5 [Nov 2002]) in bl2seq (Tatiana A. Tatusova, Thomas L. Madden (1999), "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250), which is publicly available from NCBI (ftp ://ftp.ncbi.nih. gov/blast/) . The default parameters of bl2seq are utilized except that filtering of low complexity parts should be turned off.

The identity of polynucleotide sequences may be examined using the following unix command line parameters:

bl2seq — i nucleotideseql — j nucleotideseq2 -F F — p blastn

The parameter -F F turns off filtering of low complexity sections. The parameter -p selects the appropriate algorithm for the pair of sequences. The bl2seq program reports sequence identity as both the number and percentage of identical nucleotides in a line "Identities = ".

Polynucleotide sequence identity may also be calculated over the entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs (e.g. Needleman, S. B. and Wunsch, C. D. (1970) J. MoI. Biol. 48, 443-453). A full implementation of the Needleman- Wunsch global alignment algorithm is found in the needle program in the EMBOSS package (Rice,P. Longden,I. and Bleasby,A. EMBOSS: The European Molecular Biology Open Software Suite, Trends in Genetics June 2000, vol 16, No 6. pp.276- 277) which can be obtained from http://www.hgmp.mrc.ac.uk/Software/EMBOSS/. The European Bioinformatics Institute server also provides the facility to perform EMBOSS-needle global alignments between two sequences on line at http:/www.ebi. ac.uk/emboss/align/.

Alternatively the GAP program may be used which computes an optimal global alignment of two sequences without penalizing terminal gaps. GAP is described in the following paper: Huang, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227-235.

Use of BLASTN as described above is preferred for use in the determination of sequence identity for polynucleotide variants according to the present invention.

Polynucleotide variants of the present invention also encompass those which exhibit a similarity to one or more of the specifically identified sequences that is likely to preserve the functional equivalence of those sequences and which could not reasonably be expected to have occurred by random chance. Such sequence similarity with respect to polypeptides may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from NCBI (ftp://ftp.ncbi.nih.gov/blast/).

The similarity of polynucleotide sequences may be examined using the following unix command line parameters:

bl2seq -i nucleotideseql -j nucleotideseq2 -F F — p tblastx

The parameter — F F turns off filtering of low complexity sections. The parameter — p selects the appropriate algorithm for the pair of sequences. This program finds regions of similarity between the sequences and for each such region reports an "E value" which is the expected number of times one could expect to see such a match by chance in a database of a fixed reference size containing random sequences. The size of this database is set by default in the bl2seq program. For small E values, much less than one, the E value is approximately the probability of such a random match.

Variant polynucleotide sequences preferably exhibit an E value of less than 1 x 10 ^" more preferably less than 1 x 10 ^"20, more preferably less than 1 x 10 ^"30, more preferably less than 1 x 10 ^"40, more preferably less than 1 x 10 ^~50 _, more preferably less than 1 x 10 ^"6^ more preferably less than 1 x 10 ^'70 _; more preferably less than 1 x 10 ^"80, more preferably less than 1 x 10 ^"90 and most preferably less than 1 x 10 ^"10° when compared with any one of the specifically identified sequences. Alternatively, variant polynucleotides of the present invention hybridize to a specified polynucleotide sequence, or complements thereof under stringent conditions.

The term "hybridize under stringent conditions", and grammatical equivalents thereof, refers to the ability of a polynucleotide molecule to hybridize to a target polynucleotide molecule (such as a target polynucleotide molecule immobilized on a DNA or RNA blot, such as a Southern blot or Northern blot) under defined conditions of temperature and salt concentration. The ability to hybridize under stringent hybridization conditions can be determined by initially hybridizing under less stringent conditions then increasing the stringency to the desired stringency.

With respect to polynucleotide molecules greater than about 100 bases in length, typical stringent hybridization conditions are no more than 25 to 30° C (for example, 10° C) below the melting temperature (Tm) of the native duplex (see generally, Sambrook et at., Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Ausubel et ah, 1987, Current Protocols in Molecular Biology, Greene Publishing,). Tm for polynucleotide molecules greater than about 100 bases can be calculated by the formula Tm = 81. 5 + 0. 41% (G + C-log (Na+). (Sambrook et al, Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Bolton and McCarthy, 1962, PNAS 84:1390). Typical stringent conditions for polynucleotide of greater than 100 bases in length would be hybridization conditions such as prewashing in a solution of 6X SSC, 0.2% SDS; hybridizing at 65⁰C, 6X SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in IX SSC, 0.1% SDS at 65° C and two washes of 30 minutes each in 0.2X SSC, 0.1 % SDS at 65⁰C.

With respect to polynucleotide molecules having a length less than 100 bases, exemplary stringent hybridization conditions are 5 to 10° C below Tm. On average, the Tm of a polynucleotide molecule of length less than 100 bp is reduced by approximately (500/oligonucleotide length)⁰ C.

With respect to the DNA mimics known as peptide nucleic acids (PNAs) (Nielsen et αl., Science. 1991 Dec 6;254(5037):1497-500) Tm values are higher than those for DNA-DNA or DNA-RNA hybrids, and can be calculated using the formula described in Giesen et αl., Nucleic Acids Res. 1998 Nov ϊ;26(21):5004-6. Exemplary stringent hybridization conditions for a DNA-PNA hybrid having a length less than 100 bases are 5 to 10° C below the Tm. Variant polynucleotides of the present invention also encompasses polynucleotides that differ from the sequences of the invention but that, as a consequence of the degeneracy of the genetic code, encode a polypeptide having similar activity to a polypeptide encoded by a polynucleotide of the present invention. A sequence alteration that does not change the amino acid sequence of the polypeptide is a "silent variation". Except for ATG (methionine) and TGG (tryptophan), other codons for the same amino acid may be changed by art recognized techniques, e.g., to optimize codon expression in a particular host organism.

Polynucleotide sequence alterations resulting in conservative substitutions of one or several amino acids in the encoded polypeptide sequence without significantly altering its biological activity are also included in the invention. A skilled artisan will be aware of methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et al, 1990, Science 247, 1306).

Variant polynucleotides due to silent variations and conservative substitutions in the encoded polypeptide sequence may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from NCBI (fip://flp.ncbi.nih.gov/blast/) via the tblastx algorithm as previously described.

Polypeptide Variants

The term "variant" with reference to polypeptides encompasses naturally occurring, recombinantly and synthetically produced polypeptides. Variant polypeptide sequences preferably exhibit at least 50%, more preferably at least 51%, more preferably at least 52%, more preferably at least 53%, more preferably at least 54%, more preferably at least 55%, more preferably at least 56%, more preferably at least 57%, more preferably at least 58%, more preferably at least 59%, more preferably at least 60%, more preferably at least 61%, more preferably at least 62%, more preferably at least 63%, more preferably at least 64%, more preferably at least 65%, more preferably at least 66%, more preferably at least 67%, more preferably at least 68%, more preferably at least 69%, more preferably at least 70%, more preferably at least 71%, more preferably at least 72%, more preferably at least 73%, more preferably at least 74%, more preferably at least 75%, more preferably at least 76%, more preferably at least %, more preferably at least 77%, more preferably at least 78%, more preferably at least 79%, more preferably at least 80%, more preferably at least 81%, more preferably at least 82%, more preferably at least 83%, more preferably at least 84%, more preferably at least 85%, more preferably at least 86%, more preferably at least 87%, more preferably at least 88%, more preferably at least 89%, more preferably at least 90%, more preferably at least 91%, more preferably at least 92%, more preferably at least 93%, more preferably at least 94%, more preferably at least 95%, more preferably at least 96%, more preferably at least 97%, more preferably at least 98%, and most preferably at least 99% identity to a sequences of the present invention. Identity is found over a comparison window of at least 20 amino acid positions, preferably at least 50 amino acid positions, more preferably at least 100 amino acid positions, and most preferably over the entire length of a polypeptide of the invention.

Polypeptide sequence identity can be determined in the following manner. The subject polypeptide sequence is compared to a candidate polypeptide sequence using BLASTP (from the BLAST suite of programs, version 2.2.5 [Nov 2002]) in b!2seq, which is publicly available from NCBI (ftp://ftp.ncbi.nih.gov/blast/). The default parameters of bl2seq are utilized except that filtering of low complexity regions should be turned off.

Polypeptide sequence identity may also be calculated over the entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs. EMBOSS-needle (available at http:/www.ebi.ac.uk/emboss/align/) and GAP (Huang, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227- 235.) as discussed above are also suitable global sequence alignment programs for calculating polypeptide sequence identity.

Use of BLASTP as described above is preferred for use in the determination of polypeptide variants according to the present invention.

Polypeptide variants of the present invention also encompass those which exhibit a similarity to one or more of the specifically identified sequences that is likely to preserve the functional equivalence of those sequences and which could not reasonably be expected to have occurred by random chance. Such sequence similarity with respect to polypeptides may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov W

27

2002]) from NCBI (ftp ://ftp .ncbi .nih. go v/blastΛ . The similarity of polypeptide sequences may be examined using the following unix command line parameters:

bl2seq ~i peptideseql -j peptideseq2 -F F -p blastp

Variant polypeptide sequences preferably exhibit an E value of less than 1 x 10 ^"10 more preferably less than 1 x 10 ^"20, more preferably less than 1 x 10 ^"30, more preferably less than 1 x 10 ^"40, more preferably less than 1 x 10 ^"5° more preferably less than 1 x 10 ^"60 _, more preferably less than 1 x 10 ^"70 _; more preferably less than 1 x 10 ^"80 _; more preferably less than 1 x 10 ^"90 and most preferably less than 1 x 10 ^"10° when compared with any one of the specifically identified sequences.

The parameter -F F rums off filtering of low complexity sections. The parameter -p selects the appropriate algorithm for the pair of sequences. This program finds regions of similarity between the sequences and for each such region reports an "E value" which is the expected number of times one could expect to see such a match by chance in a database of a fixed reference size containing random sequences. For small E values, much less than one, this is approximately the probability of such a random match.

Conservative substitutions of one or several amino acids of a described polypeptide sequence without significantly altering its biological activity are also included in the invention. A skilled artisan will be aware of methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et αl, 1990, Science 247, 1306).

Constructs, vectors and components thereof

The term "genetic construct" refers to a polynucleotide molecule, usually double-stranded DNA, which may have inserted into it another polynucleotide molecule (the insert polynucleotide molecule) such as, but not limited to, a cDNA molecule. A genetic construct may contain the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. The insert polynucleotide molecule may be derived from the host cell, or may be derived from a different cell or organism and/or may be a recombinant polynucleotide. Once inside the host cell the genetic construct may become integrated in the host chromosomal DNA. The genetic construct may be linked to a vector.

The term "vector" refers to a polynucleotide molecule, usually double stranded DNA, which is used to transport the genetic construct into a host cell. The vector may be capable of replication in at least one additional host system, such as E, coli.

The term "expression construct" refers to a genetic construct that includes the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. An expression construct typically comprises in a 5' to 3' direction: a) a promoter functional in the host cell into which the construct will be transformed, b) the polynucleotide to be expressed, and c) a terminator functional in the host cell into which the construct will be transformed.

The term "coding region" or "open reading frame" (ORF) refers to the sense strand of a genomic DNA sequence or a cDNA sequence that is capable of producing a transcription product and/or a polypeptide under the control of appropriate regulatory sequences. The coding sequence is identified by the presence of a 5' translation start codon and a 3' translation stop codon. When inserted into a genetic construct, a "coding sequence" is capable of being expressed when it is operably linked to promoter and terminator sequences.

"Operably-linked" means that the sequenced to be expressed is placed under the control of regulatory elements that include promoters, tissue-specific regulatory elements, temporal regulatory elements, enhancers, repressors and terminators.

The term "noncoding region" refers to untranslated sequences that are upstream of the translational start site and downstream of the translational stop site. These sequences are also referred to respectively as the 5' UTR and the 3' UTR. These regions include elements required for transcription initiation and termination and for regulation of translation efficiency. Terminators are sequences, which terminate transcription, and are found in the 3' untranslated ends of genes downstream of the translated sequence. Terminators are important determinants of mRNA stability and in some cases have been found to have spatial regulatory functions.

The term "promoter" refers to nontranscribed cis-regulatory elements upstream of the coding region that regulate gene transcription. Promoters comprise cis-initiator elements which specify the transcription initiation site and conserved boxes such as the TATA box, and motifs that are bound by transcription factors.

A "transgene" is a polynucleotide that is taken from one organism and introduced into a different organism by transformation. The transgene may be derived from the same species or from a different species as the species of the organism into which the transgene is introduced.

An "inverted repeat" is a sequence that is repeated, where the second half of the repeat is in the complementary strand, e.g.,

(5')GATCTA TAGATC(3 ')

(3')CTAGAT ATCTAG(5')

Read-through transcription will produce a transcript that undergoes complementary base-pairing to form a hairpin structure provided that there is a 3-5 bp spacer between the repeated regions.

A "transgenic plant" refers to a plant which contains new genetic material as a result of genetic manipulation or transformation. The new genetic material may be derived from a plant of the same species as the resulting transgenic plant or from a different species.

The terms "to alter expression of and "altered expression" of a polynucleotide or polypeptide of the invention, are intended to encompass the situation where genomic DNA corresponding to a polynucleotide of the invention is modified thus leading to altered expression of a polynucleotide or polypeptide of the invention. Modification of the genomic DNA may be through genetic transformation or other methods known in the art for inducing mutations. The "altered expression" can be related to an increase or decrease in the amount of messenger RNA and/or polypeptide produced and may also result in altered activity of a polypeptide due to alterations in the sequence of a polynucleotide and polypeptide produced.

The invention provides methods for selecting and producing plants altered in diameter growth, relative to suitable control plants.

Suitable control plants may include non-transformed plants of the same species and variety, or plants of the same species or variety transformed with a control construct.

Methods for isolating polynucleotides

The polynucleotide molecules of the invention can be isolated by using a variety of techniques known to those of ordinary skill in the art. By way of example, such polypeptides can be isolated through use of the polymerase chain reaction (PCR) described in Mullis et al, Eds. 1994 The Polymerase Chain Reaction, Birkhauser, incorporated herein by reference. The polypeptides of the invention can be amplified using primers, as defined herein, derived from the polynucleotide sequences of the invention.

Further methods for isolating polynucleotides of the invention, or polynucleotides useful in methods of the invention, include use of all, or portions of, the polynucleotides set forth herein as hybridization probes. The technique of hybridizing labelled polynucleotide probes to polynucleotides immobilized on solid supports such as nitrocellulose filters or nylon membranes, can be used to screen the genomic or cDNA libraries. Exemplary hybridization and wash conditions are: hybridization for 20 hours at 65°C in 5. 0 X SSC, 0. 5% sodium dodecyl sulfate, 1 X Denhardt's solution; washing (three washes of twenty minutes each at 55°C) m l. O X SSC, 1% (w/v) sodium dodecyl sulfate, and optionally one wash (for twenty minutes) in 0. 5 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60⁰C. An optional further wash (for twenty minutes) can be conducted under conditions of 0. 1 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60°C.

The polynucleotide fragments of the invention may be produced by techniques well-known in the art such as restriction endonuclease digestion and oligonucleotide synthesis. A partial polynucleotide sequence may be used, in methods well-known in the art to identify the corresponding full length polynucleotide sequence. Such methods include PCR-based methods, 5'RACE (Frohman MA, 1993, Methods Enzymol. 218: 340-56) and hybridization- based method, computer/database -based methods. Further, by way of example, inverse PCR permits acquisition of unknown sequences, flanking the polynucleotide sequences disclosed herein, starting with primers based on a known region (Triglia et ah, 1998, Nucleic Acids Res 16, 8186, incorporated herein by reference). The method uses several restriction enzymes to generate a suitable fragment in the known region of a gene. The fragment is then circularized by intramolecular ligation and used as a PCR template. Divergent primers are designed from the known region. In order to physically assemble full-length clones, standard molecular biology approaches can be utilized (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).

It may be beneficial, when producing a transgenic plant from a particular species, to transform such a plant with a sequence or sequences derived from that species. The benefit may be to alleviate public concerns regarding cross-species transformation in generating transgenic organisms. Additionally when down-regulation of a gene is the desired result, it may be necessary to utilise a sequence identical (or at least highly similar) to that in the plant, for which reduced expression is desired. For these reasons among others, it is desirable to be able to identify and isolate orthologues of a particular gene in several different plant species. Variants (including orthologues) may be identified by the methods described.

Methods for identifying variants

Physical methods

Variant polynucleotides may be identified using PCR-based methods (Mullis et al, Eds. 1994 The Polymerase Chain Reaction, Birkhauser). Typically, the polynucleotide sequence of a primer, useful to amplify variant polynucleotide molecules by PCR, may be based on a sequence encoding a conserved region of the corresponding amino acid sequence.

Alternatively library screening methods will be known to those skilled in the art (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987) may be employed. When identifying variants of the probe sequence hybridisation and/or wash stringency conditions will typically be reduced relative to when exact sequence matches are sought.

Polypeptide variants of the invention may be identified by physical methods, for example by screening expression libraries using antibodies raised against polypeptides of the invention (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987) or by identifying polypeptides from natural sources with the aid of such antibodies.

Computer based methods

The variant sequences of the invention, including both polynucleotide and polypeptide variants, may also be identified by computer-based methods well-known to those skilled in the art, using public domain sequence alignment algorithms and sequence similarity search tools to search sequence databases (public domain databases include Genbank, EMBL, Swiss-Prot, PIR and others). See, e.g., Nucleic Acids Res. 29: 1-10 and 11-16, 2001 for examples of online resources. Similarity searches retrieve and align target sequences for comparison with a sequence to be analyzed (i.e., a query sequence). Sequence comparison algorithms use scoring matrices to assign an overall score to each of the alignments.

An exemplary family of programs useful for identifying variants in sequence databases is the BLAST suite of programs (version 2.2.5 [Nov 2002]) including BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX, which are publicly available from (ftρ://ftp.ncbi.nih.gov/blastΛ or from the National Center for Biotechnology Information (NCBI), National Library of Medicine, Building 38 A, Room 8N805, Bethesda, MD 20894 USA. The NCBI server also provides the facility to use the programs to screen a number of publicly available sequence databases. BLASTN compares a nucleotide query sequence against a nucleotide sequence database. BLASTP compares an amino acid query sequence against a protein sequence database. BLASTX compares a nucleotide query sequence translated in all reading frames against a protein sequence database. tBLASTN compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames. tBLASTX compares the six- frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database. The BLAST programs may be used with default parameters or the parameters may be altered as required to refine the screen.

The use of the BLAST family of algorithms, including BLASTN, BLASTP, and BLASTX, is described in the publication of Altschul et αl., Nucleic Acids Res. 25: 3389-3402, 1997.

The "hits" to one or more database sequences by a queried sequence produced by BLASTN, BLASTP, BLASTX, tBLASTN, tBLASTX, or a similar algorithm, align and identify similar portions of sequences. The hits are arranged in order of the degree of similarity and the length of sequence overlap. Hits to a database sequence generally represent an overlap over only a fraction of the sequence length of the queried sequence.

The BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX algorithms also produce "Expect" values for alignments. The Expect value (E) indicates the number of hits one can "expect" to see by chance when searching a database of the same size containing random contiguous sequences. The Expect value is used as a significance threshold for determining whether the hit to a database indicates true similarity. For example, an E value of 0.1 assigned to a polynucleotide hit is interpreted as meaning that in a database of the size of the database screened, one might expect to see 0.1 matches over the aligned portion of the sequence with a similar score simply by chance. For sequences having an E value of 0.01 or less over aligned and matched portions, the probability of finding a match by chance in that database is 1 % or less using the BLASTN, BLASTP, BLASTX, tBLASTN or tBLASTX algorithm.

Multiple sequence alignments of a group of related sequences can be carried out with CLUSTALW (Thompson, J.D., Higgins, D.G. and Gibson, TJ. (1994) CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22:4673- 4680, http://www-igbmc.u-strasbg.fr/BioInfo/ClustalW/Top.htmD or T-COFFEE (Cedric Notredame, Desmond G. Higgins, Jaap Herihga, T-Coffee: A novel method for fast and accurate multiple sequence alignment, J. MoI. Biol. (2000) 302: 205-217))or PILEUP, which uses progressive, pairwise alignments. (Feng and Doolittle, 19S7, J. MoI. Evol. 25, 351). Pattern recognition software applications are available for finding motifs or signature sequences. For example, MEME (Multiple Em for Motif Elicitation) finds motifs and signature sequences in a set of sequences, and MAST (Motif Alignment and Search Tool) uses these motifs to identify similar or the same motifs in query sequences. The MAST results are provided as a series of alignments with appropriate statistical data and a visual, overview of the motifs found. MEME and MAST were developed at the University of California, San Diego.

PROSITE (Bairoch and Bucher, 1994, Nucleic Acids Res. 22, 3583; Hofmann et al, 1999, Nucleic Acids Res. 27, 215) is a method of identifying the functions of uncharacterized proteins translated from genomic or cDNA sequences. The PROSITE database (www.expasy.org/prosite) contains biologically significant patterns and profiles and is designed so that it can be used with appropriate computational tools to assign a new sequence to a known family of proteins or to determine which known domain(s) are present in the sequence (Falquet et al, 2002, Nucleic Acids Res. 30, 235). Prosearch is a tool that can search SWISS-PROT and EMBL databases with a given sequence pattern or signature.

Methods for isolating polypeptides

The polypeptides of the invention, including variant polypeptides, may be prepared using peptide synthesis methods well known in the art such as direct peptide synthesis using solid phase techniques (e.g. Stewart et al, 1969, in Solid-Phase Peptide Synthesis, WH Freeman Co, San Francisco California, or automated synthesis, for example using an Applied Biosystems 43 IA Peptide Synthesizer (Foster City, California). Mutated forms of the polypeptides may also be produced during such syntheses.

The polypeptides and variant polypeptides of the invention may also be purified from natural sources using a variety of techniques that are well known in the art (e.g. Deutscher, 1990, Ed, Methods in Enzymology, Vol. 182, Guide to Protein Purification^).

Alternatively the polypeptides and variant polypeptides of the invention may be expressed recombinantly in suitable host cells and separated from the cells as discussed below. Methods for producing constructs and vectors

The genetic constructs of the present invention comprise one or more polynucleotide sequences of the invention and/or polynycleotides encoding polypeptides of the invention, and may be useful for transforming, for example, bacterial, fungal, insect, mammalian or plant organisms. The genetic constructs of the invention are intended to include expression constructs as herein defined.

Methods for producing and using genetic constructs and vectors are well known in the art and are described generally in Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987 ; Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing, 1987).

Methods for producing host cells comprising constructs and vectors

The invention provides a host cell which comprises a genetic construct or vector of the invention. Host cells may be derived from, for example, bacterial, fungal, insect, mammalian or plant organisms.

Host cells comprising genetic constructs, such as expression constructs, of the invention are useful in methods well known in the art (e.g. Sambrook et al, Molecular Cloning : A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987 ; Ausubel et al, Current Protocols in Molecular Biology, Greene Publishing, 1987) for recombinant production of polypeptides of the¹ invention. Such methods may involve the culture of host cells in an appropriate medium in conditions suitable for or conducive to expression of a polypeptide of the invention. The expressed recombinant polypeptide, which may optionally be secreted into the culture, may then be separated from the medium, host cells or culture medium by methods well known in the art (e.g. Deutscher, Ed, 1990, Methods in Enzymology, VoI 182, Guide to Protein Purification).

Host cells of the invention may also be useful in methods for production of an enzymatic product generated by an expressed polypeptide of the invention. Such methods may involve culturing the host cells of the invention in a medium suitable for expression of a recombinant polypeptide of the invention, optionally in the presence of additional enzymatic substrate for the expressed polypeptide of the invention. The enzymatic product produced may then be separated from the host cells or medium by a variety of art standard methods.

Methods for producing plant cells and plants comprising constructs and vectors

The invention further provides plant cells.

Production of these plants with altered diameter growth may be achieved through methods of the invention. Such methods may involve the transformation of these plant cells and plants, with a designed to alter expression of a polynucleotide or polypeptide capable of modulating diameter growth in such plant cells and plants. Such methods also include the transformation of plant cells and plants with a combination of the constructs designed to alter expression of one or more polypeptides or polypeptides capable of modulating diameter growth in such plant cells and plants.

Methods for transforming plant cells, plants and portions thereof with polynucleotides are described in Draper et al, 1988, Plant Genetic Transformation and Gene Expression. A Laboratory Manual Blackwell Sci. Pub. Oxford, p. 365; Potrykus and Spangenburg, 1995, Gene Transfer to Plants. Springer- Verlag, Berlin.; and Gelvin et al., 1993, Plant Molecular Biol. Manual. Kluwer Acad. Pub. Dordrecht. A review of transgenic plants, including transformation techniques, is provided in Galun and Breiman, 1997, Transgenic Plants. Imperial College Press, London.

Methods for genetic manipulation of plants

A number of strategies for genetically manipulating plants are available (e.g. Birch, 1997, Ann Rev Plant Phys Plant MoI Biol, 48, 297). For example, strategies may be designed to increase expression of a polynucleotide/polypeptide in a plant cell, organ and/or at a particular developmental stage where/when it is normally expressed or to ectopically express a polynucleotide/polypeptide in a cell, tissue, organ and/or at a particular developmental stage which/when it is not normally expressed. The expressed polynucleotide/polypeptide may be derived from the plant species to be transformed or may be derived from a different plant species. Transformation strategies may be designed to reduce expression of a polynucleotide/polypeptide in a plant cell, tissue, organ or at a particular developmental stage which/when it is normally expressed. Such strategies are known as gene silencing strategies.

Genetic constructs for expression of genes in transgenic plants typically include promoters for driving the expression of one or more cloned polynucleotide, terminators and selectable marker sequences to detest presence of the genetic construct in the transformed plant.

The promoters suitable for use in the constructs of this invention are functional in a cell, tissue or organ of a monocot or dicot plant and include cell-, tissue- and organ-specific promoters, cell cycle specific promoters, temporal promoters, inducible promoters, constitutive promoters that are active in most plant tissues, and recombinant promoters. Choice of promoter will depend upon the temporal and spatial expression of the cloned polynucleotide, so desired. The promoters may be those normally associated with a transgene of interest, or promoters which are derived from genes of other plants, viruses, and plant pathogenic bacteria and fungi. Those skilled in the art will, without undue experimentation, be able to select promoters that are suitable for use in modifying and modulating plant traits using genetic constructs comprising the polynucleotide sequences of the invention. Examples of constitutive plant promoters include the CaMV 35S promoter, the nopaline synthase promoter and the octopine synthase promoter, and the Ubi 1 promoter from maize. Plant promoters which are active in specific tissues, respond to internal developmental signals or external abiotic or biotic stresses are described in the scientific literature. Exemplary promoters are described, e.g., in WO 02/00894, which is herein incorporated by reference.

Exemplary terminators that are commonly used in plant transformation genetic construct include, e.g., the cauliflower mosaic virus (CaMV) 35S terminator, the Agrobacteήum tumefaciens nopaline synthase or octopine synthase terminators, the Zea mays zein gene terminator, the Oryza sativa ADP-glucose pyrophosphorylase terminator and the Solarium tuberosum PI-II terminator.

Selectable markers commonly used in plant transformation include the neomycin phophotransferase II gene (NPT II) which confers kanamycin resistance, the aadA gene, which confers spectinomycin and streptomycin resistance, the phosphinothricin acetyl transferase (bar gene) for Ignite (AgrEvo) and Basta (Hoechst) resistance, and the hygromycin phosphotransferase gene ( hpt) for hygromycin resistance.

Use of genetic constructs comprising reporter genes (coding sequences which express an activity that is foreign to the host, usually an enzymatic activity and/or a visible signal (e.g., luciferase, GUS, GFP) which may be used for promoter expression analysis in plants and plant tissues are also contemplated. The reporter gene literature is reviewed in Herrera-Estrella et ah, 1993, Nature 303, 209, and Schrott, 1995, In: Gene Transfer to Plants (Potrykus, T., Spangenbert. Eds) Springer Verlag. Berline, pp. 325-336.

Gene silencing strategies may be focused on the gene itself or regulatory elements which effect expression of the encoded polypeptide. "Regulatory elements" is used here in the widest possible sense and includes other genes which interact with the gene of interest.

Genetic constructs designed to decrease or silence the expression of a polynucleotide/polypeptide of the invention may include an antisense copy of a polynucleotide of the invention. In such constructs the polynucleotide is placed in an antisense orientation with respect to the promoter and terminator.

An "antisense" polynucleotide is obtained by inverting a polynucleotide or a segment of the polynucleotide so that the transcript produced will be complementary to the mRNA transcript of the gene, e.g., - ^■

5'GATCTA 3' (coding strand) 3'CTAGAT 5' (antisense strand)

3'CUAGAU 5' mRNA 5'GAUCUCG 3' antisense RNA

Genetic constructs designed for gene silencing may also include an inverted repeat. An 'inverted repeat' is a sequence that is repeated where the second half of the repeat is in the complementary strand, e.g.,

5 '-GATCTA TAGATC-3 '

3'-CTAGAT ATCT AG-5' The transcript formed may undergo complementary base pairing to form a hairpin structure. Usually a spacer of at least 3-5 bp between the repeated region is required to allow hairpin formation.

Another silencing approach involves the use of a small antisense RNA targeted to the transcript equivalent to an miRNA (Llave et al., 2002, Science 297, 2053). Use of such small antisense RNA corresponding to polynucleotide of the invention is expressly contemplated.

The term genetic construct as used herein also includes small antisense RNAs and other such polynucleotides useful for effecting gene silencing.

Transformation with an expression construct, as herein defined, may also result in gene silencing through a process known as sense suppression (e.g. Napoli et al., 1990, Plant Cell 2, 279; de Carvalho Niebel et al., 1995, Plant Cell, 7, 347). hi some cases sense suppression may involve over-expression of the whole or a partial coding sequence but may also involve expression of non-coding region of the gene, such as an intron or a 5' or 3' untranslated region (UTR). Chimeric partial sense constructs can be used to coordinately silence multiple genes (Abbott et al., 2002, Plant Physiol. 128(3): 844-53; Jones et al., 1998, Planta 204: 499-505). The use of such sense suppression strategies to silence the expression of a polynucleotide of the invention is also contemplated.

The polynucleotide inserts in genetic constructs designed for gene silencing may correspond to coding sequence and/or non-coding sequence, such as promoter and/or intron and/or 5' or 3' UTR sequence, or the corresponding gene.

Other gene silencing strategies include dominant negative approaches and the use of ribozyme constructs (Mclntyre, 1996, Transgenic Res, 5, 257)

Pre-transcriptional silencing may be brought about through mutation of the gene itself or its regulatory elements. Such mutations may include point mutations, frameshifts, insertions, deletions and substitutions. The following are representative publications disclosing genetic transformation protocols that can be used to genetically transform the following plant species: Rice (Alam et al, 1999, Plant Cell Rep. 18, 572); maize (US Patent Serial Nos. 5, 177, 010 and 5, 981, 840); wheat (Ortiz et al, 1996, Plant Cell Rep. 15, 1996, 877); tomato (US Patent Serial No. 5, 159, 135); potato (Kumar et al, 1996 Plant J. 9, : 821); cassava (Li et al., 1996 Nat. Biotechnology 14, 736); lettuce (Michelmore et al, 1987, Plant Cell Rep. 6, 439); tobacco (Horsch et al, 1985, Science 227, 1229); cotton (US Patent Serial Nos. 5, 846, 797 and 5, 004, 863); grasses (US Patent Nos. 5, 187, 073, 6. 020, 539); peppermint (Niu et al, 1998, Plant Cell Rep. 17, 165); citrus plants (Pena et al, 1995, Plant Sci.104, 183); caraway (Krens et al, 1997, Plant Cell Rep, 17, 39); banana (US Patent Serial No. 5, 792, 935); soybean (US Patent Nos. 5, 416, 011 ; 5, 569, 834 ; 5, 824, 877 ; 5, 563, 04455 and 5, 968, 830); pineapple (US Patent Serial No. 5, 952, 543); poplar (US Patent No. 4, 795, 855); monocots in general (US Patent Nos. 5, 591, 616 and 6, 037, 522); brassica (US Patent Nos. 5, 188, 958 ; 5, 463, 174 and 5, 750, 871); cereals (US Patent No. 6, 074, 877); gymnosperm tree species and Pine species (Henderson, A. R. and C. Walter, (2006) Genetic Engineering in Conifer Plantation Forestry, Silvae Genetica 55 (6); p253-262). Other species are contemplated and suitable methods and protocols are available in the scientific literature for use by those skilled in the art.

Several further methods known in the art may be employed to alter expression of a nucleotide and/or polypeptide of the invention. Such methods include but are not limited to Tilling (Till et al, 2003, Methods MoI Biol, 2%, 205), so called "Deletagene" technology (Li et al, 2001, Plant Journal 27(3), 235) and the use of artificial transcription factors such as synthetic zinc finger transcription factors, (e.g. Jouvenot et al, 2003, Gene Therapy 10, 513). Additionally antibodies or fragments thereof, targeted to a particular polypeptide may also be expressed in plants to modulate the activity of that polypeptide (Jobling et al, 2003, Nat. Biotechnol., 21(1), 35). Transposon tagging approaches may also be applied. Additionally peptides interacting with a polypeptide of the invention may be identified through technologies such as phase-display (Dyax Corporation). Such interacting peptides may be expressed in or applied to a plant to affect activity of a polypeptide of the invention. Use of each of the above approaches in alteration of expression of a nucleotide and/or polypeptide of the invention is specifically contemplated. Methods for selecting plants

Methods are also provided for selecting plants altered in diameter growth. Such methods involve testing of plants for altered for the expression of a polynucleotide or polypeptide of the invention. Such methods may be applied at a young age or early developmental stage when the alteration of diameter growth may not necessarily be measurable, to accelerate breeding programs.

The expression of a polynucleotide, such as a messenger RNA, is often used as an indicator of expression of a corresponding polypeptide. Exemplary methods for measuring the expression of a polynucleotide include but are not limited to Northern analysis, RT-PCR and dot-blot analysis (Sambrook et ah, Molecular Cloning : A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987). Polynucleotides or portions of the polynucleotides of the invention are thus useful as probes or primers, as herein defined, in methods for the identification of plants with altered diameter growth. The polypeptides of the invention may be used as probes in hybridization experiments, or as primers in PCR based experiments, designed to identify such plants.

Alternatively antibodies may be raised against polypeptides of the invention. Methods for raising and using antibodies are standard in the art (see for example: Antibodies, A Laboratory Manual, Harlow A Lane, Eds, Cold Spring Harbour Laboratory, 1998). Such antibodies may be used in methods to detect altered expression of polypeptides which modulate diameter growth in plants. Such methods may include ELISA (Kemeny, 1991, A Practical Guide to ELISA, NY Pergamon Press) and Western analysis (Towbin & Gordon, 1994, J Immunol Methods, 72, 313).

These approaches for analysis of polynucleotide or polypeptide expression and the selection of plants with altered expression are useful in conventional breeding programs designed to produce varieties altered in diameter growth.

Plants

The plants of the invention may be grown and either self-ed or crossed with a different plant strain and the resulting hybrids, with the desired phenotypic characteristics, may be identified. Two or more generations may be grown to ensure that the subject phenotypic characteristics are stably maintained and inherited. Plants resulting from such standard breeding approaches also form an aspect of the present invention.

This invention may also be said broadly to consist in the parts, elements and features referred to or indicated in the specification of the application, individually or collectively, and any or all combinations of any two or more said parts, elements or features, and where specific integers are mentioned herein which have known equivalents in the art to which this invention relates, such known equivalents are deemed to be incorporated herein as if individually set forth.

Exemplary methods for assessing alteration in diameter growth in plants of the invention are provided in the Example section below.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 shows alignment of the amino acid sequences of 34 dehydrin alleles from three pine species. Amino acids in alleles PPH2-PTH11 that are identical to those the top sequence (PPHl) are indicated by dots, the letters indicate amino acid changes, and alignment gaps are indicated by dashes. The translated protein sequence contained an eight-amino-acid, serine-repeat motif and three repeated K-like segments (KIKEK(I/L)PGH) and thus could be classified as acidic SK₃ -type dehydrins (Campbell and Close 1997). The dehydrin S-segment and K-segments are shown. "T" indicate sites (i.e. [S/T]x₂[D/E]) phosphorylated by CKII (i.e. S/T is phosphorylated) (Pinna 2002). Radical amino acid changes are indicated by "*". The CKII - hydrophilic charged motif (SnD[E/D]CG[G/V]KEEKK (SEQ ID NO:69), where n=2-3) is highlighted in grey. It is similar to the S-segment, which is underlined, along with another similar motif. All three motifs are followed by a K-segment. Serine insertions that alter the CKII- hydrophilic charged motif are highlighted with grey shading and introduce an extra CKII phosphorylation site in P. taeda (PTHl, 2 and 4) and P. radiata (PRH 10- 12) alleles. PP = Pinus pinaster, PR = Pinus radiata, PT = Pinus taeda. The numbers indicate different alleles.

Figure 2 shows an alignment of the polynucleotide coding sequence of the same dehydrin alleles aligned in Figure 1. The inserted codon (TCT) at nucleotide positions 484-486 in P. taeda (PTHl, 2 and 4) and P. radiata (PRHl 0-12) alleles that encodes the additional serine at amino acid position 162 in the corresponding polypeptides (as shown in Figure 1) are highlighted with grey shading. PP = Pinus pinaster, PR = Pinus radiata, PT = Pinus taeda. The numbers indicate different alleles.

Figure 3 shows the associations between PrDhnl and radial growth rate (i.e. diameter growth). The graph shows the average radial growth rate of trees with nil, one, or two copies of S+ dehydrin allele.

Figure 4 shows a phylogenetic tree of the dhn-1 locus generated using the Neighbor- Joining method and 10,000 bootstrap replicates. Comparisons were made across 1351 bp (from -511 to +834 bp) in three pine species, P. pinaster, P. radiata, and P. taeda. Numbers below each branch denote the average pairwise nucleotide difference between adjoining branches. Numbers above the branch at each node denote the bootstrap confidence that the branch order is correct (values > 50% are shown). Each tree is rooted on the mid-point. Only SNPs were compared, so haplotypes which differ by an insertion/deletion event appear on the same branch on the tree.

EXAMPLES

The invention will now be illustrated with reference to the following non-limiting examples.

Example 1 : Demonstration of linkage of a homozygous pair of dehydrin alleles in trees to radial/diameter growth rate

Summary

P. radiata D. Don is grown commercially for wood and pulp throughout NZ, Australia, and Chile. The species also occurs naturally in five discrete populations on the west coast of California (USA) and two islands off the Baja California Peninsula; namely Monterey, Afio Nuevo, Cambria, Cedros Island and Guadalupe Island. In New Zealand, P. rαdiαtα populations are at most one or two generations of selection beyond their wild ancestors, and are thought to be a 60:40 mix of AfIo Nuevo and Monterey alleles (Burdon et αl. 1992). Association tests showed linkage between alleles of the dehydrin gene and radial/diameter growth rate in P. radiata. In the NZ population, trees with two copies of S+ allele had significantly higher radial growth rates when compared to heterozygous S+/S- trees, or homozygous S- trees. In pine species recombination rates are high and linkage disequilibrium (LD) decays rapidly (i.e. to an R² < 0.25 over a 2 kb region) (Brown et al. 2004; Gonzalez- Martinez et al. 2006), thus if a SNP(s) within a gene is found to be associated with a trait of interest in pine, the gene/locus is likely to be the causative agent.

Materials and methods

Plant material

For sequence analysis; tissue was collected from P. radiata D. Don (235 chromosomes), P. taeda L (281 chromosomes), and P. pinaster Ait. (73 chromosomes) and DNA was extracted from diploid or haploid tissue as described previously (Plomion et al. 1995; Cato et al. 2001; Xu et al. 2008). The P. radiata trees were collected from the New Zealand breeding population. P. pinaster trees were sourced from the Aquitaine region in France (Plomion et al. 1995), and P. taeda, trees were collected from natural populations throughout southeast USA (Xu et al. 2008).

Needle tissue was also collected from the following eight Pinus species from the subgenus Pinus subsection Trifoliae (the number of trees assayed per species is given in brakets): P. attenuate (4), P. muricata (4), P. teocote (1), P. greggiii (4), P. herrerae (2), P. devoniana (3), P. pseudostrobus (1), and P. contorta (3). Genomic DNA was extracted from needle tissue as described above.

For SNP analysis, needle tissue was collected from 1517 P. radiata trees in New Zealand (NZ) and genomic DNA extracted as described above. The trees were grown from seed collected from unimproved plantation forests throughout NZ during the 1960s. Amplification and sequencing of the dehydrin gene

A 1300-1700 bp region of genomic DNA was amplified by PCR using Platinum Taq (Invitrogen) under conditions recommended by the manufacturer. The primer sequences are listed in Table 2. The forward primer, PrDhnlpromFl, was used to amplify all alleles except PRHl 1 and PRH 12 (which were amplified with PrDhnlpromF3). Each forward primer was used in conjunction with the reverse primer, PrDhnlRl. PCR products were purified using a PEG precipitation protocol (Rosenthal et al. 1993) and sequenced directly using a dideoxy chain termination method with an automated sequencer. Sequence trace files were evaluated using SEQUENCHER V.4.4 (Gene Codes). For most amplicons, it was possible to evaluate both the forward and reverse sequences.

Table 2 PCR primer sequences

Position in

Name Sequence (5' to 3') PrDhnl (bp)

PrDhnlpromFl TTCCGGAAACTTTGGTTTAAG (SEQ ID NO.-72) -492 to -472

PrDhnlpromF3 CTCCAAGGTGTTCGTTGTGG (SEQ ID NO:73) -885 to -866

PrDhnl Rl TTTCTTTGCTCTTGTCCTTCG (SEQ ID NO:74) +462 to +442

Analysis of polymorphisms at the dehydrin locus

In the coding region, five SNPs and two indels were assayed. A single multiplexed-PCR was , performed which amplified four SNPs (at bp positions +575, +616, +617, and +705) and two indels (at bp positions +594 and +675) in the coding region. PCR amplifications were carried out under standard conditions. Each SNP was assayed by a different forward primer and the following eight PCR primer pairs were used: PrDhnl+575Fa, PrDhnl+575Fc, PrDlinl+616Ft, PrDhnl+617Fgc, PrDhnl +617Fgt, PrDhnl+705Fc, PrDhnl+705Ft, and PrDhnlR (Table 3). PCR products were diluted 100-fold in distilled water, and 1 μl of diluted PCR product was added to 0.01 μl of GS LIZ 500 size standard (Applied Biosystems), and 9.9 μl Hi-Di formamide (Applied Biosystems). PCR products were electrophoresed through POP4 polymer (Applied Biosystems) in a 36-cm capillary array on a 3100 DNA analyzer using standard electrophoretic conditions (Applied Biosystems). The electrophoresis data was analysed using GENESCAN ANALYSIS v3.7 and GENOTYPER v3.7 software (Applied Biosystems).

Table 3 PCR primer sequences

Position in

Name Sequence (5' to 3') PrDhnl (bp)

PrDhnl+575Fa 6FAM-CGGGACACCAGGAAAAACTA (SEQ ID NO:75) +556 to +575 PrDhnl +575Fc HEX-CGGGACACCAGGAAAAACTC (SEQ ID NO:76) +556 to +575 PrDhnl +616Ft HEX-CATTCTTCAGATGAGTGTGAGGT (SEQ ID NO.-77) +593 to +616 PrDhnl +617Fgc HEX-TTCTTCAGATGAGTGTGGAGGC (SEQ ID NO:78) +596 to +617 PrDhnl+617Fgt 6FAM-TTCTTCAGATGAGTGTGGAGGT (SEQ ID NO:79) +596 to +617 PrDhnl +705Fc HEX-CCCTGGTGATGGAAAGTACC (SEQ ID NO:80) +686 to +705 PrDhnl+705Ft 6FAM-CTCTGGTG ATGGAAAGCACT (SEQ ID NO:81) +686 to +705 PrDhnl +750Fc 6FAM-AGGAGAAGAAGTTGGGTATGC (SEQ ID NO:82) +730 to +750 PrDhnl+750Fg HEX-AGGAGAAGAAGTTGGGTATGG (SEQ ID NO:83) +730 to +750 PrDhnl -463Fa VIC-GCGTAGTAAAACATATTGACCTAACTA (SEQ ID NO:84) -437 to -463 PrDhnl -463Fg 6FAM-GCGTAGTAAAACATATTGACCTAACTG (SEQ ID NO: 85) -437 to -463

A SNP at position +750 bp was assayed separately using PCR primers: PrDhnl+750Fc, PrDhnl+750Fg, and PrDhnl R (Table 3) (i.e. each SNP was amplified with a different primer). Likewise, in the promoter region, one SNP (at -463 bp) and three indels (at -504, -634, and -692 bp) were amplified using PCR primers: PrDhnl-463Fa, PrDhnl -463Fg, and PrDhnlpromR (Table 3). All products were analysed on the 3100 DNA analyzer as described in the proceeding paragraph.

Diameter growth measurements

In the NZ population, ring widths were measured from 5-mm cores (spanning the pith to the bark) by x-ray densitometry (Cown and Clement 1983). Each core was collected at breast height (1.4 m). For each individual, the sum of the ring widths from rings five to twelve were calculated for each core and used as a measure of diameter growth rate at breast height. A mixed model with fixed effects for experiment and treatment and random plot effects was fitted to the ring width data. The standardised ring widths were calculated as the standardized residuals of this model. Calculations were performed using the R language for statistical graphics available at http://www.r-project.org/ (Ihaka and Gentleman 1996).

Association Tests

Tests for an association between gene polymorphisms (individual SNPs or haplotypes) and ring widths were performed in POWERMARKER V3.25 which is available at http://statgen.ncsu.edu/powermarker/ using a single-locus F-test (Liu and Muse 2005).

Evolutionary analysis

Consensus neighbor-joining and maximum parsimony trees were constructed based on the number of pair- wise nucleotide differences between haplotypes using 10,000 bootstrap- generated multiple alignments (Kumar, Tamura, and Nei 2004). Insertions and deletions were excluded from the analysis.

Results

Molecular basis of polymorphisms

Association tests in two NZ populations of P, radiata

In order to test whether SNPs, or haplotypes, at the dehydrin locus were associated with growth rate variation in the NZ P. radiata population, association tests were performed between individual tree phenotypes and the dehydrin haplotypes. Trees with two copies of the S+ allele (i.e. trees that were homozygous for the serine insertion in the dehydrin gene) had significantly higher radial growth rates (an increase of 0.2 standard deviations from the mean; p- value = 0.009) than trees that were either heterozygous (i.e. S+/S-) or homozygous (i.e. S-/S-) for this allele. In percentage terms, trees with two copies of the S+ allele had, on average, a 2.5% ' increase in diameter growth rate, over trees that had only one or nil copies of this allele.

In P. radiata, this serine insertion was in LD with 17 other SNPs, and collectively, these polymorphisms differentiated PRH10-PRH12 from PRH1-PRH9. When all Pinus species from the section Trifoliae are considered, the serine insertion was only in LD with one other SNP (at position 540 within the dehydrin gene).

Dectection of the S+ allele in other Pinus species

In total nine other species from the subgenus Pinus subsection Trifoliae were sequence at the dehydrin locus. The nine species included P. taeda, P. attenuate, P. muricata, P. teocote, P. greggiii, P. herrerae, P. devoniana, P. pseudostrobus, and P. contorta. The S+ allele was present in all nine of these species, providing strong evidence that it is found throughout the section Trifoliae.

Evolutionary analysis

Phylogenetic analysis of the dhn-1 locus from all three species in the coding DNA (834 bp) plus the promoter region (511 bp) revealed up to seven distinct phylogenetic branches, with up to 14 SNPs defining each branch (Figure 4).

The phylogenetic tree based on the dehydrin allele sequences, including the S+ or S- alleles of SEQ ID NO: 1 to 34 (polypeptide) and 35 to 68 (polynucleotide), shows the relatedness of the Pinus species, particularly the relatedness of P. radiata P. taeda and P. sylvestris.

This evolutionary data, together with the identification of S+ alleles in several other Pinus species discussed above, indicates that the association between presence of the S+/S+ homozgous state and altered diameter growth phenotype is likely to hold across different Pinus species, particularly subsection Trifoliae species.

Discussion

Evidence has been presented for a recessive non-additive association between an insertion at the dehydrin locus and growth rate in pine trees, where trees homozygous for the S+ allele show an increase in radial/diameter growth rate. The results demonstrated that genetic variation at the dehydrin locus underpins changes in growth rate, a crucial trait in timber tree breeding.

References

Brown, G. R., G. P. Gill, R. J. Kuntz, C. H. Langley and D. B. Neale (2004). Nucleotide diversity and linkage disequilibrium in loblolly pine. Proc. Natl. Acad. Sci. USA 101: 15255- 15260.

Burdon, R. D., J. A. Zabkiewicz and I. A. Andrew (1992). Genetic survey oϊPinus radiata. 8: Population differences in monoterpene composition of cortical oleoresin. NZJ. For. Sci. 22: 257- 273.

Campbell, S. A. and T. J. Close (1997). Dehydrins: genes, proteins, and associations with phenotypic traits. New Phytol. 137: 61-74.

Cato, S. A., R. C. Gardner, J. Kent and T. E. Richardson (2001). A rapid PCR-based method for genetically mapping ESTs. Theor. Appl. Genet. 102: 296-306.

Cown, D. J. and B. C. Clement (1983). A wood densitometer using direct scanning with X-rays. WoodSci. Technol 17: 91-99.

Gonzalez-Martinez, S. C, E. Ersoz, G. R. Brown, N. C Wheeler and D. B. Neale (2006). DNA sequence variation and selection of tag single-nucleotide polymorphisms at candidate genes for drought-stress response in Pinus taeda L. Genetics 172: 1915-1926.

Ihaka, R. and R. Gentleman (1996). R: a language for data analysis and graphics. J. Comput. Graphical Stat. 5: 299-314.

Kumar S, Tamura K, and Nei M. 2004. MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Briefings in Bioinformatics 5:150-163.

Liu, K. and S. V. Muse (2005). PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics 21 : 2128-2129. Pinna, L. A. (2002). Protein kinase CK2: A challenge to canons. J Cell Sci. 115: 3873-3878.

Plomion, C, N. Bahrman, C. E. Durel and D. M. O'Malley (1995). Genomic mapping in Pinus pinaster (maritime pine) using RAPD and protein markers. Heredity 74: 661-668.

Rosenthal, A., O. Coutelle and M. Craxton (1993). Large-scale production of DNA sequencing templates by microtitre format PCR. Nucleic Acids Res. 21 : 173- 174.

Xu, S., C. Tauer and C. Nelson (2008). Genetic diversity within and among populations of shortleaf pine ( Pinus echinata Mill.) and loblolly pine ( Pinus taeda L.). Tree Genetics & Genomes in press.

The above Examples illustrate practice of the invention. It will be appreciated by those skilled in the art that numerous variations and modifications may be made without departing from the spirit and scope of the invention.

Summary of Sequences

Claims

CLAIMS:

1. A method for identifying a tree with a genotype indicative of increased diameter growth, the method including detecting in the tree, or a sample derived from the tree, by direct or indirect methods, the presence of an allele of the dehydrin gene that encodes a dehydrin protein including an additional serine (S) residue in the casein kinase II (CKII)-hydrophilic charged domain of the dehydrin gene, relative to more commonly identified alleles disclosed herein.

2. The method of claim I₅ wherein the CKII -hydrophilic charged domain comprises the sequence motif: SnD[E/D]CG[G/V]KEEKK (SEQ ID NO:69), where n=2-3.

3. The method of claim 1 or 2, wherein two copies of the allele are detected, so the method detects the presence of the alleles in the homozygous state.

4. The method of any preceding claim, wherein three serine (S) residues are present at amino acid positions 161, 162 and 163 in the dehydrin protein.

5. The method of any preceding claim, wherein the dehydrin protein comprises a sequence with at least 70% identity to the polypeptide sequence of any one of SEQ ID NO:22-27.

6. The method of any preceding claim, wherein the dehydrin protein comprises the polypeptide sequence of any one of SEQ ID NO:22-27.

7. The method of any preceding claim, wherein the the dehydrin allele detected comprises a sequence with at least 70% identity to the polynucleotide sequence of any one of SEQ ID NO: 56-61.

8. The method of any preceding claim, wherein the dehydrin allele detected contains codons encoding serine at nucleotide positions 481-483, 484-486 and 487-489.

9. The method of claim 8, wherein the codon at nucleotide position 484-486 encodes the additional serine.

10. The method of claim 8 or 9, wherein the codon at nucleotide position 484-486 is TCT.

11. The method of any preceding claim, wherein the dehydrin allele detected comprises the polynucleotide sequence of any one of SEQ ID NO: 56-61.

12. The method of any one of claims 3 to 11, wherein presence of the homozygous pair of alleles is in linkage disequilibrium (LD) with the increased diameter growth trait.

13. The method of any preceding claim, wherein presence of either or both of the alleles is detected indirectly by detecting a marker that is linked to the specified allele.

14. The method of claim 13, wherein the marker is in linkage disequilibrium (LD) with the allele.

15. The method of any one of claims 3 to 11, wherein the method comprises detection of: a) serine at amino acid position 163 of a dehydrin protein with at least 70% identity to the sequence of any one of SEQ ID NO: 22-27; b) an amino acid marker in linkage disequilibrium with the serine in a); c) the TCT codon at nucleotide position 484-486 in the dehydrin encoding polynucleotide with at least 70% identity to the sequence of any one of SEQ ID NO: 56-61; or d) a nucleotide marker in linkage disequilibrium with the TCT codon in c).

16. The method of claim 15, wherein the dehydrin protein in a) comprises the sequence of any one of SEQ ID NO:22-27.

17. The method of claim 15, wherein the dehydrin encoding polynucleotide in c) comprises the sequence of any one of SEQ ID NO: 56-61.

18. The method of any preceding claim wherein the allele, or linked marker, is detected via using a polymerase chain reaction (PCR) step.

19. The method of claim 18 wherein at least one primer comprising the sequence of any one of SEQ ID NO-.56 to 61 is used in the PCR step.

20. A method for selecting a tree with a genotype indicative of increased diameter growth, the method comprising selecting a tree identified by the method of any one of claims 1 to 19.

21. An isolated polynucleotide encoding a polypeptide with the sequence of any one of SEQ ID NO:22-27 or a variant thereof, wherein the variant is a polypeptide capable of increasing diameter growth when expressed in the homozygous state in a plant.

22. The isolated polynucleotide of claim 21, wherein the polypeptide has at least 70% identity to the sequence of any one of SEQ ID NO:22 to 27.

23. The isolated polynucleotide of claim 22, wherein the polypeptide includes serine (S) residues at amino acid positions 161, 162 and 163.

24. The isolated polynucleotide of any one of claims 21 to 23 wherein the polypeptide comprises the sequence of any one of SEQ ID NO:22 to 27.

25. An isolated polynucleotide comprising the sequence of any one of SEQ ID NO: 56-61 or a variant thereof, wherein the variant encodes a polypeptide capable of capable of increasing diameter growth when expressed in the homozygous state in a plant.

26. The isolated polynucleotide of claim 25 comprising a sequence with at least 70% identity to the polynucleotide sequence of any one of SEQ ID NO: 56-61.

27. The isolated polynucleotide of claim 25 and 26 including a codon encoding serine at nucleotide positions 484-486.

28. The isolated polynucleotide of any one of claims 25 to 26 comprising the sequence of any one of SEQ ID NO: 56-61.

29. An isolated polypeptide with the sequence of any one of SEQ ID NO:22 to 27 or a variant thereof, wherein the variant is a polypeptide capable of increasing diameter growth when expressed in the homozygous state in a plant.

30. The isolated polypeptide of claim 29 comprising a sequence with at least 70% identity to the sequence of any one of SEQ ID NO:22 to 27.

31. The isolated polypeptide of claim 30 comprising the sequence of any one of SEQ ID

NO:22 to 27.

32. A genetic construct comprising a polynucleotide of any one of claims 21 to 28.

33. The genetic construct of claim 32, that is an expression construct.

34. A host cell comprising a polynucleotide of any one of claims 21 to 28, or a genetic construct of claim 32 or 33.

35. A host cell genetically modified to express a polynucleotide of any one of claims 21 to 28.

36. The host cell of claim 34 or 35, that is a plant cell.

37. A plant comprising the plant cell of claim 37.

38. A method of producing a plant with increased diameter growth, the method comprising transformation of a plant with: a) a polynucleotide of any one of claims 21 to 28; b) a polynucleotide comprising a fragment, of at least 15 nucleotides in length, of the polynucleotide of a); or c) a polynucleotide comprising a complement of the polynucleotide of a) or b).

39. The method of claim 38, wherein the plant is transformed with a genetic construct or vector comprising the polynucleotide.

40. The method of claim 38 or 39 in which expression of a dehydrin allele encoding a dehydrin protein comprising only two serine (S) amino acids in the CKII-hydrophilic charged domain of SEQ ID NO: 69 is disrupted.

41. A plant produced by the method of any one of claims 38 to 40.

42. The plant of claim 37 or 41, wherein the plant is a tree plant.

43. A part, fruit, seed, harvested material, propagule or progeny of a plant of claim 41 or 42.

44. A part, fruit, seed, harvested material, propagule or progeny of claim 43, that is genetically modified to comprise at least one polynucleotide of any one of claims 21 to 28 or a genetic construct of construct of claim 32 or 33.