WO2001088139A2

WO2001088139A2 - Proteins interacting with a sucrose transporter

Info

Publication number: WO2001088139A2
Application number: PCT/US2001/015315
Authority: WO
Inventors: Howard D. Grimes; Aaron M. Elmer; Kimberly A. Murphy
Original assignee: Washington State University Research Foundation
Priority date: 2000-05-12
Filing date: 2001-05-11
Publication date: 2001-11-22
Also published as: WO2001088139A3; AU2001259746A1

Abstract

Nucleic acid molecules encoding plant proteins that interact with a sucrose transporter are disclosed. These molecules may be introduced into plants in order to alter the sugar transport and/or allocation with the plants.

Description

PROTEINS INTERACTING WITH A SUCROSE TRANSPORTER

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH This work was supported at least in part by National Science Foundation Grants IBN-

9218811 and IBN-9514410.

BACKGROUND

The allocation of carbon assimilates between tissues and organ systems is of fundamental importance in plant biology. In higher plants, carbon is assimilated via photosynthesis in source organs such as leaves. Assimilated carbon, most often in the form of sucrose, is actively imported into the phloem for long-distance transport and distribution to the different sink organs such as roots and developing seeds. Thus, the loading of sucrose into the phloem, its unloading from the phloem at the sinks, and its active transport into sink tissues controls many aspects of plant growth, development, and productivity (Zimmermann and Ziegler, in Encyclopedia of Plant Physiology New Series (eds. Zimmermann and Milburn), Springer, Berlin, 245-271, 1975).

Both the long-distance transport of sucrose and the transport of sucrose across the plasma membrane of individual cells rely on proton motive-force-driven sucrose/H⁺ symporters (Bush, Plant Physiol. 93:1590-1596, 1990; Riesmeier et al, EMBO J. 11:4705-4713, 1992). Plant sucrose transporters (SUTsl) have been cloned from several plant species (Riesmeier et al, EMBO J.

11:4705-4713, 1992; Gahrtz et al., Plant J. 6:697-706, 1994; Sauer and Stolz, Plant J. 6:67-77, 1994; Chiou and Bush, Proc. Natl. Acad. Sci. U.S.A. 95:4784-4788, 1998). These plant SUTs are electrogenic, secondary active-transport proteins that couple sucrose transport to the H⁺ electrochemical potential difference across the plant-cell plasma membrane (Riesmeier et al, EMBO J. 11:4705-4713, 1992; Bush, Annu. Rev. Plant Physiol. Plant. Mol. Biol. 44:513-542, 1993). The SUT genes encode highly hydrophobic proteins consisting of twelve membrane-spanning domains characterized by the "6+6" topology, where the two sets of six transmembrane domains are separated by a cytoplasmic loop (Ward et al., Int. Rev. Cytol. 178:41-71, 1998; Rentsch et al, J. Membrane Biol. 162:177-190, 1998; Lalonde etal, Plant Cell 11:707-726, 1999). This cytosolic loop domain represents one of the few domains of the SUTs that are in contact with the cytoplasm and may thus be important in regulation or as a site of protein-protein interaction. In yeast, for instance, the sugar permease family contains 34 genes of which 20 comprise the hexose transporter (HXT) subfamily (Andre, Yeast 11:1575-1611, 1995; Nelissen ef al., FEMS Microbiol. Rev. 21:113-134, 1997). The expression of these HXT genes is sensitive to the concentration of glucose, which is sensed by the transporter homologs SNF3 and RGT2 (Ozcan et al, Proc. Natl. Acad. Sci. U.S.A. 93:12428-12432, 1996; Ozcan et al, EMBO J. 17:2566-2573, 1998). Distinct sites on the large cytoplasmic-signaling domain of these transporter homologs are involved in this sugar-sensing mechanism (Ozcan et al, Proc. Natl. Acad. Sci. U.S.A. 93:12428-12432, 1996). Sucrose uptake in developing seeds affects two significant agricultural characteristics of the mature seed: the carbohydrate content of the resulting seed grain, and the vitality of the seedling that emerges when the seed grain is planted. Enhanced sucrose uptake activity in developing seeds may be desirable where it is advantageous to increase the carbohydrate content of the seed (e.g., where the seed is the primary plant material harvested, such as soybean). In contrast, decreased sucrose uptake activity in seeds might be desirable where the vegetative material of the plant is harvested.

Despite advances in identifying and characterizing sucrose transporters of plants, very little is known of their regulation, transport to the plasma membrane, or maintenance in the plasma membrane. Regulation of sucrose transport and the physical location of SUTs in plasma membranes may depend on the interaction of SUTs with other proteins. It would be beneficial to identify putative interacting partners of plant SUTs in order to provide a focus for subsequent studies on the regulation and function of these transporters in vivo, and to permit modification of sucrose transport and or carbon allocation in plants. Thus, plants having modified sucrose-uptake activity during, for instance, seed development would be of significant agricultural importance, and it is to such plants that the present invention is directed.

SUMMARY

This invention relates to the isolation of a new soybean SUT (GmSUTl), and its interaction with two new proteins (GMA1 and GMA2, disclosed herein) that contain ankyrin repeat motifs. The identification of these interacting proteins provides a fuller understanding of sucrose-transport processes in higher plants. Provision of the sequences encoding these proteins enables modification • of sucrose transport and/or sugar allocation in plants.

A new soybean sucrose/H⁺ (proton) symporter (GmSUTl) that is expressed in developing cotyledons is disclosed herein. The yeast two-hybrid system was used to identify two proteins that interact with the cytosolic loop of GmSUTl . Both interacting proteins contain multiple ankyrin repeat motifs near their C'-terminus. Because of this feature, these two proteins have been termed "Glycine max ankyrin-related" (GMA) proteins. The interaction indicated by the yeast two-hybrid screen was verified by demonstrating that both a GST-GMA1 fusion protein and purified, iodinated, GMA1 were able to bind to membrane fractions containing GmSUTl. Deletion of three ankyrin repeats (GMAIΔ) abolishes interaction with GmSUTl in the yeast two-hybrid system and results in a reduction of -75% in membrane binding for both GST-GMA1Δ and purified ¹²⁵I-GMA1Δ. In contrast, deletion of all four ankyrin repeats restores interaction of the modified protein (GMAΔ4) with a GmSUTl bait vector (Example 2, Table 3).

The cytosolic loop of GmSUTl interacts with an ankyrin-related protein; this interaction is mediated at least in part by the ankyrin repeats. This finding opens several new possibilities concerning how plant sucrose/H⁺ symporters are regulated, transported within the cell, positioned at the cell surface, and potentially brought into proximity with other proteins involved in sucrose metabolism. With the provision herein of sequences encoding GmSUTl, GMA1 and GMA2, these processes can be manipulated.

Embodiments of this invention include purified proteins that have GMA protein biological activity (defined below). Particular examples of such proteins will have an amino acid sequence as shown in SEQ ID NO: 4 or SEQ ID NO: 6. Other examples of proteins according to this invention will have amino acid sequences that differ from those shown in SEQ ID NO: 4 or SEQ ID NO: 6 by one or more conservative amino acid substitutions, or which have at least 70%, or in other embodiments at least 90%, sequence identity to these sequences.

Embodiments of this invention also include purified GmSUTl proteins. A particular example of such proteins has an amino acid sequence as shown in SEQ ID NO: 2. Other examples of proteins according to this invention will have amino acid sequences that differ from that shown in SEQ ID NO: 2 by one or more conservative amino acid substitutions, or which have at least 70%, or in other embodiments at least 90%, sequence identity to these sequences.

Further embodiments of the invention include nucleic acid molecules that encode these proteins, as well as vectors, cells, or plants that include such sequences through genetic engineering techniques.

Another embodiment of the invention is a method of modifying the level of expression of at least one GmSUTl or GMA protein in a plant. In particular examples, this method involves expressing in the plant a recombinant genetic construct that includes a nucleic acid molecule with at least 30 consecutive nucleotides of the sequence shown in SEQ ID NO: 1, SEQ ID NO: 3 or SEQ ID NO: 5. In specific embodiments, this sequence will be operably linked to a promoter, in either the sense or antisense direction.

BRIEF DESCRIPTION OF THE FIGURES FIG. 1 Hydropathy plot and multiple sequence alignment of plant sucrose/H⁺ symporters.

(A) Kyte-Doolittle hydropathy plot of the deduced amino acid sequence of GmSUTl reveals 12 putative membrane-spanning domains. (B) Multiple-sequence alignment of plant sucrose transporters from tobacco (ntsutla), potato (stsuctr), soybean (gmsut), Viciafaba (vfz93774), and tomato (sos21) showing the highly conserved transmembrane-spanning regions and the divergent N'- terminus and cytosolic loop domains.

FIG. 2 Multiple sequence alignment of the deduced amino acid sequences of GMA 1 and GMA2 with a region of the human erythrocyte ankyrin (RBC). Solid lines indicate ankyrin repeat motifs and the dashed line indicates a more diverged, fourth putative ankyrin repeat.

FIG. 3 RNA gel blot analysis of GMA and GmSUT mRNA levels during soybean cotyledon development. Total RNAs were extracted from five stages of developing cotyledons. The RNAs (10 μg/lane) were fractionated on a 1.0% agarose formaldehyde gel and blotted to a nitrocellulose membrane. Equal loading was verified by EtBr staining. The RNA blot was hybridized with a ³²P-labeled GmSUTl or GMA1. Size of GmSUT (2.2 kb) and GMA (1.5 kb) is indicated. The probes used for both GMAl and GmSUTl were full-length cDNAs and were thus incapable of distinguishing between GMAl and GMA2, and between different soybean SUTs respectively. Labeling of this figure reflects this.

FIG. 4 Interaction of 25 μg of GST, GST-GMA1 or GST-GMA1Δ with membranes (50 μg protein) isolated from soybean cotyledons. "Controls" were loaded with 1 μg of GST or fusion proteins (no membranes) and indicate the molecular weight of GST ("GST"), GST-GMA1 ("GMAl"), and GST-GMA1Δ ("GMAIΔ"). "Membranes" were loaded with 15 μg of protein per lane and show the interaction of GST-GMA1 with the membrane fraction ("GMAl"), the slight interaction of GST with the membrane fraction ("GST"), and the reduced binding of GST-GMA1Δ with the membrane fraction ("GMAIΔ").

FIG. 5 Interaction of ¹²⁵I-GMA1 (929 ng) or ¹²⁵I-GMA1Δ (887 ng) with membrane fractions (25 μg protein) washed in either Na₂C0₃ (pH 9.5) or potassium iodide (KI). Each experiment was performed three times and standard errors are shown. Data are expressed as % maximum and the maximum value was obtained in the Kl-washed membranes incubated with ¹²⁵I- GMAl. "Comp" shows the level of radioactivity remaining with the membrane fraction when 1000 ng of unlabeled GST-GMA1 was included during incubation. "Control" represents the level of radioactivity associated with the membrane fraction after incubation of ¹²⁵I-GST-GMA1 with heat- denatured membrane fractions.

FIG. 6 Immunolocalization of GmSUTl (labeled with secondary antibody conjugated to 15 nm gold particles) and GMAl (labeled with secondary antibody conjugated to 5 nm gold particles) in soybean cotyledons. Key: CP = cytoplasm; CW = cell wall; V = vacuole.

SEQUENCE LISTING

The nucleic and amino acid sequences listed in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases, and three-letter code for amino acids, as defined in 37 § C.F.R. 1.822. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood as included by any reference to the displayed strand. In the accompanying sequence listing:

SEQ ID NO: 1 is the nucleic acid and corresponding deduced amino acid sequence of the

GmSUTl cDNA.

SEQ ID NO: 2 is the deduced amino acid sequence of GmSUTl. SEQ ID NO: 3 is the nucleic acid and corresponding deduced amino acid sequence of the GMAl cDNA. SEQ ID NO: 4 is the deduced amino acid sequence of GMAl .

SEQ ID NO: 5 is the nucleic acid and corresponding deduced amino acid sequence of the GMA2 cDNA.

SEQ ID NO: 6 is the deduced amino acid sequence of GMA2. SEQ ID NOs: 7 through 10 are oligonucleotide primers useful in amplification of GMA encoding sequences.

SEQ ID NOs 11 and 12 are oligonucleotide primers used to isolate GmSUTl.

DETAILED DESCRIPTION

/. Abbreviations

ANK: ankyrin

GMAl: Glycine max ankyrin-related protein 1 GMAIΔ: mutated Glycine max ankyrin-related protein 1

GMAΔ4: mutated Glycine max ankyrin-related protein, all four ANK regions removed

GMA2: Glycine max ankyrin-related protein 2

GmSUTl : sucrose/H⁺ symporter for Glycine max cotyledons

GST: glutathione-S-transferase ORF: open reading frame.

SUT(s): sucrose/H⁺ symporter

II. Explanations of Terms

Unless otherwise noted, technical terms are used according to conventional usage. Definitions of common terms in molecular biology may be found in Benjamin Lewin, Genes VII, published by Oxford University Press, Inc., 2000 (ISBN 019879276X); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0-632-02182- 9); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 1-56081-569-8). In order to facilitate review of the various embodiments of the invention, the following explanations of terms are provided:

Amplification: Amplification of a nucleic acid molecule (e.g., a DNA or RNA molecule) refers to use of a technique that increases the number of copies of a nucleic acid molecule in a specimen. An example of amplification is the polymerase chain reaction, in which a biological sample collected from a subject is contacted with a pair of oligonucleotide primers, under conditions that allow for the hybridization of the primers to nucleic acid template in the sample. The primers are extended under suitable conditions, dissociated from the template, and then re-annealed, extended, and dissociated to amplify the number of copies of the nucleic acid. The product of amplification may be characterized by electrophoresis, restriction endonuclease cleavage patterns, oligonucleotide hybridization or ligation, and/or nucleic acid sequencing using standard techniques. Other examples of amplification include strand displacement amplification, as disclosed in U.S. Patent No. 5,744,311; transcription-free isothermal amplification, as disclosed in U.S. Patent No. 6,033,881; repair chain reaction amplification, as disclosed in WO 90/01069; ligase chain reaction amplification, as disclosed in EP-A-320 308; gap filling ligase chain reaction amplification, as disclosed in 5,427,930; and NASBA™ RNA transcription-free amplification, as disclosed in U.S. Patent No. 6,025,134.

Antisense, Sense, and Antigene: Double-stranded DNA (dsDNA) has two strands, a 5' -> 3' strand, referred to as the "plus strand," and a 3' -» 5' strand (the reverse compliment), referred to as the "minus strand". Because RNA polymerase adds nucleic acids in a 5' -» 3' direction, the minus strand of the DNA serves as the template for the RNA during transcription. Thus, the RNA formed will have a sequence complementary to the minus strand and identical to the plus strand (except that U is substituted for T). Antisense molecules are molecules that are specifically hybridizable or specifically complementary to either RNA or the plus strand of DNA. Sense molecules are molecules that are specifically hybridizable or specifically complementary to the minus strand of DNA. Antigene molecules are either antisense or sense molecules directed to a dsDNA target.

Binding or stable binding: An oligonucleotide binds (e.g., stably binds) to a target nucleic acid if a sufficient amount of the oligonucleotide forms base pairs, or is hybridized to its target nucleic acid, to permit detection of that binding. Binding can be detected by either physical or functional properties of the target: oligonucleotide complex. Binding between a target and an oligonucleotide can be detected by any procedure known to one skilled in the art, including both functional and physical binding assays. Binding may be detected functionally by determining whether binding has an observable effect upon a biosynthetic process such as expression of a gene, DNA replication, transcription, translation and the like.

Physical methods of detecting the binding of complementary strands of DNA or RNA are well known in the art, and include such methods as DNAse I or chemical footprinting, gel shift and affinity cleavage assays, Northern blotting, dot blotting and light-absorption detection procedures. For example, one method that is widely used, because it is so simple and reliable, involves observing a change in light absorption of a solution containing an oligonucleotide (or an analog) and a target nucleic acid at 220 to 300 nm as the temperature is slowly increased. If the oligonucleotide or analog has bound to its target, there is a sudden increase in absorption at a characteristic temperature as the oligonucleotide (or analog) and target disassociate from each other, or melt. The binding between an oligomer and its target nucleic acid is frequently characterized by the temperature (T_m) at which 50% of the oligomer is melted from its target. A higher T_m means a stronger or more stable complex relative to a complex with a lower T_m. cDNA (complementary DNA): A piece of DNA lacking internal, non-coding segments (introns) and transcriptional regulatory sequences. cDNA may also contain untranslated regions (UTRs) that are responsible for translational control in the corresponding RNA molecule. cDNA is usually synthesized in the laboratory by reverse transcription from messenger RNA extracted from cells. Complementarity and percentage complementarity: Molecules with complementary nucleic acids form a stable duplex or triplex when the strands bind, (hybridize), to each other by forming Watson-Crick, Hoogsteen or reverse Hoogsteen base pairs. Stable binding occurs when an oligonucleotide remains detectably bound to a target nucleic acid sequence under the required conditions.

Complementarity is the degree to which bases in one nucleic acid strand base pair with the bases in a second nucleic acid strand. Complementarity is conveniently described by percentage, i.e., the proportion of nucleotides that form base pairs between two strands or within a specific region or domain of two strands. For example, if 10 nucleotides of a 15-nucleotide oligonucleotide form base pairs with a targeted region of a DNA molecule, that oligonucleotide is said to have 66.67% complementarity to the region of DNA targeted.

In the present invention, "sufficient complementarity" means that a sufficient number of base pairs exist between the oligonucleotide and the target sequence to achieve detectable binding, and in the case of the binding of an antigen, to disrupt expression of gene products (such as GMAl and/or GMA2). When expressed or measured by percentage of base pairs formed, the percentage complementarity that fulfills this goal can range from as little as about 50% complementarity to full (100%) complementary. In general, sufficient complementarity is at least about 50%, about 75% complementarity, about 90% or 95% complementarity, and or about 98% or even 100%o complementarity. A thorough treatment of the qualitative and quantitative considerations involved in establishing binding conditions that allow one skilled in the art to design appropriate oligonucleotides for use under the desired conditions is provided by Beltz et al. (Methods En∑ymol 100:266-285, 1983), and by Sambrook et al. (eds.) (Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989). DNA (deoxyribonucleic acid): DNA is a long-chain polymer which comprises the genetic material of most living organisms (some viruses have genes comprising ribonucleic acid (RNA)). The repeating units in DNA polymers are four different nucleotides, each of which comprises one of the four bases, adenine, guanine, cytosine and thymine bound to a deoxyribose sugar to which a phosphate group is attached. Triplets of nucleotides (referred to as codons) code for each amino acid in a polypeptide, or for a stop signal. The term codon is also used for the corresponding (and complementary) sequences of three nucleotides in the mRNA into which the DNA sequence is transcribed.

Unless otherwise specified, any reference to a DNA molecule is intended to include the reverse complement of that DNA molecule. Except where single-strandedness is required by the text herein, DNA molecules, though written to depict only a single strand, encompass both strands of a double-stranded DNA molecule. Thus, a reference to the nucleic acid molecule that encodes GMAl or GMA2, or a fragment thereof, encompasses both the sense strand and its reverse complement. Thus, for instance, it is appropriate to generate probes or primers from the reverse complement sequence of the disclosed nucleic acid molecules.

Deletion: The removal of a sequence of DNA, the regions on either side of the removed sequence being joined together. Genomic target sequence: A sequence of nucleotides located in a particular region in the human genome that corresponds to one or more specific genetic abnormalities, such as a nucleotide polymorphism, a deletion, or an amplification. The target can be for instance a coding sequence; it can also be the non-coding strand that corresponds to a coding sequence.

GMA protein biological activity: Glycine max ankyrin-related proteins (GMAs) are involved in protein-protein interaction at the plasma membrane, and in particular may associate or directly interact with sucrose/H transporters), such as the herein disclosed GmSUTl . GMAl, for instance, is believed to play an important role in establishing the physical location of GmSUTl within the plasma membrane, and may also link this sucrose transporter to other proteins involved in sucrose metabolism via the plasma membrane-cytoskeleton network. GMA protein biological activity can be at least examined through membrane-association assays, as described below, wherein purified GMA protein (or a derivative or fragment thereof) is assayed for its ability to specifically associate with a cellular membrane fraction that contains one or more sucrose/H^4" transporters (such as GmSUTl). A reduction in the amount of specific associate of the GMA protein (or GMA-derived protein) is generally indicative of a loss of GMA protein biological activity. The ability of a putative GMA protein to compete with a known GMA protein (such as GMAl) for specific binding on a cellular membrane fraction is indicative of GMA protein biological activity in the putative GMA protein.

Alterations in the sequence of a GMA protein, or in the expression level (either over- or under-expression), temporal or spatial regulation of GMA expression, can be used to modify sucrose transport in plant cells. In some instances, this modification can be used to alter sugar allocation throughout the plant, either by directing greater sugar uptake to a storage site (increasing the reserves of that storage site) or by directing less sugar uptake to reduce storage and/or increase availability of the sugar for other plant tissues. By way of example only, increased GMA activity in a storage site tissue (such as a seed tissue) can provide increased uptake of sugar into the tissue (e.g., into the seeds), thereby providing a richer, more nutritious plant tissue. If this tissue is, for instance, seeds, this can provide a more nutritious seed (e.g., a more nutritious grain or bean or other agronomically important seed).

Glycine max ankyrin-related proteins (GMAs) were previously referred to as soybean ankyrin-related proteins (SARs); the name has been changed to more closely follows gene naming nomenclature conventions.

Hybridization: Oligonucleotides and their analogs hybridize by hydrogen bonding, which includes Watson-Crick, Hoogsteen or reversed Hoogsteen hydrogen bonding, between complementary bases. Generally, nucleic acid consists of nitrogenous bases that are either pyrimidines (cytosine (C), uracil (U), and thymine (T)) or purines (adenine (A) and guanine (G)). These nitrogenous bases form hydrogen bonds between a pyrimidine and a purine, and the bonding of the pyrimidine to the purine is referred to as "base pairing." More specifically, A will hydrogen bond to T or U, and G will bond to C. "Complementary" refers to the base pairing that occurs between to distinct nucleic acid sequences or two distinct regions of the same nucleic acid sequence. For example, a therapeutically effective oligonucleotide can be complementary to a GMAl or GMA2 encoding mRNA, or a GMAl or GMA2 encoding dsDNA.

"Specifically hybridizable" and "specifically complementary" are terms that indicate a sufficient degree of complementarity such that stable and specific binding occurs between the oligonucleotide (or its analog) and the DNA or RNA target. The oligonucleotide or oligonucleotide analog need not be 100% complementary to its target sequence to be specifically hybridizable. An oligonucleotide or analog is specifically hybridizable when binding of the oligonucleotide or analog to the target DNA or RNA molecule interferes with the normal function of the target DNA or RNA, and there is a sufficient degree of complementarity to avoid non-specific binding of the oligonucleotide or analog to non-target sequences under conditions where specific binding is desired, for example under physiological conditions in the case of in vivo assays or systems. Such binding is referred to as specific hybridization.

Hybridization conditions resulting in particular degrees of stringency will vary depending upon the nature of the hybridization method of choice and the composition and length of the hybridizing nucleic acid sequences. Generally, the temperature of hybridization and the ionic, strength (especially the Na⁺ concentration) of the hybridization buffer will determine the stringency of hybridization, though waste times also influence stringency. Calculations regarding hybridization conditions required for attaining particular degrees of stringency are discussed by Sambrook et al. (eds.) (Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989, chapters 9 and 11), incorporated herein by reference.

For purposes of the present invention, "stringent conditions" encompass conditions under which hybridization will only occur if there is less than 25% mismatch between the hybridization molecule and the target sequence. "Stringent conditions" may be broken down into particular levels of stringency for more precise definition. Thus, as used herein, "moderate stringency" conditions are those under which molecules with more than 25% sequence mismatch will not hybridize; conditions of "medium stringency" are those under which molecules with more than 15% mismatch will not hybridize, and conditions of "high stringency" are those under which sequences with more than 10% mismatch will not hybridize. Conditions of "very high stringency" are those under which sequences with more than 6% mismatch will not hybridize. Isolated: An "isolated" biological component (such as a nucleic acid molecule, protein or organelle) has been substantially separated or purified away from other biological components in the cell of the organism in which the component naturally occurs, i.e., other chromosomal and extrachromosomal DNA and RNA, proteins and organelles. Nucleic acids and proteins that have been "isolated" include nucleic acids and proteins purified by standard purification methods. The term also embraces nucleic acids and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acids.

Nucleotide: "Nucleotide" includes, but is not limited to, a monomer that includes a base linked to a sugar, such as a pyrimidine, purine or synthetic analogs thereof, or a base linked to an amino acid, as in a peptide nucleic acid (PNA). A nucleotide is one monomer in a polynucleotide. A nucleotide sequence refers to the sequence of bases in a polynucleotide.

Oligonucleotide: An oligonucleotide is a plurality of joined nucleotides joined by native phosphodiester bonds, between about 6 and about 300 nucleotides in length. An oligonucleotide analog refers to moieties that function similarly to oligonucleotides but have non-naturally occurring portions. For example, oligonucleotide analogs can contain non-naturally occurring portions, such as altered sugar moieties or inter-sugar linkages, such as a phosphorothioate oligodeoxynucleotide. Functional analogs of naturally occurring polynucleotides can bind to RNA or DNA, and include peptide nucleic acid (PNA) molecules. Particular oligonucleotides and oligonucleotide analogs can include linear sequences up to about 200 nucleotides in length, for example a sequence (such as DNA or RNA) that is at least 6 bases, for example at least 8, 10, 15, 20, 25, 30, 35, 40, 45, 50, 100 or even 200 bases long, or from about 6 to about 80 bases, for example about 30-50 bases, such as 35, 40 or 45 bases.

Operably linked: A first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. Generally, operably linked DNA sequences are contiguous and, where necessary to join two protein-coding regions, in the same reading frame. Open reading frame: A series of nucleotide triplets (codons) coding for amino acids without any internal termination codons. These sequences are usually translatable into a peptide. Ortholog: Two nucleic acid or amino acid sequences are orthologs of each other if they share a common ancestral sequence and diverged when a species carrying that ancestral sequence split into two species. Orthologous sequences are also homologous sequences. Parenteral: Administered outside of the intestine, e.g., not via the alimentary tract.

Generally, parenteral formulations are those that will be administered through any possible mode except ingestion. This term especially refers to injections, whether administered intravenously, intrathecally, intramuscularly, intraperitoneally, or subcutaneously, and various surface applications including intranasal, intradermal, and topical application, for instance. Peptide Nucleic Acid (PNA): An oligonucleotide analog with a backbone comprised of monomers coupled by amide (peptide) bonds, such as amino acid monomers joined by peptide bonds. Pharmaceutically acceptable carriers: The pharmaceutically acceptable carriers useful in this invention are conventional. Martin (Remington 's Pharmaceutical Sciences, published by Mack Publishing Co., Easton, PA, 19th Edition, 1995) describes compositions and formulations suitable for pharmaceutical delivery of the nucleotides and proteins herein disclosed.

In general, the nature of the carrier will depend on the particular mode of administration being employed. For instance, parenteral formulations usually comprise injectable fluids that include pharmaceutically and physiologically acceptable fluids such as water, physiological saline, balanced salt solutions, aqueous dextrose, glycerol or the like as a vehicle. For solid compositions (e.g., powder, pill, tablet, or capsule forms), conventional non-toxic solid carriers can include, for example, pharmaceutical grades of mannitol, lactose, starch, or magnesium stearate. In addition to biologically- neutral carriers, pharmaceutical compositions to be administered can contain minor amounts of non- toxic auxiliary substances, such as wetting or emulsifying agents, preservatives, and pH buffering agents and the like, for example sodium acetate or sorbitan monolaurate.

Probes and primers: Nucleic acid probes and primers can be readily prepared based on the nucleic acid molecules provided in this invention. It is also appropriate to generate probes and primers based on fragments or portions of these disclosed nucleic acid molecules. Other GMAl or GMA2 probes and primers are probes and primers specific for the reverse complement of the disclosed sequences, as well as probes and primers to 5' or 3' regions surrounding the GMAl or GMA2 coding sequence.

A probe comprises an isolated nucleic acid attached to a detectable label or other reporter molecule. Typical labels include radioactive isotopes, enzyme substrates, co-factors, ligands, chemiluminescent or fluorescent agents, haptens, and enzymes. Methods for labeling and guidance in the choice of labels appropriate for various purposes are discussed, e.g., in Sambrook et al. (eds.), Molecular Cloning: A Laboratoty Manual, CSHL, New York, 1989 and Ausubel et al, Current Protocols in Molecular Biology, John Wiley & Sons, New York, 1998.

Primers are short nucleic acid molecules, for instance DNA oligonucleotides 10 nucleotides or more in length. Longer DNA oligonucleotides may be about 15, 20, 25, 30 or 50 nucleotides or more in length. Primers can be annealed to a complementary target DNA strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA strand, and then the primer extended along the target DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR) or other nucleic-acid amplification methods known in the art.

Methods for preparing and using nucleic acid probes and primers are described, for example, in Sambrook et al. (eds.), Molecular Cloning: A Laboratory Manual, CSHL, New York, 1989, Ausubel et al. (ed.) Current Protocols in Molecular Biology, John Wiley & Sons, New York, 1998, and Innis et al, PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc., San Diego, CA, 1990. PCR primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as Primer (Version 0.5©, 1991, Whitehead Institute for Biomedical Research, Cambridge, MA). One of ordinary skill in the art will appreciate that the specificity of a particular probe or primer increases with its length. Thus, for example, a primer comprising 30 consecutive nucleotides of a GMA-encoding nucleotide (a "GMA primer" or "GMA probe") will anneal to a target sequence, such as another GMA homolog from a gene family contained within a human genomic DNA library, with a higher specificity than a corresponding primer of only 15 nucleotides. Thus, in order to obtain greater specificity, probes and primers can be selected that comprise at least 30, 35, 40, 45, 50, 55, 50 or more consecutive nucleotides of the disclosed GMAl or GMA2 nucleotide sequences.

The invention thus includes isolated nucleic acid molecules that comprise specified lengths of the disclosed GMAl and GMA2 cDNA sequences. Such molecules may comprise at least 10, 15, 20, 25, 30, 35, 40, 45, 50 or 60 consecutive nucleotides of these sequences or more, and may be obtained from any region of the disclosed sequences. By way of example, the GMAl cDNA sequences may be apportioned into halves or quarters based on sequence length, and the isolated nucleic acid molecules may be derived from the first or second halves of the molecules, or any of the four quarters. By way of example, the soybean GMAl locus, cDNA, ORF, coding sequence and gene sequences may be apportioned into about halves or quarters based on sequence length, and the isolated nucleic acid molecules (e.g., oligonucleotides) may be derived from the first or second halves of the molecules, or any of the four quarters. The soybean GMAl cDNA (SEQ ID NO: 3) can be used to illustrate this. The portion of the prototypical GMAl cDNA shown in SEQ ID NO: 3 is 1215 nucleotides in length (excluding the poly-A tail) and so may be hypothetically divided into about halves (nucleotides 1-607 and 608-1215) or about quarters (nucleotides 1-303, 304-607, 608-912 and 913-1215). The cDNA also could be divided into smaller regions, e.g. about eighths, sixteenths, twentieths, fiftieths and so forth, with similar effect.

Another mode of division is to select the 5 ' (upstream) and/or 3 ' downstream region associated with a specific cDNA or gene.

Nucleic acid molecules may be selected that comprise at least 10, 15, 20, 25, 30, 35, 40, 50 or 100 or more consecutive nucleotides of any of these or other portions of a GMA cDNA and associated flanking regions, e.g. of the disclosed soybean GMAl or GMA2 coding sequence.

Protein: A biological molecule expressed by an encoding sequence (e.g., a gene) and comprised of amino acids.

Purified: The term "purified" does not require absolute purity; rather, it is intended as a relative term. Thus, for example, a purified protein preparation is one in which the protein referred to is more pure than the protein in its natural environment within a cell or within a production reaction chamber (as appropriate).

Recombinant: A recombinant nucleic acid is one that has a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two otherwise separated segments of sequence. This artificial combination can be accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques. Representational difference analysis: A PCR-based subtractive hybridization technique used to identify differences in the mRNA transcripts present in closely related cell lines.

Serial analysis of gene expression: The use of short diagnostic sequence tags to allow the quantitative and simultaneous analysis of a large number of transcripts in tissue, as described in Velculescu et al, Science 270:484-487, 1995.

Sequence identity: The similarity between two nucleic acid sequences, or two amino acid sequences, is expressed in terms of the similarity between the sequences, otherwise referred to as sequence identity. Sequence identity is frequently measured in terms of percentage identity (or similarity or homology); the higher the percentage, the more similar the two sequences are. Homologs or orthologs of the GMAl or GMA2 protein, and the corresponding cDNA or gene sequence, will possess a relatively high degree of sequence identity when aligned using standard methods. This homology will be more significant when the orthologous proteins or genes or cDNAs are derived from species that are more closely related (e.g., human and chimpanzee sequences), compared to species more distantly related (e.g., human and C. elegans sequences). Typically, GMA orthologs are at least 80% identical at the nucleotide level and at least 70% identical at the amino acid level when comparing soybean GMA to an orthologous GMA coding sequence.

Methods of alignment of sequences for comparison are well known in the art. Various programs and alignment algorithms are described in: Smith & Waterman, Adv. Appl. Math. 2: 482, 1981; Needleman & Wunsch, J. Mol. Biol. 48: 443, 1970; Pearson & Lipman, Proc. Natl. Acad. Sci. USA 85: 2444, 1988; Higgins & Sharp, Gene, 73: 237-244, 1988; Higgins & Sharp, CABIOS 5: 151- 153, 1989; Corpet et al, Nuc. Acids Res. 16, 10881-90, 1988; Huang et al, Computer Appls. in the Biosciences 8, 155-65, 1992; and Pearson et al, Meth. Mol. Bio. 24, 307-31, 1994. Altschul etal. (J. Mol. Biol. 215:403-410, 1990) presents a detailed consideration of sequence alignment methods and homology calculations.

The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al, J. Mol. Biol. 215:403-410, 1990) is available from several sources, including the National Center for Biotechnology Information (NCBI, Bethesda, MD) and on the Internet, for use in connection with the sequence analysis programs blastp, blastn, blastx, tblastn and tblastx. Homologs of the disclosed GMAl and GMA2 proteins typically possess at least 70% sequence identity counted over full-length alignment with the amino acid sequence of soybean GMAl and/or GMA2 using the NCBI Blast 2.0, gapped blastp set to default parameters. For comparisons of amino acid sequences of greater than about 30 amino acids, the Blast 2 sequences function is employed using the default BLOSUM62 matrix set to default parameters, (gap existence cost of 11 , and a per residue gap cost of 1). When aligning short peptides (fewer than around 30 amino acids), the alignment should be performed using the Blast 2 sequences function, employing the PAM30 matrix set to default parameters (open gap 9, extension gap 1 penalties). Proteins with even greater similarity to the reference sequence will show increasing percentage identities when assessed by this method, such as at least 70%, at least 75%, at least 80%, at least 90%, at least 95%, at least 98%, or at least 99% sequence identity. When less than the entire sequence is being compared for sequence identity, homologs will typically possess at least 75% sequence identity over short windows of 10-20 amino acids, and may possess sequence identities of at least 85% or at least 90% or 95% depending on their similarity to the reference sequence. Methods for determining sequence identity over such short windows are described at the NCBI web site.

It will be appreciated that these sequence identity ranges are provided for guidance only; it is entirely possible that strongly significant homologs could be obtained that fall outside of the ranges provided. The present invention provides not only the peptide homologs that are described above, but also nucleic acid molecules that encode such homologs.

An alternative indication that two nucleic acid molecules are closely related is that the two molecules hybridize to each other under stringent conditions. Stringent conditions are sequence- dependent and are different under different environmental parameters. Generally, stringent conditions are selected to be about 5° C to 20° C lower than the thermal melting point (T_m) for the specific sequence at a defined ionic strength and pH. The T_m is the temperature (under defined ionic strength and pH) at which 50% of the target sequence remains hybridized to a perfectly matched probe or complementary strand. Conditions for nucleic acid hybridization and calculation of stringencies can be found in Sambrook et al. (eds.), Molecular Cloning: A Laboratory Manual, CSHL, NY, 1989 and Tijssen, Laboratory Techniques in Biochemistry and Molecular Biology— Hybridization with Nucleic Acid Probes Part I, Chapter 2, Elsevier, NY, 1993. Nucleic acid molecules that hybridize under stringent conditions to a GMAl or GMA2 encoding sequence will typically hybridize to a probe based on either an entire GMAl or GMA2 encoding sequence or selected portions of the encoding sequence under wash conditions of 2 x SSC at 50° C.

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences, due to the degeneracy of the genetic code. It is understood that changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid molecules that all encode substantially the same protein.

Specific binding agent: An agent that binds substantially only to a defined target. Thus a GMA protein-specific binding agent binds substantially only the GMA protein. As used herein, the term GMA-protein specific binding agent" includes anti-GMA protein antibodies (and functional fragments thereof) and other agents (such as soluble receptors) that bind substantially only to a GMA protein. Examples of anti-GMA proteins will be specific for GMAl over GMA2, or vice versa.

Anti-GMA protein antibodies may be produced using standard procedures described in a number of texts, including Harlow and Lane, Antibodies, A Laboratory Manual, CSHL, New York, 1988. The determination that a particular agent binds substantially only to the GMA protein may readily be made by using or adapting routine procedures. One suitable in vitro assay makes use of the Western blotting procedure (described in many standard texts, including Harlow and Lane, Antibodies, A Laboratory Manual, CSHL, New York, 1988). Western blotting may be used to determine that a given GMA protein binding agent, such as an anti-GMA protein monoclonal antibody, binds substantially only to the GMA protein.

Shorter fragments of antibodies can also serve as specific binding agents. For instance, FAbs, Fvs, and single-chain Fvs (SCFvs) that bind to GMA would be GMA-specific binding agents. These antibody fragments are defined as follows: (1) Fab, the fragment which contains a monovalent antigen-binding fragment of an antibody molecule produced by digestion of whole antibody with the enzyme papain to yield an intact light chain and a portion of one heavy chain; (2) Fab', the fragment of an antibody molecule obtained by treating whole antibody with pepsin, followed by reduction, to yield an intact light chain and a portion of the heavy chain; two Fab' fragments are obtained per antibody molecule; (3) (Fab')₂, the fragment of the antibody obtained by treating whole antibody with the enzyme pepsin without subsequent reduction; (4) F(ab')₂, a dimer of two Fab' fragments held together by two disulfide bonds; (5) Fv, a genetically engineered fragment containing the variable region of the light chain and the variable region of the heavy chain expressed as two chains; and (6) single chain antibody ("SCA"), a genetically engineered molecule containing the variable region of the light chain, the variable region of the heavy chain, linked by a suitable polypeptide linker as a genetically fused single chain molecule. Methods of making these fragments are routine.

Subject: Living multi-cellular vertebrate organisms, a category that includes both human and non-human mammals

Target sequence: "Target sequence" is a portion of single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), or RNA that, upon hybridization to a therapeutically effective oligonucleotide or oligonucleotide analog, results in the inhibition of GMAl and/or GMA2 expression. Either an antisense or a sense molecule can be used to target a portion of dsDNA, since both will interfere with the expression of that portion of the dsDNA. The antisense molecule can bind to the plus strand, and the sense molecule can bind to the minus strand. Thus, target sequences can be ssDNA, dsDNA, and RNA.

Transformed: A transformed cell is a cell into which has been introduced a nucleic acid molecule by molecular biology techniques. As used herein, the term transformation encompasses all techniαue**: bv which a rmcleir. ar.irl molecule mifrht he intrn ncRfl intn ςnr-h π cell incliiHinα

practice or testing of the present invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.

III. GMA Protein and Nucleic Acid Sequences

This invention provides GmSUTl, GMAl and GMA2 proteins and GmSUTl, GMAl and GMA2 nucleic acid molecules, including cDNA sequences. The prototypical GMA sequences are the soybean sequences, and the invention provides for the use of these sequences to produce transgenic plants, such as corn and rice plants, having increased or decreased levels of GMA (e.g., GMAl or GMA2) protein.

The partitioning of carbon, mediated by a series of sucrose/H⁺ transporters located in distinct cells and tissues throughout the plant, is a pivotal determinant of plant growth and productivity. The entire process is globally regulated and integrates the availability of, and needs for, sucrose between organ systems that respond to both developmental and environmental signals. Currently, little is known about the regulation of this complex process. Identification of proteins that interact with these sucrose/H transporters offers new avenues of investigation into how this process might be regulated. It is believed that GMA(s) interacts with GmSUTl, either directly or indirectly, and perhaps other membrane proteins or transporters, for instance as part of a cytoskeleton-plasma membrane network. The solanaceous SUT1 protein is located in the sieve element plasma membrane, while transcription occurs in the companion cells (Kuhn et al, Science 275:1298-1300, 1997). Lalonde et al. (Plant Cell 11 :707-726, 1999) suggest two possible pathways to explain this: SUT1 mRNA may be delivered through plasmadesmatal connections by an RNA transport mechanism, or SUT1 translation may occur in the companion cells and delivered to the sieve element plasma membrane by an undefined mechanism, perhaps involving cytoskeleton-directed movement.

It is now believed that interaction between GmSUTl and at least GMAl (also possibly GMA2) plays an important role in establishing the physical location of this sucrose transporter within the plasma membrane and might also link this sucrose transporter to other proteins involved in sucrose metabolism via the plant plasma membrane-cytoskeleton network. In this latter regard, it is interesting to note that sucrose synthase, which cleaves sucrose into UDP-glucose and fructose, has been shown to associate with the plasma membrane and associated cytoskeleton (Amor et al, Proc. Natl. Acad. Sci. U.S.A. 92:9353-9357, 1995; Winter et al, FEBSLett. 430:205-208, 1998). The fact that plant sucrose transporters interact with other proteins forces an expansion of our current models of sucrose transport and regulation. The new model must include several novel possibilities, including the control of sucrose/H⁺ transporters polarity on cell membranes, coupling of sucrose/H⁺ transporters with sucrose metabolizing enzymes to create sucrose "metabolons" at the cell surface, initiation of signal transduction events, or other functions that might depend on the interaction of SUTs with proteins containing ANK repeats.

a. Soybean GMA The soybean GMAl and GMA2 cDNA sequences are shown in SEQ ID NO: 3 and 5, respectively. The sequences encode proteins that are, respectively, 293 and 268 amino acids in length (SEQ ID NO: 4 shows the amino acid sequence of the GMAl protein; SEQ ID NO: 6 shows the amino acid sequence of the GMA2 protein). The soybean GMAl and GMA2 proteins share no significant homology to any known published proteins over their entire length, but do show high sequence homology to ankyrin regions of certain proteins.

As described below, both GMAl and GMA2 have GMA biological activity, i.e., the ability to interact with one or more proteins or plasma membrane proteins, e.g., GmSUTl, and possibly to influence sugar uptake and/or carbon allocation in a plant.

With the provision herein of the soybean GMAl and GMA2 cDNA sequences, nucleotide amplification techniques (such as the PCR) may now be utilized to produce nucleic acid sequences encoding the soybean GMAl or GMA2 proteins. For example, PCR amplification of the soybean GMAl and GMA2 cDNA sequence may be accomplished by direct PCR from a cDNA library (e.g., a plant cDNA library), or by reverse-transcription PCR (RT-PCR) using RNA extracted from plant cells as a template. Methods and conditions for both direct PCR and RT-PCR are known in the art and are described in Innis et al, PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc., San Diego, CA, 1990. Any cDNA library would be useful for direct PCR. The GMAl and GMA2 gene sequences can be isolated from other libraries, for instance the IGF Arabidopsis BAC library (Mozo et al, Mol Gen. Genet. 258(5):562-570, 1998).

Other amplification methods can be used, including strand-displacement amplification, as disclosed in U.S. Patent No. 5,744,311 ; transcription-free isothermal amplification, as disclosed in U.S. Patent No. 6,033,881; repair chain reaction amplification, as disclosed in WO 90/01069; ligase chain reaction amplification, as disclosed in EP-A-320 308; gap filling ligase chain reaction amplification, as disclosed in 5,427,930; and NASBA™ RNA transcription-free amplification, as disclosed in U.S. Patent No. 6,025,134. The selection of amplification primers will be made according to the portions of the GMAl or GMA2 cDNA that are to be amplified. Primers may be chosen to amplify small segments of the cDNA, the open reading frame, the entire cDNA molecule or the entire gene sequence. Variations in amplification conditions may be required to accommodate primers of differing lengths; such considerations are well known in the art and are discussed in Innis et al, PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc., San Diego, CA, 1990, Sambrook et al, In

Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, NY, 1989, and Ausubel et al, In Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences, 1992. By way of example only, the soybean GMAl cDNA molecule as shown in SEQ ID NO: 3 (excluding the poly A tail) may be amplified using the following combination of primers: primer 1: 5' CCAGCAGAACAAAGA 3' (SEQ ID NO: 7) primer 2: 5' ATAAGTAAAATTGAA 3' (SEQ ID NO: 8)

The open reading frame portion of the GMAl cDNA may be amplified using the following primer pair: primer 3: 5' ATGTCTGGTCTGCTCAATGA 3' (SEQ ID NO: 9) primer 4: 5' CAGAAAAGCATCTTTCTCAA 3' (SEQ ID NO: 10)

These primers are illustrative only; one of ordinary skill in the art will appreciate that many different primers may be derived from the provided cDNA sequences in order to amplify particular regions of these molecules. Resequencing of PCR products obtained by these amplification procedures is recommended to facilitate confirmation of the amplified sequence and will also provide information on natural variation in this sequence in different ecotypes and plant populations. Oligonucleotides derived from the soybean GMAl or GMA2 sequence (depending on the amplified sequence) may be used in such sequencing methods.

Oligonucleotides that are derived from the soybean GMAl or GMA2 cDNA sequences are encompassed within the scope of the present invention. Preferably, such oligonucleotide primers will comprise a sequence of at least 15-20 consecutive nucleotides of the soybean GMAl or GMA2 cDNA sequences. To enhance amplification specificity, oligonucleotide primers comprising at least 25, 30, 35, 40, 45 or 50 consecutive nucleotides of these sequences may also be used.

b. GMA Encoding Sequences in Other Plant Species Orthologs of the GM4-encoding sequences are expected to be present in a number of plant species. With the provision herein of the prototypical GMAl and GMA2 proteins from soybean and cDNA sequences that encode these proteins, cloning of cDNAs and genes that encode GMAl and GMA2 protein orthologs in other plant species is now enabled. Standard methods can be used. As described above, orthologs of the disclosed soybean GMAl and GMA2 proteins have GMA protein biological activity (e.g. , the ability to interact with one or more plasma membrane proteins, for instance GmSUTl, and to influence sugar uptake and/or carbon allocation in a plant), and typically possess least 70% sequence identity counted over the full length alignment with the amino acid sequence of soybean GMAl or GMA2 (respectively), using the NCBI Blast 2.0, gapped blastp set to default parameters. Proteins with even greater similarity to the soybean sequence will show greater percentage identities when assessed by this method, such as at least 75%, at least 80%, at least 85%, at least 90% or at least 95% sequence identity.

Both conventional hybridization and amplification procedures (e.g., PCR amplification) may be utilized to clone sequences encoding GMA protein orthologs. Common to these techniques is the hybridization of probes or primers derived from the soybean GMAl or GMA2 cDNA sequence to a target nucleotide preparation. This target may be, in the case of conventional hybridization approaches, a cDNA or genomic library or, in the case of amplification, a cDNA or genomic library, or an mRNA preparation. Direct amplification (e.g., conventional PCR amplification) may be performed on cDNA or genomic libraries prepared from the plant species in question, or RT-PCR may be performed using mRNA extracted from the plant cells using standard methods. Amplification primers will comprise at least 15 consecutive nucleotides of the soybean GMAl or GMA2 cDNA. It will be appreciated that sequence differences between the soybean GMAl or GMA2 cDNA and the target nucleic acid to be amplified may result in lower amplification efficiencies. To compensate for this difference, longer PCR primers or lower annealing temperatures may be used during the amplification cycle. Where lower annealing temperatures are used, sequential rounds of amplification using nested primer pairs may be necessary to enhance amplification specificity.

For conventional hybridization techniques, the hybridization probe is preferably conjugated with a detectable label such as a radioactive label, and the probe is preferably of at least 20 nucleotides in length. As is well known, increasing the length of hybridization probes tends to give enhanced specificity. The labeled probe derived from the soybean GMAl or GMA2 cDNA sequence may be hybridized to a plant cDNA or genomic library and the hybridization signal detected using means known in the art. The hybridizing colony or plaque (depending on the type of library used) is then purified and the cloned sequence contained in that colony or plaque isolated and characterized. Orthologs of the soybean GMAl or GMAl alternatively may be obtained by immunoscreening an expression library. With the provision herein of the disclosed soybean GMAl or GMA2 nucleic acid sequences, the protein may be expressed in and purified from a heterologous expression system (e.g., E. coli) and used to raise antibodies (monoclonal or polyclonal) specific for the soybean GMAl or GMA2 protein. Antibodies may also be raised against synthetic peptides derived from the soybean GMAl or GMA2 amino acid sequences presented herein. Methods of raising antibodies are well known, and for instance are described in Harlow and Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory, NY, 1988. Such antibodies can be used to screen an expression cDNA library produced from the plant from which it is desired to clone the GMAl or GMAl ortholog, using routine methods. The selected cDNAs can be confirmed by sequencing.

c. GMA Sequence Variants

With the provision of the soybean GMAl and GMA2 proteins and GMAl and GMAl cDNA sequences herein, the creation of variants of these sequences is now enabled. Variant GMA proteins include proteins that differ in amino acid sequence from the soybean

GMA sequence disclosed but which retain GMA protein biological activity (the ability to interact with one or more plasma membrane proteins, e.g., GmSUTl, and to influence sugar uptake and/or carbon allocation in a plant). Such proteins may be produced by manipulating the nucleotide sequence of the soybean GMAl or GMAl cDNA using standard procedures, including for instance site-directed or PCR mutagenesis. The simplest modifications involve the substitution of one or more amino acids for amino acids having similar biochemical properties. These so-called conservative substitutions are likely to have minimal impact on the activity of the resultant protein. Table 1 shows amino acids that may be substituted for an original amino acid in a protein, and which are regarded as conservative substitutions.

Table 1.

Original Residue Conservative Substitutions Ala ser

Arg lys

Asn gin; his

Asp glu

Cys ser Gin asn

Glu asp

Gly pro

His asn; gin

He leu; val Leu ile; val

Lys arg; gin; glu

Met leu; ile

Phe met; leu; tyr

Ser thr Thr ser

Tip tyr

Tyr tip; phe

Val ile; leu

More substantial changes in protein functions or other features may be obtained by selecting amino acid substitutions that are less conservative than those listed in Table 1. Such changes include changing residues that differ more significantly in their effect on maintaining polypeptide backbone structure (e.g., sheet or helical conformation) near the substitution, charge, or hydrophobicity of the molecule at the target site, or bulk of a specific side chain. The following substitutions are generally expected to produce the greatest changes in protein properties: (a) a hydrophilic residue (e.g., seryl or threonyl) is substituted for (or by) a hydrophobic residue (e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl); (b) a cysteine or proline is substituted for (or by) any* other residue; (c) a residue having an electropositive side chain (e.g., lysyl, arginyl, or histadyl) is substituted for (or by) an electronegative residue (e.g., glutamyl or aspartyl); or (d) a residue having a bulky side chain (e.g., phenylalanine) is substituted for (or by) one lacking a side chain (e.g., glycine). The effects of these amino acid substitutions, deletions, or additions may be assessed in GMAl or GMA2 protein derivatives by analyzing the ability of a gene encoding the derivative protein to interact with one or more plasma membrane proteins, e.g., GmSUTl, and to influence sugar uptake and/or carbon allocation in a plant. Variant GMA cDNAs may be produced by standard DNA mutagenesis techniques, for example, M13 primer mutagenesis. Details of these techniques are provided in Sambrook et al. (eds.), Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, NY, 1989, Ch. 15. By the use of such techniques, variants may be created which differ in minor ways from the soybean GMAl or GMA2 cDNA sequences disclosed, yet which still encode a protein having GMA protein biological activity. DNA molecules and nucleotide sequences that are derivatives of those specifically disclosed herein and that differ from those disclosed by the deletion, addition, or substitution of nucleotides while still encoding a protein that has GMA protein biological activity are comprehended by this invention. In their most simple form, such variants may differ from the disclosed sequences by alteration of the coding region to fit the codon usage bias of the particular organism into which the molecule is to be introduced.

Alternatively, the coding region may be altered by taking advantage of the degeneracy of the genetic code to alter the coding sequence such that, while the nucleotide sequence is substantially altered, it nevertheless encodes a protein having an amino acid sequence substantially similar to the disclosed soybean GMAl and/or GMA2 protein sequence. For example, the 23rd amino acid residue of the soybean GMAl protein is alanine. This alanine residue is encoded for by the nucleotide codon triplet GCC. Because of the degeneracy of the genetic code, three other nucleotide codon triplets - GCT, GCA and GCG - also code for alanine. Thus, the nucleotide sequence of the soybean GMAl or GMA2 ORF could be changed at this position to any of these three alternative codons without affecting the amino acid composition or other characteristics of the encoded protein. Based upon the degeneracy of the genetic code, variant DNA molecules may be derived from the cDNA sequences disclosed herein using standard DNA mutagenesis techniques as described above, or by synthesis of DNA sequences. Thus, this invention also encompasses nucleic acid sequences that encode a GMA protein, but which vary from the disclosed nucleic acid sequences by virtue of the degeneracy of the genetic code.

Variants of the GMA protein may also be defined in terms of their sequence identity with the prototype GMA proteins shown in SEQ ID NOs: 4 and 6. As described above, GMA proteins have GMA biological activity and share at least 70% sequence identity (or more) with the soybean GMAl and/or GMA2 protein. Nucleic acid sequences that encode such proteins readily may be determined simply by applying the genetic code to the amino acid sequence of a GMA protein, and such nucleic acid molecules may readily be produced by assembling oligonucleotides corresponding to portions of the sequence.

Nucleic acid molecules that are derived from the soybean GMAl or GMAl cDNA sequences disclosed include molecules that hybridize under stringent conditions to the disclosed prototypical GMA nucleic acid molecules, or fragments thereof. Stringent conditions are hybridization at 55° C in 6 x SSC, 5 x Denhardt's solution, 0.1% SDS and 100 μg sheared salmon testes DNA, followed by 15-30 minute sequential washes at 55° C in 2 x SSC, 0.1% SDS, followed by 1 x SSC, 0.1% SDS and finally 0.2 x SSC, 0.1% SDS. Low-stringency hybridization conditions (to detect less closely related homologs) are performed as described above but at 50° C (both hybridization and wash conditions); however, depending on the strength of the detected signal, the wash steps may be terminated after the first 2 x SSC, 0.1% SDS wash.

The soybean GMAl or GMA2 gene or cDNA, and orthologs of these sequences from other plants, may be incorporated into transformation vectors and introduced into plants to produce plants with an altered sucrose uptake characteristics and/or altered sugar allocation. Specific examples of variant GMA proteins include those in which one or more of the ankyrin-homology regions (see Figure 2) has been altered.

IV. Introducing GmSUTl or GMA (e.g., GMAl or GMAl) into Plants

After a nucleic acid molecule (e.g., cDNA or gene) encoding a protein involved in the determination of a particular plant characteristic has been isolated, standard techniques may be used to express the cDNA in transgenic plants in order to modify that particular plant characteristic. The basic approach is to clone, for instance, the cDNA into a transformation vector, such that it is operably linked to control sequences (e.g., a promoter) that direct expression of the cDNA in plant cells. The transformation vector is then introduced into plant cells by one of a number of techniques (e.g, electroporation) and progeny plants containing the introduced cDNA are selected. In some examples, all or part of the transformation vector will stably integrate into the genome of the plant cell. That part of the transformation vector that integrates into the plant cell and that contains the introduced cDNA and associated sequences for controlling expression (the introduced "transgene") may be referred to as the recombinant expression cassette.

Selection of progeny plants containing the introduced transgene may be based upon the detection of an altered phenotype. Such a phenotype may result directly from the cDNA cloned into the transformation vector or may be manifested as enhanced resistance to a chemical agent (such as an antibiotic) as a result of the inclusion of a dominant selectable marker gene incorporated into the transformation vector.

Successful examples of the modification of plant characteristics by transformation with cloned cDNA sequences are replete in the technical and scientific literature. Selected examples, which serve to illustrate the knowledge in this field of technology, include:

U.S. Patent No. 5,451,514 (modification of lignin synthesis using antisense RNA and co- suppression);

U.S. Patent No. 5,750,385 (modification of plant light-, seed- and fruit-specific gene expression using sense and antisense transformation constructs); U.S. Patent No. 5,583,021 (modification of virus resistance by expression of plus-sense untranslatable RNA);

U.S. Patent No. 5,589,615 (production of transgenic plants with increased nutritional value via the expression of modified 2S storage albumins); U.S. Patent No. 5,268,526 (modification of phytochrome expression in transgenic plants); U.S. Patent No. 5,741,684 (production of plants resistant to herbicides or antibiotics through the use of anti-sense expression);

U.S. Patent No. 5,773,692 (modification of the levels of chlorophyll by transformation of plants with anti-sense messages corresponding to chlorophyll a/b binding protein);

WO 96/13582 (modification of seed VLCFA composition using over expression, co- suppression and antisense RNA in conjunction with the Arabidopsis FAE1 gene).

These examples include descriptions of transformation vector selection, transformation techniques and the assembly of constructs designed to over-express the introduced nucleic acid, as well as techniques for sense suppression and antisense expression. In light of the foregoing and the provision herein of the soybean GmSUTl, GMAl and GMAl cDNA sequences, introduction these nucleic acid molecules, or orthologous, homologous or derivative forms of these molecules, into plants in order to produce plants having altered GmSUTl or GMA (e.g., GMAl and/or GMA2) expression and/or activity is now enabled. Manipulating the expression of GmSUTl, GMAl and/or GMA2 in plants will be useful to confer altered sugar transport and/or altered sugar (e.g., sucrose) allocation. Alteration of the GMA protein levels in plants could be used to increase the nutritional value of plant tissues, for instance plant seeds or grain.

a. Plant Types

The presence of a system to regulate sugar transport and allocation appears to be universal. Thus, expression of the GmSUTl, GMAl and/or GMA2 proteins may be modified in a wide range of higher plants to confer altered sucrose uptake and/or sugar allocation, including monocotyledonous and dicotyledenous plants. These include, but are not limited to, Arabidopsis, cotton, tobacco, maize, wheat, rice, barley, soybean, beans in general, rape/canola, alfalfa, flax, sunflower, safflower, brassica, cotton, flax, peanut, clover; vegetables such as lettuce, tomato, cucurbits, cassava, potato, carrot, radish, pea, lentils, cabbage, cauliflower, broccoli, Brussels sprouts, peppers; tree fruits such as citrus, apples, pears, peaches, apricots, walnuts; and ornamental flowers.

b. Vector Construction, Choice of Promoters

Nucleic acids according to the invention can be incorporated into recombinant nucleic-acid constructs, typically DNA constructs, capable of being introduced into and replicating in a host cell. Such a construct preferably is a vector that includes sequences that are capable of transcription and translation of a polypeptide-encoding sequence in a given host cell (and may include a replication system, although direct DNA introduction methods conventionally used for monocot fransfonnation do not require this).

For the practice of the present invention, conventional compositions and methods for preparing and using vectors and host cells are employed, as discussed, inter alia, in Sambrook et al. (eds.), Molecular cloning: A laboratory manual. 2nd ed. Cold Spring Harbor Laboratory Press, Plainview, N.Y., 1989, or Ausubel et al, In Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences, 1992.

A number of vectors suitable for stable transformation of plant cells or for the establishment of transgenic plants have been described in, e.g., Pouwels et al, Cloning Vectors: A Laboratory Manual, 1985, supp., 1987; Weissbach and Weissbach, Methods for Plant Molecular Biology, Academic Press, 1989; and Gelvin et al, Plant Molecular Biology Manual, Kluwer Academic Publishers, 1990. Typically, plant expression vectors include, for example, one or more cloned plant genes under the transcriptional control of 5'- and 3'- regulatory sequences and a dominant selectable marker. Such plant expression vectors also can contain a promoter regulatory region (e.g., a regulatory region controlling inducible or constitutive, environmentally or developmentally regulated, or cell- or tissue-specific expression), a transcription-initiation start site, a ribosome-binding site, an RNA-processing signal, a transcription-termination site, and/or a polyadenylation signal.

Examples of constitutive plant promoters useful for expressing genes in plant cells include, but are not limited to, the cauliflower mosaic virus (CaMV) 35S promoter, maize ubiquitin (Ubi-l) promoter, rice actin (Act) promoter, nopaline synthase promoter, and the octopine synthase promoter. A variety of plant gene promoters that are regulated in response to environmental, hormonal, chemical, and/or developmental signals also can be used for expression of foreign genes in plant cells, including promoters regulated by heat (e.g., heat shock promoters), light (e.g., pea rbcS-3A or maize rbcS promoters or chlorophyll a/b-binding protein promoter); phytohormones, such as abscisic acid; wounding (e.g., wunl); anaerobiosis (e.g., Adh); and chemicals such as methyl jasminate, salicylic acid, or safeners. It may also be advantageous to employ well-known organ-specific promoters such as endosperm-, embryo-, root-, phloem-, or trichome-specific promoters, for example. A variety of plant gene promoters are regulated in response to environmental, hormonal, chemical, and/or developmental signals, and can be used for expression of the cDNA in plant cells. Such promoters include, for instance, those regulated by: (a) heat (Callis et al, Plant Physiol. 88:965, 1988; Ainley et al, Plant Mol. Biol. 22:13-23, 1993; Gilmartin et al. The Plant Cell 4:839-949, 1992); (b) light (e.g., the pea rbcS-3A promoter, Kuhlemeier et al, The Plant Cell, 1:471-478, 1989, and the maize rbcS promoter, Schaff er and Sheen, Plant Cell 3:997, 1991); (c) hormones, such as abscisic acid (Marcotte et al., Plant Cell 1:969, 1989); (d) wounding (e.g., wunl, Siebertz et al, Plant Cell 1:961, 1989); and (e) chemicals such as methyl jasminate or salicylic acid (see Gatz et al., Ann. Rev. Plant Physiol. Plant Mol. Biol. 48:89-108, 1997).

Alternatively, tissue-specific (root, leaf, flower, or seed, for example) promoters (Carpenter et al, The Plant Cell 4:557-571, 1992; Denis et al, Plant Physiol 101:1295-1304 1993; Opperman et al, Science 263:221-223, 1993; Stockhause et al, The Plant Cell 9:479-489, 1997; Roshal et al, EMBOJ. 6:1155, 1987; Schemthaner et α/., £ B<9J 7:1249, 1988; and Bustos et al, Plant Cell 1:839, 1989) can be fused to the coding sequence to obtained protein expression in specific organs. Plant expression vectors optionally include RNA processing signals, e.g., introns, which may be positioned upstream or downstream of a polypeptide-encoding sequence in the transgene. In addition, the expression vectors may also include additional regulatory sequences from the 3'- untranslated region of plant genes, e.g., a 3 '-terminator region to increase mRNA stability of the mRNA, such as the PI-II terminator region of potato or the octopine or nopaline synthase 3'- terminator regions.

Such vectors also generally include one or more dominant selectable marker genes, including genes encoding antibiotic resistance (e.g., resistance to hygromycin, kanamycin, bleomycin, G418, streptomycin, paromomycin, or spectinomycin) and herbicide-resistance genes (e.g., resistance to phosphinothricin acetylfransferase or glyphosate) to facilitate manipulation in bacterial systems and to select for transfonned plant cells.

Screenable markers are also used for plant cell transformation, including color markers such as genes encoding β-glucuronidase (gus) or anthocyanin production, or fluorescent markers such as genes encoding luciferase or green fluorescence protein (GFP).

c. Arrangement of an Encoding Sequence in the Vector

The particular arrangement of the GmSUTl or GMA (e.g., GMAl or GMA2) sequence in the transformation vector will be selected according to the type of expression of the sequence that is desired. Where enhanced protein activity (such as GMAl, GMA2, or both proteins) is desired in the plant, a relevant encoding sequence (e.g., a GMAl or GMA2 ORF) may be operably linked to a constitutive high-level promoter such as the CaMV 35 S promoter. As noted below, modification of GMA synthesis may also be achieved by introducing into a plant a transformation vector containing a variant form of a GMA cDNA. In contrast, a reduction of GMA activity (such as GMAl, GMA2, or both) in the transgenic plant may be obtained by introducing into plants an antisense construct based on a GMA cDNA sequence. For antisense suppression, a GMA cDNA (or fragment thereof) is arranged in reverse orientation relative to the promoter sequence in the transformation vector. The introduced sequence need not be a full length GMA cDNA (e.g., the GMAl cDNA, SEQ ID NO: 3, or the GMA2 cDNA, SEQ ID NO: 5), and need not be exactly homologous to the native GMA cDNA or gene found in the plant type to be transformed. Generally, however, where the introduced sequence is of shorter length, a higher degree of homology to the native GMA sequence will be needed for effective antisense suppression. The introduced antisense sequence in the vector generally will be at least 30 nucleotides in length, and improved antisense suppression will typically be observed as the length of the antisense sequence increases. In some examples, the length of the antisense sequence in the vector will be greater than 100 nucleotides. Transcription of an antisense construct as described results in the production of RNA molecules that are the reverse complement of mRNA molecules transcribed from the endogenous GMA gene in the plant cell. Although the exact mechanism by which antisense RNA molecules interfere with gene expression has not been elucidated, it is believed that antisense RNA molecules bind to the endogenous mRNA molecules and thereby inhibit translation of the endogenous mRNA. The production and use of anti-sense constructs are disclosed, for instance, in U.S. Pat. Nos. 5,773,692 (using constructs encoding anti-sense RNA for chlorophyll a/b binding ' protein to reduce plant chlorophyll content), and 5,741,684 (regulating the fertility of pollen in various plants through the use of anti-sense RNA to genes involved in pollen development or function).

Suppression of endogenous GMA gene expression can also be achieved using ribozymes. Ribozymes are synthetic RNA molecules that possess highly specific endoribonuclease activity. The production and use of ribozymes are disclosed in U.S. Patent No. 4,987,071 to Cech and U.S. Patent No. 5,543,508 to Haselhoff. Inclusion of ribozyme sequences within antisense RNAs may be used to confer RNA cleaving activity on the antisense RNA, such that endogenous mRNA molecules that bind to the antisense RNA are cleaved, leading to an enhanced antisense inhibition of endogenous gene expression. Constructs in which an GMA cDNA (or fragment or variants thereof) are over-expressed may also be used to obtain co-suppression of the endogenous GMA gene (such as the GMAl or GMA2 gene) in the manner described in U.S. Patent No. 5,231,021 to Jorgensen. Such co- suppression (also termed sense suppression) does not require that the entire GMA cDNA or gene be introduced into the plant cells, nor does it require that the introduced sequence be exactly identical to the endogenous GMA gene sequence. However, as with antisense suppression, the suppressive efficiency will be enhanced as (1) the introduced sequence is lengthened and (2) the sequence similarity between the introduced sequence and the endogenous GMA gene is increased.

Constructs expressing an untranslatable form of a GMA mRNA may also be used to suppress the expression of endogenous GMA activity. Methods for producing such constructs are described in U.S. Patent No. 5,583,021 to Dougherty et al. Preferably, such constructs are made by introducing a premature stop codon into a GMA ORF.

Finally, dominant negative mutant forms of the disclosed sequences may be used to block endogenous GMA activity. Such mutants require the production of mutated forms of the GMA protein that interact with the same molecules as GMA but do not have GMA activity. The expression and/or activity of GmSUTl can be influenced similarly, using the relevant

GmSUTl encoding molecules.

d. Transformation and Regeneration

Techniques Transformation and regeneration of both monocotyledonous and dicotyledenous plant cells is now routine, and the most appropriate transformation technique will be determined by the practitioner. The choice of method will vary with the type of plant to be transformed; those skilled in the art will recognize the suitability of particular methods for given plant types. Suitable methods may include, but are not limited to: electroporation of plant protoplasts; liposome-mediated transformation; polyethylene glycol (PEG) mediated transformation; transformation using viruses; micro-injection of plant cells; micro-projectile bombardment of plant cells; vacuum infiltration; and

Agrobacterium tumefaciens (AT) mediated transformation. Typical procedures for transforming and regenerating plants are described in the patent documents listed at the beginning of this section.

e. Selection of Transformed Plants

Following transformation and regeneration of plants with the transformation vector, transformed plants are usually selected using a dominant selectable marker incorporated into the transformation vector. Typically, such a marker will confer antibiotic resistance on the seedlings of transformed plants, and selection of transformants can be accomplished by exposing the seedlings to appropriate concentrations of the antibiotic.

After transformed plants are selected and grown to maturity, they can be assayed using the methods described herein to determine whether the sugar transport and/or protein interaction with a sugar transporter has been altered as a result of the introduced transgene.

IV. Production of Recombinant GmSUTl, GMAl or GMA2 Protein in Heterologous Expression Systems

Many different expression systems are available for expressing cloned nucleic acid molecules. Examples of prokaryotic and eukaryotic expression systems that are routinely used in laboratories are described in Chapters 16-17 of Sambrook et al. (eds.), Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, NY, 1989. Such systems may be used to express GMAl or GMA2 at high levels to facilitate purification of the protein. The purified GMA protein may be used for a variety of purposes. For example, the purified recombinant enzyme may be used as an immunogen to raise anti-GMA antibodies, or more particularly anti-GMA 1 or anti-GMA2 antibodies. Such antibodies are useful as both research reagents (such as in the study of sucrose transport and/or related protein interactions in plants) as well as diagnostically to determine expression levels of the protein in plants that are being developed for agricultural or other use. Thus, the antibodies may be used to quantify the level of GMA protein both in non-transgenic plant varieties and in transgenic varieties that are designed to over-express or under-express the relevant GMA protein. Such quantification may be performed using standard immunoassay techniques, such as ELISA and in situ immunofluorescence and others described in Harlow & Lane (Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory, NY, 1988).

By way of example only, high-level expression of a GMA protein (e.g., GMAl or GMA2) may be achieved by cloning and expressing the relevant GMA cDNA in yeast cells using the pYES2 yeast expression vector (Invitrogen, Carlsbad, CA). Alternatively, a genetic construct may be produced to direct secretion of the recombinant GMA protein from yeast cells into the growth medium. This approach will facilitate the purification of the GMA protein, if this is necessary. Secretion of the recombinant protein from the yeast cells may be achieved by placing a yeast signal sequence adjacent to the relevant GMA coding region. A number of yeast signal sequences have been characterized, including the signal sequence for yeast invertase. This sequence has been successfully used to direct the secretion of heterologous proteins from yeast cells, including such proteins as human interferon (Chang et al, Mol. and Cell. Biol. 6:1812-1819, 1986), human lactoferrin (Liang and Richardson, J. Agric. FoodChem. 41:1800-1807, 1993), and prochymosin (Smith et al, Science 229:1219-1224, 1985).

Alternatively, the enzyme may be expressed at high level in prokaryotic expression systems, such as E. coli, as described in Sambrook et al. (eds.), Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, NY, 1989. Commercially available prokaryotic expression systems include the pBAD expression system and the ThioFusion expression system (Invitrogen, Carlsbad, CA).

The following examples are intended to illustrate but not to limit the invention.

EXAMPLES

Example 1: Isolation and Characterization of GmSUTl

Methods and Materials:

GmSUTl was PCR cloned from a soybean seed cDNA library. First, a 0.6 kb partial clone was obtained using the reverse primer 5'-TGTTAGCGACGTCGAGGATCCAAAA-3' (SEQ ID NO: 11) designed from the consensus sequences of eight plant SUTs and a library-specific adapter primer (API; Clontech). The forward primer 5'-TCTCTTTCTTTCTTCCTGCTGCTACAATATGGAGC- 3' (SEQ ID NO: 12) was designed from the 5 '-untranslated region of the partial clone and used along with the API adapter primer to generate a full-length cDNA clone. Characterizing GmSUTl as a functional sucrose transport protein in a heterologous expression system was carried out according to methods of Overvoorde et al, Plant Cell 8:271-280, 1996. Results:

The isolated full-length cDNA, termed GmSUTl, was sequenced and shown to encode a polypeptide of 55 kD. A hydropathy plot (Kyte and Doolittle, J. Mol. Biol. 157:105-132, 1982) of the deduced amino acid sequence reveals the presence of twelve putative hydrophobic membrane- spanning domains (Fig. 1A). The apparent topology of this polypeptide is the classical "6+6", where each set of six transmembrane domains is separated by a cytosolic loop (Marger and Saier, Trends Biochem. Sci. 18:13-20, 1993). Multiple sequence alignment of GmSUTl with other Suc/H⁺ transporters indicates that GmSUTl shares 68-84% identity with these known transporters (Fig. IB). To verify that GmSUTl encodes a functional Suc/H⁴ symporter, a chimeric gene was transformed into the DBY2615 yeast strain, which normally cannot grow on sucrose due to a knockout of the SUC2 gene (Kaiser and Botstein, Mol. Cell Biol. 6:2382-2391, 1986), creating the yeast strain CG24. Expression of GmSUTl in CG24 allowed these cells to grow on medium containing sucrose as a sole carbon source. Neither DBY2615 cells alone nor DBY2615 cells transformed with pMK195 were able to grow on sucrose-containing medium. These results indicate that GmSUTl encodes a functional sucrose transporter.

Example 2: Isolation and Characterization of GMAl and GMA2

Two-Hybrid cDNA Library Screening

Methods and Materials:

Yeast Two-hybrid Constructs — The Hybrid Hunter™ yeast two-hybrid system from Invitrogen (Carlsbad, CA) was used to screen a soybean cotyledon cDNA library for SUT-binding proteins. Bait plasmids were generated by PCR cloning either the full-length GmSUTl or GmSUTl peptide (F228 through A278) in-frame with the DNA-binding domain of LexA from the pHybLex/Zeo vector generating pYBSUTl and pYBSUTlpep, respectively. Testing pYBSUTl and pYBSUTlpep for expression using LexA antibody revealed that only pYBSUTlpep was sufficiently expressed for use in the yeast two-hybrid screen. The prey plasmid library was generated by cloning a soybean cotyledon cDNA library downstream of the B42 activator domain in the pYESTrp2 vector. Yeast Two-hybrid Screening — Bait plasmid pYBSUTlpep was transformed into yeast strain L40 [MATa his3Δ200 t 1-901 leu2-3112 ade2 LYS2::(41exAop-HIS3) URA3::(81exAop- lacZ) GAL4] (Invitrogen, Carlsbad, CA) using the PEG/Li-acetate method (Gietz et al, Nucleic Acids Res. 20:1425, 1992). The cDNA library was then transformed into L40-pYBSUTlpep. Transformants growing on his^" media were tested using a β-galactosidase filter lift assay (Invitrogen, Carlsbad, CA). Putative positive clones (832) were indicated by blue colonies after 25 minutes in the 30° C incubator. Seventy putative positive clones were selected for further testing. Plasmid DNA was extracted from the clones and transformed into E. coli XLlO-gold cells (Stratagene, La Jolla, CA). Each clone was classified by restriction analysis and partial sequences were obtained using a vector-specific primer (pYESTrp forward). Each putative interactor was checked for autoactivation and histidine prototrophy, eliminating seventeen of the original 70. Additional analysis was performed to test for and eliminate false positives. Specific interactions were identified by directed yeast two-hybrid interactions using various control plasmids. Positive interactions were tested using pYBSUTlpep. Nonspecific interactions with other proteins were tested using three plasmids supplied by The Hybrid Hunter™ yeast two-hybrid system (Invitrogen, Carlsbad, CA): pHybLex/Zeo, pHybLex/Zeo-Fos, pHybLex/Zeo-Lamin. Nonspecific interactions were also tested using pHybLex/Zeo-SBPl, a protein previously cloned in our lab.

Results:

Two bait vectors were constructed as discussed above; the first bait vector contained the entire coding sequence for GmSUTl (pYBSUTl) and the second contained only the cytosolic loop domain (pYSUTlpep). This cytoplasmic loop encompassed 51 amino acids, F228 through A278 (Fig. IB). When both were expressed in yeast strain L40, however, pYBSUTl could not be used because the protein was either not expressed or was rapidly degraded. This was determined by extracting proteins from the transformed yeast, resolving them via SDS-PAGE, electroblotting, and detecting the SUT1 indirectly using the LexA antibody. The GmSUTl peptide was expressed adequately as evidenced by positive LexA antibody staining and was used as the bait vector for subsequent experiments. For the two-hybrid cDNA library screening, yeast L40 cells were sequentially transformed with pYBSUTlpep, and then with 150 μg of a soybean cotyledon cDNA library in the pYPScDNA vector. From a total of 4.2 x 10⁸ transformants, and after elimination of numerous colonies based on a series of control experiments to minimize false positives and autoactivators, 53 clones were found to interact with GmSUTl pep and were isolated. Sequencing of each of these, followed by multiple- sequence analysis, indicated that each of these cDNAs fell into one of four groups (Table 2). The vast majority (46 of the 53 cDNAs) of these belonged to one of two groups (Table 2; groups A and B), and the clones were highly conserved between these two groups. Because several identical clones were isolated, it appears as though our screen was saturated. A member of each of the original four groups was randomly chosen for full-length sequencing and further analysis. The cDNAs from group A (Table 2) defined an ORF of 293 amino acids (879 nucleotides) and those from group B defined an ORF of 268 amino acids (804 nucleotides). Both full-length cDNAs encode proteins of approximately 30-32 kD. The ORFs from A and B were nearly identical except for an additional 25 amino acids at the N'-terminus of the clones from group A. The fact that two similar cDNAs were found in soybean suggested that soybean contains two closely related genes which was verified by Southern blot analysis. Since 87% of the putative interacting clones encoded two closely related proteins, all subsequent work focuses on these clones and their encoded proteins.

Table 2. Positive clones from the yeast two-hybrid screen that interact with pYBSUTlpep bait.

+, growth on his media; Blue, strong β Galactosidase activity. One representative from each group was selected for sequencing, and tested for false positive interactions. Group A contains two sets of clones with identical amino acid sequences, but differ in their 5' sequences.

Computer analysis of GMAl and GMA2 proteins and encoding sequences: The ORF sequences representing groups A and B (Table 2) were imported into the NCSA

Molecular Workbench and the TBLASTN function used initially to find similar sequences in the GenBank Plant Sequence data base. The same sequences were found for both soybean ORFs and these were distinguished by the presence of several ankyrin (ANK) repeat motifs (Accession #AF034387 and #U70425). Based on the presence of these motifs, these clones and their corresponding proteins were termed soybean ankyrin-related (GMA) cDNAs and proteins. Groups A and B correspond to GMAl and GMA2, respectively (Table 2).

Because of the presence of multiple ANK repeats in GMAl and GMA2, an alignment was made between the deduced amino acid sequences of GMAl, GMA2 and a portion of the human erythrocyte ankyrin (Fig. 2), a protein known to interact with plasma membrane proteins (Bennett, J. Biol. Chetn. 267:8703-8706, 1992; Bennett, Methods Enzymol. 96:313-324, 1983). Analysis of GMAl and GMA2 sequences reveal three ANK motifs at the carboxy terminus (solid lines; Fig. 2). While the ANK repeat is typically a 33 aa sequence comprised of DxxGxTPLHLAxxxGxxxVVxLLLxxGADVNA, an unusually high number of amino acid substitutions are tolerated in this motif (Sedgwick and Smerdon, Trends Biochem. Sci. 24:311 -316, 1999; Bork, Proteins 17:363-374, 1993; Michaely and Bennett, J. Biol. Chem. 270:31298-31302, 1995). A fourth putative ANK repeat is also present (dashed line; Fig. 2); this repeat is somewhat more diverged. The soybean ANK motifs clearly show diversity from the human erythrocyte ankyrin. For instance, SIVH is substituted for the TPLH in the 1st GMA ANK repeat, and TALH is substituted for the TPLH in the 2nd and 3rd GMA ANK repeats. The soybean ANK-related proteins are considerably smaller (30-32 kD) than the human erythrocyte ankyrin (the spectrin-binding domain is 62 kD; Bennett, J. Biol. Chem. 267:8703-8706, 1992) and contain fewer ANK repeats. While proteins containing ANK repeats have several functions, the original ankyrin protein and related proteins often link plasma membrane proteins to the spectrin/actin cytoskeleton network (Bennett, Methods Enzymol. 96:313-324, 1983; Bennett and Stenbuck, Nature 280:468-473, 1979; Bennett, Physiol. Rev. 70:1029-1065, 1990).

RNA Gel Blot Analysis

Methods and Materials: Soybean embryos (4, 6, 7, 9, and 11 mm) excised from the seed coat were frozen in liquid nitrogen, and stored at -80° C. RNA was isolated from 0.5 g of tissue according to Grimes et al. (Plant Cell 4:1561-1574, 1992). Poly A⁺ mRNA was isolated and run on a 1% agarose- formaldehyde gel according to Sambrook et al. (Molecular cloning: A laboratory manual, Cold Spring Harbor Lab. Press, Plainview, NY, 1989). RNA was transferred to GeneScreenPlus™ (DuPont, Wilmington, DE) membrane according to manufacturer's recommendations. The blot was hybridized with a 1.5 kb GMAl probe labeled by random priming according to manufacturer's directions (Amersham Pharmacia Biotech). The membrane was hybridized (Fig. 3) with 1-5 ng/mL ³²P-GMA1 according to the protocol from Grimes et al, Plant Cell 4:1561-1574, 1992. Results: Although the GmSUTl and GMA mRNA populations exhibit slightly different temporal regulation, both are present in the soybean cotyledon when sucrose is actively imported (Grimes et al, Plant Cell 4:1561-1574, 1992). Characterization of GMA Proteins

Methods and Materials:

Protein Expression and ¹²³I Labeling — Full-length and truncated (GMAIΔ, residues 1 tlirough 240 of SEQ ID NO: 6) forms of GMAl were generated using PCR and subcloned into pGEX-3X (Amersham Pharmacia Biotech) resulting in pGST-GMAl and pGST-GMAlΔ. E. coli BL21(DE3) cells transformed with pGEX-3X, pGST-GMAl, and pGST-GMAlΔ were grown to mid-log phase prior to inducing expression with 0.3 mM IPTG. Fusion proteins were expressed for 4-5 hours at 30° C. Purification of GST, GST-GMA1, and GST-GMA1Δ was carried out according to manufacturer's instructions (Amersham Pharmacia Biotech). GMAl and GMAIΔ were generated by cleaving the column-bound fusion proteins with Factor Xa (New England Biolabs, Beverly MA). Cleaved protein from the glutathione column was separated from Factor Xa using a DEAE-Sephadex anion exchange column. GMAl and GMAIΔ were iodinated with 1 mCi Na^I25I using 10 μL of 0.5 mg/mL chloramine T as an oxidant. The reaction proceeded 5 minutes at room temperature, and was halted by the addition of 10 μL of 10 mg/mL sodium metabisulfate. The labeled protein was separated from the reaction mixture using a 30 cm Sephadex G75-120 column. Specific activity of ^1Z5I-GMA1 was 4.3xl0⁶ cpm/μg, and specific activity of ¹²⁵I-GMA1Δ was 4.5xlO^δ cpm/μg.

Protein Interactions — A membrane fraction was isolated from stage A through D soybean cotyledons (Chao et al, Plant Physiol. 107:253-262, 1995), according to the method of Overvoorde and Grimes, J. Biol. Chem. 269:15154-15161, 1994. The resulting membrane pellet was stripped of peripheral membrane proteins using either KI (Bennett, Methods Enzymol. 96:313-324, 1983) or Na₂C0₃ (Overvoorde and Grimes, J. Biol. Chem. 269:15154-15161, 1994). Membranes were aliquoted into protein containing 25 μg protein, and blocked in 80 μL of interaction buffer containing 100 mM KC1, 7.5 mM sodium phosphate, 0.2 mM EDTA, 0.2 mM DTT, and 2% BSA (pH 7.5) for 15 minutes. Interactions were initiated by adding 25 μg of GST, GST-GMA1, or GST-GMA1Δ in 20 μL of interaction buffer. Interactions proceeded for 30 minutes at room temperature. Membranes were pelleted at 100,000 x g for 30 minutes. The membranes were washed three times with interaction buffer plus 1% BSA. The membrane pellet was solubilized in sample buffer plus urea (Overvoorde. and Grimes, J. Biol Chem. 269:15154-15161, 1994). Proteins were separated on 12% SDS-PAGE, and electroblotted to nitrocellulose (Towbin et al, Proc. Natl. Acad. Sci. USA 76:4350- 4354, 1979). Nitrocellulose membranes were probed using GST antibodies (Sigma-Aldrich, St. Louis, MO) at a 1 :20,000 dilution. Immunoblots were visualized using chemiluminescence (ECL; Pierce Chemical, Rockford, IL). Alternatively, binding of ^I25I-GMA1 and ^I25I-GMA1Δ to soybean membranes, containing the sucrose/H⁺ transporter, followed the procedure established for erythrocyte ghosts (Bennett and Stenbuck, J. Biol. Chem. 255:2540-2548, 1980). Blocked membranes were interacted with 4.0 x 10⁶ cpm of ¹²⁵I- GMAl (929 ng) or ¹²⁵I-GMA1Δ (887 ng) for 30 minutes at room temperature. Competition reactions included 1 μg of unlabeled GST-GMA during the blocking with BSA. The control interactions used heat denatured, Na₂C0₃ washed, membranes (Bennett and Stenbuck, J. Biol Chem. 255:2540-2548, 1980). The ¹²⁵I-GMA1 and ¹²⁵I-GMA1Δ bound to the membrane fraction was measured using an Iso-Data 20/20 γ counter. The experiment was repeated three times, and the data converted to a percentage of the highest value within each experiment. The data from the three replications were used to calculate means and standard errors.

Results:

The yeast two-hybrid system was used to verify the specificity of interaction between GMAl and GmSUTlpep. This was done by directed yeast two-hybrid interactions using known bait (GmSUTlpep) combined with different prey vectors expressing Fos, Lamin, or the sucrose binding protein (SBP1). The latter is a soybean protein able to mediate low levels of sucrose uptake when expressed in a heterologous yeast system (Overvoorde et al, Plant Cell 8:271-280, 1996; Grimes et al, Plant Cell 4:1561-1574, 1992; Overvoorde et al, J. Biol. Chem. 272:15898-15904, 1997). Each of these proteins was expressed and accumulated in yeast as verified by antibody detection. Table 3 confirms the initial GmSUTl :GMA1 interaction and shows that the GmSUTlpep was unable to interact with any of these alternative proteins, suggesting that the interaction of GmSUTl with GMAl is specific.

The interaction of human erythrocyte ankyrin and other ankyrin-like proteins with their partners is dependent on the ANK motifs. Deletion of these ANK repeats diminishes or abolishes protein-protein interaction (Inoue et al, Proc. Natl. Acad. Sci. U.S.A. 89:4333-4337, 1992; Rebay et al, Cell 74:319-329, 1993; Ewaskow et al, Biochem. 37:4437-4450, 1998). To determine if this was the case for GMAl, a truncated GMAl with three of the four 3 '-carboxy terminal ANK repeats deleted (GMAIΔ) was constructed and subcloned into the pYESTrp2 vector to create pYPGMAlΔ. Table 3 shows that the GMAIΔ peptide is unable to interact with GmSUTlpep in the yeast two- hybrid system and suggests that the ANK repeats are essential for GMAl-GmSUTlpep interaction. However, another construct (pYPGMAΔ4, encoding the GMAΔ4 protein), in which all four of the ankyrin repeats were deleted, is capable of interacting with GmSUTl . This indicates that at least one region of interaction is at the N-terminal region of the GMA protein, rather than at the ANK domains. However, the ANK domains have some role in mediating the interaction. The ANK domains may also mediate or influence the correct folding of the GMAl protein; thus, the loss of association observed with the GMAIΔ construct may be due at least in part to aberrant folding that obscures or obliterates the GmSUTl binding site on GMAl .

Table 3. Directed yeast two-hybrid screen testing the specificity of the GMAl clones.

+, growth on his media; -, no growth on his media; Blue, strong β- Galactosidase activity; White, no β-Galactosidase activity. If GMAl interacts with GmSUTl in vivo, it should be possible to verify this interaction by establishing that GMAl binds to membrane fractions containing GmSUTl . To test this idea, GMAl and GMAIΔ were expressed as GST-GMAl and GST-GMAl Δ fusion proteins in E. coli. The purified fusion proteins were incubated with a membrane fraction isolated from 2 to 6 mm soybean cotyledons. After washing, the membrane fraction, which should contain any interacting protein, was resolved via SDS-PAGE and electroblotted. The presence of bound GST-GMAl or GST-GMA1Δ with this washed membrane fraction was detected using an anti-GST antibody. Figure 4 shows that GST-GMAl was able to bind to this membrane fraction, while the GST-GMAl Δ exhibited a reduced level of binding. It is also apparent that GST by itself is capable of some interaction with the membrane fraction; however, its level is quite low when compared to GST-GMAl (Fig. 4).

In a second approach, which also circumvented the complication of GST interaction with the membrane fraction, pure GMAl and GMAIΔ proteins were obtained and iodinated. These ¹²⁵I- GMA1 and ¹²⁵I-GMA1Δ peptides were incubated with cotyledon membrane fractions that had been washed with either Na₂C0₃ or KI to remove extrinsic membrane proteins. After incubation with iodinated GMAl or GMAIΔ, the membrane fractions were washed and the level of radioactivity in each fraction determined. Figure 5 clearly demonstrates that ¹²⁵I-GMA1 interacts with membrane fractions washed with either Na₂C0₃ or KI while the ¹²⁵I-GMA1Δ peptide was significantly (~2.5 fold) diminished in its ability to interact. Furthermore, the ¹²⁵I-GMA1 binding was competed by the inclusion of unlabeled GST-GMAl (929 ng ^I25I-GMA1 to 1000 ng GST-GMAl), again suggesting that the GMAl interaction with the membrane fraction was specific (Fig. 5).

Discussion:

Multiple experimental approaches verified that GMAl interacted with GmSUTl and that the ANK motifs were critical for this interaction. A mutated GMAl (GMAIΔ) with the three ANK motifs deleted from the C'-terminus allows a direct comparison between full-length GMAl and GMAIΔ. In the yeast two-hybrid system, yeast expressing GMAl interacted with GmSUTlpep, while those expressing GMAIΔ did not. This suggests that GMAl ANK motifs are necessary for the interaction of GMAl and GmSUTlpep. Additional experiments, with both fusion proteins and purified GMA 1 /GMA 1 Δ verify the GMA 1 : GmSut 1 interaction at the biochemical level.

Specifically, incubation of full-length ¹²⁵I-GMA1 with the membrane fraction verified that GMAl was able to bind to the membrane fraction, while incubation with ¹²⁵I-GMA1Δ exhibited -75% less affinity for the membrane. It is tempting to speculate that some of the residual 25% binding may be due to the fourth putative ANK repeat that was not deleted. Importantly, the level of binding between ¹²⁵I-GMA1 and the membrane fraction was competed with unlabeled GST-GMAl to levels virtually identical to that of ¹²⁵I-GMA1Δ. The collective results of the yeast two-hybrid screen, the directed two-hybrid screens, and the biochemical data provide compelling support for the hypothesis that GmSUTl binds to GMAl in vivo and that this interaction is dependent upon the ANK motifs. This does not, however, preclude the possibility that GMAl interacts with other proteins in addition to GmSUTl.

The ANK repeat is one of the most common protein sequence motifs and proteins with ANK motifs carry out a wide variety of biological activities (Sedgwick and Smerdon, Trends Biochem. Sci. 24:311-316, 1999). The motif has now been detected in >1000 proteins, including cyclin-dependent kinase (CDK) inhibitors, transcriptional regulators, cytoskeleton organizers, developmental regulators, and toxins (Bork, Proteins 17:363-374, 1993; Michaely and Bennett, J. Biol. Chem. 270:31298-31302, 1995). Although the structural and functional diversity of these ANK motif proteins makes a common function seem unlikely, the role of ANK repeats in mediating protein- protein interaction is well documented (Bennett, J. Biol. Chem. 267:8703-8706, 1992; Sedgwick and Smerdon, Trends Biochem. Sci. 24:311-316, 1999). While ANK repeat proteins have been identified in plants (Zhang et al, Plant Cell 4:1575-1588, 1992; Cao et al, Cell 88:57-63, 1997; Albert et al, Plant J. 17:169-179, 1999), this is the first time their involvement in protein-protein interactions has been established. The function of the human erythrocyte ankyrin might be of special interest in the context of

GmSUTl :GMA1 binding. Ankyrin is a structural protein initially isolated based on activity in linking the spectrin skeleton of human erythrocytes to the cytoplasmic surface of the plasma membrane (Bennett, J. Biol Chem. 267:8703-8706, 1992; Bennett, J. Biol. Chem. 267:8703-8706, 1992; Bennett, J. Biol. Chem. 253:2292-2299, 1978). In the erythrocyte, ankyrin associates with at least seven distinct membrane proteins, including transport proteins (Bennett, J. Biol. Chem.

253:2292-2299, 1978; Michaely and Bennett, J. Biol. Chem. 270:31298-31302, 1995). In other systems, several membrane proteins co-localize with ankyrin isoforms in specialized membrane domains including nodes of Ranvier, sites of cell contact, and axon initial segments (Morrow et al, J. Cell Biol. 108:455-465, 1989; Kordeli et α/., J. Cell Biol. 110:1341-1352, 1990; Kordeli and Bennett, J. Cell Biol. 114:1243-1259, 1991; Kunimoto et al, J. Cell Biol. 115:1319-1331, 1991). The spectrin-based membrane skeleton aids in determining the protein composition of these membrane domains by interacting with the cytoplasmic domains of several integral membrane proteins. In plant systems, one of the K⁺ channels contains endogenous ANK repeats which are exposed to the cytoplasm (Sentenac et α/., Scze«ce 256:663-665, 1992; Daram et al, EMBO J. 16:3455-3463, 1997). While a role of these ANK repeats is not yet documented, the authors postulated that they may link the K⁺ channel to the cytoskeleton (Sentenac et al, Science 256:663-665, 1992). GmSUTl is a novel second electrogenic plant transporter and this sucrose transporter interacts with at least one protein containing ANK repeats. It may be significant that two plant transporters, both capable of altering ψ (membrane potential), share the potential of interacting with other proteins or the cytoskeleton network via ANK repeats. Since other plant secondary transporters, such as amino acid transporters, are characterized by the "6+6" topology, it is possible that yet other transporters may be found which interact with ankyrin-like proteins. Example 3: I munogold Subcellular Localization of GMAl

Polyclonal antibodies were generated in rabbits against a GmSUTl cytosolic loop peptide (FLSIALLLTLSTIALTYVKEKTVSSEKTVRSSVEEDGSHGGMPCFGQLFGA, corresponding to residues 229 to 279 of SEQ ID NO: 2) and the entire GMAl protein expressed in E. coli, using known methods. These antibodies were used for immunocytochemical localization of GmSUTl and GMAl in soybean cotyledons using known techniques (Fischer et al, Plant J. 19:543-554, 1999; Dubbs & Grimes, Plant Phys. 123:1269-1279, 2000; Dubbs & Grimes, Plant Phys. 123:1281-1288, 2000). In summary, tissue was fixed and embedded in LRWhite (Pelco 3450 Lab, Ted Pella, Redding, CA). Thin sections were produced via ultramicrotome. Sections were attached to grids, blocked with BSA/Tween-20, incubated with primary antibody (cc-GmSUTlpep or α-GMAl), washed, incubated with secondary antibody (anti-rabbit IgG conjugated to gold particles of 15 and 5 nm, respectively), washed, and visualized using transmission electron microscopy (TEM).

Representative electronmicrographs are shown in Fig. 6, panels A, B, and C. This data shows that the GmSUTl, as expected, is associated with the plasma membrane of cotyledons. This localization was independent of cotyledon development. GMAl appears to be present in the cytoplasm, and again this localization was independent of the age or developmental condition of the cotyledon. The examples shown are -14 days after fertilization.

In view of the many possible embodiments to which the principles of this invention may be applied, it should be recognized that the illustrated embodiments are only examples of the invention and should not be taken as a limitation on the scope of the invention. Rather, the scope of the invention is defined by the following claims. We therefore claim as our invention all that comes within the scope and spirit of these claims.

Claims

CLAIMSWe claim:

1. A purified protein having GMA protein biological activity, and comprising an amino acid sequence selected from the group consisting of:

(a) the amino acid sequence shown in SEQ ID NO: 4 or SEQ ID NO: 6;

(b) amino acid sequences that differ from those in (a) by one or more conservative amino acid substitutions; and (c) amino acid sequences having at least 70% sequence identity to the sequences specified in (a) or (b).

2. A specific binding agent that binds the protein of claim 1.

3. An isolated nucleic acid molecule encoding a protein according to claim 1.

4. An isolated nucleic acid molecule according to claim 3, which molecule comprising a sequence as shown in SEQ ID NO: 3 or SEQ ID NO: 5.

5. A recombinant nucleic acid molecule, comprising a promoter sequence operably linked to a nucleic acid molecule according to claim 3.

6. A cell transformed with a recombinant nucleic acid molecule according to claim 5.

7. A transgenic organism, or part thereof, comprising a recombinant nucleic acid molecule according to claim 5, wherein the transgenic organism is a plant, bacterium, fungus, or animal.

8. The transgenic organism of claim 7, wherein the organism is a plant and the plant is Arabidopsis, cotton, tobacco, maize, wheat, rice, barley, soybean, another bean, rape/canola, alfalfa, flax, sunflower, safflower, brassica, cotton, flax, peanut, clover; a vegetable, lettuce, tomato, cucurbits, cassava, potato, carrot, radish, pea, lentils, cabbage, cauliflower, broccoli, Brussels sprout, pepper, a fruit tree, or an ornamental flower.

9. A method of modifying the level of expression of at least one GMA protein in a plant, comprising: expressing in the plant a recombinant genetic construct comprising a promoter operably linked to a nucleic acid molecule, wherein the nucleic acid molecule comprises at least 20 consecutive nucleotides of the isolated nucleic acid molecule of claim 3.

10. The method of claim 9, wherein the nucleic acid molecule comprises at least 20 consecutive nucleotides of the sequence shown in SEQ ID NO: 3 or SEQ ID NO: 5.

11. A method of modifying the sugar content of a plant or plant part, comprising: expressing in the plant a recombinant genetic construct comprising a promoter operably linked to a nucleic acid molecule, wherein the nucleic acid molecule comprises at least 20 consecutive nucleotides of the sequence shown in SEQ ID NO: 1, SEQ ID NO: 3, or SEQ ID NO: 5.

12. The method of claim 11, wherein the plant part in which the sugar content is modified is influenced by the promoter that is operably linked to the nucleic acid molecule.

13. A purified active sucrose/proton symporter protein, comprising an amino acid sequence selected from the group consisting of:

(a) SEQ ID NO: 2;

(b) amino acid sequences encoding functional fragments of (a); and (c) amino acid sequences having at least 70% sequence identity with the amino acid sequence of (a), or (b).

14. An isolated nucleic acid molecule encoding the protein according to claim 13.

15. A recombinant nucleic acid molecule comprising a promoter sequence operably linked to the nucleic acid molecule of claim 13.

16. A transgenic plant comprising the recombinant nucleic acid molecule of claim 14.

17. An isolated nucleic acid molecule that:

(a) hybridizes under low-stringency conditions with a nucleic acid probe, the probe comprising a sequence selected as shown in SEQ ID NO: 3 or SEQ ID NO: 5, or a fragment thereof that is at least 30 residues in length; and

(b) encodes a protein having GMA protein biological activity.

18. A GMA protein encoded by the nucleic acid molecule of claim 17.

19. A recombinant nucleic acid molecule, comprising a promoter sequence operably linked to a nucleic acid molecule according to claim 17.

20. A cell transformed with a recombinant nucleic acid molecule according to claim 19.

21. A transgenic organism, or part thereof, comprising a recombinant nucleic acid molecule according to claim 19, wherein the transgenic organism is a plant, bacterium, fungus, or animal.

22. The transgenic organism of claim 21, wherein the organism is a plant and the plant is Arabidopsis, cotton, tobacco, maize, wheat, rice, barley, soybean, another bean, rape/canola, alfalfa, flax, sunflower, safflower, brassica, cotton, flax, peanut, clover; a vegetable, lettuce, tomato, cucurbits, cassava, potato, carrot, radish, pea, lentils, cabbage, cauliflower, broccoli, Brussels sprout, pepper, a fruit tree, or an ornamental flower.

23. A specific binding agent, that binds to the GMA protein of claim 18.

24. An isolated nucleic acid molecule that: (a) has at least 70% sequence identity with the nucleic acid sequence shown in SEQ

ID NO: 3 or SEQ ID NO: 5; and

(b) encodes a protein having GMA protein biological activity.

25. A method of identifying a nucleic acid sequence of a SUT interacting protein, comprising:

(a) hybridizing the nucleic acid sequence to at least 10 contiguous nucleotides of the sequence shown in SEQ ID NO: 3 or SEQ ID NO: 5; and

(b) identifying die nucleic acid sequence as one that encodes the SUT interacting protein.

26. A nucleic acid molecule identified by the method of claim 25.

27. The method of claim 25, wherein hybridizing the nucleic acid sequence is performed under low-stringency conditions.

28. A SUT interacting protein encoded by the nucleic acid molecule of claim 26.

29. A specific binding agent, that binds the SUT interacting protein of claim 28.

30. An isolated nucleic acid molecule comprising at least 15 consecutive nucleotides of the sequences set forth in SEQ ID NO: 1, SEQ ID NO: 3, or SEQ ID NO: 5.

31. An isolated nucleic acid molecule according to claim 30 wherein the molecule comprises at least 20 consecutive nucleotides of the sequences set forth in SEQ ID NO: 1, SEQ ID NO: 3, or SEQ ID NO: 5.

32. An isolated nucleic acid molecule according to claim 30 wherein the molecule comprises at least 30 consecutive nucleotides of the sequences set forth in SEQ ID NO: 1, SEQ ID

NO: 3, or SEQ ID NO: 5.

33. An isolated nucleic acid molecule according to claim 30 wherein the molecule comprises at least 50 consecutive nucleotides of the sequences set forth in SEQ ID NO: 1, SEQ ID NO: 3, or SEQ ID NO: 5.

34. An isolated nucleic acid molecule according to claim 30 wherein the molecule comprises at least 75 consecutive nucleotides of the sequences set forth in SEQ ID NO: 1, SEQ ED NO: 3, or SEQ ID NO: 5.

35. An isolated nucleic acid molecule according to claim 30 wherein the molecule comprises at least 100 consecutive nucleotides of the sequences set forth in SEQ ID NO: 1, SEQ ID NO: 3, or SEQ ID NO: 5.