EP1143787A2 - Molecular profiling for heterosis selection - Google Patents

Molecular profiling for heterosis selection

Info

Publication number
EP1143787A2
EP1143787A2 EP00904457A EP00904457A EP1143787A2 EP 1143787 A2 EP1143787 A2 EP 1143787A2 EP 00904457 A EP00904457 A EP 00904457A EP 00904457 A EP00904457 A EP 00904457A EP 1143787 A2 EP1143787 A2 EP 1143787A2
Authority
EP
European Patent Office
Prior art keywords
plant
expression
progeny
plants
dominant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP00904457A
Other languages
German (de)
English (en)
French (fr)
Inventor
Ben Bowen
Mei Guo
Oscar Smith
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pioneer Hi Bred International Inc
Original Assignee
Pioneer Hi Bred International Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pioneer Hi Bred International Inc filed Critical Pioneer Hi Bred International Inc
Publication of EP1143787A2 publication Critical patent/EP1143787A2/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • Hyb ⁇ d offspring often outperform their parents by a variety of different measures, including yield, adaptability to environmental changes, disease resistance, pest resistance, and the like
  • the improved properties for the hyb ⁇ d as compared to the parents are collectively referred to as "hybrid vigor," or “heterosis”
  • Hyb ⁇ dization between parents of dissimilar genetic stock has been used in animal husbandry and especially for improving major plant crops, such as corn, sugarbeet and sunflower
  • the development of a maize hyb ⁇ d typically involves three steps (1) the selection of plants from va ⁇ ous germplasm pools for initial breeding crosses, (2) the self g of the selected plants from the breeding crosses for several generations to produce a se ⁇ es of inbred lines, which, although different from each other, breed true and are highly uniform, and (3) crossing the selected mbred lines with different mbred lines to produce hybrid progeny (sometimes referred to as "FI" hyb ⁇ ds) Du ⁇ ng the inbreeding process in maize, the vigor of the lines decreases Vigor is restored when two different inbred lines are crossed to produce hyb ⁇ d progeny A consequence of the homozygosity and homogeneity of the mbred lines is that hyb ⁇ ds produced by crossing a defined pair of mbreds are uniform and predictable.
  • heterosis is the result of one or a few general genetic mechanisms, or whether it is the result of many simultaneously interacting processes.
  • heterosis Because of the lack of understanding of the molecular basis for heterosis, crop development has relied upon empi ⁇ cal observations of heterosis for hybrids which result from crossing selected mbred crop strains (or resulting from second order crosses, e.g., in which two mbreds are crossed to produce a hyb ⁇ d which is then crossed with an inbred or hybrid strain to produce a subsequent 3-4 way heterotic hyb ⁇ d) This laborious process has been conducted on a large scale, resulting in increases in desirable measures of heterosis, such as yield, of several percent per year.
  • Molecular methods have been used to a limited extent to supplement crop breeding programs to select desirable inbreds and hyb ⁇ ds. In general, these procedures have been used to identify genetic markers corresponding to desirable or undesirable loci (e.g., "quantitative trait loci" or QTLs) m plants under analysis. Genetic markers represent (mark the location of) specific loci in the genome of a species or closely related species, and sampling of different genotypes at these marker loci reveals genetic va ⁇ ation.
  • QTLs quantitative trait loci
  • the genetic va ⁇ ation at marker loci can then be desc ⁇ bed and applied to genetic studies, commercial breeding, diagnostics, cladistic analysis of va ⁇ ance, or genotyping of samples Because molecular methods are amenable to high throughput analysis and because they do not require yield testing, they can be used to speed the process of crop development. However, although these techniques are of considerable use, and can and do enhance the efficiency of crop breeding programs, they are not currently used, or useful, as a predictor for the more general phenomenon of heterosis.
  • the present invention provides a number of fundamental discove ⁇ es which make it possible to correlate molecular methods and the phenomenon of heterosis, as well as a variety of additional aspects which will be apparent upon complete review.
  • a heterologous nucleic acid that results in expression of expression products from silenced genes is introduced into a target plant.
  • Examples of approp ⁇ ate heterologous nucleic acids include one or more of: a transc ⁇ ption factor which activates a promoter from a silenced gene, a nucleic acid encoded by the silenced gene under the control of a heterologous promoter, and a nucleic acid homologous to the silenced gene with at least one region of difference with the silenced gene, which homologous nucleic acid can recombine with the silenced gene to produce a modified gene. Any of these nucleic acids can be cloned under the control of heterologous promoters and placed into target plants to increase heterosis of the target plants.
  • integrated systems comprising computer databases having expression profile information can be used to select which parental crosses are most likely to result in an increase in the number of expression products (or an optimization of expression products of a selected class, i.e., dominant, under- dominant, over-dommant, additive, or the like) in offsp ⁇ ng
  • consideration of expression profile information provides not only a basis for selecting hybrids from crosses, but, using the methods herein, also identifies desirable crosses to be made.
  • Production and automated consideration of expression profile databases also provides a mechanism for identifying the genetic source of particular expression products, thereby indicating the likely parentage of given hyb ⁇ ds
  • the invention additionally provides methods of cloning and transducing target plants or animals with dominant, additive, under-dominant and over-dommant genes identified by comparative examination of expression profiles BRIEF DESCRIPTION OF THE FIGURES
  • Figure 1 is a scatter plot showing the correlation between the degree of heterosis and % relationship.
  • Figure 2 is a set of bar graphs showing classification of gene expression patterns in Hybrid vs. inbred parents.
  • Figure 3 is a line graph showing the co ⁇ elation between the pattern of gene expression and heterosis.
  • Figure 4 is a set of bar graphs showing dominant, additive and over-/under- dominant RNA expression.
  • Figure 5 is a scatter graph showing the correlation between parental effects on gene expression and heterosis.
  • Figure 6a-c is a set of schematic illustrations showing polymorphic dominant products and their sequences.
  • An "expression profile” is the result of detecting a representative sample of expression products from a cell, tissue or whole organism, or a representation (picture, graph, data table, database, etc.) thereof. For example, many RNA expression products or a cell or tissue can simultaneously be detected on a nucleic acid array, or by the technique of differential display or modification thereof such as Curagen's "GeneCallingTM” technology. Similarly, protein expression products can be tested by various protein detection methods, such as hybridization to peptide or antibody arrays, or by screening phage display libraries.
  • a “portion” or “subportion” of an expression profile, or a “partial profile” is a subset of the data provided by the complete profile, such as the information provided by a subset of the total number of detected expression products.
  • An “expression product” is any product transcribed in a cell from a DNA (e.g., from a gene) or translated from an RNA (e.g., a protein).
  • Example expression products include mRNAs and proteins.
  • a "representative sample" of expression products e.g., from a particular cell, tissue, or whole organism is a sufficiently large number of expression products that statistical comparison of the actual number and/or type of expression products between different cells, tissues, or whole organisms can be made. Ideally, at least about 50%, and typically 60%, 70%, 80%, 90%, 95% or 100% of the total expression products which are detectable by a given technique constitute the "representative sample.”
  • the representative sample will typically include a large number of expression products, as cells, tissues and organisms typically produce a fairly large number of expression products.
  • a typical representative sample of expression products includes between about 100 and 20,000 or more expression products, e.g., about 100-500, 1,000, 1,500, 2,000, 2,500, 3,000, 3,500, 4,000, 4,500, 5,000, 5,500, 6,000, 6,500, 7,000, 7,500, 8,000, 8,500, 9,000, 9,500, 10,000, 10,500, 11,000, 11,500, 12,000, 12,500, 13,000, 13,500, 14,000, 14,500, 15,000, 15,500, 16,000, 16,500, 17,000, 17,500, 18,000, 18,500, 19,000, 19,500, 20,000, or 30,000 expression products, or the like.
  • correlation unless indicated otherwise, is used herein to indicate that a “statistical association” exists between, e.g., an expression product and the degree of heterosis.
  • Dominant expression for an expression product refers to the situation where expression of the product in a progeny differs from one parent, and not the other for the expression product
  • Additional expression for an expression product refers to the situation where expression of the product in a progeny falls within the range of the two parents (and may or may not differ from both parents).
  • “Over-dominant” or “under-dommant” expression for an expression product refers to the situation where expression of an expression product in a progeny differs from both parents and falls outside of the range of the two parents, either over the higher parent value, or under the lower parent value, respectively ( Figure 2).
  • a “biological sample” is a portion of mate ⁇ al isolated from a biological source such as a plant, isolated plant tissue, or plant cell, or a portion of mate ⁇ al made from such a source, such as a cell extract or the like
  • a “promoter” is an array of nucleic acid control sequences which direct transc ⁇ ption of a nucleic acid
  • a promoter includes necessary nucleic acid sequences near the start site of transc ⁇ ption, such as, in the case of a polymerase II type promoter, a TATA element.
  • a promoter also optionally includes distal enhancer or repressor elements which can be located as much as several thousand base pairs from the start site of transcription.
  • a “constitutive” promoter is a promoter which is active in a selected organism under most environmental and developmental conditions.
  • An “inducible” promoter is a promoter which is under environmental or developmental regulation in a selected organism. -
  • hybrid plants refers to plants which result from a cross between genetically different individuals.
  • tester parent refers to a parent that is genetically different from a set of lines to which it is crossed. The cross is for purposes of evaluating differences among the lines in topcross combination. Using a tester parent in a sexual cross allows one of skill to determine the genetic differences between the tested lines on the phenotypic trait with expression of quantitative trait loci in a hybrid combination.
  • topcross combination and “hybrid combination” refer to the processes of crossing a single tester parent to multiple lines. The purposes of producing such crosses is to evaluate the ability of the lines to produce desirable phenotypes in hybrid progeny derived from the line by the tester cross.
  • transgenic plant refers to a plant into which exogenous polynucleotides have been introduced by any process other than sexual cross or selfing. Examples of processes by which this can be accomplished are described below, and include Agrob ⁇ cte ⁇ ' wm-mediated transformation, biolistic methods, electroporation, in planta techniques, and the like. Such a plant containing the exogenous polynucleotides is referred to here as an R l generation transgenic plant. Transgenic plants may also arise from sexual cross or by selfing of transgenic plants into which exogenous polynucleotides have been introduced. DETAILED DESCRIPTION OVERVIEW OF SELECTION FOR HETEROSIS
  • Crop improvement relies extensively on the phenomenon of heterosis.
  • Inbreds and/or hyb ⁇ ds are crossed to produce heterotic hyb ⁇ ds with desirable traits such as high yield, disease resistance, resistance to heat, cold, salinity, insects, fungi, herbicides, pesticides, etc.
  • Secondary desirable traits such as a particular size or shape of ears, solids content, sugar content, oil content, water content, etc., can also be affected by heterosis.
  • the present invention establishes several correlations between the expression of gene products and heterosis, e.g., with respect to yield.
  • genes are silenced du ⁇ ng inbreeding in plants. These correlations provide new methods of selecting heterotic hyb ⁇ ds, without the necessity of field testing every hyb ⁇ d to monitor heterotic traits.
  • expression of a first representative sample of first expression products e.g., RNAs or proteins
  • a first progeny plant e.g., a hyb ⁇ d from resulting from crossing two or more parental lines.
  • the expression products produced in the first progeny plant are quantified and/or monitored for the type of expression product (additive, dominant, under- dominant, over-dominant, etc ).
  • the number of first expression products produced in the first progeny plant is statistically associated with a measure of heterosis in the first progeny plant, as is the number of dominant, additive, under-dommant or over-dom ant, or silenced expression products.
  • the plant is then selected (e.g., against similar measures for a second progeny plant, or a population of progeny plants, or against the parental stock) for further testing based upon the number or type of expression products detected.
  • the plant can be selected for one or more characte ⁇ stic, including: a selected number of expression products, a selected number of dominant expression products, a selected ratio of dominant expression products to total expression products, a desired number of over- or under-dommant expression products, a selected ratio of over- or under- dominant expression products to total expression products, a selected number of additive expression products, and a selected ratio of additive expression products to total expression products.
  • the first progeny plant is selected to maximize the number of dominant expression products and/or to maximize the number of additive expression products, and/or to minimize the number of over- or under-dominant expression products.
  • Crosses can also be selected to minimize silencing in the progeny plant.
  • the parental plants used to produce the first progeny can also be profiled. Resulting parental expression profiles serve any of a va ⁇ ety of purposes.
  • the parental expression profiles can be compared to the first progeny profile to aid in determining whether the progeny show an increase in the number of expression products as compared to parental stocks (thereby indicating that the progeny is likely to be heterotic).
  • compa ⁇ son between the parental expression profiles and the progeny profile is used to determine whether the individual expression products represented m the profile are dominant, additive, under- dominant, over-dominant, or the like
  • the parental expression profiles can also be placed into a database to aid in determining which crosses are most likely to produce heterotic hyb ⁇ ds.
  • Cropsenchymy plants are selected by identifying plants likely to produce progeny plants with a selected number of expression products which are dominant, over-dommant, under-dommant or additive.
  • parents are selected to produce the first progeny plant by selecting for complementary expression of dominant or additive expression products between the parents, or by selecting against expression of over-dominant or under-dommant expression products in the parents
  • An additional statistical association relates to the relationship between parental and progeny plants. It is discovered that plants which exhibit an expression profile that is more similar to the maternal plant than to the paternal plant may be more heterotic. Accordingly, compa ⁇ son of the maternal, paternal and progeny expression profiles can be used to monitor this relationship. In addition, multiple crosses to a single female type can be made (or the results predicted by compa ⁇ son in a database) and the progeny screened (or predicted) for simila ⁇ ty to the female type.
  • silencing was determined to play a significant role in the loss of heterosis due to inbreeding. Accordingly, by compa ⁇ ng parental and progeny plants it is possible to determine which genes are silenced These genes can be rescued, e.g., by cloning the silenced genes and placing them under the control of heterologous promoters, or other strategies noted herein, and transducing the genes back into target plants (e.g., the parental lines, the hyb ⁇ ds, or any other plant). In addition, by compiling database information for which genes are silenced m mbreds, it is possible to decrease silencing in hyb ⁇ ds by selecting crosses where parents have complementary patterns. It is also possible to use these methods to increase the performance (e.g., gram yield, standabihty, etc.) of the inbred lines themselves.
  • the first progeny plant selected by any of the methods herein, or a subsequent progeny plant, or a transgenic plant as desc ⁇ bed above can be subjected to any of the field tests appropriate for monito ⁇ ng one or more desired traits
  • the first progeny plant, or a subsequent progeny plant thereof can be tested for a desired phenotypic trait.
  • the phenotypic trait can be compared between the first progeny plant, or a subsequent progeny plant, and a selected hybrid or inbred plant.
  • the expression profile of the selected hyb ⁇ d or mbred plant can be compared to an expression profile of the first progeny plant, or the subsequent progeny plant.
  • Nucleic acids differentially expressed between the selected hyb ⁇ d or mbred plant and the first progeny plant, or the subsequent progeny plant are identified as targets for cloning.
  • genes that are expressed high yielding hyb ⁇ ds that are not expressed in low yielding hybrids can be determined by compa ⁇ sons of the expression profiles for the high and low yielding hyb ⁇ ds
  • Nucleic acids from (or corresponding to) the differentially expressed genes are cloned for introduction into target nucleic acids After identifying which expression products from the representative sample show an additive, dominant, underdominant, or overdommant expression pattern for at least a portion of the representative sample, or a nucleic acid corresponding to the expression product, can be cloned.
  • the cloned nucleic acid can then be transduced into target plants to test whether the nucleic acid encodes a useful trait, or to improve traits in the target plant. Further details on expression profiling, cloning of nucleic acids, selection of hyb ⁇ ds, integrated systems, screening methods and the like are set forth below.
  • a va ⁇ ety of tissues can be profiled, with immature tissues being preferentially profiled. Immature tissues are prefe ⁇ ed, because it increases the rate at which crops can be screened, as a plant does not have to be grown to matu ⁇ ty However, essentially any tissue, or whole plant, can be profiled.
  • a va ⁇ ety of profiling methods are available, including hybridization of expressed or amplified nucleic acids to a nucleic acid array, hybridization of expressed polypeptides to a protein array, hybridization of peptides or nucleic acids to an antibody array, subtractive hybridization, differential display and others.
  • CROPS TO BE PROFILED The parental or progeny plants can be inbreds or hybrids.
  • the progeny plant is a hybrid, produced by crossing two different inbred lines, or crossing an inbred line and a hybrid line, or crossing two hybrid lines (which are the result of crossing inbred or hybrid lines), or crossing of more than two lines (e.g., to generate polyploid or recombinant plants) in a single cross.
  • a desirable heterotic hybrid Once a desirable heterotic hybrid is identified, it can be treated as such hybrids typically are in breeding schemes, e.g., it can produced in quantity as seed; it can be top crossed to inbred lines to produce a 3-way hybrid plant; it can be selfed to produce more inbred lines, or the like.
  • Monocots such as plants in the grass family (Gramineae), such as plants in the sub families Fetucoideae and Poacoideae, which together include several hundred genera including plants in the genera
  • Agrostis Phleum, Dactylis, Sorgum, Setaria, Zea (e.g., corn), Oryza (e.g., rice), Triticum (e.g., wheat), Secale (e.g., rye), Avena (e.g., oats), Hordeum (e.g., barley), Saccharum, Poa, Festuca, Stenotaphrum, Cynodon, Coix, the Olyreae, Phareae and many others. Plants in the family Gramineae are a particularly preferred target plants for the methods of the invention.
  • Additional preferred targets include other commercially important crops, e.g., from the families Compositae (the largest family of vascular plants, including at least 1,000 genera, including important commercial crops such as sunflower), and Leguminosae or "pea family,” which includes several hundred genera, including many commercially valuable crops such as pea, beans, lentil, peanut, yam bean, cowpeas, velvet beans, soybean, clover, alfalfa, lupine, vetch, lotus, sweet clover, wisteria, and sweetpea.
  • Common crops applicable to the methods of the invention include Zea mays, rice, soybean, sorghum, wheat, oats, barley, millet, sunflower, and canola.
  • RNA PROFILING In one preferred embodiment, the expression products which are detected in the methods of the invention are RNAs, e.g., mRNAs expressed from genes within a cell of the plant or tissue profiled.
  • RNA detection A number of techniques are available for detecting RNAs. For example, northern blot hybridization is widely used for RNA detection, and is generally taught in a variety of standard texts on molecular biology, including: Berger and Kimmel, Guide to
  • RNA can be converted into a double stranded DNA using a reverse transcriptase enzyme and a polymerase. See, Ausubel, Sambrook and Berger, id.
  • detection of mRNAs can be performed by converting, e.g., mRNAs into DNAs, which are subsequently detected in, e.g., a standard "Southern blot" format.
  • DNAs can be amplified to aid in the detection of rare molecules by any of a number of well known techniques, including: the polymerase chain reaction (PCR), the ligase chain reaction (LCR), Q ⁇ -rephcase amplification and other RNA polymerase mediated techniques (e g., NASBA) Examples of these techniques are found m Berger, Sambrook, and Ausubel, id., as well as in Mulhs et al, (1987) U.S. Patent No
  • PCR amphcons of up to 40kb are generated.
  • RNA can be converted into a double stranded DNA suitable for rest ⁇ ction digestion, PCR expansion and sequencing using reverse transcnptase and a polymerase. See, Ausubel, Sambrook and Berger, all supra These general methods can be used for expression profiling. For example, arrays of probes can be spotted onto a surface and expression products (or in vitro amplified nucleic acids corresponding to expression products) can be labeled and hyb ⁇ dized with the array For convenience, it may be helpful to use several arrays simultaneously. It is expected that one of skill is familiar with nucleic acid hyb ⁇ dization. General methods of hyb ⁇ dization are found in Berger, Sambrook and Ausubel, supra, and further in Tijssen (1993) Laboratory
  • solid phase arrays are adapted for the rapid and specific detection of multiple polymorphic nucleotides.
  • a nucleic acid probe is chemically linked to a solid support and a target nucleic acid (e.g., an RNA or corresponding amplified DNA) is hybridized to the probe.
  • a target nucleic acid e.g., an RNA or corresponding amplified DNA
  • hybridization is detected by detecting bound fluorescence.
  • hybridization is typically detected by quenching of the label by the bound nucleic acid.
  • detection of hybridization is typically performed by monitoring a - signal shift such as a change in color, fluorescent quenching, or the like, resulting from proximity of the two bound labels.
  • an array of probes are synthesized on a solid support.
  • chip masking technologies and photoprotective chemistry it is possible to generate ordered arrays of nucleic acid probes with large numbers of probes.
  • These arrays which are known, e.g., as "DNA chips,” or as very large scale immobilized polymer arrays (“VLSIPS”TM arrays) can include millions of defined probe regions on a substrate having an area of about 1cm 2 to several cm 2 .
  • arrays of chemicals, nucleic acids, proteins or the like can also be printed on a solid substrate using printing technologies.
  • these procedures provide a method of producing 4 n different oligonucleotide probes on an array using only 4n synthetic steps.
  • Light-directed combinatorial synthesis of oligonucleotide arrays on a glass surface is performed with automated phosphoramidite chemistry and chip masking techniques similar to photo resist technologies in the computer chip industry.
  • a glass surface is derivatized with a silane reagent containing a functional group, e.g., a hydroxyl (for nucleic acid arrays) or amine group (for peptide or peptide nucleic acid arrays) blocked by a photolabile protecting group.
  • Photolysis through a photolithogaphic mask is used selectively to expose functional groups which are then ready to react with incoming 5'-photoprotected nucleoside phosphoramidites.
  • the phosphoramidites react only with those sites which are illuminated (and thus exposed by removal of the photolabile blocking group).
  • the phosphoramidites only add to those areas selectively exposed from the preceding step. These steps are repeated until the desired array of sequences have been synthesized on the solid surface.
  • Combinatorial synthesis of different oligonucleotide analogues at different locations- on the array is determined by the pattern of illumination during synthesis and the order of addition of coupling reagents.
  • Monitoring of hybridization of target nucleic acids to the array is typically performed with fluorescence microscopes or laser scanning microscopes.
  • one of skill is also able to order custom-made arrays and array-reading devices from manufacturers specializing in a ⁇ ay manufacture. For example, Affymetrix Corp. in Santa Clara, CA manufactures nucleic acid arrays.
  • probe design is influenced by the intended application. For example, where several allele-specific probe-target interactions are to be detected in a single assay, e.g., on a single nucleic acid chip, it is desirable to have similar melting temperatures for all of the probes. Accordingly, the length of the probes are adjusted so that the melting temperatures for all of the probes on the array are closely similar (it will be appreciated that different lengths for different probes may be needed to achieve a particular T m where different probes have different GC contents). Although melting temperature is a primary consideration in probe design, other factors are also optionally used to further adjust probe construction, such as elimination of self-complementarity in the probe (which can inhibit hybridization of a target nucleotide).
  • a restriction site or amplification template for a second primer is incorporated, the primers are optionally longer than those described above by the length of the restriction site, or amplification template site.
  • Standard restriction enzyme sites include 4 base sites, 5 base sites, 6 base sites, 7 base sites, and 8 base sites.
  • An amplification template site for a second primer can be of essentially any length, for example, the site can be about 15-25 nucleotides in length.
  • the amplified products are optionally labeled and are typically resolved by electrophoresis on a polyacrylamide gel; the location(s) where label is present are excised and the labeled product species is/are recovered from the gel portion, typically by elution.
  • the resultant recovered product species can be subcloned into a replicable vector with or without attachment of linkers, amplified further, and/or detected, or even sequenced directly. Sequencing methods are described in Berger, Sambrook and Ausubel, supra.
  • differential display for expression profiling.
  • CuraGen Corp. New Haven CT
  • detected proteins can be derived from one of at least two sources.
  • the proteins which are detected can be either directly isolated from a cell or tissue to be profiled, providing direct detection (and, optionally, quantification) of proteins present in a cell.
  • mRNAs can be translated into cDNA sequences, cloned and expressed. This increases the ability to detect rare RNAs, and makes it possible to immediately associate a detected protein with its coding sequence.
  • nucleic acids it is not necessary even to express nucleic acids in the proper reading frame, as it is typically the presence or absence of an expression product that is, initially, at issue. Even an out of frame peptide is an indicator for the presence of a corresponding RNA.
  • hybridization techniques including western blotting, ELISA assays, and the like are available for detection of specific proteins. See, Ausubel, Sambrook and Berger, supra. See also, Antibodies: A Laboratory Manual, (1988) E. Harlow and D. Lane, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY. Non-hybridization based techniques such as two-dimensional electrophoresis can also be used to simultaneously and specifically detect large numbers of proteins.
  • One typical technology for detecting specific proteins involves making antibodies to the proteins. By specifically detecting binding of an antibody and a given protein, the presence of the protein can be detected.
  • one of skill can easily make antibodies using existing techniques, or modify those antibodies which are commercially or publicly available.
  • general methods of producing polyclonal and monoclonal antibodies are known to those of skill in the art. See, e.g., Paul (ed) (1998) Fundamental Immunology, Fourth Edition Raven Press, Ltd., New York Coligan (1991) Current Protocols in Immunology Wiley/Greene, NY; Harlow and Lane (1989) Antibodies: A Laboratory Manual Cold Spring Harbor Press, NY; Stites et al. (eds.) Basic and Clinical Immunology (4th ed.) Lange Medical Publications, Los-
  • Specific monoclonal and polyclonal antibodies and antisera will usually bind with a K D of at least about .1 ⁇ M, preferably at least about .01 ⁇ M or better, and most typically and preferably, .001 ⁇ M or better.
  • an “antibody” refers to a protein consisting of one or more polypeptide substantially or partially encoded by immunoglobulin genes or fragments of immunoglobulin genes.
  • the recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon and mu constant region genes, as well as myriad immunoglobulin variable region genes.
  • Light chains are classified as either kappa or lambda.
  • Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, IgM, IgA, IgD and IgE, respectively.
  • a typical immunoglobulin (antibody) structural unit is known to comprise a tetramer.
  • Each tetramer is composed of two identical pairs of polypeptide chains, each pair having one "light” (about 25 kD) and one "heavy” chain (about 50-70 kD).
  • the N-terminus of each chain defines a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition.
  • the terms variable light chain (V L ) and variable heavy chain (V H ) refer to these light and heavy chains respectively.
  • Antibodies exist as intact immunoglobulins or as a number of well characterized fragments produced by digestion with various peptidases.
  • pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)' 2 a dimer of Fab which itself is a light chain joined to V H -C H 1 by a disulfide bond.
  • the F(ab)' 2 may be reduced under mild conditions to break the disulfide linkage in the hinge region thereby converting the (Fab') 2 dimer into an Fab' monomer.
  • the Fab' monomer is essentially an Fab with part of the hinge region (see, Fundamental Immunology. W.E. Paul, ed., Raven Press, N.Y. (1993), for a more detailed description of other antibody fragments).
  • Antibodies include single chain antibodies, including single chain Fv (sFv) antibodies in which a variable heavy and a variable light chain are joined together (directly or through a peptide linker) to form a continuous polypeptide.
  • sFv single chain Fv
  • antibodies or antibody fragments can be arrayed, e.g., by coupling to an amine moiety fixed to a solid phase array, in a manner similar to that described above for construction of nucleic acid arrays.
  • the antibodies can be labeled, or proteins corresponding to expression products can be labeled. In this manner, it is possible to couple hundreds, or even thousands, of different antibodies to an array.
  • a bacteriophage antibody display library is screened with a polypeptide encoded by a cell, or obtained by expression of mRNAs, differential display, subtractive hybridization or the like.
  • Combinatorial libraries of antibodies have been generated in bacteriophage lambda expression systems which are screened as bacteriophage plaques or as colonies of lysogens (Huse et al. (1989) Science 246: 1275; Caton and Koprowski (1990) Proc. Natl. Acad. Sci. (U.S.A.) 87:6450; Mullinax et al (1990) Proc. Natl. Acad. Sci. (U.S.A.) 87:8095; Persson et al.
  • the patterns of hybridization which are detected provide an indication of the presence or absence of protein sequences. As long as the library or array against which a population of proteins are to be screened can be correlated from one experiment to the next -
  • peptide and nucleic acid hybridization to arrays or libraries can be treated in a manner analogous to a bar code label.
  • Any diverse library or array can be used to screen for the presence or absence of complementary molecules, whether RNA, DNA, protein, or a combination thereof.
  • mass spectrometry is in use for identification of large sets of proteins in samples, and is suitable for identification of many proteins in a sequential or parallel fashion.
  • Hutchens et al. U.S. Pat. 5,719,060 describe methods and apparatus for desorption and ionization of analytes for subsequent analysis by mass spectroscopy and/or biosensors.
  • Sample presenting means with probe elements with "Surfaces Enhanced for Laser Desorption/Ionization" (SELDI) described in the '060 patent is particularly useful in the context of the present invention; however, other approaches described in the '060 are also generally applicable to the present invention.
  • Multi-dimensional gel technology is well-known and described e.g., in Ausubel, supra, Volume 2, Chapter 10.
  • Image analysis of multi-dimensional protein separation gels provides an indication of the proteins that are expressed e.g., in a cell or tissue type. It is worth noting that identification of particular proteins is not necessary; instead, positional and pattern information e.g , of protein staining or fluorescmg patterns is sufficient to identify sets of protein expression products.
  • metabolites can be monitored by any of currently available method, including chromatography, urn or multi dimensional gel separations, hyb ⁇ dization to complementary molecules, or the like
  • the invention provides methods of identifying plant crosses with an increase in probability for heterosis progeny plants. For example, in a preferred method, the expression profiles for a plurality of plants are compared, and the expression profiles are considered by pair-wise comparison. Desirable crosses produce progeny with a selected or optimal number of expression products, or progeny with a selected number or type of expression products that display a dominant, additive, over-dommant or under-dommant expression pattern. Desirably, these compa ⁇ sons are performed in an integrated system which includes a computer The generation and use of databases of expression profile information for performing a va ⁇ ety of comparisons is a feature of the invention.
  • a va ⁇ ety of comparative methods can be performed in an integrated system, e.g., to determine the heterosis (or likely heterosis) of a cross.
  • one simple measure that can be compared across different actual or potential crosses to determine the desirability of a particular cross is to determine the sum of the expressed gene products that differ from a progeny plant in each of a first and second parental plant and the number of expressed gene products that differ between the first and second parental plant. The larger this sum, typically, the more desirable the cross.
  • matrices of possible expression profile combinations for plants are generated.
  • the expression profiles are compiled in a database in a computer and a matrix of possible pair-wise expression profile combinations for the plants is generated and queried using an integrated system comprising a computer with software for generating and comparing matrices.
  • Subsets of potential crosses from all of the possible pair-wise comparisons which exhibit a maximal number of expression profile differences represent one preferred cross.
  • Useful software aids in determining how many genes are expressed, or whether expressed genes are additive, dominant, over-dominant or under-dominant.
  • matrix information can be limited to possible pair-wise crosses for plants from different heterotic groups, or from the same heterotic group.
  • the fidelity of predicted expression profile information increasingly varies as subsequent cross information is considered, and of course, the number of possible crosses increases. Accordingly, typically only one or a few rounds of potential crosses are considered at one time. In any case, selection of a subset of potential crosses from all of the possible pair-wise comparisons which exhibit a maximal number of expression profile differences is desirable. A variety of rules for performing the basic comparisons can be used.
  • crosses are identified in which the sum of: (i) expression products produced in a first plant from a first heterotic group (A,) which are not expressed in a second plant from the first heterotic group (A) to which the first plant is crossed (A.), and which are not expressed in a selected third plant from a second heterotic group (B), plus (ii) the expression products produced in A. which are not produced A, and which are not produced in B, is optimized.
  • This optimization results in crosses which achieve elevated numbers of expression products expressed in heterotic hybrid progeny, and also in an optimization of the number of dominant products expressed.
  • optimization is achieved by determining all possible pair-wise combinations from the first heterotic group and identifying the cross which results in the largest sum of expression products, or by determining all possible pair-wise combinations from the first heterotic group and identifying crosses which result in a hybrid progeny (A, x A.) with a maximal number of differences as compared to B, or by determining all possible pair-wise combinations from the first heterotic group and identifying crosses which result in the hybrid progeny (A, x A..) having a greater number of differences with B than the number of differences between B and A, or B and A..
  • this optimization results in crosses which achieve elevated numbers of expression products expressed in heterotic hybrid progeny, and also in an optimization of the number of dominant products expressed.
  • Such implementations can also be used to improve selection methods per se. For example, in one method, self- or back-crossed progeny derived from the A, x A hybrid are selected which either retain a set of expression products defined by the sum of expression products expressed in A, (but not A. or B) and A. (but not A, or B), or which show a larger number of expression products expressed in a topcross with B than does either A, or A. when topcrossed with B.
  • One approach for comparing profiles is a nested analysis in which expression profiles are successively grouped together, and the many gene expression differences seen in individual pair-wise compa ⁇ sons can be ranked hierarchically in a filtering process. This method is useful for identifying genes expressed in one set of genotypes vs. another, e.g. hybrids vs. inbreds or bulked segregants from the two ends of a quantitative phenotypic distribution.
  • the methods of the invention can include inputing an expression profile for progeny or parental plants into a database of expression profiles. This can be performed manually, but is more typically performed in an automated system.
  • Computer databases of expression profile information can be quite large, with from a few up to several thousand profiles in the database.
  • the database will have expression product profiles of a representative sample of expression products for hybrid progeny plants resulting from at least 10 separate inbred plant crosses, or at least 10 inbred plant expression product profiles.
  • computer system or "integrated system” in the context of this invention refers to a system in which data entering a computer corresponds to physical objects or processes external to the computer, e.g., nucleic acid hybridization or protein binding data and a process that, within a computer, causes a physical transformation of the input signals to different output signals.
  • the input data e.g., hybridization of expression products on a specific array
  • output data e.g., the identification or counting of the sequence hybridized, comparison to similar a ⁇ ays with different test materials, counting and categorization of expression products or the like.
  • the process within the computer is a program by which positive (or negative) hybridization signals are recognized by the computer system and attributed to a region of an array, or other expression profile format (e.g., simple counting of array signals).
  • the program determines which region of the array the hybridized expression products are located on and, optionally, the specific corresponding sequences which the probe is based on (as noted above, no sequence information is required for making or assessing expression profiles).
  • the invention provides integrated systems for plant or plant cell manipulation and hybridization analysis. Typical systems include a digital computer with high-throughput liquid control software, image analysis software, and data interpretation software.
  • a robotic liquid control armature for transferring solutions (e.g., plant cell extracts) from a source to a destination, is typically operably linked to the digital computer.
  • An input device for entering data to the digital computer to control high throughput liquid transfer by the robotic liquid control armature and, optionally, to control transfer by the pinning armature to the solid support is commonly a feature of the integrated system, as is an image scanner for digitizing label signals from labeled probe hyb ⁇ dized to the DNA on the solid support operably linked to the digital computer
  • the image scanner interfaces with the image analysis software to provide a measurement of probe label intensity, where the probe label intensity measurement is interpreted by the data interpretation software to show whether, and to what degree, the labeled probe hybridizes to a label.
  • High throughput screening systems are commercially available (see, e.g., Zymark Corp., Hopkmton, MA; Air Technical Indust ⁇ es, Mentor, OH; Beckman Instruments, Inc Fullerton, CA, Precision Systems, Inc., Natick, MA, etc.). These systems typically automate entire procedures including all sample and reagent pipetting, liquid dispensing, timed incubations, and final readings of the microplate in detector(s) approp ⁇ ate for the assay. These configurable systems provide high throughput and rapid start up as well as a high degree of flexibility and customization For example, the currently available commercial software package, BioWorks® 1 4®, provided by Beckman Instruments, Inc.
  • Optical images viewed (and, optionally, recorded) by a camera or other recording device are optionally further processed- m any of the embodiments herein, e.g., by digitizing the image and/or sto ⁇ ng and analyzing the image on a computer.
  • a va ⁇ ety of commercially available pe ⁇ pheral equipment and software is available for digitizing, sto ⁇ ng and analyzing a digitized video or digitized optical image, e.g., using PC (Intel x86 or pentium chip- compatible DOSTM, OS2TM WINDOWSTM, WINDOWS NTTM or WINDOWS95TM based machines), MACINTOSHTM, or UNIX based
  • a CCD camera includes an array of picture elements (pixels). The light from the specimen is imaged on the CCD. Particular pixels corresponding to regions of the specimen (e g., individual hyb ⁇ dization sites on an array of biological polymers) are sampled to obtain light intensity readings for each position Multiple pixels are processed in parallel to increase speed.
  • Integrated systems for hybridization analysis of the present invention typically include a digital computer with high-throughput liquid control software, image analysis software, data interpretation software, a robotic liquid control armature for transfer ⁇ ng solutions from a source to a destination operably linked to the digital computer, an input device (e.g., a computer keyboard) for ente ⁇ ng data to the digital computer to control high throughput liquid transfer by the robotic liquid control armature and, optionally, an image scanner for digitizing label signals from labeled probe hybridized to expression products, e.g , on a solid support operably linked to the digital computer
  • the image scanner interfaces with the image analysis software to provide a measurement of probe label intensity Typically, the probe label intensity measurement is interpreted by the data interpretation software to show whether the labeled probe hybridizes to the DNA on the solid support.
  • Software to support sample processing can be divided into 4 functional catego ⁇ es: 1) liquid transfer control software, 2) image analysis software
  • applications can share information through data files which the applications can read and create.
  • files can be formatted as simple text files and/or in Microsoft Excel® or other worksheet format. This allows viewing and editing of the files through the use of commercially available software such as Microsoft Excel®.
  • Microsoft Windows® a Microsoft Windows® user interface can be developed for most applications using Microsoft Visual Basic 4.O®. Most applications can be developed for a 32-bit environment to run under Microsoft Windows 95® or 98®. 16-bit applications such as image analysis software developed by Optimas Corporation, Optimas 5.0, can also be useful components of the integrated system.
  • nucleic acid encoding an expression product identified as being of interest by the expression profiling techniques noted herein can be cloned. It is expected that many such nucleic acids, particularly dominant and additive nucleic acids will be encoded by loci responsible for desirable quantitative traits ("QTL” see, Edwards, et al., (1987) in Genetics 115:113). QTL include genes that control, to some degree, nume ⁇ cally quantifiable phenotypic traits such as disease resistance, crop yield, resistance to environmental extremes, etc. In addition to the methods herein, other expe ⁇ mental paradigms can be used to identify, analyze and select for QTL.
  • One paradigm involves crossing two mbred lines and genotyping multiple marker loci and evaluating one to several quantitative phenotypic traits among the progeny of the cross. QTL are then identified and ultimately selected for based on significant statistical associations between the genotypic values determined by genetic marker technology and the phenotypic va ⁇ ability among the segregating progeny. As applied to the present invention, the identification of particular nucleic acids which encode dominant, additive or under or over dominant expression products, or which encode silenced expression products, are potential products of QTLs or other genes or loci of interest.
  • nucleic acids which are genetically linked to DNAs encoding these expression products for transduction into cells (e.g., coding sequences for expression products, or genetically linked coding or non-coding sequences), especially to make transgenic plants.
  • the cloned sequences are also useful as molecular tags- for selected plant strains, e.g., to identify parentage, and are further useful for encoding expression products, including nucleic acids and polypeptides.
  • expression products which are differentially expressed between heterotic and non-heterotic plants are encoded by QTL and are responsible for the phenotypic effects of the QTL.
  • a DNA linked to a locus encoding an expression product is introduced into plant cells, either in culture or in organs of a plant, e.g., leaves, stems, fruit, seed, etc.
  • the expression of natural or synthetic nucleic acids encoded by nucleic acids linked to expression product coding nucleic acids can be achieved by operably linking a cloned nucleic acid of interest, such as an expression product or a genetically linked nucleic acid, to a promoter, incorporating the construct into an expression vector and introducing the vector into a suitable host cell.
  • an endogenous promoter linked to the nucleic acids can be used.
  • Bacterial cells are often used to amplify increase the number of plasmids containing DNA constructs of this invention.
  • the bacteria are grown to log phase and the plasmids within the bacteria can be isolated by a variety of methods known in the art (see, for instance, Sambrook).
  • kits are commercially available for the purification of plasmids from bacteria.
  • Agrobacterium tumefaciens related vectors to infect plants contain transc ⁇ ption and translation terminators, transc ⁇ ption and translation initiation sequences, and promoters useful for regulation of the expression of the particular nucleic acid.
  • the vectors optionally comp ⁇ se gene ⁇ c expression cassettes containing at least one independent terminator sequence, sequences permitting replication of the cassette in eukaryotes, or prokaryotes, or both, (e.g., shuttle vectors) and selection markers for both prokaryotic and eukaryotic systems
  • Vectors are suitable for replication and integration in prokaryotes, eukaryotes, or preferably both.
  • the nucleic acid constructs of the invention are introduced into plant cells, either m culture or in the organs of a plant by a va ⁇ ety of conventional techniques.
  • the DNA construct can be introduced directly into the genomic DNA of the plant cell using techniques such as electroporation and microinjection of plant cell protoplasts, or the DNA constructs can be introduced directly to plant cells using ballistic methods, such as DNA particle bombardment.
  • the DNA constructs are combined with suitable T-DNA flanking regions and introduced into a conventional Agrobacterium tumefaciens host vector.
  • the virulence functions of the Agrobacterium tumefaciens host directs the insertion of the construct and adjacent marker into the plant cell DNA when the cell is infected by the- bacteria.
  • Microinjection techniques are known in the art and well described in the scientific and patent literature.
  • the introduction of DNA constructs using polyethylene glycol precipitation is described in Paszkowski, et al, EMBO J. 3:2717 (1984).
  • Electroporation techniques are described in Fromm, et al, Proc. Nat'l. Acad. Sci. USA
  • Agrobacterium tumefaciens-medi ⁇ ed transformation techniques including disarming and use of binary vectors, are also well described in the scientific literature. See, for example Horsch, et al, Science 233:496-498 (1984), and Fraley, et al., Proc. Nat'l. Acad.
  • Agrobacterium-mediated transformation is a prefe ⁇ ed method of transformation of dicots.
  • recombinant DNA vectors suitable for transformation of plant cells are prepared.
  • a DNA sequence coding for the desired mRNA, polypeptide, or non-expressed sequence is transduced into the plant.
  • the sequence is optionally combined with transcriptional and translational initiation regulatory sequences which will direct the transcription of the sequence from the gene in the intended tissues of the transformed plant.
  • Promoters in nucleic acids linked to loci identified by detecting expression products, are identified, e.g., by analyzing the 5' sequences upstream of a coding sequence in linkage disequilibrium with the loci.
  • promoters will be associated with a QTL. Sequences characteristic of promoter sequences can be used to identify the promoter. Sequences controlling eukaryotic gene expression have been extensively studied. For instance, promoter sequence elements include the TATA box consensus sequence
  • TATAAT which is usually 20 to 30 base pairs upstream of a transcription start site. In most instances the TATA box aids in accurate transcription initiation. In plants, further upstream from the TATA box, at positions -80 to -100, there is typically a promoter element with a series of adenines su ⁇ ounding the trinucleotide G (or T) N G. See, e.g., J. Messing, et al, in Genetic Engineering in Plants, pp. 221-227 (Kosage, Meredith and Hollaender, eds. (1983)). A number of methods are known to those of skill in the art for identifying and characterizing promoter regions in plant genomic DNA.
  • a plant promoter fragment is optionally employed which directs expression of a nucleic acid in any or all tissues of a regenerated plant.
  • constitutive promoters include the cauliflower mosaic virus (CaMV) 35S transcription initiation region, the 1'- or 2'- promoter derived from T-DNA of Agrobacterium tumafaciens, and other transcription initiation regions from various plant genes known to those of skill.
  • the plant promoter may direct expression of the polynucleotide of the invention in a specific tissue (tissue-specific promoters) or may be otherwise under more precise environmental control (inducible promoters).
  • tissue-specific promoters under developmental control include promoters that initiate transcription only in certain tissues, such as fruit, seeds, or flowers. Any of a number of promoters which direct transcription in plant cells can be suitable.
  • the promoter can be either constitutive or inducible.
  • promoters of bacterial origin which operate in plants include the octopine synthase promoter, the nopaline synthase promoter and other promoters derived from native Ti plasmids. See, Herrara-Estrella et al. (1983), Nature. 303:209-213. Viral promoters include the 35S and 19S RNA promoters of cauliflower mosaic virus. See, Odell et al
  • plant promoters include the ribulose-l,3-bisphosphate carboxylase small subunit promoter and the phaseolin promoter.
  • the promoter sequence from the E8 gene and other genes may also be used. The isolation and sequence of the E8 promoter is described in detail in Deikman and Fischer, (1988) EMBO J. 7:3315- 3327.
  • a polyadenylation region at the 3'-end of the coding region is typically included. The polyadenylation region can be de ⁇ ved from the natural gene, from a va ⁇ ety of other plant genes, or from T-DNA.
  • the vector compnsing the sequences from genes encoding expression products of the invention will typically comp ⁇ se a nucleic acid - subsequence which confers a selectable phenotype on plant cells.
  • the vector comp ⁇ smg the sequence will typically comprise a marker gene which confers a selectable phenotype on plant cells.
  • the marker may encode biocide tolerance, particularly antibiotic tolerance, such as tolerance to kanamycin, G418, bleomycm, hygromycin, or herbicide tolerance, such as tolerance to chlorosluforon, or phosph oth ⁇ cm (the active ingredient in the herbicides bialaphos and Basta).
  • crop selectivity to specific herbicides can be conferred by engmee ⁇ ng genes into crops which encode approp ⁇ ate herbicide metabolizing enzymes from other organisms, such as microbes.
  • crops which encode approp ⁇ ate herbicide metabolizing enzymes from other organisms, such as microbes.
  • Padgette et al. (1996) "New weed control opportunities: Development of soybeans with a Round UP ReadyTM gene” In: Herbicide-Resistant Crops (Duke, ed.), pp 53-84, CRC Lewis Publishers, Boca
  • genes that confer tolerance to herbicides include: a gene encoding a chime ⁇ c protein of rat cytochrome P4507A1 and yeast NADPH- cytochrome P450 oxidoreductase (Shiota, et al. (1994) Plant Physiol. 106(1)17, genes for glutathione reductase and superoxide dismutase (Aono, et al. (1995) Plant Cell Physiol.
  • nucleic acids which can be cloned and introduced into plants to modify or complement expression of a gene, including a silenced gene, a dominant gene, and additive gene or the like, can be any of a variety of constructs, depending on the particular application.
  • a nucleic acid encoding a cDNA expressed from an identified gene can be expressed in a plant under the control of a heterologous promoter.
  • a nucleic acid - encoding a transc ⁇ ption factor that regulates a target identified by the methods herein, or that encodes any other moiety affecting transc ⁇ ption can be cloned and transduced into a plant Methods of identifying such factors are replete throughout the literature. For a basic introduction to genetic regulation, see, Lewin (1995) Genes V Oxford University Press Inc , NY (Lewm), and the references cited therein.
  • Transformed plant cells which are de ⁇ ved by any of the above transformation techniques can be cultured to regenerate a whole plant which possesses the transformed genotype and thus the desired phenotype.
  • Such regeneration techniques rely on manipulation of certain phytohormones in a tissue culture growth medium, typically relying on a biocide and/or herbicide marker which has been introduced together with the desired nucleotide sequences.
  • Plant regeneration from cultured protoplasts is desc ⁇ bed in Evans, et al., Protoplasts Isolation and Culture, Handbook of Plant Cell Culture, pp 124-176, Macmilhan Publishing Company, New York, (1983), and Binding, Regeneration of Plants. Plant Protoplasts, pp. 21-73, CRC Press, Boca Raton, (1985).
  • Regeneration can also be obtained from plant callus, explants, somatic embryos (Dandekar, et al., J. Tissue Cult. Meth. 12: 145 (1989); McGranahan, et al., Plant Cell Rep 8:512 (1990)), organs, or parts thereof.
  • Such regeneration techniques are desc ⁇ bed generally in Klee, et al, Ann. Rev, of Plant Phvs 38:467-486 (1987).
  • One of skill will recognize that after the expression cassette is stably incorporated in transgenic plants and confirmed to be operable, it can be introduced into other plants by sexual crossing. Any of a number of standard breeding techniques can be used, depending upon the species to be crossed.
  • GENE SILENCING AND HETEROSIS It is discovered that gene silencing and epigenetic effects play a role in inbreeding depression. As demonstrated herein, the number of genes in hyb ⁇ ds with a dominant pattern of gene expression is correlated with hyb ⁇ d yield, a component of which is found to be relief from inbreeding depression. An other way of conside ⁇ ng genes in this class is to classify them as genes that are expressed at lower levels in one inbred parent than the other.
  • the number of genes in the dominant class were considered as a function of the number of hyb ⁇ ds that share those genes, and the frequency dist ⁇ bution indicated that the overlap between sets of genes cont ⁇ buting to dominant patterns of gene expression in hyb ⁇ ds is essentially random. This suggests that, du ⁇ ng the process of inbreeding, expression of a subset of genes may always be altered (and usually reduced), and that the expression of different random subsets of genes are silenced in different mbreds.
  • allelic (and non-allelic) effects have been described where expression in heterozygotes is normal, but in homozygotes trans-inactivation (or silencing) of both alleles occurs.
  • These effects are mediated by cis-acting regulatory sequences that need to be present at more than one copy (e.g. on different chromosome homologs) to mediate the cooperative assembly of multimeric protein complexes responsible for gene silencing (e.g., Polycomb proteins in Drosophila or SIR proteins in yeast).
  • sequences responsible for these effects most likely occur in intergenic regions outside of the chromatin loops flanked by MARs that contain genes.
  • the present invention provides methods of identifying unique expression products and/or unique profiles (or partial profiles). This ability to identify unique expression products provides one way of ascertaining parentage, which, in turn, provides the ability to determine whether a hybrid comprises proprietary material.
  • a source or the sources of a test plant such as a hybrid can be identified.
  • a representative sample of expression products from the test plant is profiled and the resulting test expression profile is compared to a database of known expression profiles for plants from known inbred or hybrid strains (methods of making such databases are described above).
  • the expression profiles for a selected tissue can be entered into a database for any or every proprietary plant (or clone, or any other source of germ plasm) that a corporation owns.
  • profiling a number of plants it is possible to detect unique expression products and/or expression patterns within the expression profile of specific plants. It is also possible to generate likely expression profiles for hybrid products of members of the database. Any of these expression profiles can be compared to an actual expression profile for a test plant suspected of being derived from a one or more proprietary plant. For example, a matrix of pair-wise comparisons for potential progeny from the expression profiles in the database can be compared to the test expression profile. Either the entire expression profile or a sub portion of the expression profile (i.e., a plurality of characters corresponding to expression products found in the overall profile) comprising at least one unique expression marker can be evaluated.
  • EXAMPLE 1 DIFFERENCES IN RNA EXPRESSION PROFILES CORRELATE WITH HETEROSIS Heterosis is a term used to describe the increased vigor of hybrid progeny in - comparison to their parents. Although heterosis has been widely used in plant breeding for many decades, the molecular mechanisms underlying the phenomenon were previously unknown. In this example, heterosis was studied as a phenotype using CuraGen (CuraGen Corp., New Haven CT) RNA profiling technology to examine differences in RNA expression between hybrids and their inhybrid parents. Using this approach, it was possible to sort out cDNA fragments into different categories, depending on their relative levels of expression in a given hybrid and its two parents.
  • CuraGen CuraGen Corp., New Haven CT
  • the degree of heterosis varies tremendously among hybrids from different parental combinations. In cu ⁇ ent breeding practice, selection for parent combinations which give a high degree of heterosis depends on top-cross yield tests.
  • new methods of monitoring heterosis by identifying genes and gene expression patterns associated with heterosis expression are provided. Specific gene expression patterns associated with heterosis are identified prior to yield testing. This allows screening of larger numbers of top- crosses without having to yield test all combinations.
  • non-optimally expressed genes in existing commercial hybrids can be identified and improved by transgenic manipulation or gene-expression profile assisted selection.
  • PAR poly(ethylene glycol) names
  • Figure 1 graphically represents the correlation between degree of heterosis and % relationship: % relationship is designated on the X axis; Fl-MP heterosis in bu/LCR is given on the Y axis. Data was obtained from 4 locations in JH97.
  • RNAs in each F hybrid were expressed at the same levels as in both parental inbreds. Genetically distantly related inbreds, e.g., the parents of commercial hybrids, had less than 6% of the mRNAs differentially expressed. The number of differentially expressed RNA bands between two inbred parents was positively correlated with the corresponding hybrid yield, demonstrating that either gene expression differences and/or DNA sequence polymorphism between inbred parents are important for heterosis.
  • RNA expression in the hybrid can differ from one inbred parent or the other (dominant), or both (additive or over-/under-dominant).
  • Figure 2 depicts the classification of gene expression patterns in FI hybrids relative to the inbred parents. RNA levels are provided on the vertical axis. Bands in each class exhibited the following expression patterns: (A) Over/under-dominant class: the level of expression in FI hybrid is at least two folds higher or lower than both parents, which have either equal or different levels of expression. In the additive. The majority of RNA expression level differences in both tissues of all hybrids analyzed were in the (B) additive and (C) dominant classes, the mRNA levels of the inbred parents are different.
  • Additive class Fl's expression level falls within the range of the two parents.
  • Dominant class the level of expression in FI hybrid is equal to one parent but different from the other. Two-thirds of the differences observed exhibited additive expression, and the rest of the differences demonstrated a dominant expression pattern.
  • RNA fragments correlated with the degree of heterosis.
  • Hybrid yield in bu/LCR is given on the X axis, while % of bands in each expression class is given on the Y axis (% of bands different: dotted line; % of additive bands: dashed line; and % of dominant bands: solid line).
  • Table 2 The number of genes exhibiting over-/under-dominant. additive or dominant expression patterns in heterotic and non heterotic hybri Expression data derived from 1 replicate/sample. [Note: Discrepancies between table 2 and Table 3 are likely due to different number of sam used.]
  • EXAMPLE 2 PREDICTING HETEROSIS FROM ANALYSIS OF SHARED ADDITIVE BANDS; IDENTIFICATION OF GENES INVOLVED IN HETEROSIS Immature ear mRNA was profiled from 10 hybrids and their respective inbred parents .
  • the genotypes profiled included a number of commercial hybrids and a set from the "PAR 27 series, " in which PAR 27 was used as a common female with a series of males that differed in percent relationship.
  • FI tends to have an expression level closer to the parents with higher expression. This is especially true with the RNA bands that are similar between an FI hybrid and its male parent.
  • Hybrids derived from two inbreds that have optimal complementation to each other to give rise to an heterozygosity condition for most of these regulatory elements had a maximal number of genes "re-activated” and were therefore, heterotic.
  • Crosses of closely related inbreds or inbred lines that did not have such "optimal complementation” had fewer genes re-activated and produced low heterotic hybrids.
  • RNA profile data described in Example 1 are based on the expression patterns of FI hybrids relative to their inbred parents, such as additive vs. non additive classifications and the differences of these catego ⁇ es between heterotic and non-heterotic hyb ⁇ ds. While the results so far were informative, another way of analyzing this data set by comparing the levels of RNA expression of poor hyb ⁇ ds with heterotic hyb ⁇ ds without any involvement of their parents. In compa ⁇ ng all 10 hyb ⁇ ds, which include 3 breeding crosses and 7 commercial hybrids, a list of bands that have similar expression level among heterotic hyb ⁇ ds but different from the non-heterotic hyb ⁇ ds
  • EXAMPLE 5 EXPRESSION PROFILING USING DIFFERENT TISSUES FROM HYBRIDS AND PARENTS
  • RNA profiling data from hyb ⁇ d sets were obtained in maize. Five other sets utilized kernel tissue at 13 days after pollination
  • DAP seedling tissue
  • the 14 hybrid sets analyzed included seven from the PAR 27 series, which covers a spectrum of heterosis levels ranging from commercial hyb ⁇ ds to low heterotic hybrids of sibling crosses; four commercial hyb ⁇ ds from diversified genetic backgrounds other than PAR 27 series and three crosses between inbreds of the same heterotic group, typical of those that would be useful for breeding new mbreds.
  • PAR,, vs. PAR,, vs. PAR,, vs. PAR,, vs. PAR,, vs. PAR,, vs. PAR 27 /PAR 25 PAR,, vs. PAR,, vs. PAR,, vs. Band ID PAR 12 PAR except PAR, vs. PAR, 5 PAR permitting/PAR, precisely PAR-j PAR,, (PAR 2 , cross) PAR 27 /PAR 4 , PAR 27 /PAR 44 PAR 27 /
  • RNA expression of poor hybrids and heterotic hybrids are compared without any involvement of their parents.
  • This approach examines whether the absolute level of expression of a subset of genes are important for heterosis, in addition to the additive vs. non-additive expression patterns we already found.
  • the FI hybrids tend to have the same expression levels as the higher parent, i.e. showing overall an up-regulation of gene expression (Table 10).
  • 34 bands that have a similar expression level among heterotic hybrids but different from the non-heterotic hybrid were identified (Table 9; the last three columns are non-heterotic hybrids). For these 34 bands, the 3 poor hybrids show either higher or lower expression than PAR 19 whereas all other hybrids, which are heterotic, show no or little differences in the expression relative to PAR I9 .

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Genetics & Genomics (AREA)
  • General Health & Medical Sciences (AREA)
  • Immunology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Botany (AREA)
  • Mycology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
EP00904457A 1999-01-21 2000-01-19 Molecular profiling for heterosis selection Withdrawn EP1143787A2 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US11661799P 1999-01-21 1999-01-21
US116617P 1999-01-21
US16636899P 1999-11-17 1999-11-17
US166368P 1999-11-17
PCT/US2000/001422 WO2000042838A2 (en) 1999-01-21 2000-01-19 Molecular profiling for heterosis selection

Publications (1)

Publication Number Publication Date
EP1143787A2 true EP1143787A2 (en) 2001-10-17

Family

ID=26814421

Family Applications (1)

Application Number Title Priority Date Filing Date
EP00904457A Withdrawn EP1143787A2 (en) 1999-01-21 2000-01-19 Molecular profiling for heterosis selection

Country Status (6)

Country Link
EP (1) EP1143787A2 (es)
AU (1) AU2621300A (es)
CA (1) CA2358509A1 (es)
HU (1) HUP0200319A3 (es)
MX (1) MXPA01007325A (es)
WO (1) WO2000042838A2 (es)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000070340A2 (en) * 1999-05-14 2000-11-23 Karolinska Innovations Ab Materials and methods relating to disease diagnosis
AU2002364153A1 (en) * 2001-12-11 2003-06-23 Lynx Therapeutics, Inc. Genetic analysis of gene expression in heterosis
JP2007519403A (ja) * 2004-02-03 2007-07-19 ハイブリッド バイオサイエンシーズ ピーティーワイ リミテッド 雑種強勢と雑種衰弱を促進する遺伝子の同定法とその使用
WO2007012138A1 (en) * 2005-07-29 2007-02-01 Hybrid Biosciences Pty Ltd Identification of genes and their products which promote hybrid vigour or hybrid debility and uses thereof
GB2436564A (en) * 2006-03-31 2007-10-03 Plant Bioscience Ltd Prediction of heterosis and other traits by transcriptome analysis
EP2005193A1 (en) * 2006-04-06 2008-12-24 Monsanto Technology, LLC Method of predicting a trait of interest
US20080083042A1 (en) * 2006-08-14 2008-04-03 David Butruille Maize polymorphisms and methods of genotyping
WO2009086500A1 (en) 2007-12-28 2009-07-09 Pioneer Hi-Bred International, Inc. Using structural variation to analyze genomic differences for the prediction of heterosis
US8865970B2 (en) 2008-10-06 2014-10-21 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. Induced heterosis related mutations
US9842252B2 (en) 2009-05-29 2017-12-12 Monsanto Technology Llc Systems and methods for use in characterizing agricultural products
GB201110888D0 (en) * 2011-06-28 2011-08-10 Vib Vzw Means and methods for the determination of prediction models associated with a phenotype

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1984004758A1 (en) * 1983-05-26 1984-12-06 Plant Resources Inst Process for genetic mapping and cross-breeding thereon for plants

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0042838A3 *

Also Published As

Publication number Publication date
HUP0200319A3 (en) 2003-12-29
AU2621300A (en) 2000-08-07
MXPA01007325A (es) 2002-06-04
WO2000042838A2 (en) 2000-07-27
HUP0200319A2 (hu) 2002-05-29
CA2358509A1 (en) 2000-07-27
WO2000042838A3 (en) 2001-03-29

Similar Documents

Publication Publication Date Title
Yu et al. A whole‐genome SNP array (RICE 6 K) for genomic breeding in rice
US8039686B2 (en) QTL “mapping as-you-go”
AU2004303836C1 (en) High lysine maize compositions and methods for detection thereof
Barua et al. Identification of RAPD markers linked to a Rhynchosporium secalis resistance locus in barley using near-isogenic lines and bulked segregant analysis
Conaway-Bormans et al. Molecular markers linked to the blast resistance gene Pi-z in rice for use in marker-assisted selection
US20040250317A1 (en) Cotton event moni5985 and compositions and methods for detection thereof
EP2158336A2 (en) Methods for sequence-directed molecular breeding
EP1250461A2 (en) Methods and kits for identifying elite event gat-zm1 in biological samples
JP2006345855A (ja) アレイにおける遺伝子改変植物に特異的なヌクレオチド配列エレメントの同定および/または定量のための方法
EP1143787A2 (en) Molecular profiling for heterosis selection
US9617605B2 (en) Molecular markers associated with yellow flash in glyphosate tolerant soybeans
US20070048768A1 (en) Methods for screening for gene specific hybridization polymorphisms (GSHPs) and their use in genetic mapping and marker development
Lu et al. Genetic basis of maize kernel protein content revealed by high-density bin mapping using recombinant inbred lines
US5332408A (en) Methods and reagents for backcross breeding of plants
US20070192909A1 (en) Methods for screening for gene specific hybridization polymorphisms (GSHPs) and their use in genetic mapping ane marker development
Tirnaz et al. The importance of plant pan-genomes in breeding.
Chen et al. A genotyping platform assembled with high-throughput DNA extraction, codominant functional markers, and automated CE system to accelerate marker-assisted improvement of rice
CN113604603A (zh) 一种与水稻抗白叶枯基因Xa7连锁的SNP标记及其应用
KR100981042B1 (ko) 벼의 잡종퇴화에 관여하는 상보적인 열성 유전자 및스크리닝 마커
Wu et al. Applications of DNA marker techniques in plant mutation research.
AU2019312799A1 (en) Method for the quality control of seed lots
US20050250205A1 (en) Use of associations between at least one nucleic sequence polymorphism of the sh2 gene and at least one seed quality characteristic in plant selection methods
MXPA06006574A (es) Composiciones de maiz de alto contenido de lisina, y metodos para la deteccion de las mismas

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20010810

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

17Q First examination report despatched

Effective date: 20040216

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20040629