WO2003057832A2 - Identification de genes commandant des traits complexes - Google Patents

Identification de genes commandant des traits complexes Download PDF

Info

Publication number
WO2003057832A2
WO2003057832A2 PCT/US2002/041381 US0241381W WO03057832A2 WO 2003057832 A2 WO2003057832 A2 WO 2003057832A2 US 0241381 W US0241381 W US 0241381W WO 03057832 A2 WO03057832 A2 WO 03057832A2
Authority
WO
WIPO (PCT)
Prior art keywords
genes
gene
individuals
phenotype
phenotypically
Prior art date
Application number
PCT/US2002/041381
Other languages
English (en)
Other versions
WO2003057832A3 (fr
Inventor
Benjamin A. Bowen
Christian D. Haudenschild
Edward S. Buckler, Iv
Original Assignee
Lynx Therapeutics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lynx Therapeutics, Inc. filed Critical Lynx Therapeutics, Inc.
Priority to AU2002364013A priority Critical patent/AU2002364013A1/en
Publication of WO2003057832A2 publication Critical patent/WO2003057832A2/fr
Publication of WO2003057832A3 publication Critical patent/WO2003057832A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection

Definitions

  • the invention herein relates generally to methods and systems for identifying genes that control complex traits.
  • the methods include differential gene expression techniques, gene/phenotype linkage (QTL) techniques, and gene sequence comparison techniques to screen expressed genes that control specific phenotypes.
  • the invention also includes computer systems that facilitate data handling and logical operations used in practice of the methods.
  • the first type of genetic variations are those that qualitatively influence the function of a gene product.
  • one allelic variant of a gene can encode a protein that is more active than another protein encoded by a different allelic variant of the gene.
  • the second basic type of genetic variations are those that quantitatively influence the expression level of a gene product (e.g., an RNA or protein) encoded by a gene.
  • the expression difference can be the result, e.g., of a mutation that alters gene expression, e.g., at the level of RNA transcription, RNA splicing, RNA translation, protein stability, or the like.
  • the present invention focuses on the second of these two basic types of genetic variations, i.e., on identifying genes that display gene expression level polymorphisms that control phenotypes of interest.
  • a number of methods currently exist for correlating gene expression levels with displayed phenotypes include, e.g., classical genetic mapping, positional cloning, differential gene expression analysis, etc. While many such methods for correlating genes to the phenotypes exist, they are all labor intensive, and are often imprecise. Difficulties with current methods for correlating genes to phenotypes become particularly apparent where the relevant phenotypes are controlled by multiple genes (multigenic phenotypes), and/or where the traits are the result of complex genetic interactions.
  • QTL quantitative trait locus
  • a method is presented to correlate genes, gene clusters and/or gene products to control of a phenotype. At least two individuals which differ by one or more quantifiable phenotype are identified. A set of genes which are differentially expressed between the two individuals is determined. The set of differentially expressed genes are mapped to the genome of the individuals. Quantitative trait loci borders established for statistical linkage to the phenotype of interest are also mapped onto the genome. Differentially expressed phenotypically correlated genes falling within the borders of the quantitative trait loci are then identified.
  • Phenotypically correlated genes identified by correlation analysis can be selected for further screening or functional studies. Determination of phenotypically correlated genes which are phenotype controlling genes can be further elucidated by application of a contingency test, by repeating the correlation analysis to develop more informative statistical correlations, or by comparing the set of phenotype controlling gene to other sets of phenotype controlling genes selected for correlation to a different, but related, phenotype.
  • the sensitivity, resolution and specificity of the correlation analysis can be modified by optimizing certain parameters.
  • a level of confidence determined by applying a selected p value, LOD score or similar confidence value during correlation analysis can be used to select individuals with optimum differences in displayed phenotype, to provide optimum differences in genetic differential expression and/or to choose optimum QTL borders.
  • the percentage of phenotype controlling genes in a phenotypically correlated gene set is increased where the individuals chosen are from both tails of a phenotypic distribution.
  • a population of individuals can be evaluated to select individuals that display the phenotype at a level beyond a statistical threshold from the population mean. It is preferred to have these statistical adjustments available in computer programs for correlation operations and analysis.
  • Preferred populations of individuals for correlation analysis come from recombinant inbred lines (RILs). Correlation analysis is also generally more effective when individuals selected for study are members of a phenotypically characterized or genetically sequenced species such as Arabidopsis thaliana, Oryza sativa, Mus musculus, Homo sapiens, Fugu rubripes, Drosophila melanogaster, Caenorhabditis elegans, or Saccharomyces cerevisiae. The genomes of many additional species are being sequenced and the information from these species can be used in the practice of the present invention as well.
  • RILs recombinant inbred lines
  • Mapping of gene markers and gene sequencing are practiced on a wide variety of agriculturally and commercially important species.
  • the correlation and contingency testing methods of the invention are broadly applicable and can be applied to individuals in essentially any species of any genera including, e.g., Arabidopsis, Brassica, Zea, Oryza, Triticum, Hordeum, Lolium, Sorghum, Glycine, Medicago, Helianthus, Lactuca, Beta, Vitis, Solanum, Lycopersicon, Capsicum, Gossypium, Hevea, Linum, Prunus, Citrus, Populus, Pinus, Quercus, Mus, Rattus, Sus, Ovis, Equus, Bos, Canis, Homo, Gallus, Fugu, Danio, Tilapia, Oncorhynchus, Crassostrea, Drosophila, Caenorhabditis, Aspergillus, Neurospora, Candida and Saccharomyces.
  • Massively parallel signature sequence (MPSS) data provide gene expression data for determination of differential expression between sets of genes.
  • MPSS data provide quantitative expression data associated with a gene signature sequence.
  • the signature sequence provides an identification tag that aids in comparison between individuals and in mapping genes in a genome.
  • certain low levels of background "noise” can be rejected by comparing only genes expressed in tissue of the individuals at an expression level assigned by a minimal abundance test.
  • the minimum abundance for MPSS analysis is often 0.0001% of total mRNA or less.
  • RIL individuals selected from both tails of a phenotypic distribution using MPSS data for differential expression comparison.
  • Independently correlated genes can be selected by repeating correlation analysis on different individuals followed by comparison of the phenotypically correlated gene sets. Individuals which differ by a quantifiable phenotype are identified for correlation analysis. A set of genes which are differentially expressed in the individuals is mapped onto a genome. Relevant quantitative trait loci borders are also mapped onto the genome and differentially expressed genes which fall within the borders are identified to provide a first "phenotypically correlated" gene set. The process above is repeated for new individuals to provide a second phenotypically correlated gene set. Independently correlated genes are selected which are common to the sets of phenotypically correlated genes.
  • the comparison step can be used to discard differentially expressed genes, e.g., those associated with other phenotypes.
  • the different phenotypes can be subtypes of the general target trait so the independent correlation analyses will provide genes more likely to control the target trait.
  • the different phenotypes can be similar traits scored in different organs or under different environmental conditions or experimental treatments.
  • a method for ranking phenotypically correlated genes according to how often they show up in correlation analyses between a variety of individuals or different phenotypic groups Two groups of individuals are identified which differ by a quantifiable phenotype of interest. Correlation analyses are carried out in a pairwise manner between each individual of the first group and each individual of the second group. Phenotypically correlated genes from the several correlation analyses are selected and compiled into a complete set of phenotypically correlated genes which form all pairwise correlation analysis comparisons. Phenotypically correlated genes are rank-ordered according to the frequency with which each phenotypically correlated gene is selected in all pairwise comparisons. Phenotypically correlated genes selected from the complete set of phenotypically correlated genes that are high in rank order are more likely to be controlling genes for the phenotype of interest.
  • a method for genotype contingency testing is presented which selects for genes more likely to be phenotype controlling.
  • the genotype of a particular gene is compared between two individuals.
  • the expression level of the gene is also compared between the individuals.
  • Genes are selected as contingency tested genes if the genotype is different between the individuals and the level of expression is significantly different between the individuals, or if the genotype is not different between the individuals and the level of expression is not significantly different between the individuals.
  • genes are rejected as contingency tested genes if the genotype is different between the individuals and the level of expression is not significantly different between the individuals, or if the genotype is not different between the individuals and the level of expression is significantly (as established by statistical evaluation or selected confidence levels) different between the individuals.
  • Comparison of genotypes of the gene between the individuals can be accomplished using any appropriate gene sequence or linkage data known in the art including identical by descent
  • genotype contingency testing can be repeated between all possible pairwise comparisons of individuals. Genes that are selected in one pairwise test, but rejected in another can be eliminated as phenotype controlling genes. Only those genes that are selected in all genotype contingency tests that are performed are retained at the end of the process.
  • phenotypically correlated genes selected by correlation analyses can be further screened by contingency testing.
  • differentially expressed genes contingency tested for phenotype control can be subjected to independent correlation analyses.
  • Correlation analysis results for more than one phenotype can be compared to discover phenotypically correlated genes correlated to a more general target trait.
  • Multiple correlation analyses for the same phenotype of interest using individuals from a variety of different populations can be compared to filter out differentially expressed genes not associated with the phenotype of interest.
  • a computer system is a valuable tool in the practice of this invention.
  • a correlation analysis computer program can include a computer readable digital instruction set.
  • the instruction set can call for input of differential expression gene sequence data for individuals, input of genome sequence data of the individuals and input of relevant quantitative trait loci border data.
  • the instruction set can manipulate the data to place the genes identified as differentially expressed onto a sequence map of the genome, e.g., if working with a sequenced organism.
  • the instruction set can place the borders of the quantitative trait loci data on to the genome sequence data.
  • the instruction set can direct the comparison of mapped differentially expressed gene sequences to mapped quantitative trait loci borders and direct selection of a set of phenotypically correlated genes common to both maps.
  • the instruction set can call for or direct storage of the set of phenotypically correlated genes to a database for later retrieval.
  • MPSS derived differential expression data are well suited for input into correlation analysis computer programs because this massively parallel method provides a preformed database with sequence data already associated with expression level data.
  • MPSS databases are often too large for practical manual correlation analysis, while they are readily processed for correlation analysis by modern computer systems.
  • a computer program to derive independently correlated genes can include a computer readable digital instruction set calling for repeat of the correlation analysis instruction set to select a second set of phenotypically correlated genes.
  • the instruction set can call for input of the previously stored set of phenotypically correlated genes and direct comparison to the second set.
  • Phenotypically correlated genes common to both sets can be selected and stored to a independently correlated gene database for later retrieval.
  • a computer program to create a database of phenotypic correlations can include a computer readable digital instruction set calling for input of two groups of individuals.
  • the ranking instruction set can direct repeated operation of a correlation analysis subroutine with systematic input and pairwise comparison of differential gene sequence expression data between each individual of each group and each individual of each other group.
  • the ranking instruction set can direct storage of each set of phenotypically correlated genes into a database of phenotypic correlations.
  • the instruction set can direct counting of the frequency with which each phenotypically correlated gene is represented in the database of phenotypic correlations. Finally, the instruction set can direct rank-ordering according to the count.
  • a contingency testing computer program can include a computer readable digital instruction set calling for input of differential expression gene sequence data for two individuals and input of genotype data for the two individuals.
  • the instruction set can direct comparison of genotypes of. the individuals.
  • the instruction set can direct selection of genes with different genotypes and significantly different expression and directing selection of genes with genotypes that are not different and do not have significantly different expression.
  • the instruction set can direct storage of selected genes in a contingency tested gene database for later retrieval.
  • mapping is determining the genomic position or locus of a nucleic acid sequence, genotype or gene sequence, e.g., within a genome, on a chromosome, or the like.
  • a "phenotype” is the display of one or more traits in an individual organism resulting from the interaction of gene expression and the environment.
  • a "quantifiable phenotype” is displayed in an objectively measurable form (e.g., height, growth rate under selected conditions, etc.). Gene expression itself, e.g., quantity of transcription and/or translation, can be considered a quantifiable phenotype.
  • a "phenotypically correlated gene” is a gene provided by correlation analysis.
  • a "phenotype controlling gene” is a gene of which expression affects display of a particular phenotype.
  • RIL recombinant inbred line
  • a "quantitative trait locus" is a region of a genome statistically associated with a contribution to the variance of a particular phenotype. Relevant quantitative trait loci in a correlation analysis are statistically associated to the quantifiable phenotype subject to the correlation analysis.
  • a "QTL and genotype dependent gene” is a gene sequence that has been selected by both correlation and contingency test methods.
  • Figure 1 shows two Arabidopsis QTL plots for three growth related traits (root length, aerial mass, and root mass). The LOD score for association of each marker interval in the genome with each phenotype is shown.
  • Figure 2 is a table comparing differential expression data to QTL borders for screening of phenotypically correlated genes.
  • Figure 3 shows a block flow diagram of computer operations to practice automated correlation analysis.
  • Figure 4 shows a block flow diagram of computer operations to practice automated contingency testing.
  • Figure 5 is an overlay chart of mapped differential expression data and mapped
  • Figure 6 is a table of comparing MPSS differential expression data to QTL borders to identify phenotypically correlated genes for root growth in ammonium sulfate.
  • Figure 7 shows the narrowing of phenotype controlling gene candidates for root growth response in fertilizer by independent correlation analysis.
  • Figure 8 shows a multiple correlation analysis cross of two groups of three individuals to generate a database of phenotypic correlations for rank-ordered phenotypically correlated genes.
  • This invention comprises methods to discover phenotype controlling gene candidates, and related systems for practicing the methods.
  • Correlation analysis can be used to select phenotypically correlated gene sequences enriched for phenotype controlling genes, as those genes that are differentially expressed in phenotypically different individuals and which map within the borders of relevant quantitative trait loci (QTL).
  • Multiple correlation analyses and a rank-ordering methods can further enrich phenotypically correlated genes for phenotype controlling genes.
  • Genotype contingency testing can select phenotype controlling gene candidates as those gene sequences that are the same between individuals with the same phenotype, or as those gene sequences that are different between individuals with a different phenotype.
  • Genome contingency testing eliminates phenotypically correlated gene sequences whose variation in gene expression is controlled by segregation of factors that do not map within the QTL. Practice of the methods is greatly facilitated through the use of digital systems.
  • the methods above are complimentary, e.g., more than one of the methods can be used in conjunction with one another.
  • the combination of genotype contingency testing with correlation analysis, multiple correlation analysis, and/or rank-ordering can provide highly enriched pools of phenotype controlling gene candidates.
  • integrated computer systems comprising databases and operational instruction sets provide an efficient means to practice correlation analysis, multiple correlation analysis, rank-ordering and genotype contingency testing.
  • Gene expression data, differential expression data, gene sequence data and quantitative phenotype data can be maintained in computer databases.
  • Computers can access the data for operations such as statistical evaluations, mathematical operations, comparisons, mapping and selection according to designated parameters to provide phenotype controlling gene candidates.
  • Display of a phenotype is the result of an interplay between the genetic makeup of an organism and environmental influences. Many phenotypes of interest, such as plant growth, or heart disease in humans, are controlled by the interplay of complex environmental influences with complex genetic systems.
  • One approach to studying phenotype control is to hold environmental influences constant so that the genetic contribution to differences between individuals can be evaluated in isolation.
  • a single gene controls a phenotype of interest and QTL analysis can provide a narrow range of candidate genes near the single control gene for study.
  • a phenotype of interest is controlled by the sequential action of a multigene family or even the interplay of several multigene systems dispersed throughout the genome.
  • the complimentary methods of the present invention are well suited to screen for phenotype controlling genes in such complex genetic situations.
  • Phenotype controlling genes also present themselves where individuals with similar genetic makeup, e.g., RILs, are exposed to different environments. Differential gene expression can be observed for genes that respond to stresses or opportunities presented by the environment. For example, by exposing similar individuals to environments plus and minus a nutrient, differential expression can be observed in genes controlling, e.g., growth or fertilizer response phenotypes.
  • the methods of the invention are capable of identifying phenotype controlling genes associated with such environmental influences.
  • nucleic acid analyses methods can be used in this invention to quantitate, and sequence, genes and gene products of interest.
  • An introduction to nucleic acid analysis methods is found in available standard texts, including Berger and Kimmel, Guide to Molecular Cloning Techniques. Methods in Enzvmology volume 152 Academic Press, Inc., San Diego, CA (Berger); Sambrook et al., Molecular Cloning - A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989 (“Sambrook”) and Current Protocols in Molecular Biology. F.M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc.
  • Methods to evaluate differential transcription of mRNA include subtractive binding methods, southern blotting analysis, quantitative dot blotting, and nucleic acid arrays. Other methods to determine qualitative and quantitative gene expression include SAGE data, microarrays and cDNA sequencing. See, e.g., Okubo et al., (1992), Large scale cDNA sequencing for analysis of quantitative and qualitative aspects of gene expression, Nature Genetics, 2:173-179; Bachem et al., (1996) Visualization of differential gene expression using a novel method of RNA fingerprinting based on AFLP: analysis of gene expression during potato tuber development, Plant J.. 9:745-753 and Shimkets et al., (1999) Gene expression analysis by transcript profiling coupled to database query, Nature Biotechnology. 17:798-803.
  • RNA quantitate RNA are also well known in the art. These methods include electrophoretic methods on substrates such as agarose, and nucleic acid hybridization methods, such as Northern blotting. Quantitation of mRNA by these methods is complicated by poor sensitivity, narrow quantitative range, a large background of transport RNA (tRNA) and degradation of RNA by the ubiquitous RNAse enzyme.
  • Another method, well known in the art, to indirectly evaluate the quantity of mRNA, and to determine the sequences of mRNA is through the preparation of a cDNA library.
  • a cDNA library is prepared by converting messenger ribonucleic acid (RNA) sequences back to deoxyribonucleic acid (DNA) sequences using the enzymatic action of reverse transcriptase.
  • the resultant cDNA library is more stable than the original mRNA and avoids the issue of tRNA background.
  • the relative quantitative sequence information of the original mRNA population is generally retained in a cDNA library.
  • cDNA array technologies including subtractive cloning, and massively parallel signature sequencing (MPSS).
  • MPSS massively parallel signature sequencing
  • Recombinant inbred lines are well characterized highly homozygotic lines of organisms well suited to screening for phenotype controlling genes. Generally, they are derived by multiple generations of self or sib-matings from a cross of two parental lines, so that there are only two alleles segregating at each polymorphic locus in a RIL population.
  • RILs can be derived, for example, from a F2, a F1BC1, a F1BC2 or from randomly inter-mated populations.
  • RILs can contain multiple regions derived by recombination that are identical-by- descent (IBD) with both parents.
  • RILs that contain only a small fraction (typically 10-20%, on average) of the genome from one parent introgressed into the background of the other parent are frequently referred to as congenic lines, especially in mouse genetics.
  • a gene that controls a quantifiable phenotype is often expressed at different levels in individuals with different phenotypes.
  • the differential gene expression levels are often accentuated between individuals selected from the tails (extreme ends) of a phenotypic distribution. It is an aspect of the invention to select individuals at opposite ends of the population distribution for display of a quantifiable phenotype at a level of confidence determined by a selected p value, LOD score or other statistical measure of confidence.
  • the wide variety of RILs available provide a population of lines displaying phenotypes available for study by the methods of the invention.
  • the predominantly homozygous genetic compliment of RILs makes them excellent subjects for differential expression analyses. Where a gene controlling a phenotype is homozygous, the expression level can be higher to provide easier detection in differential expression analyses. Also, the homozygous condition reduces confusion and background noise resulting from companion alleles to a gene under study. The signal to noise ratio is generally increased in differential analyses of the invention when the test subjects are RELs. [0057] Homozygosity, genotype consistency, phenotype consistency and availability of a variety of phenotypes make RILs a desirable choice for screening of phenotype controlling genes.
  • RILs are generally created by inbreeding individuals of a species (having a similar phenotype) for at least 8 generations. The RILs are homozygotic for most genes and breed true. Many different RILs can be bred from a species of organism to display a variety of quantifiable phenotypes. RILs derived from Arabidopsis thaliana are an example model subject for correlation analysis because of the simplicity of the plant, its small sequenced genome, and the availability of a wide variety of phenotypes. However, the same principles can be applied to a variety of organisms.
  • MPSS Massively parallel signature sequencing
  • Massively parallel signature sequencing is a wide ranging and sensitive quantitative cDNA analysis tool for preparation of expression profiles, Brenner et al. "In vitro cloning of complex mixtures of DNA on microbeads: Physical separation of differentially expressed cDNAs", (2000) PNAS 97, 1665-1670.
  • MPSS cDNA is prepared from poly(A) RNA (mRNA) using a biotin-labeled oligo-dT primer. The oligo-dT is designed to prime each mRNA molecule exactly at the poly(A) junction.
  • the cDNA fragments are then digested with DpnII (recognition sequence GATC), and the 3'-most Dpn ⁇ -poly(A) fragments are purified utilizing the biotin label at the end of each molecule.
  • the fragments are subsequently bound to 5 micron diameter microbeads using a complex set of 32 base tag/antitags. This process yields a library of beads where one mRNA molecule is represented by one microbead, and each microbead contains approximately 100,000 identical cDNA fragments from that mRNA. All molecules are covalently attached to the microbeads at their poly(A) ends; therefore, the DPNII end is available for sequencing reactions.
  • the sequencing process is initiated by ligation of an adaptor molecule and digestion e.g., with a type II restriction enzyme. Approximately 1 million microbeads are then loaded onto a specially designed flow-cell in a way that allows them to stack together along channels to form a tightly packed monolayer in the flow-cell.
  • the flow cell is connected to a computer-controlled microfluidics network that delivers different reagents for the sequencing reactions.
  • a high-resolution CCD camera is positioned directly over the flow-cell in order to capture fluorescent images from the microbeads at specific stages of the sequencing reactions. Analysis of the fluorescent images from the microbeads provides massively parallel collection of quantitative and sequence data sets.
  • the DNA is sequenced through an automated series of adaptor ligations and enzymatic steps.
  • the process is initiated by ligating an adaptor molecule to the GATC (Dpnll) single-stranded overhangs, and then digesting the samples with Bbvl, which is a type IIs restriction enzyme that cuts DNA at a position 9-13 nucleotides away from the recognition sequence.
  • Bbvl is a type IIs restriction enzyme that cuts DNA at a position 9-13 nucleotides away from the recognition sequence.
  • Another set of adaptors called encoded adaptors, are hybridized and ligated to the 4 base overhangs on each molecule.
  • the encoded adaptors contain a 4 base overhang with all possible nucleotide combinations at one end, and a single stranded coded sequence at the other end.
  • One member of the encoded adaptor set will find a partner on the DNA molecules attached to the beads in the flow cell.
  • the exact sequence of each encoded adaptor that hybridizes to the DNA on a microbead is decoded through 16 different sequential hybridization reactions with a set of fluorescent decoder probes. This process yields the first 4 nucleotides at the end of each molecule.
  • the encoded adaptor from the first round is removed by digestion with Bbvl, and the process is repeated several times. In the end, a signature sequence of 17-bases or more is generated for each bead in the flow cell.
  • Each signature sequence in the MPSS data set is analyzed, compared electronically, and all identical signatures are counted.
  • the level of expression of any single gene is calculated by dividing the number of signatures from the gene by the total number of signatures for all mRNAs present in the data set. Reliable sequence and quantitative data can often be acquired with MPSS analysis where a particular mRNA is present at a level of
  • MPSS quantitative data sets for signature sequences can be compared to evaluate gene expression differences between sequences in individuals. Replicate MPSS expression analyses of a single biological sample generally yield only 1% of signatures that are quantitatively different (p ⁇ 0.001). This low intra-assay variability allows highly reliable comparisons between individuals with different gene expression levels. Differential gene expression between individuals at opposite extremes (strong and weak) of phenotypic display often differ quantitatively in about 15% of signature sequences.
  • the difference in display of the phenotype between the individuals in the invention can be above a statistical threshold, e.g., p ⁇ 0.01, p ⁇ 0.001, or p ⁇ 0.0001, or the like.
  • a statistical threshold e.g., p ⁇ 0.01, p ⁇ 0.001, or p ⁇ 0.0001, or the like.
  • Contingency testing selects for phenotype controlling genes using gene expression level variation and co-segregation analysis.
  • the phenotype difference and genomic position data used in correlation analyses are independent of and complimentary to the co-segregation data of contingency testing.
  • the combination of correlation analyses with contingency testing in the invention provides a common candidate pool highly enriched for phenotype controlling genes.
  • SAGE is a transcript counting technique that generates a tag sequence for each mRNA.
  • SAGE is based on the principles that a short sequence tag derived from a defined position from a mRNA can uniquely identify the transcript and concatenation of the tags allows for high-throughput sequencing.
  • the length of the SAGE tag is about 10 to about 14 nucleotides.
  • the tag sequence is determined using conventional sequencing technologies. See the following publications and references cited within: Velculescu et al, (1995), Serial analysis of gene expression, Science. 270:484-487; and Zhang et al, (1997), Gene expression profiles in normal and cancer cells, Science. 276: 1268-1272.
  • the frequency of a sequence tag derived from the corresponding mRNA transcript is measured.
  • adjustments to consider bias and normalization are optionally included in the present invention. See, e.g., Marguiles et al., (2001) Identification and prevention of a GC content bias in SAGE libraries, Nucleic Acid Res., 29(12):E60-0.
  • Microarrays are also technologies that can be applied in the context of the present invention.
  • a microarray is a solid support that contains a variety of nucleic acids fixed to the support in a specified arrangement. mRNAs or cDNAs from a sample are allowed to hybridize to the microarray.
  • Microarrays have the advantage of high throughput analysis of multiple samples.
  • some or all of a variety of variables can be optimized. For example, genes of interest should have corresponding nucleic acids on a given array.
  • detection sensitivity is typically optimized to achieve detection of genes expressed at low levels in the sample under investigation.
  • the array can be designed to detect multiple regions for each gene of interest, because multiple signals can then be detected for, e.g., distinct probe regions within the gene. See also, Kerr and Churchhill, G.A., (2001), Statistical design and the analysis of gene expression microarray data.
  • Biostatistics 2:183-201; Wodicka et al., (1997), Genome wide expression monitoring in Saccharomyces cerevisiae, Nature Biotech., 15:1359-1367; Lockhart et al., (1996), Expression monitoring by hybridization to high-density oligonucleotide arrays, Nature Biotech.. 14:1675-1680; Aach et al., Systematic management and analysis of yeast gene expression data, Genome Res.. 10:431- 445 and Wittes and Friedman, (1999) Searching for evidence of altered gene expression: a comment on statistical analysis of microarray data, J. Natl. Cancer Inst., 91:400-401.
  • microarrays More information regarding microarrays can be found in the following publications and references cited within: Duggan et al., (1999), Expression profiling using cDNA microarrays, Nature Genetics, 21:10-14; Lipshutz et al., High density synthetic oligonucleotide arrays, Nature Genetics Suppl. 21:20-24; Evertsz et al., (2000), Technology and applications of gene expression microarrays, in Microarray Biochip technology. Schena, M., Ed. BioTechniques Books, Natick, MA, pp.149-166; Lockhart and Winzeler, (2000), Genomics, gene expression and DNA arrays, Nature.
  • Quantitative trait loci (QTL) data provide an effective complement to differential expression data in correlation analyses for screening for phenotype controlling genes.
  • QTL population statistics can provide a highly orthogonal screen to a population of differential expression genes for a phenotype of interest.
  • Quantitative trait loci data assign a statistical confidence that particular loci in a genomic sequence contribute a portion of the genetic variance in a quantitatively measurable phenotype.
  • a population of phenotypically defined individuals are scanned for the presence of "marker" nucleic acid sequences.
  • the pattern of markers and phenotypes for each individual in the population is compiled into a database.
  • Statistical evaluation of the database can provide the probability that a certain marker, or an interval between markers, and the phenotype will be present in the same individual of the population.
  • Commonly used procedures to generate QTL statistics include multiple regression analysis, interval mapping, or composite interval mapping, many of which are covered in texts such as Molecular Dissection of Complex Traits.
  • Figure 1 is a chart showing LOD scores for the association of marker intervals with variation in growth traits in Arabidopsis grown under two environments (provision of ammonium sulfate or ammonium nitrate fertilizer). Genomic regions with a higher probability of containing genes controlling the phenotype of growth are in QTL spanning intervals with higher LOD score values.
  • Precision of QTL data depends on the number of markers and the number of individuals with the quantifiable phenotype evaluated.
  • the borders of a QTL can be broadened or narrowed according to the statistical level of confidence selected for evaluation of the data. Generally, QTL borders are such that many genes are covered within the QTL, including false positive genes for control of the phenotype.
  • Correlation analysis is an effective method to screen for phenotype controlling genes in this invention.
  • the complimentary screening methods of QTL mapping and differential gene expression analysis are combined to narrow the search for phenotype controlling genes.
  • Differentially expressed gene sequences can be greatly enriched for phenotype controlling genes by rejecting sequences not linked with a high probability to relevant qualitative trait loci (QTL).
  • QTL qualitative trait loci
  • Databases of gene sequences differentially expressed between individuals having different phenotypes include many false positive "hits" irrelevant to control of the phenotype. Many of the false positive hits can be rejected for not mapping within the relevant QTL (i.e., QTLs established based on the same or related phenotypes).
  • MPSS Massively parallel signature sequencing
  • Correlation analysis combines differential expression analyses and QTL data to identify phenotypically correlated genes more likely to be phenotype controlling genes (and reject genes associated with the phenotype but which are not actually controlling genes). For example, differential expression analyses often identify a large proportion of differentially expressed false positive genes that do not control the phenotype of interest, but are genetically downstream of the controlling genes that map within the QTL. These downstrean genes can often be rejected by correlation analysis. Similarly, a large number of genes not actually associated with control of the phenotype are usually included within the borders of a typical QTL of l-20cM. Many of these genes will not be hits by differential expression analysis and will thus be rejected in correlation analysis. Correlation analysis can combine QTL analysis with differential expression analysis to accept highly likely (mutually included) controlling genes, and to reject unlikely controlling genes.
  • FIG. 2 is a table demonstrating how correlation analysis can narrow the search for phenotype controlling genes.
  • Correlation analysis was performed on Arabidopsis targeting the phenotype of growth in ammonium nitrate fertilizer.
  • the number that mapped within the borders of five QTL was estimated as 7220 or 4993, depending on the stringency used to determine the borders (i.e., a QTL p-value ⁇ 0.01 or ⁇ 0.001, respectively).
  • QTL mapping by itself provides a 4-5 fold enrichment for identifying phenotype controlling gene candidates.
  • a single MPSS profile identified 13109 genes that were expressed in roots, narrowing the number of phenotype controlling candidates about another 2-fold.
  • the number of prospective controlling genes can be trimmed to levels capable of evaluation by desired methods. That is, adjustment of confidence levels for differential expression and QTL borders can allow the size of the candidate pool to be changed.
  • a candidate pool of 624 genes was selected.
  • confidence levels set more stringently, e.g., at p ⁇ 0.001 for both QTL and differential expression data 221 candidates were selected for a pool more likely to be phenotype controlling genes.
  • Phenotypically correlated gene rank-ordering involves repeated correlation analyses to discover the most frequent common phenotypically correlated gene between individuals of different groups. In rank-ordering, at least two groups of individuals are identified which differ by a quantifiable phenotype. Correlation analysis comparisons are carried out pairwise between the individuals of each group to select a complete set of phenotypically correlated genes from all pairwise comparisons. The each of the genes in the complete set of phenotypically correlated genes can be ranked by the frequency with which they were selected among all the pairwise combinations. Phenotypically correlated genes represented in a higher number of correlation analysis combinations are more likely to be phenotype controlling genes.
  • Phenotypically correlated genes are rank-ordered by counting how often they are differentially expressed between individuals in different phenotypic groups. As with other statistical analyses, repetition and independent corroboration can increase the reliability of the data.
  • a database of phenotypic correlations can be obtained by comparing differential gene expression within relevant QTL borders between several individuals from groups having different phenotypes. Each of several individuals from a group displaying a phenotype at one end of a phenotypic distribution are compared to each of several individuals from a group which do not display the phenotype, or which display the phenotype at an opposite end of the phenotypic distribution, to determine a set of all significant phenotypically correlated genes between the groups.
  • Multiple independent correlation analyses of genes from a variety of individuals enhances the reliability of correlation analysis by strengthening the underlying basis for the statistical analysis and by eliminating some genes irrelevant to the phenotype.
  • Multiple independent correlation analyses can further enhance the power of complementary methods in this invention by providing complementary test subject individuals.
  • phenotypically correlated genes selected for one phenotype can be compared to another set of phenotypically correlated genes selected for another phenotype (such as weight) to discover phenotype controlling gene candidates for a more general target trait (such as size).
  • candidate genes for overlapping QTL controlling more than one trait e.g. insulin resistance vs. atherosclerosis susceptibility
  • common genes are identified from phenotypically correlated gene sets derived from correlation analyses of individuals from different populations.
  • the individual populations can have the same phenotype of interest but can differ in other phenotypes.
  • the selection of mutually included phenotypically correlated genes from independent correlation analyses provides more reliable statistics and removes many genes associated with irrelevant phenotypes.
  • common genes are identified from phenotypically correlated gene sets derived from correlation analyses of individuals from populations having different phenotypes of interest. Selection of phenotypically correlated genes independently correlated to multiple phenotypes associated with a more general target trait enriches phenotypically correlated genes for phenotype controlling genes. For example, phenotypically correlated genes generally associated with plant growth in ammonium sulfate fertilizer can be compared to phenotypically correlated genes for growth in ammonium nitrate fertilizer to select for candidate genes that control growth in both environments.
  • phenotypically correlated genes independently selected based on root growth and leaf growth parameters could be compared, with environmental factors held constant, to select for candidate genes that control both leaf and root growth.
  • multiple quantifiable phenotypes associated with two or more environmental responses, or two or more quantifiable parameters related to a more general target trait are subjected to correlation analysis to discover common phenotype correlated genes.
  • Those who are skilled in the art can readily conceive of many orthogonal parameters to measure phenotypes or conceive of ways to manipulate environmental factors to investigate general traits by independent correlation analysis.
  • Genotype contingency testing capitalizes on two ideas: 1) the variation of phenotype controlling genes is determined by genetic factors that co-segregate in cis with the QTL; and 2) phenotypically correlated genes that vary as a result of segregation of genetic factors in trans, that map outside of the QTL, can be eliminated as phenotype controlling gene candidates.
  • the genotype contingency test selects genes that are differentially expressed between individuals with a different genotype in a region of a gene and also selects genes not differentially expressed between individuals with a similar genotype in the region of the gene.
  • the genotype contingency test rejects genes that are differentially expressed between individuals with a similar genotype in the region of the gene and genes that are not differentially expressed between individuals with a different genotype in the region of the gene.
  • the genotype contingency test is developed to enrich for phenotype controlling genes and to reject genes less likely to be phenotype controlling genes.
  • a phenotypically correlated gene that is differentially expressed between any two individuals that do not differ in genotype in the region of the phenotypically correlated gene is not likely to be a phenotype controlling gene, because the genes responsible for the difference in expression must map elsewhere in the genome.
  • a phenotypically correlated gene that is expressed at the same level in any two individuals that differ in genotype in the region of the phenotypically correlated gene is also not likely to be a phenotype controlling gene, because variation in the level of expression of a phenotype controlling gene must co-segregate with the QTL.
  • genes are retained and selected, if gene expression is different between individuals that differ in genotype in the region of the gene or if gene expression is not different between individuals that do not differ in genotype in the region of the gene. Further, genes are rejected if gene expression is not different between individuals that differ in genotype in the region of the gene or if gene expression is different between individuals that do not differ in genotype in the region of the gene. Phenotypically correlated genes selected by the genotype contingency test are more likely to be phenotype controlling genes than those genes rejected by the test.
  • genotypes in the region of each gene can be deemed “different” if they are not identical by descent or if marker alleles on both sides of the gene are different in the two individuals being compared.
  • individuals for comparison are RILs genotyped typically with 100-1000 marker sequences for high resolution determination of the identity by descent for the entire genome.
  • genotype contingency testing can be carried out by comparing known nucleotide sequences of individual genomes in computer databases, as available.
  • Standard DNA sequencing methods such as Sanger, Maxim-Gilbert and automated sequencing methods, can be applied to elucidate specific sequences from each individual to be tested.
  • genotype sequenced species such as Arabidopsis thaliana, Oryza sativa, Homo sapiens, Mus musculus, Caenorhabditis elegans, Drosophila melanogaster and Saccharomyces cerevisiae.
  • the genotype contingency test can be applied to sets of genes before or after complimentary gene screening methods, such as differential expression analyses and QTL analyses, to further narrow a candidate pool of phenotype controlling genes.
  • complimentary gene screening methods such as differential expression analyses and QTL analyses.
  • Such sequential testing is not redundant since selection decisions in the complementary methods can be based on independent (orthogonal) parameters.
  • Contingency testing can be repeated, comparing new pairs of individuals to provide multiply contingency tested genes with greater statistical certainty of correlation to phenotype control.
  • phenotype controlling genes are identified by any of the methods above, further expression and functional studies can be carried out. Although the contingency test and correlation methods can discover previously undocumented genes, a search of genomic databases can yield analogous sequences that suggest the function of the gene. Genes identified by these methods can be readily synthesized and/or cloned by those skilled in the art for expression and purification. Gene sequences and gene products of the methods can be used in high throughput screening as elements of nucleic acid or peptide arrays. Gene sequences and gene products of this method can be evaluated by in vivo and in vitro assays for use as probes, pharmaceuticals, vaccines, diagnostics, nutritional supplements, growth factors, herbicides, and other products.
  • mapping, comparison and selection operations involving the differential expression and QTL databases would be difficult without the use of an automated system. Indeed, in most cases, correlation analyses are difficult, at best, to perform manually. Computer systems are well suited to make correlation analysis tasks more practical.
  • computer system in the context of this invention refers to a system in which data entering a computer (input data) corresponds to physical objects or processes external to the computer that within the computer causes a physical transformation of the input data to different usable output data.
  • the input data e.g., expression data, a genomic sequence, genotype information, QTL information, and/or operational commands
  • a new data product e.g., mapped differential expression data.
  • the process within the computer is a program by which expression data, e.g. MPSS data, are recognized and compared to genome sequence data until a match is found.
  • MPSS data the signature sequences associated with differential expression data, are in turn associated with the locations of the matching genomic sequences for output as an associated set of data. Additional operations can then be performed on the associated data set e.g., comparison to QTL data or genotype contingency data.
  • levels of confidence can be adjustable to modify the stringency of comparisons, selections and determinations to affect the quality or quantity of outputs. Adjustment can be made to, but not limited to, the level of phenotypic difference required to distinguish quantifiable phenotypes, the level of expression difference required to determine if genes are differentially expressed, and the level of confidence establishing QTL borders.
  • Correlation analysis can be carried out by a computer system through a series of operations controlled by a computer readable digital instruction set, e.g., as shown in Figure 3.
  • Such an instruction set can include, but not be limited to: 1) input differential expression gene sequences, 2) input a relevant genomic sequence, 3) map the genomic location of the differentially expressed gene by comparison of sequence data, 4) input relevant QTL data, 5) select differentially expressed genes within the borders of QTL data to identify phenotypically correlated gene sequences, and/or 6) store a phenotypically correlated sequence database.
  • Independent correlation analysis can be carried out by a computer system through a series of operations controlled by a computer readable digital instruction set.
  • Such an instruction set can include, but not be limited to: 1) input a first database of phenotypically correlated sequences, 2) input a second database of phenotypically correlated sequences, 3) compare the first database with the second database to determine which genes are common between them, 4) select genes common between the first database and the second database, and/or 5) store an independently correlated gene database.
  • Rank-ordering genes by multiple correlation to a phenotype can be carried out by a computer system through a series of operations controlled by a computer readable digital instruction set.
  • Such an instruction set can include, but not be limited to: 1) identify a first group and a second group of individuals, 2) identify a storage location for a database of phenotypic correlations 3) select an individual from the first group, 4) select an individual from the second group, 5) loop to a correlation analysis subroutine to identify a set of phenotypically correlated genes common between the individuals, 6) store the set of phenotypically correlated genes in the database of phenotypic correlations, 7) repeat steps 4 and 5 with a different individual from the second group until each individual from the second group has been selected for correlation analysis with the individual from the first group, 8) select a different individual from the first group, 9) repeat steps 4 to 8 until each individual in the first group has been selected, 10) count the number of times each phenotypically correlated gene is represented in the database of phenotypic correlations, 11) rank each phenotypically correlated gene according to the count, and 12) store the rank-order of each phenotypically correlated gene in
  • Genotype contingency testing can be carried out by a computer system through a series of operations controlled by a computer readable digital instruction set.
  • Such an instruction set can include, but not be limited to, instructions to: 1) select a gene for contingency testing, 2) input a value for the level of expression of the gene in a first individual, 3) input a genotype of the gene for the first individual, 4) input a value for the level of expression of the gene in a second individual, 5) input a genotype of the gene for the second individual, 6) compare the genotypes of the gene between the first individual and second individual, 7) compare the levels of expression of the gene between the first individual and second individual, 8) select the gene if the genotype is different between the individuals and the level of expression is different between the individuals, 9) select the gene if the genotype is not different between the individuals and the level of expression is not different between the individuals, and 10) store selected genes in a contingency tested gene database.
  • Contingency testing steps can be repeated in the computer system of the invention, comparing new pairs of individuals to provide multiply contingency tested genes with greater statistical certainty of correlation to phenotype control.
  • Output databases from computerized genotype contingency testing can provide input databases for computerized correlation analyses programs. In a preferred embodiment, output databases from computerized correlation analyses provide input databases for computerized contingency testing programs.
  • Efficiency and productivity of phenotype controlling gene screening can be further increased by interfacing analytical instrumentation with a computer system. Interfacing MPSS instruments, nucleic acid sequencing instruments, and other hardware generating relevant data, to a computer system can avoid the effort and errors of manual data transcription.
  • Arabidopsis thaliana is a small flowering plant of the mustard (Brassicaceae) family that is widely used as a model organism in plant biology. Sequencing of the 125 megabase Arabidopsis genome was essentially complete in the year 2000. Arabidopsis is an excellent subject for the correlation methods described herein because it is well characterized phenotypically, grows quickly to maturity, has a sequenced genome, and has available well characterized recombinant inbred lines (RILs).
  • RILs well characterized recombinant inbred lines
  • Correlation analysis identified phenotypically correlated gene sequences for fertilizer response in Arabidopsis.
  • RILs that differ in growth on ammonium sulfate or ammonium nitrate fertilizer were selected from the two tails of the REL phenotypic distribution.
  • Root cDNA from pools of fast and slow-growing RELs were profiled by MPSS to identify differentially expressed genes.
  • genes that were differentially expressed (p ⁇ 0.0001) in the roots of fast and slow growing plants were mapped to the Arabidopsis genome and screened for inclusion within the borders of QTL controlling growth in ammonium nitrate to identify phenotypically correlated genes.
  • the differentially expressed genes are ordered on the x-axis in the same order that they appear in the Arabidopsis genome (the five chromosomes from which they are derived are aligned below).
  • the left y- axis, "QTL p Value" shown as histogram data, represents the level of confidence, that a gene at a point along the genome is linked to a phenotype controlling gene.
  • the level of differential expression between fast and slow growing plants grown on ammonium nitrate (shown as up and down spikes from a center line) is plotted according to the right y-axis. Relatively few differentially expressed genes are phenotypically correlated genes within the QTL borders.
  • a single MPSS profile identified 13686 genes that were expressed in roots on ammonium sulfate, so this narrowed down the number of phenotype controlling candidates about another 2-fold.
  • the candidate pool of phenotype controlling genes narrowed significantly.
  • Adjustment of confidence levels for the differential expression and QTL borders also allowed the size of the candidate pool to be changed. With the confidence level set at p ⁇ 0.01 for QTL Borders and at p ⁇ 0.01 for differential expression confidence, a candidate pool of 716 genes was selected.
  • 202 candidates were selected for a smaller pool of genes more likely to be phenotype controlling genes.
  • Selection of 202 candidates from the estimated 1635 genes that mapped within the QTL borders at this confidence level represents a 8-fold enrichment.
  • most of the phenotypically correlated genes identified in Figure 6 were different from those identified in Figure 2, possibly because different QTL control growth phenotypes expressed in the two fertilizer environments (see, Figure 1).
  • Candidate phenotype controlling genes for fertilizer response were selected by ranking the number of times a particular gene was selected in a series of correlation analyses between many pairs of MPSS runs derived from groups of fast and slow growing RILs.
  • Figure 8 shows the nine possible correlation analyses between MPSS runs derived from two pools of fast and slow growing RELs, three individual fast growing RELs, and three individual slow growing RELs.
  • MPSS profiles from each of the three fast growing samples were separately compared to MPSS profiles from each of the three slow growing samples to generate a set of phenotypic correlations.
  • Phenotypically correlated genes were rank-ordered by ranking the frequency with which each unique phenotypically correlated gene was selected in all the pairwise correlation analysis comparisons.
  • Contingency testing was used effectively in Arabidopsis to further narrow the search for phenotype controlling genes.
  • a contingency test was completed on 221 phenotypically correlated genes selected in the correlation analysis for ammonium nitrate fertilizer response (Figure.2). Thirty-five (35) phenotypically correlated genes were selected as differentially expressed and having a different EBD genotype, thereby providing QTL and genotype dependent genes.
  • the second contingency test parameter selection of genes that are not differentially expressed and which do not have a different genotype, had no effect since these candidates were previously screened out in the correlation analyses step.
  • Contingency testing of phenotypically correlated genes narrowed the list of ammonium nitrate fertilizer response phenotype controlling candidates here by 84%.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Cette invention se rapporte à un procédé permettant d'établir une sélection entre des approches complémentaires multiples de sélection de populations de gènes fortement enrichies en gènes régulateurs de phénotypes. L'application séquentielle de techniques de sélection orthogonales complémentaires, telle que l'analyse par corrélation, l'analyse par corrélation multiple, l'analyse par corrélation indépendante et le test de contingence, permet d'affiner la recherche pour les gènes candidats régulateurs de phénotypes. L'application d'un système informatique à ce procédé de sélection facilite la mise en oeuvre de l'invention.
PCT/US2002/041381 2001-12-28 2002-12-23 Identification de genes commandant des traits complexes WO2003057832A2 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2002364013A AU2002364013A1 (en) 2001-12-28 2002-12-23 Identification of genes controlling complex traits

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US34449901P 2001-12-28 2001-12-28
US60/344,499 2001-12-28

Publications (2)

Publication Number Publication Date
WO2003057832A2 true WO2003057832A2 (fr) 2003-07-17
WO2003057832A3 WO2003057832A3 (fr) 2004-03-04

Family

ID=23350778

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/041381 WO2003057832A2 (fr) 2001-12-28 2002-12-23 Identification de genes commandant des traits complexes

Country Status (2)

Country Link
AU (1) AU2002364013A1 (fr)
WO (1) WO2003057832A2 (fr)

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
JIANG ET AL.: 'Multiple trait analysis of genetic mapping for quantitative trait loci' GENETICS vol. 140, no. 3, July 1995, pages 1111 - 1127, XP002967875 *
LISTER ET AL.: 'Rebombinant inbred lines for mapping RFLP and phenotypic markers in arabidopsis thaliana' THE PLANT JOURNAL vol. 4, no. 4, 1993, pages 745 - 750, XP002967871 *
MUELLER ET AL.: 'Homology-dependent resistance: transgenic virus resistance in plants related to homology-dependent gene silencing' THE PLANT JOURNAL vol. 7, no. 6, 1995, pages 1001 - 1013, XP002967876 *
THORNSBERRY ET AL.: 'Dwarf8 polymorphisms associate with variation in flowering time' NATURE GENETICS vol. 28, no. 3, July 2001, pages 286 - 289, XP002967872 *
ZENG Z.: 'Precision mapping of quantitative trait loci' GENETICS vol. 136, no. 4, April 1994, pages 1457 - 1468, XP002967873 *
ZENG Z.: 'Theoretical basis for separation of multiple linked gene effects in mapping quantitative trait loci' PROC. NATL. ACAD. SCI. USA vol. 90, no. 23, December 1993, pages 10972 - 10976, XP002967874 *

Also Published As

Publication number Publication date
WO2003057832A3 (fr) 2004-03-04
AU2002364013A1 (en) 2003-07-24
AU2002364013A8 (en) 2003-07-24

Similar Documents

Publication Publication Date Title
CN109196123B (zh) 用于水稻基因分型的snp分子标记组合及其应用
Breyne et al. Quantitative cDNA-AFLP analysis for genome-wide expression studies
Johnson et al. Genome‐wide population structure analyses of three minor millets: Kodo millet, little millet, and proso millet
Shi et al. Identification of candidate genes associated with cell wall digestibility and eQTL (expression quantitative trait loci) analysis in a Flint× Flint maize recombinant inbred line population
Rusyn et al. Toxicogenetics: population-based testing of drug and chemical safety in mouse models
CN102277351A (zh) 从无基因组参考序列物种获得基因信息及功能基因的方法
CN108998550B (zh) 用于水稻基因分型的snp分子标记及其应用
JP2016165286A (ja) 転写物測定値数が減少した、遺伝子発現プロファイリング
Uncu et al. High-throughput single nucleotide polymorphism (SNP) identification and mapping in the sesame (Sesamum indicum L.) genome with genotyping by sequencing (GBS) analysis
Kingsley Identification of causal sequence variants of disease in the next generation sequencing era
Negi et al. Applications and challenges of microarray and RNA-sequencing
CN110444253B (zh) 一种适用于混池基因定位的方法及系统
Silva et al. A 3K Axiom SNP array from a transcriptome-wide SNP resource sheds new light on the genetic diversity and structure of the iconic subtropical conifer tree Araucaria angustifolia (Bert.) Kuntze
Tavtigian et al. An analysis of unclassified missense substitutions in human BRCA1
US20140019062A1 (en) Nucleic Acid Information Processing Device and Processing Method Thereof
JP5403563B2 (ja) 網羅的フラグメント解析における遺伝子同定方法および発現解析方法
AU2004234996B2 (en) Array having substances fixed on support arranged with chromosomal order or sequence position information added thereto, process for producing the same, analytical system using the array and use of these
WO2003057832A2 (fr) Identification de genes commandant des traits complexes
WO2006109535A1 (fr) Analyseur de sequence d'adn et procede et programme d'analyse de sequence d'adn
Boopathi et al. Molecular Markers and DNA Barcoding in Moringa
Wong et al. Assessing gene expression variation in normal human tissues using GeneTag™, a novel, global, sensitive profiling method
JP6646120B1 (ja) 分解されたdnaで個体識別可能なdna鑑定方法
Scott et al. Designing a Study for Identifying Genes in Complex Traits
Gibbs et al. Evolving methods for the assembly of large genomes
WO2003050748A2 (fr) Analyse genetique de l'expression genique dans l'heterosis

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP