WO2005007816A2 - Compositions and methods for identifying an individual organism based on single nucleotide polymorphisms - Google Patents

Compositions and methods for identifying an individual organism based on single nucleotide polymorphisms Download PDF

Info

Publication number
WO2005007816A2
WO2005007816A2 PCT/US2004/021662 US2004021662W WO2005007816A2 WO 2005007816 A2 WO2005007816 A2 WO 2005007816A2 US 2004021662 W US2004021662 W US 2004021662W WO 2005007816 A2 WO2005007816 A2 WO 2005007816A2
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
individual organism
acid sample
single nucleotide
identity
Prior art date
Application number
PCT/US2004/021662
Other languages
French (fr)
Other versions
WO2005007816A3 (en
Inventor
Christian Jurinke
Ken Abel
Matthew Roberts Nelson
Georgios E. Marnellos
Charles Cantor
Original Assignee
Sequenom, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sequenom, Inc. filed Critical Sequenom, Inc.
Publication of WO2005007816A2 publication Critical patent/WO2005007816A2/en
Publication of WO2005007816A3 publication Critical patent/WO2005007816A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • the invention relates to genetic methods for identifying an individual organism or matching the genetic profiles of two nucleic acid samples by analyzing the single nucleotide polymorphisms (SNPs) in one or more samples of genetic material and compositions including nucleic acid molecules containing single nucleotide polymorphisms useful for such methods.
  • SNPs single nucleotide polymorphisms
  • VNTR variable number tandem repeat
  • the core repeat is typically a sequence of about 15 base pairs in length, and highly polymorphic VNTR loci can have an average of about 20 alleles (or types of VNTR).
  • DNA restriction sites located on either site of the VNTR are exploited to create DNA f agments from about 0.5 Kb to less than 10 Kb which are then separated by electrophoresis, indicating the number of repeats found in the individual at the particular loci.
  • RFLP methods generally consist of (1) extraction and isolation of DNA, (2) restriction endonuclease digestion; (3) separation of DNA fragments by electrophoresis; (4) capillary transfer; (5) hybridization with radiolabelled probes; (6) autoradiography; and (7) interpretation of results (Lee, H. C. et al., Am. J. Forensic. Med. Pathol. 15(4): 269-282 (1994)).
  • RFLP methods generally combine analysis at about 5 loci and have much higher discriminate potential than other available test due the highly polymorphic nature of the VNTRs. The probability that any two people, except identical twins, will have the same pattern of VNTR' s at 5 or more loci is very low.
  • PCR-based methods offer an alternative to RFLP methods.
  • a pFLP DNA fragments containing VNTRs are amplified and then separated electrophoretically, without the restriction step of RFLP method.
  • STRs short tandem repeats
  • Other methods include sequencing of mitochondrial DNA, which is especially suitable for situations where sample DNA is very degraded or in small quantities.
  • D-Loop locus a small region of 1 Kb of the mitochondrial DNA, referred to as the D-Loop locus, has been found useful for typing because of its polymorphic nature, resulting in lower discriminatory potential than with RFLP or AmpFLP methods.
  • DNA sequencing is expensive to carry out on a large number of samples.
  • Further available methods include dot-blot methods, which involve using allele- specific oligonucleotide probes that hybridize sequence specifically to one allele of a polymorphic site.
  • Systems include the HLA DQ-alpha kit developed by Cetus Corp.
  • SNP markers sets of SNP markers, probes, primers, and methods for determining the identity of a nucleotide at the SNP markers are also encompassed and are further described herein, and may encompass any further limitation described in this disclosure, alone or in any combination.
  • the methods provided herein provide equal to or better chscriminatory power for identifying an individual while consistently reducing costs and time for making the determination.
  • One factor that reduces costs and time is multiplexing.
  • MassEXTEND® primers and assays set forth in Tables 6, 7 and 8 have been optimized so that the same terminator mix can be used for all assays. This improves the efficiency of the assays and enables a greater ease of automation and also allows for grouping of assays into a multiplex format. Another factor that reduces costs and time is performing an identification procedure with informative SNP markers, which minimizes the number of SNP markers required for the determination.
  • a method for identifying an individual organism based on one or more single nucleotide polymorphisms which comprises (a) obtaining or possessing a nucleic acid sample known to be from the individual organism; (b) obtaining or possessing a nucleic acid sample which may or may not be from, or which may or may not be derived from, the individual organism; (c) analyzing the nucleic acid sample of step (a) to detect the identity of at least six nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5 (e.g., determining the sequence of a polymorphic variant at a particular position in the nucleic acid); (d) analyzing the nucleic acid sample of step (b) to detect the identity of the at least six single nucleotide polymorphism selected from the group consisting of the polymorphisms set forth in Table 5; and (e) comparing the results of steps (c) and (d), whereby if the
  • SNP's may vary the number of SNP's selected, depending upon the power of discrimination required. Therefore, more or less than six SNP's may be used. Also, it is possible that if the relevant information sought is the degree of relatedness between the nucleic acid sample and the individual or between two nucleic acid samples, (e.g., do they come from siblings or cousins), then the degree of identity between the SNP matches may be lessened. For example, if one half of the SNP's match between samples, (e.g., three of the six SNP's), that would be indicative of siblings. If one quarter of the SNP's match that would be indicative of a match for cousins and so on.
  • a method for identifying a known individual organism based on one or more single nucleotide polymorphisms comprises the following steps: (a) obtaining or possessing a nucleic acid sample from the known individual organism; (b) obtaining or possessing a nucleic acid sample which may or may not be from, or which may or may not be derived from, the known individual organism; (c) analyzing the nucleic acid sample of step (a) to detect the identity of at least six single nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5; (d) analyzing the nucleic acid sample of step (b) to detect the identity of at least six nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5; and (e) comparing the results of steps (c) and (d), whereby if the identity of each of the six single nucleotide polymorphisms detected in step (
  • the nucleic acid sample is obtained by extraction from a biological source that is associated with a non-biological source.
  • the non-biological source is selected from the group consisting of fabric, carpeting, currency, leather, cordage, tobacco products, hard-surfaced objects, a biological specimen other than the biological source that is the source of the nucleic acid.
  • the nucleic acid sample is obtained from human tissue, wherein the human tissue is selected from the group of human tissue consisting of epiderma, blood, semen, vaginal cells, hair, saliva, vomit, urine, feces, bone, buccal sample, amniotic fluid containing placental cells or fetal cells, and mixtures of any of the tissues listed above.
  • a method for identifying an individual organism based on single nucleotide polymorphisms which comprises: (a) obtaining or possessing a nucleic acid sample known to be from the individual organism; (b) obtaining or possessing a nucleic acid sample which may or may not be from, or which may or may not be derived from, the individual organism; (c) analyzing the nucleic acid sample of step (a) to detect the identity of each single nucleotide polymorphisms selected from the group consisting any subset of the polymorphisms set forth in Table 5 ; (d) analyzing the nucleic acid sample of step (b) to detect the identity of each single nucleotide polymorphism selected from the group consisting of any 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46
  • a "genetic fingerprint” or “genetic profile”as disclosed in this application may also be used as follows.
  • a biological specimen is taken from a subject for example, but not limited to, a medical patient, a veterinary patient for, for example, diagnostic or pharmacogenomic testing
  • the sample may be simultaneously, or sequentially, analyzed for the presence of the genetic profile of the subject whose sample is being tested. In this way, the testing entity can be assured that the sample tested and the diagnostic or pharmacogenomic result obtained corresponds to the correct subject.
  • the genetic profile e.g., a result of the SNP assays disclosed herein
  • a kit comprising at least one primer pair for amplifying each nucleic acid single nucleotide polymorphism selected from the group consisting of any two of the polymorphisms set forth in Table 5, and (a) a compartment comprising the primer pairs set forth in Tables 6, 7 or 8; and (b) instructions for use of reagents in the kit.
  • a method for matching a genetic profile of a known individual organism based on one or more single nucleotide polymorphisms which comprises obtaining or possessing a nucleic acid sample from the known individual organism; analyzing the nucleic acid sample of step (a) to detect the identity of at least six of the nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5, to obtain an individually-specific result; obtaining or accessing a collection of information containing at least one result of previously generated information on the genetic composition of individual organisms, which previously generated genetic information may contain a result comprising the genetic information of the individual organism of step (a); and comparing the individually-specific result of step (b) with the previously generated information of step (c), whereby if the individually-specific result of step (b) matches the result comprising the genetic information of the individual organism of step (c), then the match of the genetic information of the individual organism is confirmed.
  • the above method may be used with a sample from an unknown individual.
  • samples from two or more unknown individuals may be compared. This is useful for determining if, for example, the same individual, albeit unknown, has committed more than one crime, i.e., is a serial criminal.
  • Figure 1 is a graph that shows the estimated performance of the candidate panel provided herein. Under the assumptions of Hardy-Weinberg equilibrium at each SNP for a population at equilibrium under evolutionary forces, the median probability that two randomly selected, unrelated individuals would share the same multi-SNP genotype is shown in Figure 1. Curves are given under three inbreeding levels for the population (Theta). At the level of 35 SNPs with the modest inbreeding coefficient of 0.01, the probability of two randomly selected unrelated individuals sharing the same identical multi-SNP genotype is one in 2.4 x 10 15 . This is comparable to the values reported for the 13 STR panel in U.S. Caucasians (Holt et al. 2000. Foren. Sci. Int. 112:91-109).
  • a high powered typing system is advantageous when for example a suspect is identified by searching a DNA profile database such as that maintained by the U.S. Federal Bureau of Investigation (CODIS) or the United Kingdom's National DNA Database (NDNAD). Since databases may contain large numbers of data entries that are expected to increase consistently, currently used forensic systems will need to be modified to prevent matching overlapping DNA profiles. While database searches generally reinforce the evidence by excluding other possible suspects, low powered typing systems resulting in the identification of several individuals may often tend to diminish the overall case against a court.
  • CODIS U.S. Federal Bureau of Investigation
  • NDNAD National DNA Database
  • a target population is systematically tested to identify an individual having the same DNA profile as that of a DNA sample.
  • a lawyer is chosen at random based on DNA profile from a large population of innocent individuals. Since the population tested can often be large enough that at least one positive match is identified, and it is usually not possible to exhaustively test a population, the usefulness of the evidence will depend on the level of significance of the forensic test. In order to render such an application useful as a sole or primary source of evidence, DNA typing systems of extremely high discriminatory potential are required.
  • a very high-powered DNA typing assay would be required to discriminate between them. This can have important effects if a sample is found to match the court's DNA profile and no evidence that the perpetrator is a relative can be found. [0022] Accordingly, provided herein is a rapid, simple, inexpensive and accurate technique having a very high resolution value to determine relationships between individuals and differences in degree of relationships. Also, provided herein is a very accurate genetic relationship test procedure that uses very small amounts of an original DNA sample, yet produces very accurate results.
  • Described herein are methods for the identification of individuals, which comprise deterrnming the identity of the nucleotides at a set of genetic markers in a biological sample, wherein said set of genetic markers comprises at least that number of SNP markers which allows for the useful discrimination between individual organisms, based on their individual , pattern of SNPs.
  • the present invention provides an extensive set of SNP markers allowing an effectively high discriminatory potential that is cheaper, faster and more efficient than the genetic markers used in current forensic typing systems. Also, SNP markers can be genotyped in individuals with much higher efficiency and accuracy than the genetic markers used in current forensic typing systems.
  • the invention comprises determining the identity of a nucleotide at a SNP marker by single or multiple nucleotide primer extension, which does not require electrophoresis as in techniques described above and results in lower rate of experimental error.
  • SNP markers and SNP markers of the invention may be used, and may be selected according to the discriminatory power desired. SNP markers, primers, and methods for determining the identity of said SNP markers are further described herein. The use of SNPs which segregate independently is preferred. If a conservative estimate of male chromosome sizes is used, 77 unlinked SNPs could be identified. If a more liberal and generally applicable sex averaged estimate is used, unlinked SNPs from the 94 regions provided in Table 2 could be identified. A 100 SNP panel of nearly unlinked SNPs is possible. Additionally, SNP allele frequencies should approach an approximate frequency of 0.500 in most major ethnic groups, and SNPs should not be subject to selection. Additional considerations are set forth below.
  • the Discriminatory potential of methods provided herein are equal to or better than the discriminatory potential of reported methods.
  • the discriminatory potential of a SNP marker typing method can be calculated.
  • the discriminatory potential of the forensic test can be determined in terms of the profile frequency, also referred to as the random match probability, by applying the product rule.
  • the product rale involves multiplying the alleHc frequencies of all the individual alleles tested, and multiplying by an additional factor of 2 for each heterozygous locus.
  • the discriminatory potential of SNP marker typing can be considered in the context of forensic science.
  • the genotype of this DNA sample can be determined for several genetic markers, and the profile A of the perpetrator can thereby be determined.
  • one suspect (S) is available for typing.
  • the same set of genetic markers, such as the SNP markers of the invention, are typed and the same profile A is obtained for (S) and (P).
  • Two hypotheses are thus presented as follows: (1) either S is P (event (c) (2) either S is not P (event C ).
  • the ratio L of both probabilities can then be calculated using the following equation:
  • E(L) can thus be expressed as 3 N .
  • VNTR-based DNA typing systems assuming the VNTRs have 10 alleles, E(L) can be expressed as 55 N . Based on these results, the number of SNP markers or VNTRs needed to obtain, in mean, a ratio of at least 10 6 or 10 8 can calculated, and are set forth below in Table 3.
  • DNA typing systems and methods of the invention may comprise genotyping a set of at least 13 or at least 17 SNP markers to obtain a ratio of at least 10 6 or 10 8 , assuming a flat distribution of L across the SNP markers.
  • a greater number of SNP markers is genotyped to obtain a higher L value.
  • at least 1, 2, 3, 4, 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 37, 40, 50, 56, 60, 70, 80, 90, 100 or more of the SNP markers are genotyped.
  • DNA typing systems and methods of the invention using a larger number of SNP markers allow for uneven distributions of L across the SNP markers. For example, assuming unrelated individuals, a set of independent markers having an allelic frequency of 0.1/0.9, and the genetic profile of a homozygote at each genetic loci for the major allele, 66 SNP markers are required to obtain a ratio of 10 6 , and 88 SNP markers are required to obtain a ratio of 10 8 .
  • this is a first estimation of the upper bound of markers required in a DNA typing system.
  • Table 4 below (Weir (1996)) lists probabilities for several different types of relationships, assuming alleles A; and Aj, and population frequencies p; and p j , and lists likelihood ratios assuming genetic loci having allele frequencies of 0.1.
  • the DNA typing systems and methods of the present invention may further take into account effects of subpopulations on the discriminatory potential.
  • DNA typing systems consider close familial relationships, but do not take into account membership in the same population. While population membership is expected to have little effect, the invention may further comprise genotyping a larger set of SNP markers to achieve higher discriminatory potential.
  • a larger set of SNP markers may be optimized for typing selected populations; alternatively, the ceiling principle may be used to study allele frequencies from individuals in various populations of interest, taking for any particular genotype the maximum allele frequency found among the populations.
  • any markers known in the art may be used with the SNP markers of the present invention in the DNA typing methods and systems described herein, for example in anyone of the foUowing web sites offering collections of SNPs and information about those SNPs: The Genetic Annotation Initiative (http address cgap.nci.nih.gov/GAI ). An NIH run site which contains information on candidate SNPs thought to be related to cancer andtumorigenesis generally. dbSNP Polymorphism Repository (http address www.ncbi.nlm.nih.gov/SNP/). A more comprehensive NTH-run database containing information on SNPs with broad applicability in biomedical research.
  • This website provides access to SNPs that have been organized by chromosomes and cytogenetic location.
  • the site is run by Washington University.
  • HGBase http address hgbase.cgr.ki.se/).
  • HGBASE is an attempt to summarize aU known sequence variations in the human genome, to faciHtate research into how genotypes affect common diseases, drug responses, and other complex phenotypes, and is run by the Karolinska Institute of Sweden.
  • the SNP Consortium Database http address snp.cshl.org/db/snp/map). A collection of SNPs and related information resulting from the collaborative effort of a number of large pharmaceutical and information processing companies.
  • SNP markers provided in the foHowing patents and patent appHcations may also be used with the SNP markers of the invention in the DNA typing methods and systems described above: PCT/IB00/00184, filed Feb. 11, 2000; PCT/D398/01193, filed Jul. 17, 1998; PCT Publication No. WO 99/54500, filed Apr. 21, 1999; and PCT/IB00/00403, filed Mar.
  • nucleic acid includes DNA molecules (e.g., a complementary DNA (cDNA) and genomic DNA (gDNA)) and RNA molecules (e.g., mRNA, rRNA, and tRNA) and analogs of DNA or RNA, for example, by use of nucleotide analogs.
  • the nucleic acid molecule can be single-stranded and it often is double-stranded.
  • isolated or purified nucleic acid refers to nucleic acids that are separated from other nucleic acids present in the natural source of the nucleic acid.
  • isolated includes nucleic acids which are separated from the chromosome with which the genomic DNA is naturally associated.
  • An “isolated” nucleic acid often is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5' and/or 3' ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived.
  • the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5' and/or 3' nucleotide sequences which flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived.
  • an "isolated" nucleic acid molecule such as a cDNA molecule, often is substantially free of other ceHular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
  • nucleic acid fragments As used herein, the term “gene” refers to a nucleotide sequence that encodes a polypeptide. [0039] Also included herein are nucleic acid fragments. These fragments often are a nucleotide sequence identical to a nucleotide sequence in Tables 6, 7 or 8, a nucleotide sequence substantially identical to a nucleotide sequence in Tables 6, 7 or 8, or a nucleotide sequence that is complementary to the foregoing. The nucleic acid fragment may be identical, substantially identical or homologous to a nucleotide sequence in a nucleotide sequence of Tables 6, 7 or 8.
  • nucleic acid fragment may encode a full-length or mature polypeptide of the invention, or the nucleic acid fragment may encode a domain or part of a domain of a polypeptide of the invention.
  • An example of a nucleic acid fragment is an oligonucleotide.
  • oligonucleotide refers to a nucleic acid comprising about 8 to about 50 covalently linked nucleotides, often comprising from about 8 to about 35 nucleotides, and more often from about 10 to about 25 nucleotides.
  • the backbone and nucleotides within an oligonucleotide may be the same as those of naturally occurring nucleic acids, or analogs or derivatives of naturally occurring nucleic acids, provided that oligonucleotides having such analogs or derivatives retain the abihty to hybridize specifically to a nucleic acid comprising a targeted polymorphism. Oligonucleotides described herein may be used as hybridization probes or as components of assays, for example, as described herein.
  • Oligonucleotides typically are synthesized using standard methods and equipment, such as the ABITM3900 High Throughput DNA Synthesizer and the EXPEDITETM 8909 Nucleic Acid Synthesizer, both of which are available from Applied Biosystems (Foster City, CA). Analogs and derivatives are exemplified in U.S. Pat. Nos.
  • OHgonucleotides also may be linked to a second moiety.
  • the second moiety may be an additional nucleotide sequence such as a tail sequence (e.g., a polyadenosine tail), an adapter sequence (e.g., phage Ml 3 universal tail sequence), and others.
  • the second moiety may be a non-nucleotide moiety such as a moiety which faciHtates linkage to a solid support or a label to facilitate detection of the oHgonucleotide.
  • Such labels include, without limitation, a radioactive label, a fluorescent label, a cherniluminescent label, a paramagnetic label, and the like.
  • the second moiety may be attached to any position of the oHgonucleotide, provided the oHgonucleotide can hybridize to the nucleic acid comprising the polymorphism.
  • Nucleic acids substantially identical to those described herein can be utilized. Substantially identical nucleic acids sometimes are 90% or more identical to a reference nucleic acid. Calculations of sequence identity often are performed as follows.
  • Sequences are aHgned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal aHgnment and non-homologous sequences can be disregarded for comparison purposes).
  • the length of a reference sequence aHgned for comparison purposes is sometimes 30% or more, 40% or more, 50% or more, often 60% or more, and more often 70%, 80%, 90%, 100% of the length of the reference sequence.
  • the nucleotides or amino acids at corresponding nucleotide or polypeptide positions, respectively, are then compared among the two sequences.
  • the percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, introduced for optimal alignment of the two sequences. [0044] Comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm.
  • Percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of Meyers & Miller, CABIOS 4: 11- 17 (1989), which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4. Also, percent identity between two amino acid sequences can be determined using the Needleman & Wunsch, J. Mol. Biol.
  • a set of parameters often used is a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.
  • Another manner for determining if two nucleic acids are substantiaUy identical is to assess whether a polynucleotide homologous to one nucleic acid wiU hybridize to the other nucleic acid under stringent conditions.
  • stringent conditions refers to conditions for hybridization and washing. Stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. , 63.1- 6.3.6 (1989). Aqueous and non-aqueous methods are described in that reference and either can be used.
  • stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 50°C.
  • Another example of stringent hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 55°C.
  • a further example of stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1%) SDS at 60°C.
  • stringent hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 65°C. More often, stringency conditions are 0.5M sodium phosphate, 7% SDS at 65°C, followed by one or more washes at 0.2X SSC, 1% SDS at 65°C.
  • SSC sodium chloride/sodium citrate
  • stringency conditions are 0.5M sodium phosphate, 7% SDS at 65°C, followed by one or more washes at 0.2X SSC, 1% SDS at 65°C.
  • Determining the presence or absence of a polymorphic variant in both chromosomal complements represented in a nucleic acid sample from a subject having a copy of each chromosome is useful for determining the zygosity of an individual for the polymorphic variant (i.e., whether the individual is homozygous or heterozygous for the polymorphic variant).
  • Any detection method known in the art may be utilized to determine whether a sample includes the presence or absence of a polymorphic variant described herein. While many detection methods include a process in which a DNA region carrying the polymorphic site of interest is amplified, ultrasensitive detection methods which do not require amplification may be utilized in the detection method, thereby eliminating the amplification process.
  • Polymorphism detection methods known in the art include, for example, primer extension or microsequencing methods, ligase sequence determination methods (e.g., U.S. Pat. Nos. 5,679,524 and 5,952,174, and WO 01/27326), mismatch sequence determination methods (e.g., U.S. Pat. Nos.
  • microarray sequence determination methods include restriction fragment length polymorphism (RFLP) procedures, PCR-based assays (e.g., TAQMAN ® PCR System (Applied Biosystems)), nucleotide sequencing methods, hybridization methods, conventional dot blot analyses, single strand conformational polymorphism analysis (SSCP, e.g., U.S. Patent Nos. 5,891,625 and 6,013,499; Orita etal, Proc. Natl. Acad. Sci.
  • RFLP restriction fragment length polymorphism
  • skiU in the art can utilize the determined nucleotide sequence flanking a polymorphic site in a database search to determine where the polymorphic site is located in genomic DNA (e.g., a BLAST search may be utilized to determine genomic orientation).
  • a BLAST search may be utilized to determine genomic orientation.
  • Primer extension polymorphism detection methods typically are carried out by hybridizing a complementary oHgonucleotide to a nucleic acid carrying the polymorphic site. In these methods, the oligonucleotide typically hybridizes adjacent to the polymorphic site.
  • the term "adjacent" refers to the 3' end of the extension oligonucleotide being sometimes 1 nucleotide from the 5' end of the polymorphic site, often 2 or 3, and at times 4, 5, 6, 7, 8, 9, or 10 nucleotides from the 5' end of the polymorphic site, in the nucleic acid when the extension oHgonucleotide is hybridized to the nucleic acid.
  • the extension oligonucleotide then is extended by one or more nucleotides, often 1, 2, or 3 nucleotides, and the number and or type of nucleotides that are added to the extension oligonucleotide determine which polymorphic variant or variants are present.
  • Oligonucleotide extension methods are disclosed, for example, in U.S. Patent Nos.4,656,127; 4,851,331; 5,679,524; 5,834,189; 5,876,934; 5,908,755; 5,912,118; 5,976,802; 5,981,186; 6,004,744; 6,013,431; 6,017,702; 6,046,005; 6,087,095; 6,210,891; and WO 01/20039.
  • the extension products can be detected in any manner, such as by fluorescence methods (see, e.g., Chen & Kwok, Nucleic Acids Research 25: 347-353 (1997) and Chen et al, Proc. Natl. Acad. Sci.
  • Microsequencing detection methods often incorporate an amplification process that proceeds the extension step.
  • the amplification process typically amplifies a region from a nucleic acid sample that comprises the polymorphic site.
  • Amplification can be carried out by utilizing a pair of oligonucleotide primers in a polymerase chain reaction (PCR), in which one oligonucleotide primer typically is complementary to a region 3' of the polymorphism and the other typically is complementary to a region 5' of the polymorphism.
  • PCR primer pair may be used in methods disclosed in U.S. Patent Nos.4,683,195; 4,683,202, 4,965,188; 5,656,493; 5,998,143; 6,140,054; WO 01/27327; and WO 01/27329 for example.
  • PCR primer pairs may also be used in any commercially available machines that perform PCR, such as any of the GENEAMP ® Systems available from Applied Biosystems.
  • Mismatch sequence determination methods typically are based upon the specificity of polymerases and ligases. Polymerization reactions place particularly stringent requirements on correct base pairing of the 3' end of the amplification primer and the joining of two oligonucleotides hybridized to a target DNA sequence is quite sensitive to mismatches close to the ligation site, especially at the 3' end. Discrimination between two alleles can be achieved by allele specific ampHfication, which is a selective strategy, whereby one of the alleles is amplified without amplification of the other allele. This is accompHshed by placing a polymorphic base at the 3' end of one of the amplification primers.
  • Oligonucleotide ligation assays utilize two oHgonucleotides designed to be capable of hybridizing to abutting sequences of a single strand of a target molecules.
  • One of the oligonucleotides may be biotinylated, and the other may be detectably labeled.
  • oligonucleotides will hybridize such that their termini abut, and create a ligation substrate that can be captured and detected.
  • OLA is capable of detecting polymorphic sites and may be advantageously combined with PCR as described by Nickerson et al, Proc. Natl. Acad. Sci. U.S.A. 87: 8923-8927 (1990). In this method, PCR is used to achieve the exponential amplification of target DNA, which is then detected using OLA.
  • Ligase chain reaction (LCR) detection methods utilize two pairs of probes to exponentially amplify a specific target.
  • each pair of oligonucleotides is selected to permit the pair to hybridize to abutting sequences of the same strand of the target.
  • Such hybridization forms a substrate for a template-dependant ligase.
  • LCR can be performed with oligonucleotides having the proximal and distal sequences of the same strand of a polymorphic site.
  • either oligonucleotide will be designed to include the polymorphic site.
  • the reaction conditions are selected such that the oligonucleotides can be ligated together only if the target molecule contains or lacks the specific nucleotide(s) that is complementary to the polymorphic site on the oligonucleotide.
  • the oligonucleotides will not include a polymorphic site, such that when they hybridize to the target molecule, a "gap" is created as described for example in WO 90/01069.
  • This embodiment is termed gap LCR (GLCR).
  • This gap is then "fiUed” with complementary dNTPs (as mediated by DNA polymerase), or by an additional pair of oligonucleotides.
  • GLCR gap LCR
  • Another technique which may be used to analyze polymorphisms, includes multicomponent integrated systems, which miniaturize and compartmentalize processes such as PCR and capillary electrophoresis reactions in a single functional device.
  • An example of such technique is disclosed in U.S. Pat. No. 5,589,136, which describes the integration of PCR amplification and capiUary electrophoresis in chips.
  • a microarray can be utilized for determining whether a polymorphic variant is present or absent in a nucleic acid sample.
  • a microarray may include any oHgonucleotides described herein, and methods for making and using oligonucleotide microarrays suitable for prognostic use are disclosed in U.S. Pat. Nos.
  • the microarray typically comprises a solid support and the oligonucleotides may be linked to this solid support by covalent bonds or by non-covalent interactions.
  • the oligonucleotides may also be linked to the solid support directly or by a spacer molecule.
  • a microarray may comprise one or more oHgonucleotides complementary to a polymorphic site within a nucleotide sequence in Tables 6, 7 or 8.
  • Polymorphism detection methods can be carried out within an integrated system.
  • An example of an integrated system is a micro fluidic system. These systems comprise a pattern of micro channels designed onto a glass, silicon, quartz, or plastic wafer included on a microchip. The movements of the samples are controlled by electric, electroosmotic or hydrostatic forces appHed across different areas of the microchip.
  • the microfluidic system may integrate nucleic acid amplification, microsequencing, capillary electrophoresis and a detection method such as laser-induced fluorescence detection.
  • a kit may also be utilized for determining whether a polymorphic variant is present or absent in a nucleic acid sample.
  • a kit often comprises one or more pairs of oligonucleotide primers useful for amplifying a fragment of a nucleotide sequence in Tables 6, 7 or 8, or a substantially identical sequence thereof, where the fragment includes a polymorphic site.
  • the kit sometimes comprises a polymerizing agent, for example, a thermostable nucleic acid polymerase such as one disclosed in U.S. Pat. Nos. 4,889,818 or 6,077,664.
  • the kit often comprises an elongation oligonucleotide that hybridizes to a nucleotide sequence in a nucleic acid sample adjacent to the polymorphic site.
  • the kit includes an elongation oligonucleotide, it also often comprises chain elongating nucleotides, such as dATP, dTTP, dGTP, dCTP, and dITP, including analogs of dATP, dTTP, dGTP, dCTP and dITP, provided that such analogs are substrates for a thermostable nucleic acid polymerase and can be incorporated into a nucleic acid chain elongated from the extension oligonucleotide.
  • the kit comprises one or more oligonucleotide primer pairs, a polymerizing agent, chain elongating nucleotides, at least one elongation oligonucleotide, and one or more chain terminating nucleotides. Kits optionally include buffers, vials, microtitre plates, and instructions for use. [0056] Forensic matching by microsequencing is further described in Example 1 below.
  • DNA samples are isolated from forensic specimens of, for example, hair, semen, blood or skin cells by conventional methods.
  • a panel of PCR primers based on a number of the sequences of the invention is then utilized according to the methods described herein to amplify DNA of approximately 500 bases in length from the forensic specimen.
  • the alleles present at each of the selected SNP markers site according to SNP markers of the invention are then identified according to Examples discussed herein.
  • a simple database comparison of the analysis results determines the differences, if any, between the sequences from a subject individual or from a database and those from the forensic sample.
  • the SNPs labeled "set37” are a subset of 37 assays selected because they represent a preferred embodiment in that they are a) polymorphic; b) the results of replicate tests consistently give the same results; c) they had a maximum of one drop out in four repeated tests, and d) work well in a multiplex mode. It should be noted that the SNPs labeled “new” in Table 5 also meet the criteria described in a) through d) above. Also in Table 5, the "rs SNP ID" corresponds to the SNP reference number (e.g., rs2029490).
  • the chromosome position refers to the position of the SNP within NCBI's Genome build 34, which may be accessed at the following http address: www.ncbi.nlm.nih.gov.
  • the "sqnm.maf is the minor allele frequency estimated by the Applicants in a pool of 96 CEPH Whites.
  • the "distance” corresponds to the distance (in Mb) between each marker and the preceeding marker in the current. No value is reported if adj cent markers are not on the same chromosome.
  • a MassARRAYTM system (mass spectrometry) (Sequenom, Inc.) may be utilized to perform SNP genotyping in a high-throughput fashion.
  • This genotyping platform may be complemented by a homogeneous, single-tube assay method (hMETM or homogeneous MassEXTEND® (Sequenom, Inc.)) in which two genotyping primers anneal to and amplify a genomic target surrounding a polymorphic site of interest.
  • a third primer (the MassEXTEND® primer), which is complementary to the amplified target up to but not including the polymorphism, may be then enzymatically extended one or a few bases through the polymorphic site and then terminated.
  • the MassEXTEND® primers and assays set forth in Tables 6, 7 and 8 have been optimized so that the same terminator mix can be used for all assays. This improves the efficiency of the assays and enables a greater ease of automation and also allows for grouping of assays into a multiplex format.
  • the multiplex format may be in the form of 4-, 6-, 8- or 12- plexes. Tables 6, 7 and 8 provide assay designs for 6-, 8- and 12-plexes, respectively.
  • SpectroDESIGNERTM software (Sequenom, Inc.) may be used to generate a set of PCR primers and a MassEXTEND® primer that may be used to genotype the polymorphism.
  • Other primer design software could be used or one of ordinary skill in the art could manually design primers based on his or her knowledge of the relevant factors and considerations in designing such primers.
  • Tables 6, 7 and 8 shows PCR primers and extension primers used for analyzing polymorphisms.
  • the initial PCR amplification reaction may be performed in a 5 ⁇ l total volume containing IX PCR buffer with 1.5 M MgCl 2 (Qiagen), 200 ⁇ M each of dATP, dGTP, dCTP, dTTP (Gibco-BRL), 2.5 ng of genomic DNA, 0.1 units of HotStar DNA polymerase (Qiagen), and 200 nM each of forward and reverse PCR primers specific for the polymorphic region of interest.
  • Table 9 illustrates a different individual across each of 16 columns, comprising African Americans (af); Caucasians (ca); Hispanics (hi) and unknown ethnicity (controls), also males (M) and females (F). Reading down each column the identity of the nucleotide at each SNP position , e.g., 3277, 9566 etc. is shown. The combination of the 37 SNPs per person, provides a unique pattern for each individual. The number of possible combinations is 3xl0 37 .
  • the MassEXTEND® reaction may be performed in a total volume of 9 ⁇ l, with the addition of IX ThermoSequenase buffer, 0.576 units of ThermoSequenase (Amersham Pharmacia), 600 nM MassEXTEND® primer, 2 mM of ddATP and or ddCTP and/or ddGTP andor ddTTP, and 2 mM of dATP or dCTP or dGTP or dTTP.
  • the deoxy nucleotide (dNTP) used in the assay normally will be complementary to the nucleotide at the polymorphic site in the amplicon.
  • Samples are incubated at 94°C for 2 minutes, followed by 55 cycles of 5 seconds at 94°C, 5 seconds at 52°C, and 5 seconds at 72°C. [0062] Following incubation, samples are desalted by adding 16 ⁇ l of water (total reaction volume was 25 ⁇ l), 3 mg of SpectroCLEANTM sample cleaning beads (Sequenom, Inc.) and aUowed to incubate for 3 minutes with rotation.
  • Samples are then robotically dispensed using a piezoelectric dispensing device (SpectroJETTM (Sequenom, Inc.)) onto either 96-spot or 384-spot silicon chips containing a matrix that crystallized each sample (SpectroCHIP® (Sequenom, Inc.)).
  • SpectroJETTM Spin-On-Ediode
  • SpectroCHIP® Sequenom, Inc.
  • MALDI-TOF mass spectrometry Boflex and Autoflex MALDI-TOF mass spectrometers (Bruker Daltonics) can be used
  • SpectroTYPER RTTM software Sequenom, Inc.
  • Table 10 illustrates a number of multiplex reactions, meaning that PCR amplification and/or MassEXTEND® reactions can be run simultaneously in the same reaction vessel.
  • A01 is a set of multiplex assays in which 1397421, 1912948, 775709, 934774 are all run together.
  • A02-04 are additional triplicate runs of the same multiplex reactions, which shows the reproduceability of the reactions.
  • a total of fourteen four-plexes is set forth.
  • the multiplexes were designed so that the PCR primers do not interfere with each other and share the same formation of the te ⁇ nination mix.
  • Other sets of multiplexes and/or higher multiplexes could designed by those of ordinary skill in the art.
  • "X" indicates that a polymorphic variant at the designated position was not identified for the particular individual.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Provided herein are methods and compositions for identifying an individual organism, matching the genetic profile of an individual with information in, for example, a database, and comparing nucleic acid samples including a human, by analyzing single nucleotide polymorphism patterns in nucleotide sequences within the human genome.

Description

COMPOSITIONS AND METHODS FOR IDENTIFYING AN INDIVIDUAL ORGANISM BASED ON SINGLE NUCLEOTIDE POLYMORPHISMS
Field of the Invention [0001] The invention relates to genetic methods for identifying an individual organism or matching the genetic profiles of two nucleic acid samples by analyzing the single nucleotide polymorphisms (SNPs) in one or more samples of genetic material and compositions including nucleic acid molecules containing single nucleotide polymorphisms useful for such methods.
Background [0002] Methods and compositions for the identification of individuals based on genetic markers for, for example, forensic science and paternity determinations have become increasingly important. Li forensic science, for example, the identification of individuals by genetic polymorphism analysis, such as short tandem repeat (STR) polymorphisms, has become widely accepted by courts as evidence. [0003] While forensic geneticists have developed many techniques to compare homologous segments of DNA to determine if the segments are identical or if they differ in one or more nucleotides, each technique still has certain disadvantages. In particular, the techniques vary widely in terms of expense of analysis, time required to carry out an analysis and statistical power. Some of the known methods are set forth below. [0004] The best known and most widespread method in forensic DNA typing is restriction fragment length polymorphism (RFLP) analysis. In RFLP testing, a repetitive DNA sequence referred to as a variable number tandem repeat (VNTR) the number of repeats which varies between individuals is analyzed. The core repeat is typically a sequence of about 15 base pairs in length, and highly polymorphic VNTR loci can have an average of about 20 alleles (or types of VNTR). DNA restriction sites located on either site of the VNTR are exploited to create DNA f agments from about 0.5 Kb to less than 10 Kb which are then separated by electrophoresis, indicating the number of repeats found in the individual at the particular loci. RFLP methods generally consist of (1) extraction and isolation of DNA, (2) restriction endonuclease digestion; (3) separation of DNA fragments by electrophoresis; (4) capillary transfer; (5) hybridization with radiolabelled probes; (6) autoradiography; and (7) interpretation of results (Lee, H. C. et al., Am. J. Forensic. Med. Pathol. 15(4): 269-282 (1994)). RFLP methods generally combine analysis at about 5 loci and have much higher discriminate potential than other available test due the highly polymorphic nature of the VNTRs. The probability that any two people, except identical twins, will have the same pattern of VNTR' s at 5 or more loci is very low. However, autoradiography is costly and time consuming and an analysis generally takes weeks or months for turnaround. Additionally, a large amount of sample DNA is required, which is often not available at a crime scene. Furthermore, the reliability of the system and its credibility as evidence is decreased because the analysis of tightly spaced bands on electrophoresis results in a high rate of error. [0005] PCR-based methods offer an alternative to RFLP methods. In a first method called A pFLP, DNA fragments containing VNTRs are amplified and then separated electrophoretically, without the restriction step of RFLP method. While this method allows small quantities of sample DNA to be used, and decreases analysis time by avoiding autoradiography, and retains high discriminatory potential, it nevertheless requires electrophoretic separation that takes substantial time and introduces a significant error rate. In another AmpFLP method, short tandem repeats (STRs) of 2 to 8 base pairs are analyzed. STRs are more suitable to analysis of degraded DNA samples since they require smaller amplified fragments but have the disadvantage of requiring separation of the amplified fragments. While STRs are far less informative than longer repeats, similar discriminatory potential can be achieved if enough STRs are used in a single analysis. [0006] Other methods include sequencing of mitochondrial DNA, which is especially suitable for situations where sample DNA is very degraded or in small quantities. However, only a small region of 1 Kb of the mitochondrial DNA, referred to as the D-Loop locus, has been found useful for typing because of its polymorphic nature, resulting in lower discriminatory potential than with RFLP or AmpFLP methods. Furthermore, DNA sequencing is expensive to carry out on a large number of samples. [0007] Further available methods include dot-blot methods, which involve using allele- specific oligonucleotide probes that hybridize sequence specifically to one allele of a polymorphic site. Systems include the HLA DQ-alpha kit developed by Cetus Corp. which has a discriminatory value of about 1 in 20, and a dot-blot strip referred to as the Polymarker strip combining five genetic loci for a discriminatory value of about one in a few thousand. (Weedn, V., Clinics in Lab. Med. 16(1): 187-196 (1996)).
Summary [0008] Featured herein are methods for identifying an individual organism based on one or more single nucleotide polymorphisms, nucleic acid molecules useful in such methods and kits containing reagents useful in such methods. SNP markers, sets of SNP markers, probes, primers, and methods for determining the identity of a nucleotide at the SNP markers are also encompassed and are further described herein, and may encompass any further limitation described in this disclosure, alone or in any combination. [0009] As compared to published methods, the methods provided herein provide equal to or better chscriminatory power for identifying an individual while consistently reducing costs and time for making the determination. One factor that reduces costs and time is multiplexing. The MassEXTEND® primers and assays set forth in Tables 6, 7 and 8 have been optimized so that the same terminator mix can be used for all assays. This improves the efficiency of the assays and enables a greater ease of automation and also allows for grouping of assays into a multiplex format. Another factor that reduces costs and time is performing an identification procedure with informative SNP markers, which minimizes the number of SNP markers required for the determination. Thus, in one embodiment there is provided a method for identifying an individual organism based on one or more single nucleotide polymorphisms, which comprises (a) obtaining or possessing a nucleic acid sample known to be from the individual organism; (b) obtaining or possessing a nucleic acid sample which may or may not be from, or which may or may not be derived from, the individual organism; (c) analyzing the nucleic acid sample of step (a) to detect the identity of at least six nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5 (e.g., determining the sequence of a polymorphic variant at a particular position in the nucleic acid); (d) analyzing the nucleic acid sample of step (b) to detect the identity of the at least six single nucleotide polymorphism selected from the group consisting of the polymorphisms set forth in Table 5; and (e) comparing the results of steps (c) and (d), whereby if the identity of each of the six single nucleotide polymorphism detected in step (c) is the same as the identity of each single nucleotide polymorphism detected in step (d), then the nucleic acid sample of step (b) can be said to have come from the individual organism in step (a), thereby corifirming the identity of the source of the nucleic acid sample from the individual organism. Other embodiments may vary the number of SNP's selected, depending upon the power of discrimination required. Therefore, more or less than six SNP's may be used. Also, it is possible that if the relevant information sought is the degree of relatedness between the nucleic acid sample and the individual or between two nucleic acid samples, (e.g., do they come from siblings or cousins), then the degree of identity between the SNP matches may be lessened. For example, if one half of the SNP's match between samples, (e.g., three of the six SNP's), that would be indicative of siblings. If one quarter of the SNP's match that would be indicative of a match for cousins and so on. [0010] In another embodiment there is provided a method for identifying a known individual organism based on one or more single nucleotide polymorphisms, which comprises the following steps: (a) obtaining or possessing a nucleic acid sample from the known individual organism; (b) obtaining or possessing a nucleic acid sample which may or may not be from, or which may or may not be derived from, the known individual organism; (c) analyzing the nucleic acid sample of step (a) to detect the identity of at least six single nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5; (d) analyzing the nucleic acid sample of step (b) to detect the identity of at least six nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5; and (e) comparing the results of steps (c) and (d), whereby if the identity of each of the six single nucleotide polymorphisms detected in step (c) is the same as the identity of each of the six single nucleotide polymorphism detected in step (d), then the nucleic acid sample of step (b) can be said to have come from the individual organism in step (a), thereby confirming the identiy of that individual organism as the source of the nucleic acid sample in step (a). [0011] In another embodiment, the nucleic acid sample is obtained by extraction from a biological source that is associated with a non-biological source. In a further embodiment, the non-biological source is selected from the group consisting of fabric, carpeting, currency, leather, cordage, tobacco products, hard-surfaced objects, a biological specimen other than the biological source that is the source of the nucleic acid. In yet another embodiment, the nucleic acid sample is obtained from human tissue, wherein the human tissue is selected from the group of human tissue consisting of epiderma, blood, semen, vaginal cells, hair, saliva, vomit, urine, feces, bone, buccal sample, amniotic fluid containing placental cells or fetal cells, and mixtures of any of the tissues listed above. [0012] h another embodiment there is provided a method for identifying an individual organism based on single nucleotide polymorphisms, which comprises: (a) obtaining or possessing a nucleic acid sample known to be from the individual organism; (b) obtaining or possessing a nucleic acid sample which may or may not be from, or which may or may not be derived from, the individual organism; (c) analyzing the nucleic acid sample of step (a) to detect the identity of each single nucleotide polymorphisms selected from the group consisting any subset of the polymorphisms set forth in Table 5 ; (d) analyzing the nucleic acid sample of step (b) to detect the identity of each single nucleotide polymorphism selected from the group consisting of any 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 61, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94 or 95 of the polymorphisms set forth in Table 5; and (e) comparing the results of steps (c) and (d), whereby if the identity of each single nucleotide polymorphism detected in step c is the same as the identity of each single nucleotide polymorphism detected in step (d), then the nucleic acid sample of step (b) can be said to have come from the individual organism in step (a), thereby identifying that individual organism. In another embodiment, there is provided a method for identifying an individual organism based on single nucleotide polymorphisms, wherein the number of single nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5 is determined by consulting Figure 1 for the desired discriminatory potential. In yet another embodiment, there is provided a method for identifying an individual organism based on single nucleotide polymorphisms, wherein the identity of the single nucleotide polymorphism selected from the group consisting of any 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 5, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94 or 95 of the polymorphisms set forth in Table 5 includes the "sex" SNP on chromosome Y. [0013] A "genetic fingerprint" or "genetic profile"as disclosed in this application, whether based on SNP's, STR's or any other individualized genetic basis, may also be used as follows. When a biological specimen is taken from a subject for example, but not limited to, a medical patient, a veterinary patient for, for example, diagnostic or pharmacogenomic testing, the sample may be simultaneously, or sequentially, analyzed for the presence of the genetic profile of the subject whose sample is being tested. In this way, the testing entity can be assured that the sample tested and the diagnostic or pharmacogenomic result obtained corresponds to the correct subject. In effect, the genetic profile (e.g., a result of the SNP assays disclosed herein), acts as a "genetic barcode" to ensure the synonymousness of the test result and the test subject. [0014] In a further embodiment there is provided a kit comprising at least one primer pair for amplifying each nucleic acid single nucleotide polymorphism selected from the group consisting of any two of the polymorphisms set forth in Table 5, and (a) a compartment comprising the primer pairs set forth in Tables 6, 7 or 8; and (b) instructions for use of reagents in the kit. [0015] In yet another embodiment, there is provided a method for matching a genetic profile of a known individual organism based on one or more single nucleotide polymorphisms, which comprises obtaining or possessing a nucleic acid sample from the known individual organism; analyzing the nucleic acid sample of step (a) to detect the identity of at least six of the nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5, to obtain an individually-specific result; obtaining or accessing a collection of information containing at least one result of previously generated information on the genetic composition of individual organisms, which previously generated genetic information may contain a result comprising the genetic information of the individual organism of step (a); and comparing the individually-specific result of step (b) with the previously generated information of step (c), whereby if the individually-specific result of step (b) matches the result comprising the genetic information of the individual organism of step (c), then the match of the genetic information of the individual organism is confirmed. [0016] In another embodiment, the above method may be used with a sample from an unknown individual. [0017] In other embodiments, samples from two or more unknown individuals may be compared. This is useful for determining if, for example, the same individual, albeit unknown, has committed more than one crime, i.e., is a serial criminal.
Brief Description of the Figure [0018] Figure 1 is a graph that shows the estimated performance of the candidate panel provided herein. Under the assumptions of Hardy-Weinberg equilibrium at each SNP for a population at equilibrium under evolutionary forces, the median probability that two randomly selected, unrelated individuals would share the same multi-SNP genotype is shown in Figure 1. Curves are given under three inbreeding levels for the population (Theta). At the level of 35 SNPs with the modest inbreeding coefficient of 0.01, the probability of two randomly selected unrelated individuals sharing the same identical multi-SNP genotype is one in 2.4 x 1015. This is comparable to the values reported for the 13 STR panel in U.S. Caucasians (Holt et al. 2000. Foren. Sci. Int. 112:91-109).
Detailed Description [0019] There are several applications for DNA typing which require a particularly powerful genotyping system. In a first appHcation, a high powered typing system is advantageous when for example a suspect is identified by searching a DNA profile database such as that maintained by the U.S. Federal Bureau of Investigation (CODIS) or the United Kingdom's National DNA Database (NDNAD). Since databases may contain large numbers of data entries that are expected to increase consistently, currently used forensic systems will need to be modified to prevent matching overlapping DNA profiles. While database searches generally reinforce the evidence by excluding other possible suspects, low powered typing systems resulting in the identification of several individuals may often tend to diminish the overall case against a defendant. [0020] In another application, a target population is systematically tested to identify an individual having the same DNA profile as that of a DNA sample. In such a situation, a defendant is chosen at random based on DNA profile from a large population of innocent individuals. Since the population tested can often be large enough that at least one positive match is identified, and it is usually not possible to exhaustively test a population, the usefulness of the evidence will depend on the level of significance of the forensic test. In order to render such an application useful as a sole or primary source of evidence, DNA typing systems of extremely high discriminatory potential are required. [0021] In yet another appHcation, it is desirable to be able to discrirninate between related individuals or to determine if individuals are related, as in, for example, paternity testing. Because related individuals will be expected to share a large portion of alleles at polymorphic sites, a very high-powered DNA typing assay would be required to discriminate between them. This can have important effects if a sample is found to match the defendant's DNA profile and no evidence that the perpetrator is a relative can be found. [0022] Accordingly, provided herein is a rapid, simple, inexpensive and accurate technique having a very high resolution value to determine relationships between individuals and differences in degree of relationships. Also, provided herein is a very accurate genetic relationship test procedure that uses very small amounts of an original DNA sample, yet produces very accurate results. Described herein are methods for the identification of individuals, which comprise deterrnming the identity of the nucleotides at a set of genetic markers in a biological sample, wherein said set of genetic markers comprises at least that number of SNP markers which allows for the useful discrimination between individual organisms, based on their individual , pattern of SNPs. The present invention provides an extensive set of SNP markers allowing an effectively high discriminatory potential that is cheaper, faster and more efficient than the genetic markers used in current forensic typing systems. Also, SNP markers can be genotyped in individuals with much higher efficiency and accuracy than the genetic markers used in current forensic typing systems. In preferred embodiments, the invention comprises determining the identity of a nucleotide at a SNP marker by single or multiple nucleotide primer extension, which does not require electrophoresis as in techniques described above and results in lower rate of experimental error. [0023] As an important application of DNA typing tests is to determine whether a DNA sample (e.g. from a crime scene) originated from an individual suspected of leaving the DNA sample, it is desirable that DNA typing systems have a high discriminatory power. It also is desirable to minimize time consuming laboratory procedures and difficulties in analysis. In addition to difficulties' in analysis, and time consuming laboratory procedures, it remains desirable for all DNA typing systems to have a higher discriminatory power. Several applications exist in which even the most discriminating tests need improvement in order to remove the considerable remaining doubt resulting from such analyses. Table 1 below lists characteristics of currently available forensic testing systems (Weedn, (1996)) and compares them with the method of the present invention. TABLE 1
Figure imgf000010_0001
[0024] Any suitable set of genetic markers and SNP markers of the invention may be used, and may be selected according to the discriminatory power desired. SNP markers, primers, and methods for determining the identity of said SNP markers are further described herein. The use of SNPs which segregate independently is preferred. If a conservative estimate of male chromosome sizes is used, 77 unlinked SNPs could be identified. If a more liberal and generally applicable sex averaged estimate is used, unlinked SNPs from the 94 regions provided in Table 2 could be identified. A 100 SNP panel of nearly unlinked SNPs is possible. Additionally, SNP allele frequencies should approach an approximate frequency of 0.500 in most major ethnic groups, and SNPs should not be subject to selection. Additional considerations are set forth below.
TABLE 2
Figure imgf000010_0002
Figure imgf000011_0001
[0025] As noted above, the Discriminatory potential of methods provided herein are equal to or better than the discriminatory potential of reported methods. The discriminatory potential of a SNP marker typing method can be calculated. The discriminatory potential of the forensic test can be determined in terms of the profile frequency, also referred to as the random match probability, by applying the product rule. The product rale involves multiplying the alleHc frequencies of all the individual alleles tested, and multiplying by an additional factor of 2 for each heterozygous locus. [0026] In one example discussed below, the discriminatory potential of SNP marker typing can be considered in the context of forensic science. In order to determine the discriminatory potential with respect to the numbers of SNP markers to be used in a genetic typing system, the formulas and calculations below assume that (1) the population under study is sufficiently large (so that we can assume no consanguinity); (2) all markers chosen are not correlated, so that the product rule (Lander and Budlowle (1992)) can be applied; and (3) the ceiling rule can be applied or that the allelic frequencies of markers in the population under study are known with sufficient accuracy. [0027] As noted in Weir, B. S., Genetic data Analysis II: Methods for Discrete population genetic Data, Sinauer Assoc, Inc., Sunderland, Mass., USA, 1996, the example assumes a crime has been committed and a sample of DNA from the perpetrator (P) is available for analysis. The genotype of this DNA sample can be determined for several genetic markers, and the profile A of the perpetrator can thereby be determined. [0028] In this example, one suspect (S) is available for typing. The same set of genetic markers, such as the SNP markers of the invention, are typed and the same profile A is obtained for (S) and (P). Two hypotheses are thus presented as follows: (1) either S is P (event (c) (2) either S is not P (event C ). The ratio L of both probabilities can then be calculated using the following equation:
L = (pr(S = A, P = A|C)) / (pr(S = A, P = A] C ))
L can then further be calculated by the foUowing equation:
' _ - ! =.- pr(P = A \ S = A,C) [0029] These probabiHties as well as L can be calculated in several settings, notably for different kinship coefficients between P and S for a genetic marker (see Weir, (1996)). Assuming that all genetic markers chosen are independent of each other, the global ratio L for a set of genetic markers will be the product over each genetic marker of all L. It is further possible to estimate the mean number of SNP markers or VNTRs required to have a ratio L equal to 10s or 106 by calculating the expectancy of the random variable L using the following equation: N E(L) = T E(L.) where N is the number of loci (=1
E(L.) = _ζ pr(P = A \ S = A,„ C) L,
where Ay is the genotype j at the ith marker, Ly the ratio associated with such genotype, Gi being the number of genotypes at locus i. From equation 1, it can easily be derived that the expectancy of L; is Gj, the number of possible genotypes of this marker. [0030] The general expectancy for a set of genetic markers can then be expressed by the following equation:
Figure imgf000012_0001
[0031] Using the equations described above, it is possible to select SNP marker-based DNA typing systems having a desired discriminatory potential. Using SNP markers, E(L) can thus be expressed as 3N. When using VNTR-based DNA typing systems, assuming the VNTRs have 10 alleles, E(L) can be expressed as 55N. Based on these results, the number of SNP markers or VNTRs needed to obtain, in mean, a ratio of at least 106 or 108 can calculated, and are set forth below in Table 3.
TABLE 3
Figure imgf000013_0001
[0032] Thus, in one embodiment, DNA typing systems and methods of the invention may comprise genotyping a set of at least 13 or at least 17 SNP markers to obtain a ratio of at least 106 or 108, assuming a flat distribution of L across the SNP markers. In certain embodiments, a greater number of SNP markers is genotyped to obtain a higher L value. In specific embodiments, at least 1, 2, 3, 4, 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 37, 40, 50, 56, 60, 70, 80, 90, 100 or more of the SNP markers are genotyped. [0033] In situations where the distribution of L is not flat, such as in the worst case when the perpetrator is homozygous for the major allele at each genetic locus and L thus takes the lowest value, a larger number of SNP markers is required for the same discriminatory potential. Therefore, in preferred embodiments, DNA typing systems and methods of the invention using a larger number of SNP markers allow for uneven distributions of L across the SNP markers. For example, assuming unrelated individuals, a set of independent markers having an allelic frequency of 0.1/0.9, and the genetic profile of a homozygote at each genetic loci for the major allele, 66 SNP markers are required to obtain a ratio of 106, and 88 SNP markers are required to obtain a ratio of 108. Thus, in preferred embodiments based on the use of markers having a major allele of sufficiently high frequency, this is a first estimation of the upper bound of markers required in a DNA typing system. [0034] In further embodiments, it is also desirable to have the ability to discriminate between relatives. Although unrelated individuals have a low probability of sharing genetic profiles, the probability is greatly increased for relatives. For example, the DNA profile of a suspect matches the DNA profile of a sample at a crime scene, and the probability of obtaining the same DNA profile if left by an untyped relative is required. Table 4 below (Weir (1996)) lists probabilities for several different types of relationships, assuming alleles A; and Aj, and population frequencies p; and pj, and lists likelihood ratios assuming genetic loci having allele frequencies of 0.1. TABLE 4
Figure imgf000014_0001
[0035] In yet further embodiments, the DNA typing systems and methods of the present invention may further take into account effects of subpopulations on the discriminatory potential. In embodiments described above for example, DNA typing systems consider close familial relationships, but do not take into account membership in the same population. While population membership is expected to have little effect, the invention may further comprise genotyping a larger set of SNP markers to achieve higher discriminatory potential. Alternatively, a larger set of SNP markers may be optimized for typing selected populations; alternatively, the ceiling principle may be used to study allele frequencies from individuals in various populations of interest, taking for any particular genotype the maximum allele frequency found among the populations. [0036] Any markers known in the art may be used with the SNP markers of the present invention in the DNA typing methods and systems described herein, for example in anyone of the foUowing web sites offering collections of SNPs and information about those SNPs: The Genetic Annotation Initiative (http address cgap.nci.nih.gov/GAI ). An NIH run site which contains information on candidate SNPs thought to be related to cancer andtumorigenesis generally. dbSNP Polymorphism Repository (http address www.ncbi.nlm.nih.gov/SNP/). A more comprehensive NTH-run database containing information on SNPs with broad applicability in biomedical research. HUGO Mutation Database Initiative http address ariel.ucs.unimelb.edu.au:80/cotton mdi.htm). A database meant to provide systematic access to information about human mutations including SNPs. This site is maintained by the Human Genome Organization (HUGO). Human SNP Database (http address www-genome.wi.mit.edu/SNP/human /index.html). Managed by the Whitehead Institute for Biomedical Research Genome Institute, this site contains information about SNPs resulting from the many Whitehead research projects on mapping and sequencing. SNPs in the Human-Genome SNP database (http address www.ibc.wustl.edu/SNP). This website provides access to SNPs that have been organized by chromosomes and cytogenetic location. The site is run by Washington University. HGBase (http address hgbase.cgr.ki.se/). HGBASE is an attempt to summarize aU known sequence variations in the human genome, to faciHtate research into how genotypes affect common diseases, drug responses, and other complex phenotypes, and is run by the Karolinska Institute of Sweden. The SNP Consortium Database (http address snp.cshl.org/db/snp/map). A collection of SNPs and related information resulting from the collaborative effort of a number of large pharmaceutical and information processing companies. GeneSNPs (http address www.genome.utah.edu/genesnps/). Run by the University of Utah, this site contains information about SNPs resulting from the U. S. National Institute of Environmental Health's initiative to understand the relationship between genetic variation and response to environmental stimuli and xenobiotics. [0037] In addition, SNP markers provided in the foHowing patents and patent appHcations may also be used with the SNP markers of the invention in the DNA typing methods and systems described above: PCT/IB00/00184, filed Feb. 11, 2000; PCT/D398/01193, filed Jul. 17, 1998; PCT Publication No. WO 99/54500, filed Apr. 21, 1999; and PCT/IB00/00403, filed Mar. 24, 2000. [0038] As used herein, the term "nucleic acid" includes DNA molecules (e.g., a complementary DNA (cDNA) and genomic DNA (gDNA)) and RNA molecules (e.g., mRNA, rRNA, and tRNA) and analogs of DNA or RNA, for example, by use of nucleotide analogs. The nucleic acid molecule can be single-stranded and it often is double-stranded. The term "isolated or purified nucleic acid" refers to nucleic acids that are separated from other nucleic acids present in the natural source of the nucleic acid. For example, with regard to genomic DNA, the term "isolated" includes nucleic acids which are separated from the chromosome with which the genomic DNA is naturally associated. An "isolated" nucleic acid often is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5' and/or 3' ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5' and/or 3' nucleotide sequences which flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an "isolated" nucleic acid molecule, such as a cDNA molecule, often is substantially free of other ceHular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized. As used herein, the term "gene" refers to a nucleotide sequence that encodes a polypeptide. [0039] Also included herein are nucleic acid fragments. These fragments often are a nucleotide sequence identical to a nucleotide sequence in Tables 6, 7 or 8, a nucleotide sequence substantially identical to a nucleotide sequence in Tables 6, 7 or 8, or a nucleotide sequence that is complementary to the foregoing. The nucleic acid fragment may be identical, substantially identical or homologous to a nucleotide sequence in a nucleotide sequence of Tables 6, 7 or 8. Further, the nucleic acid fragment may encode a full-length or mature polypeptide of the invention, or the nucleic acid fragment may encode a domain or part of a domain of a polypeptide of the invention. [0040] An example of a nucleic acid fragment is an oligonucleotide. As used herein, the term "oligonucleotide" refers to a nucleic acid comprising about 8 to about 50 covalently linked nucleotides, often comprising from about 8 to about 35 nucleotides, and more often from about 10 to about 25 nucleotides. The backbone and nucleotides within an oligonucleotide may be the same as those of naturally occurring nucleic acids, or analogs or derivatives of naturally occurring nucleic acids, provided that oligonucleotides having such analogs or derivatives retain the abihty to hybridize specifically to a nucleic acid comprising a targeted polymorphism. Oligonucleotides described herein may be used as hybridization probes or as components of assays, for example, as described herein. [0041] Oligonucleotides typically are synthesized using standard methods and equipment, such as the ABI™3900 High Throughput DNA Synthesizer and the EXPEDITE™ 8909 Nucleic Acid Synthesizer, both of which are available from Applied Biosystems (Foster City, CA). Analogs and derivatives are exemplified in U.S. Pat. Nos. 4,469,863; 5,536,821; 5,541,306; 5,637,683; 5,637,684; 5,700,922; 5,717,083; 5,719,262; 5,739,308; 5,773,601; 5,886,165; 5,929,226; 5,977,296; 6,140,482; WO 00/56746; WO 01/14398, and related publications. Methods for synthesizing oligonucleotides comprising such analogs or derivatives are disclosed, for example, in the patent publications cited above and in U.S. Pat. Nos. 5,614,622; 5,739,314; 5,955,599; 5,962,674; 6,117,992; in WO 00/75372; and in related publications. [0042] OHgonucleotides also may be linked to a second moiety. The second moiety may be an additional nucleotide sequence such as a tail sequence (e.g., a polyadenosine tail), an adapter sequence (e.g., phage Ml 3 universal tail sequence), and others. Alternatively, the second moiety may be a non-nucleotide moiety such as a moiety which faciHtates linkage to a solid support or a label to facilitate detection of the oHgonucleotide. Such labels include, without limitation, a radioactive label, a fluorescent label, a cherniluminescent label, a paramagnetic label, and the like. The second moiety may be attached to any position of the oHgonucleotide, provided the oHgonucleotide can hybridize to the nucleic acid comprising the polymorphism. [0043] Nucleic acids substantially identical to those described herein can be utilized. Substantially identical nucleic acids sometimes are 90% or more identical to a reference nucleic acid. Calculations of sequence identity often are performed as follows. Sequences are aHgned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal aHgnment and non-homologous sequences can be disregarded for comparison purposes). The length of a reference sequence aHgned for comparison purposes is sometimes 30% or more, 40% or more, 50% or more, often 60% or more, and more often 70%, 80%, 90%, 100% of the length of the reference sequence. The nucleotides or amino acids at corresponding nucleotide or polypeptide positions, respectively, are then compared among the two sequences. When a position in the first sequence is occupied by the same nucleotide or amino acid as the corresponding position in the second sequence, the nucleotides or amino acids are deemed to be identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, introduced for optimal alignment of the two sequences. [0044] Comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. Percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of Meyers & Miller, CABIOS 4: 11- 17 (1989), which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4. Also, percent identity between two amino acid sequences can be determined using the Needleman & Wunsch, J. Mol. Biol. 48: 444-453 (1970) algorithm which has been incorporated into the GAP program in the GCG software package (available at the http address www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1 , 2, 3, 4, 5, or 6. Percent identity between two nucleotide sequences can be determined using the GAP program in the GCG software package (available at http address www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A set of parameters often used is a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5. [0045] Another manner for determining if two nucleic acids are substantiaUy identical is to assess whether a polynucleotide homologous to one nucleic acid wiU hybridize to the other nucleic acid under stringent conditions. As use herein, the term "stringent conditions" refers to conditions for hybridization and washing. Stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. , 63.1- 6.3.6 (1989). Aqueous and non-aqueous methods are described in that reference and either can be used. An example of stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 50°C. Another example of stringent hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 55°C. A further example of stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1%) SDS at 60°C. Often, stringent hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 65°C. More often, stringency conditions are 0.5M sodium phosphate, 7% SDS at 65°C, followed by one or more washes at 0.2X SSC, 1% SDS at 65°C. [0046] The presence or absence of a polymorphic variant is detected in one or both chromosomal complements represented in the nucleic acid sample. Determining the presence or absence of a polymorphic variant in both chromosomal complements represented in a nucleic acid sample from a subject having a copy of each chromosome is useful for determining the zygosity of an individual for the polymorphic variant (i.e., whether the individual is homozygous or heterozygous for the polymorphic variant). Any detection method known in the art may be utilized to determine whether a sample includes the presence or absence of a polymorphic variant described herein. While many detection methods include a process in which a DNA region carrying the polymorphic site of interest is amplified, ultrasensitive detection methods which do not require amplification may be utilized in the detection method, thereby eliminating the amplification process. Polymorphism detection methods known in the art include, for example, primer extension or microsequencing methods, ligase sequence determination methods (e.g., U.S. Pat. Nos. 5,679,524 and 5,952,174, and WO 01/27326), mismatch sequence determination methods (e.g., U.S. Pat. Nos. 5,851,770; 5,958,692; 6,110,684; and 6,183,958), microarray sequence determination methods, restriction fragment length polymorphism (RFLP) procedures, PCR-based assays (e.g., TAQMAN® PCR System (Applied Biosystems)), nucleotide sequencing methods, hybridization methods, conventional dot blot analyses, single strand conformational polymorphism analysis (SSCP, e.g., U.S. Patent Nos. 5,891,625 and 6,013,499; Orita etal, Proc. Natl. Acad. Sci. U.S.A 86: 27776-2770 (1989)), denaturing gradient gel electrophoresis (DGGE), heteroduplex analysis, mismatch cleavage detection, and techniques described in Sheffield et al, Proc. Natl. Acad. Sci. USA 49: 699-706 (1991), White etal, Genomics 12: 301-306 (1992), Grompe et al, Proc. Natl Acad. Sci. USA 86: 5855-5892 (1989), and Grompe, Nature Genetics 5: 111-117 (1993). Furthermore, those of skiU in the art can utilize the determined nucleotide sequence flanking a polymorphic site in a database search to determine where the polymorphic site is located in genomic DNA (e.g., a BLAST search may be utilized to determine genomic orientation). [0047] Primer extension polymorphism detection methods, also referred to herein as "microsequencing" methods, typically are carried out by hybridizing a complementary oHgonucleotide to a nucleic acid carrying the polymorphic site. In these methods, the oligonucleotide typically hybridizes adjacent to the polymorphic site. As used herein, the term "adjacent" refers to the 3' end of the extension oligonucleotide being sometimes 1 nucleotide from the 5' end of the polymorphic site, often 2 or 3, and at times 4, 5, 6, 7, 8, 9, or 10 nucleotides from the 5' end of the polymorphic site, in the nucleic acid when the extension oHgonucleotide is hybridized to the nucleic acid. The extension oligonucleotide then is extended by one or more nucleotides, often 1, 2, or 3 nucleotides, and the number and or type of nucleotides that are added to the extension oligonucleotide determine which polymorphic variant or variants are present. Oligonucleotide extension methods are disclosed, for example, in U.S. Patent Nos.4,656,127; 4,851,331; 5,679,524; 5,834,189; 5,876,934; 5,908,755; 5,912,118; 5,976,802; 5,981,186; 6,004,744; 6,013,431; 6,017,702; 6,046,005; 6,087,095; 6,210,891; and WO 01/20039. The extension products can be detected in any manner, such as by fluorescence methods (see, e.g., Chen & Kwok, Nucleic Acids Research 25: 347-353 (1997) and Chen et al, Proc. Natl. Acad. Sci. USA 94/20: 10756-10761 (1997)) and by mass spectrometric methods (e.g., MALDI-TOF mass spectrometry). Oligonucleotide extension methods using mass spectrometry are described, for example, in U.S. Patent Nos. 5,547,835; 5,605,798; 5,691,141; 5,849,542; 5,869,242; 5,928,906; 6,043,031; 6,194,144; and 6,258,538. [0048] Microsequencing detection methods often incorporate an amplification process that proceeds the extension step. The amplification process typically amplifies a region from a nucleic acid sample that comprises the polymorphic site. Amplification can be carried out by utilizing a pair of oligonucleotide primers in a polymerase chain reaction (PCR), in which one oligonucleotide primer typically is complementary to a region 3' of the polymorphism and the other typically is complementary to a region 5' of the polymorphism. A PCR primer pair may be used in methods disclosed in U.S. Patent Nos.4,683,195; 4,683,202, 4,965,188; 5,656,493; 5,998,143; 6,140,054; WO 01/27327; and WO 01/27329 for example. PCR primer pairs may also be used in any commercially available machines that perform PCR, such as any of the GENEAMP® Systems available from Applied Biosystems. [0049] Mismatch sequence determination methods typically are based upon the specificity of polymerases and ligases. Polymerization reactions place particularly stringent requirements on correct base pairing of the 3' end of the amplification primer and the joining of two oligonucleotides hybridized to a target DNA sequence is quite sensitive to mismatches close to the ligation site, especially at the 3' end. Discrimination between two alleles can be achieved by allele specific ampHfication, which is a selective strategy, whereby one of the alleles is amplified without amplification of the other allele. This is accompHshed by placing a polymorphic base at the 3' end of one of the amplification primers. Because the extension forms from the 3' end of the primer, a mismatch at or near this position has an inhibitory effect on amplification. Therefore, under appropriate amplification conditions, these primers direct only ampHfication on their complementary allele. Designing the appropriate allele-specific primer and the corresponding assay conditions are weH with the ordinary skill in the art. [0050] Oligonucleotide ligation assays (OLA) utilize two oHgonucleotides designed to be capable of hybridizing to abutting sequences of a single strand of a target molecules. One of the oligonucleotides may be biotinylated, and the other may be detectably labeled. If the precise complementary sequence is found in a target molecule, the oligonucleotides will hybridize such that their termini abut, and create a ligation substrate that can be captured and detected. OLA is capable of detecting polymorphic sites and may be advantageously combined with PCR as described by Nickerson et al, Proc. Natl. Acad. Sci. U.S.A. 87: 8923-8927 (1990). In this method, PCR is used to achieve the exponential amplification of target DNA, which is then detected using OLA. [0051] Ligase chain reaction (LCR) detection methods utilize two pairs of probes to exponentially amplify a specific target. The sequences of each pair of oligonucleotides is selected to permit the pair to hybridize to abutting sequences of the same strand of the target. Such hybridization forms a substrate for a template-dependant ligase. LCR can be performed with oligonucleotides having the proximal and distal sequences of the same strand of a polymorphic site. In one embodiment, either oligonucleotide will be designed to include the polymorphic site. In such an embodiment, the reaction conditions are selected such that the oligonucleotides can be ligated together only if the target molecule contains or lacks the specific nucleotide(s) that is complementary to the polymorphic site on the oligonucleotide. In another embodiment, the oligonucleotides will not include a polymorphic site, such that when they hybridize to the target molecule, a "gap" is created as described for example in WO 90/01069. This embodiment is termed gap LCR (GLCR). This gap is then "fiUed" with complementary dNTPs (as mediated by DNA polymerase), or by an additional pair of oligonucleotides. Thus at the end of each cycle, each single strand has a complement capable of serving as a target during the next cycle and exponential aUele-specific amplification of the desired sequence is obtained. [0052] Another technique, which may be used to analyze polymorphisms, includes multicomponent integrated systems, which miniaturize and compartmentalize processes such as PCR and capillary electrophoresis reactions in a single functional device. An example of such technique is disclosed in U.S. Pat. No. 5,589,136, which describes the integration of PCR amplification and capiUary electrophoresis in chips. [0053] A microarray can be utilized for determining whether a polymorphic variant is present or absent in a nucleic acid sample. A microarray may include any oHgonucleotides described herein, and methods for making and using oligonucleotide microarrays suitable for prognostic use are disclosed in U.S. Pat. Nos. 5,492,806; 5,525,464; 5,589,330; 5,695,940; 5,849,483; 6,018,041; 6,045,996; 6,136,541; 6,142,681; 6,156,501; 6,197,506; 6,223,127; 6,225,625; 6,229,911; 6,239,273; WO 00/52625; WO 01/25485; and WO 01/29259. The microarray typically comprises a solid support and the oligonucleotides may be linked to this solid support by covalent bonds or by non-covalent interactions. The oligonucleotides may also be linked to the solid support directly or by a spacer molecule. A microarray may comprise one or more oHgonucleotides complementary to a polymorphic site within a nucleotide sequence in Tables 6, 7 or 8. [0054] Polymorphism detection methods can be carried out within an integrated system. An example of an integrated system is a micro fluidic system. These systems comprise a pattern of micro channels designed onto a glass, silicon, quartz, or plastic wafer included on a microchip. The movements of the samples are controlled by electric, electroosmotic or hydrostatic forces appHed across different areas of the microchip. The microfluidic system may integrate nucleic acid amplification, microsequencing, capillary electrophoresis and a detection method such as laser-induced fluorescence detection. [0055] A kit may also be utilized for determining whether a polymorphic variant is present or absent in a nucleic acid sample. A kit often comprises one or more pairs of oligonucleotide primers useful for amplifying a fragment of a nucleotide sequence in Tables 6, 7 or 8, or a substantially identical sequence thereof, where the fragment includes a polymorphic site. The kit sometimes comprises a polymerizing agent, for example, a thermostable nucleic acid polymerase such as one disclosed in U.S. Pat. Nos. 4,889,818 or 6,077,664. Also, the kit often comprises an elongation oligonucleotide that hybridizes to a nucleotide sequence in a nucleic acid sample adjacent to the polymorphic site. Where the kit includes an elongation oligonucleotide, it also often comprises chain elongating nucleotides, such as dATP, dTTP, dGTP, dCTP, and dITP, including analogs of dATP, dTTP, dGTP, dCTP and dITP, provided that such analogs are substrates for a thermostable nucleic acid polymerase and can be incorporated into a nucleic acid chain elongated from the extension oligonucleotide. Along with chain elongating nucleotides would be one or more chain terminating nucleotides such as ddATP, ddTTP, ddGTP, ddCTP, and the like. In an embodiment, the kit comprises one or more oligonucleotide primer pairs, a polymerizing agent, chain elongating nucleotides, at least one elongation oligonucleotide, and one or more chain terminating nucleotides. Kits optionally include buffers, vials, microtitre plates, and instructions for use. [0056] Forensic matching by microsequencing is further described in Example 1 below.
Example 1 Forensic Matching by Genotyping SNPs [0057] DNA samples are isolated from forensic specimens of, for example, hair, semen, blood or skin cells by conventional methods. A panel of PCR primers based on a number of the sequences of the invention is then utilized according to the methods described herein to amplify DNA of approximately 500 bases in length from the forensic specimen. The alleles present at each of the selected SNP markers site according to SNP markers of the invention are then identified according to Examples discussed herein. A simple database comparison of the analysis results determines the differences, if any, between the sequences from a subject individual or from a database and those from the forensic sample. In a preferred method, statistically significant differences between the suspect's DNA sequences and those from the sample conclusively prove a lack of identity. This lack of identity can be proven, for example, with only one sequence. Identity, on the other hand, should be demonstrated with a large number of sequences, all matching. Preferably, a minimum of 1, 2, 3, 4, 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 37, 40, 50, 56, 60, 70, 80, 90 or 100 SNP markers of Table 5 are used to test identity between the suspect and the sample. [0058] In Table 5, the SNPs labeled "set37" are a subset of 37 assays selected because they represent a preferred embodiment in that they are a) polymorphic; b) the results of replicate tests consistently give the same results; c) they had a maximum of one drop out in four repeated tests, and d) work well in a multiplex mode. It should be noted that the SNPs labeled "new" in Table 5 also meet the criteria described in a) through d) above. Also in Table 5, the "rs SNP ID" corresponds to the SNP reference number (e.g., rs2029490). The chromosome position refers to the position of the SNP within NCBI's Genome build 34, which may be accessed at the following http address: www.ncbi.nlm.nih.gov. The "sqnm.maf is the minor allele frequency estimated by the Applicants in a pool of 96 CEPH Whites. The "distance" corresponds to the distance (in Mb) between each marker and the preceeding marker in the current. No value is reported if adj cent markers are not on the same chromosome.
TABLE 5: Genomic Distribution of SNP Panel
Figure imgf000023_0001
Figure imgf000024_0001
Figure imgf000025_0001
Assay for Verifying and Genotyping SNPs [0059] A MassARRAY™ system (mass spectrometry) (Sequenom, Inc.) may be utilized to perform SNP genotyping in a high-throughput fashion. This genotyping platform may be complemented by a homogeneous, single-tube assay method (hME™ or homogeneous MassEXTEND® (Sequenom, Inc.)) in which two genotyping primers anneal to and amplify a genomic target surrounding a polymorphic site of interest. A third primer (the MassEXTEND® primer), which is complementary to the amplified target up to but not including the polymorphism, may be then enzymatically extended one or a few bases through the polymorphic site and then terminated. The MassEXTEND® primers and assays set forth in Tables 6, 7 and 8 have been optimized so that the same terminator mix can be used for all assays. This improves the efficiency of the assays and enables a greater ease of automation and also allows for grouping of assays into a multiplex format. The multiplex format may be in the form of 4-, 6-, 8- or 12- plexes. Tables 6, 7 and 8 provide assay designs for 6-, 8- and 12-plexes, respectively. However, combinations of SNP assays and/or MassEXTEND® primers using different termination mixes or in different multiplex formats may be used as weU. [0060] For each polymorphism, SpectroDESIGNER™ software (Sequenom, Inc.) may be used to generate a set of PCR primers and a MassEXTEND® primer that may be used to genotype the polymorphism. Other primer design software could be used or one of ordinary skill in the art could manually design primers based on his or her knowledge of the relevant factors and considerations in designing such primers. Tables 6, 7 and 8 shows PCR primers and extension primers used for analyzing polymorphisms. It also includes the termination mix, which is the same (ACT) for each assay, and the "well", which breaks each assay into 6-, 8- or 12-plexes. For example, Table 6 consists of 15 6-plex assay designs. The initial PCR amplification reaction may be performed in a 5 μl total volume containing IX PCR buffer with 1.5 M MgCl2 (Qiagen), 200 μM each of dATP, dGTP, dCTP, dTTP (Gibco-BRL), 2.5 ng of genomic DNA, 0.1 units of HotStar DNA polymerase (Qiagen), and 200 nM each of forward and reverse PCR primers specific for the polymorphic region of interest.
TABLE 6: 6-plex Assay Designs
Figure imgf000027_0001
Figure imgf000028_0001
Figure imgf000029_0001
TABLE 7: 8-plex Assay Designs
Figure imgf000030_0001
Figure imgf000031_0001
Figure imgf000032_0001
TABLE 8: 12-plex Assay Designs
Figure imgf000033_0001
Figure imgf000034_0001
Figure imgf000035_0001
Table 9 illustrates a different individual across each of 16 columns, comprising African Americans (af); Caucasians (ca); Hispanics (hi) and unknown ethnicity (controls), also males (M) and females (F). Reading down each column the identity of the nucleotide at each SNP position , e.g., 3277, 9566 etc. is shown. The combination of the 37 SNPs per person, provides a unique pattern for each individual. The number of possible combinations is 3xl037.
Figure imgf000037_0001
Figure imgf000038_0001
[0061] The MassEXTEND® reaction may be performed in a total volume of 9 μl, with the addition of IX ThermoSequenase buffer, 0.576 units of ThermoSequenase (Amersham Pharmacia), 600 nM MassEXTEND® primer, 2 mM of ddATP and or ddCTP and/or ddGTP andor ddTTP, and 2 mM of dATP or dCTP or dGTP or dTTP. The deoxy nucleotide (dNTP) used in the assay normally will be complementary to the nucleotide at the polymorphic site in the amplicon. Samples are incubated at 94°C for 2 minutes, followed by 55 cycles of 5 seconds at 94°C, 5 seconds at 52°C, and 5 seconds at 72°C. [0062] Following incubation, samples are desalted by adding 16 μl of water (total reaction volume was 25 μl), 3 mg of SpectroCLEAN™ sample cleaning beads (Sequenom, Inc.) and aUowed to incubate for 3 minutes with rotation. Samples are then robotically dispensed using a piezoelectric dispensing device (SpectroJET™ (Sequenom, Inc.)) onto either 96-spot or 384-spot silicon chips containing a matrix that crystallized each sample (SpectroCHIP® (Sequenom, Inc.)). Subsequently, MALDI-TOF mass spectrometry (Biflex and Autoflex MALDI-TOF mass spectrometers (Bruker Daltonics) can be used) and SpectroTYPER RT™ software (Sequenom, Inc.) can be used to analyze and interpret the SNP genotype for each sample. [0063] Table 10 illustrates a number of multiplex reactions, meaning that PCR amplification and/or MassEXTEND® reactions can be run simultaneously in the same reaction vessel. For example A01 is a set of multiplex assays in which 1397421, 1912948, 775709, 934774 are all run together. A02-04 are additional triplicate runs of the same multiplex reactions, which shows the reproduceability of the reactions. A total of fourteen four-plexes is set forth. The multiplexes were designed so that the PCR primers do not interfere with each other and share the same formation of the teπnination mix. Other sets of multiplexes and/or higher multiplexes could designed by those of ordinary skill in the art. "X" indicates that a polymorphic variant at the designated position was not identified for the particular individual.
Figure imgf000040_0001
Figure imgf000040_0002
Figure imgf000041_0001
Figure imgf000042_0001
Figure imgf000043_0001
Figure imgf000044_0001
Figure imgf000045_0001
Figure imgf000046_0001
Figure imgf000047_0001
Figure imgf000048_0001
Figure imgf000049_0001
[0064] Modifications may be made to the foregoing without departing from the basic aspects of the invention. Although the invention has been described in substantial detail with reference to one or more specific embodiments, those of skill in the art will recognize that changes may be made to the embodiments specifically disclosed in this application, yet these modifications and improvements are within the scope and spirit of the invention, as set forth in the claims which follow. All publications or patent documents cited in this specification are incorporated herein by reference as if each such publication or document was specifically and individually indicated to be incorporated herein by reference. [0065] Citation of the above publications or documents is not intended as an admission that any of the foregoing is pertinent prior art, nor does it constitute any admission as to the contents or date of these publications or documents.

Claims

What is claimed is: 1. A method for identifying an individual organism based on one or more single nucleotide polymorphisms, which comprises the following steps: (a) obtaining or possessing a nucleic acid sample known to be from the individual organism; (b) obtaining or possessing a nucleic acid sample which may or may not be from, or which may or may not be derived from, the individual organism; (c) analyzing the nucleic acid sample of step (a) to detect the identity of at least six single nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5; (d) analyzing the nucleic acid sample of step (b) to detect the identity of at least six nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5; and (e) comparing the results of steps (c) and (d), whereby if the identity of each of the at least six single nucleotide polymorphisms detected in step (c) is the same as the identity of each of the at least six single nucleotide polymorphism detected in step (d), then the nucleic acid sample of step (b) can be said to have come from the individual organism in step (a), thereby identifying that individual organism.
2. The method of claim 1 wherein the nucleic acid sample is obtained by extraction from a biological source that is associated with a non-biological source.
3. The method of claim 2 wherein the non-biological source is selected from the group consisting of fabric, carpeting, currency, leather, cordage, tobacco products, hard-surfaced objects, a biological specimen other than the biological source which is the source of the nucleic acid.
4. The method of claim 1 wherein the nucleic acid sample is obtained from human tissue, wherein the human tissue is selected from the group of human tissue consisting of blood, semen, epiderma, vaginal cells, hair, saliva, vomit, urine, feces, bone, buccal sample, amniotic fluid containing placental cells or fetal cells, and mixtures of any of the tissues listed above.
5. The method of claim 1 wherein the individual organism is a human.
6. A method for identifying an individual organism based on single nucleotide polymorphisms, which comprises the following steps:
(a) obtaining or possessing a nucleic acid sample known to be from the individual organism; (b) obtaining or possessing a nucleic acid sample which may or may not be from, or which may or may not be derived from, the individual organism; (c) analyzing the nucleic acid sample of step (a) to detect the identity of at least six single nucleotide polymorphisms selected from the group consisting of the set37 polymorphisms set forth in Table 5; (d) analyzing the nucleic acid sample of step (b) to detect the identity of the at least six single nucleotide polymorphisms selected from the group consisting of the set37 polymorphisms set forth in Table 5; and (e) comparing the results of steps (c) and (d), whereby if the identity of the at least six single nucleotide polymorphisms detected in step (c) is the same as the identity of the at least six single nucleotide polymorphisms detected in step (d), then the nucleic acid sample of step (b) can be said to have come from the individual organism in step (a), thereby identifying that individual organism.
7. A kit comprising at least one primer pair for amplifying each of a nucleic acid containing each of a nucleic acid single nucleotide polymorphism selected from the group consisting of any two of the polymorphisms set forth in Tables 6, 7 or 8, and: (a) a compartment comprising the primer pairs; and (b) instructions for use of reagents in the kit.
8. A method for identifying a known individual organism based on one or more single nucleotide polymorphisms, which comprises the following steps: (a) obtaining or possessing a nucleic acid sample from the known individual organism; (b) obtaining or possessing a nucleic acid sample which may or may not be from, or which may or may not be derived from, the known individual organism; (c) analyzing the nucleic acid sample of step (a) to detect the identity of at least six single nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5; (d) analyzing the nucleic acid sample of step (b) to detect the identity of at least six nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5; and (e) comparing the results of steps (c) and (d), whereby if the identity of each of the six single nucleotide polymorphisms detected in step (c) is the same as the identity of each of the six single nucleotide polymorphism detected in step (d), then the nucleic acid sample of step (b) can be said to have come from the individual organism in step (a), thereby confirming the identity of that individual organism as the source of the nucleic acid sample in step (a).
9. A method for identifying a known individual organism as a parent of another organism based on one or more single nucleotide polymorphisms, which comprises the following steps: (a) obtaining or possessing a nucleic acid sample from the known individual organism; (b) obtaining or possessing a nucleic acid sample from a possible offspring of the known individual, which possible offspring nucleic acid sample may or may not be derived from the known individual organism; (c) analyzing the nucleic acid sample of step (a) to detect the identity of at least six single nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5; (d) analyzing the nucleic acid sample of step (b) to detect the identity of at least six nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5; and (e) comparing the results of steps (c) and (d), whereby if the identity of at least one half of each of the six single nucleotide polymorphisms detected in step (c) is the same as at least one half of each of the six single nucleotide polymorphism detected in step (d), then the nucleic acid sample of step (b) can be said to be derived from the individual organism in step (a), thereby confirming the identity of that individual organism in step (b) as the offspring of the individual of step (a).
10. A method for matching a genetic profile of a known individual organism based on one or more single nucleotide polymorphisms, which comprises the following steps:
(a) obtaining or possessing a nucleic acid sample from the known individual organism;
(b) analyzing the nucleic acid sample of step (a) to detect the identity of at least six of the nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5, to obtain an individually-specific result; (c) obtaining or accessing a collection of information containing at least one result of previously generated information on the genetic composition of individual organisms, wherein such previously generated genetic information may contain a result comprising the genetic information of the individual organism of step (a); and
(d) comparing the individually-specific result of step (b) with the previously generated information of step (c), whereby if the individually-specific result of step (b) matches the result comprising the genetic information of the individual organism of step (c), then the match of the genetic information of the individual organism is confirmed.
11. The method of claim 10 wherein the collection of information is a computerized database, said database containing at least one relevant additional category of information that is linked to the genetic information.
12. The method of claim 11 wherein the additional category of information is selected from the group consisting of an individual organism's name, an individual organism's identifying number, including a social security number, a driver's license number or a prison identification number, an individual organism's physical characteristic, an individual organism's status, including individuals convicted of committing a crime, and individuals on a terrorist or criminal watch list.
13. A method for matching a genetic profile of an unknown individual organism based on one or more single nucleotide polymorphisms, which comprises the following steps:
(a) obtaining or possessing a nucleic acid sample from an unknown individual organism;
(b) analyzing the nucleic acid sample of step (a) to detect the identity of at least six of the single nucleotide polymorphisms selected from the group consisting of the polymorphisms set forth in Table 5, to obtain an individually-specific result;
(c) obtaining or accessing a collection of information containing at least one result of previously generated information on the genetic composition of individual organisms, wherein such previously generated genetic information may contain a result comprising the genetic information of the individual organism of step (a); and (d) comparing the individually-specific result of step (b) with the previously generated information of step (c), whereby if the individually-specific result of step (b) matches the result comprising the genetic information of the individual organism of step (c), then the match of the genetic information of the individual organism is confirmed.
PCT/US2004/021662 2003-07-07 2004-07-07 Compositions and methods for identifying an individual organism based on single nucleotide polymorphisms WO2005007816A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US48556103P 2003-07-07 2003-07-07
US60/485,561 2003-07-07

Publications (2)

Publication Number Publication Date
WO2005007816A2 true WO2005007816A2 (en) 2005-01-27
WO2005007816A3 WO2005007816A3 (en) 2006-03-23

Family

ID=34079143

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/021662 WO2005007816A2 (en) 2003-07-07 2004-07-07 Compositions and methods for identifying an individual organism based on single nucleotide polymorphisms

Country Status (1)

Country Link
WO (1) WO2005007816A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011067765A1 (en) * 2009-12-03 2011-06-09 Yissum Research Development Company Of The Hebrew University Of Jerusalem, Ltd. System and method for analyzing dna mixtures

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
DATABASE SNP [Online] 02 January 2001 XP002994380 Retrieved from ncbi Database accession no. (rs1874794) *
DATABASE SNP [Online] 02 January 2001 XP002994384 Retrieved from ncbi Database accession no. (rs1912948) *
DATABASE SNP [Online] 13 September 2000 XP002994383 Retrieved from ncbi Database accession no. (rs1053407) *
DATABASE SNP [Online] 26 January 2001 XP002994379 Retrieved from ncbi Database accession no. (rs2029490) *
DATABASE SNP [Online] 26 January 2001 XP002994382 Retrieved from ncbi Database accession no. (rs172982) *
DATABASE SNP [Online] 30 June 2000 XP002994381 Retrieved from ncbi Database accession no. (rs270487) *
HOLT CL ET AL: 'Practical applications of genotypic surveys for forensic STR testing.' FORENSIC SCI INT. vol. 112, no. 2-3, 14 August 2000, pages 91 - 109, XP002994385 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011067765A1 (en) * 2009-12-03 2011-06-09 Yissum Research Development Company Of The Hebrew University Of Jerusalem, Ltd. System and method for analyzing dna mixtures
US20120245052A1 (en) * 2009-12-03 2012-09-27 Yissum Research Development Company Of The Hebrew University Of Jerusalem, Ltd. System and method for analyzing dna mixtures
US9447474B2 (en) 2009-12-03 2016-09-20 Yissum Research Development Company Of The Hebrew University Of Jerusalem, Ltd. System and method for analyzing DNA mixtures

Also Published As

Publication number Publication date
WO2005007816A3 (en) 2006-03-23

Similar Documents

Publication Publication Date Title
US6703228B1 (en) Methods and products related to genotyping and DNA analysis
Shuber et al. High throughput parallel analysis of hundreds of patient samples for more than 100 mutations in multiple disease genes
Taylor Laboratory Methods for the Detection of Mutations and Polymorphisms in DNA
Twyman et al. Techniques patents for SNP genotyping
EP1056889B1 (en) Methods related to genotyping and dna analysis
EP2061910B1 (en) Prognostic method
AU2003247715B2 (en) Methods and compositions for analyzing compromised samples using single nucleotide polymorphism panels
Nilsson et al. Making ends meet in genetic analysis using padlock probes
EP1743036A2 (en) Apoe genetic markers associated with age of onset of alzheimer's disease
US9057103B2 (en) Method for detecting mutations at IL28B and ITPA
US20080076130A1 (en) Molecular haplotyping of genomic dna
EP2821503B1 (en) Method for detecting hla-a*31:01 allele
JPWO2007055255A1 (en) Method for amplifying a plurality of nucleic acid sequences for identification
WO2010067208A2 (en) Genotyping dihydropyrimidine dehydrogenase deficiency
Tubbs et al. Cell and Tissue Based Molecular Pathology E-Book: A Volume in the Foundations in Diagnostic Pathology Series
US20160053333A1 (en) Novel Haplotype Tagging Single Nucleotide Polymorphisms and Use of Same to Predict Childhood Lymphoblastic Leukemia
WO2005007816A2 (en) Compositions and methods for identifying an individual organism based on single nucleotide polymorphisms
JPWO2006070666A1 (en) Simultaneous detection method of gene polymorphism
JP6245796B2 (en) Markers, probes, primers and kits for predicting the risk of developing primary biliary cirrhosis and methods for predicting the risk of developing primary biliary cirrhosis
CN110527709B (en) Chimeric rate detection method based on chimera only
WO2003020950A2 (en) Methods and compositions for bi-directional polymorphism detection
CN110551804B (en) Detection method for chimeric rate of chimera based on donor and acceptor
Salvado et al. Microarray technology for mutation analysis of low-template DNA samples
Gupta et al. Application of SNP genotype arrays to determine somatic changes in cancer
US7074567B2 (en) Assay for human DNA for gender determination

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase