CA2356536A1 - Mitochondrial dna sequence alleles - Google Patents

Mitochondrial dna sequence alleles Download PDF

Info

Publication number
CA2356536A1
CA2356536A1 CA 2356536 CA2356536A CA2356536A1 CA 2356536 A1 CA2356536 A1 CA 2356536A1 CA 2356536 CA2356536 CA 2356536 CA 2356536 A CA2356536 A CA 2356536A CA 2356536 A1 CA2356536 A1 CA 2356536A1
Authority
CA
Canada
Prior art keywords
haplogroup
nucleotide
sample
group
alleles
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA 2356536
Other languages
French (fr)
Inventor
Douglas C. Wallace
Seyed H. Hosseini
Dan Mishmar
Marie Lott
Eduardo Ruiz-Pesini
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Emory University
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US10/488,618 priority Critical patent/US20050123913A1/en
Priority to CA002459127A priority patent/CA2459127A1/en
Priority to EP02796465A priority patent/EP1432831A4/en
Priority to PCT/US2002/028471 priority patent/WO2003018775A2/en
Priority to JP2003523626A priority patent/JP2005525082A/en
Publication of CA2356536A1 publication Critical patent/CA2356536A1/en
Abandoned legal-status Critical Current

Links

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

This invention provides human mitochondrial DNA polymorphisms that are diagnostic of all the major human haplogroups and methods of diagnosing those haplogroups and selected sub-haplogroups. This invention provides sets of nucleic acid molecules containing human mtDNA
polymorphisms. This invention provides sets of nucleic acid molecules containing human mtDNA polymorphisms formatted as nucleic acid arrays. This invention also provides methods for making the nucleic acid arrays, using the arrays to determine the presence or absence of a nucleotide allele in a sample, and methods for identifying a nucleic acid allele associated with a disease phenotype. This invention provides machine-readable storage devices containing data associating haplogroups and nucleic acid alleles and program storage devices for diagnosing a haplogroup.

Description

MITOCHONDRIAL DNA SEQUENCE: ALLELES
BACKGROUND
Methods for constructing peptide and nucleotide libraries are well-known to the art, e.g.
as described in U.S. Patents 6,156,511 and 6,130,092. Sequencing methods are also known to the art, e.g. as described in U.S. Patent 6,087,095. Arrays of nucleic acid probes on biological chips have been used for sequencing and for identifying exceptional alleles including disease-associated alleles. Nucleic acid arrays are described, e.g. in U.;i. Patent 5;837,832, and PCT
Publications WO 99/05324, 99/05591, WO 00/58516, US 5,807,522, US 6,110,426, W09535505A1, JP10503841T2, GR3030430T3, ES2134481T3, EP913485A1, EP804731B1, EP804731A4, EP804731A1, DK804731T3, DE69509925T2, DE69509925C0, CA2192095AA, AU2862995A1, AU709276B2, and AT180570E, U.S. 6,110,426, 5,807,522, WO
99/42813, EP
1066506, AU 2780499, U.S. 6,007,987, WO 95/11995, U.S. 5,837,832. Such arrays can be incorporated into computerized methods for analyzing hybridization results when the arrays are contacted with labeled sample nucleotides, e.g. as described in :PCT
Publication WO 99/05574, and U.S. Patents 5,754,524, 6228,575, 5,593,839, and 5,856,101. Methods for screening for disease markers are also known to the art, e.g. as described in U.S. Patents 6,228,586, 6,160,104, 6,083,698, 6,268,398, 6,228,578, and 6,265,174. Mitochondria) DNA sequences have been associated with pathologies as described in U.S. 5,670,320, 5,296,349, 5,185,244, and 5,494,794.
A review of human haplogroups is provided in Wallace DC et ail. (1999), "Mitochondria) DNA
~ Variation in Human Evolution and Disease," Gene 238:211-230. The complete Cambridge mitochondria) sequence may be found at MITOMAP, http://uw~~~.~en.emo .edu/c~i-ain/MITOM..~P. Genbank accession no. J01415. Also see Andrews et al. (1999), "Reanalysis and Revision of the Cambridge Reference Sequence for Human Mitochondria) DNA,"
Nature Genetics 23:147. All publications referred to herein are incorporated by reference to the extent not inconsistent herewith.

Introduction Human mitochondria) DNA (mtDNA) is maternally inherited. Mutations accumulate sequentially in radiating lineages creating branches on the human evolutionary tree called "haplogroups." Each haplogroup has distinctive polymorphisms which must be known to distinguish normal mitochondria) DNA from disease mutations and to track the mtDNA in lineage and association studies.
This invention provides sequences of mtDNAs diagnostic of all the major human haplogroups.
The polymorphisms associated with the various haplog:roups can be incorporated into an mtDNA sequencing array (also referred to herein as a "mitosed chip" or a "resequencing chip") Such arrays can be used to identify new mtDNA disease mutations; polymorphisms can be linked to disease phenotypes and used for identification of diseases, and to identify individuals predisposed to such diseases. The polymorphisms can also be used in forensic analysis to help identify individuals.
Description This invention provides an isolated human mitochondria) polynucleotide molecule (also referred to herein as a "polynucleotide fragment") comprising a sequence with at least one polymorphic locus when compared with the corresponding Cambridge sequence, selected from the group consisting of novel sequences having a nucleotide difference at a mitochondria) nucleotide pair position as listed herein. The molecule can be a DNA or an RNA
molecule.
Polypeptide fragments having an amino acid sequence of a translation product of such polynucleotide molecules are also provided herein. Such molecules can be as long as only a few polynucleotides, e.g. 6, up to the number of nucleotides in a complete mitochondria) sequence, such as the Cambridge sequence. Preferably such polynucleotide molecules are about 9 to about nucleotides in length.
This invention also includes a complete set of all such novel polynucleotide molecules.
These polymorphisms are listed on the attachment hereto entitled "Wallace Mitochondria) Polymorphisms."
Subsets of the mitochondria) polynucleotide molecules provided by this invention are made up of molecules comprising alleles associated with each haplogroup defined herein. This invention defines 36 haplogroups, as listed on the attachment hereto entitled "Haplogroup Designations." A table of haplogroups and alleles present in each is included in this application, entitled, "Human Haplogroups." Polymorphisms from fifty mitochondria) DNA
sequences of individuals representative of human haplogroups are attached hereto entitled "Polymorphisms of Fifty mtDNA Sequences." The first column is the Cambridge mtDNA nucleotide, the second column is the Cambridge sequence at that nucleotide, the third column is the allele present in the sequenced individual mitochondria) DNA. A title indicating the individual, and sometimes the haplogroup to which the individual belongs appears at the top center of each page. Data on each allele, the haplogroup(s) in which it occurs, and preferably also its frequency in a particular haplogroup, allows the use of the mitochondria) polynucleotide; sequences and molecules of this invention to diagnose a particular individual's haplogroup.
Some or all of the novel mitochondria) polynucleotide molecules of this invention may be used in an array of nucleotide fragments positioned on known locations on a substrate, said nucleotide fragments comprising at least one fragment comprising mitochondria) nucleotides corresponding to Cambridge mitochondria) sequences (or other known mitochondria) sequences used as a reference) and at least one fragment comprising a polymorphic locus when compared with the corresponding reference sequence, selected from the group consisting of sequences having a nucleotide difference at a mitochondria) nucleotide pair position as listed in "Polymorphisms of Fifty mtDNA Sequences" attached hereto. These arrays preferably also include other previously-known mitochondria) alleles. Preferably the array comprises fragments comprising the complete Cambridge (or other reference) mitochondria) sequence.
Preferably it also includes all novel alleles of this invention, as well as all other known mitochondria) alleles.

Other known mitochondria) alleles include polymorphisms from fifty-four mitochondria) DNA
sequences of individuals representative of human haplogroups which are attached hereto entitled "Polymorphisms of Fifty-four mtDNA Sequences." The first column is the Cambridge mtDNA
nucleotide, the second column is the Cambridge sequence at that nucleotide, the third column is the allele present in the sequenced individual mitochondria) DNA. A title indicating the individual appears at the top center of each page. The foregoing arrays rnay be used to determine if a given sample nucleotide fragment contains identical or different sequences to those on the array.
The array may also comprise at least one polynucleotide; molecule comprising an allele associated with a disease phenotype or predisposition. Many such alleles are known to the art as described in publications incorporated by reference herein, and other publications. Previously unknown exceptional alleles associated with a disease can be identified by the methods of this invention as described below. An array containing a polynucleotide molecule comprising an allele associated with a disease phenotype or predisposition can be used in a method of this invention to diagnose an individual with the corresponding predisposition or disease.
This invention also provides a method for determining the presence or absence of an allele at at least one locus of mtDNA in an individual, comprising obtaining a sample of said individual's mtDNA; labeling fragments of said sample mtDNA; providing an array of nucleotide fragments positioned at known locations on a substrate, said nucleotide fragments comprising at least one fragment comprising mitochondria) nucleotides corresponding t:o Cambridge mitochondria) sequences (or other reference sequences) and at least one fragment comprising a polymorphic locus when compared with the corresponding Cambridge sequence, selected from the group consisting of sequences having a nucleotide difFerence from the reference sequence at a mitochondria) nucleotide pair position; contacting said substrate with said labeled fragments of said mtDNA, whereby said labeled nucleotide fragments hybridize to bound nucleotide fragments on said substrate; quantitating a hybridization pattern of said labeled nucleotide fragments to said bound nucleotide fragments to produce a data set; and analyzing said data set to determine the presence or absence of said allele. The array may comprise fra~;rnents collectively comprising the complete Cambridge mitochondria) sequence, and preferably th.e array comprises all known polymorphic loci.
This invention also provides a method for determining to which of a set of haplogroups an individual belongs comprising: obtaining a sample of said individual's mtDNA;
labeling fragments of said sample mtDNA; providing an array of nucleotide fragments positioned at known locations on a substrate, said nucleotide fragments comprising sufficient alleles of sufficient polymorphic loci to diagnose all haplogroups in said set of haplogroups; contacting said substrate with said labeled fragments of said sample mtDNA, whereby said labeled nucleotide fragments hybridize to bound nucleotide fragments on said substrate; quantitating a hybridization pattern of said labeled nucleotide fragments to said bound nucleotide fragments to produce a first data set comprising positions on the array associated with strength of hybridization at such positions; providing a second data set comprising the data provided in "Human Haplogroups" attached hereto;
comparing said first data set with said second data set to determine the haplogroup of said set of haplogroups to which said individual belongs. As will be understood by those skilled in the art, the results will be in terms of probabilities that the individual belongs to a given haplogroup. In this invention the haplogroup having the highest probability of .being that to which the individual belongs will be the haplogroup to which that individual is assigned.
Such haplotyping methods may be used for forensic anGilysis.
This invention provides a method for determining to which of a set of haplogroups an individual belongs, said method comprising: sequencing some or all mtDNA from said individual;
and comparing the sequence of said mtDNA with reference seq~~uences of mtDNA
of reference individuals of said set of haplogroups to identify sufficient mitochondria) alleles of said individual to determine to which haplogroup said individual belongs, said reference sequences being selected from the group consisting of the alleles at the polymorphic loci disclosed herein as being associated with said haplogroups.

A method for determining the complete mtDNA sequence of an individual is also provided, said method comprising: providing a sequencing array of tiled nucleotide fragments on a substrate collectively comprising ail possible variants (AGCT) of the mitochondria) genome sequence; labeling sample fragments of mitochondria) DNA (mtDNA); contacting said substrate with said labeled mtDNA, whereby said labeled mtDNA hybridizes to bound nucleotide fragments on said substrate; quantitating hybridization patterns of said labeled mtDNA
to said bound nucleotide fragments to produce a data set; and analyzing said data set to determine the sequence of said mtDNA.
A method for identifying information about an individual useful in forensics is provided using the foregoing sequencing and haplotyping methods and data provided herein.
A method for determining an exceptional allele of a mitochondria) locus in a sample of mtDNA is also provided, said method comprising: sequencing said mtDNA sample, preferably using the sequencing array described in the preceding paragraph; and comparing the sequence of said mtDNA sample with reference sequences of mtDNA comprising the Cambridge mitochondria) DNA sequence or other reference sequence, and also with reference sequences comprising alleles at the polyrnorphic loci described herein as known, naturally-occurnng alleles.
New, previously-unknown (also referred to herein as "exceptional") mitochondria) alleles may be identified in sample DNA by a method of this invention comprising providing a sequencing array of tiled nucleotide fragments on a substrate collectively comprising all possible variants (ACCT) of the mitochondria) genome sequence; labeling sample fragments of mitochondria) DNA
(mtDNA); contacting said substrate with said labeled mtDNA, whereby said labeled mtDNA
hybridizes to bound nucleotide fragments on said substrate; quantitating hybridization patterns of said labeled mtDNA to said bound nucleotide fragments to produce a data set;
analyzing said data set to determine the sequence of said mtDNA; comparing said sequence with sequences comprising the complete Cambridge sequence (or other reference sequence) and the novel sequences of this invention, as well as all other known mitochondria) sequences, to produce a final data set comprising at least one exceptional allele associated with its nucleotide locus. Such exceptional alleles could be alleles identifying new haplogroups or subgroups of existing haplogroups, or could be new polymorphisms characteristic of existing haplogroups. Such exceptional alleles may also be associated with a disease phenol:ype or predisposition.
For example, this invention provides a method for identifying a mitochondria) allele associated with a disease phenotype or predisposition in an individual comprising: obtaining a sample of said individual's mtDNA; labeling fragments of said rntDNA;
providing an array of nucleotide fragments positioned at known locations on a substrate, said nucleotide fragments collectively comprising all possible variants (AGCT) of the mitochondria) genome sequence;
contacting said substrate with said labeled sample fragments of said mtDNA
from an individual, whereby said labeled sample rntDNA fragments hybridize to bound nucleotide fragments on said substrate; quantitating a hybridization pattern of said labeled nucleotide fragments to said bound nucleotide fragments to produce a first data set comprising the mtDNA sequence of said individual; providing a second data set comprising the "Wallace; Mitochondria) Polymorphisms"
attached hereto, preferably associated with their haplogroups as described in "Human Haplogroups" attached hereto; comparing said first data set with said second data set to identify any non-wildtype alleles in said first data set, any said non-wildtype alleles being associated with said disease phenotype or predisposition in said individual. Further applications of this method may show that newly-identified non-wild type alleles may correlate with a disease phenotype or predisposition in many individuals.
The methods of this invention, using an array collectively comprising all possible variants (AGCT) of the mitochondria) genome sequence, can be used to diagnose an individual with a disease or a predisposition to a disease using a data set of all alleles corresponding with diseases, as known to the art or as identified by methods herein.
This invention also provides a method for determining to which of a set of haplogroups an individual belongs comprising: obtaining a sample of said individual's mtDNA;
labeling fragments of said sample mtDNA; providing an array of nucleotide fragments comprising all possible variants (AGCT) of mtDNA; contacting said substrate with said labeled fragments of said sample mtDNA, whereby said labeled nucleotide fragments hybridize to bound nucleotide fragments on said substrate; quantitating a hybridization pattern of said Iabelc:d nucleotide fragments to said bound nucleotide fragments to produce a first data set comprisiing positions on the array associated with strength of hybridization at such positions; providing a second data set comprising the data provided in "Human Haplogroups" attached hereto; comparing said first data set with said second data set to determine the haplogroup of said set of haplogroups to which said individual belongs. As will be understood by those skilled in the art, the results will be in terms of probabilities that the individual belongs to a given haplogroup. In this invention the haplogroup having the highest probability of being that to which the individual belongs will be the haplogroup to which that individual is assigned.
Preferably the array used in haplotyping analyses of this invention comprises all known polymorphic loci including those disclosed for the first time herein, also including disease-associated polymorphic loci, and corresponding Cambridge mtDNA sequences.
Computer programs for carrying out the foregoing methods are also provided, said programs comprising code for performing said method steps. Computers programmed to perform the steps of at least one of the above-described methods are also provided.
Computer-readable storage media having stored thereon data structures constructed according to the steps of at Least one of the above-describe methods are also provided.
A method of creating a data set comprising mitochondria) DNA sequences and frequency of occurrence thereof, which sequences and frequencies are diagnostic of a set of haplogroups is also provided, said method comprising: providing mitochondria) DNA RFLP data from a population of individuals; performing phylogenetic analysis on said RFLP data to identify said set of haplogroups; assigning a haplogroup identification from said set of haplogroups to each individual in said population; sequencing complete mtDNA fronn at least one individual within each haplogroup to obtain individual mtDNA sequences; comparing said individual mtDNA
sequences with the Cambridge mtDNA sequence to produce a data set comprising sequences of polymorphic alleles selected from the group consisting of those disclosed herein as naturally-occurnng alleles; and determining the frequency of each said polymorphic alleles in each haplogroup of said set of haplogroups. This method of may also comprise using said data set to determine the haplogroup of an individual.
This invention also provides a computer storage medium storing a data structure produced by the above method of creating a data set.
Definitions As used herein, a "locus" refers to a single nucleotide location in the human mitochondria) genome, as defined by the Cambridge sequence, denoted by the numbers 1 to 16568.
As used herein, an "allele" refers to the variant, a different nucleotide, or deletion, or insertion, present at a locus compared with the Cambridge sequence nucleotide at that locus.
As used herein; the "Cambridge sequence" refers to the Human mtDNA sequence, Genbank Accession #701415. (Andrews et al. (1999), "Reanal;ysis and Revision of the Cambridge Reference Sequence for Human Mitochondria) DNA," Nature Genetics 23:147.) This sequence is 16568 nucleotides long. This sequence implies its 16568 complementary sequence as well, but only one strand, in Genbank Accession #701415 it is the L or light strand as opposed to the H or heavy strand, is necessary to represent the sequence, unless specified. In vivo, this sequence is circular dsDNA.
All sequences given herein are meant to encompass the complementary strand, as well as double-stranded polynucleotides comprising the given sequence.

As used herein, a "polymorphism" refers to the existence, in an otherwise-similar DNA
sequence, of more than one allele at a locus.
As used herein, "polymorphic locus" refers to a locus at which more than one allele is known to exist in corresponding sequences from different individuals. The allele could comprise a different nucleotide at that position, a deletion at that position, or an insertion at that position.
As used herein, a "nucleotide difference" refers to a polymorphism.
As used herein, a "polynucleotide" refers to a molecule consisting of more than one nucleotide, that may be obtained by purification, amplificatian, or chemical synthesis, by methods standard to the art.
As used herein, a "polypeptide fragment" refers to a molecule consisting of more than one peptide, that may be obtained by protein purification from cells, in vitro translation, or chemical synthesis; by methods standard to the art.
As used herein, "translation product" refers to the polypeptide fragment that would be produced if a polynucleotide were transcribed and translated in vitro or in vivo, or could be predicted to be produced using standard amino acid codon tables and synthesized, by methods standard to the art.
As used herein, "collectively comprising" refers to a set: of items that may contain redundancy and together form a complete representational set. For example, "polynucleotide fragments collectively comprising the complete Cambridge sequence" refers to a set of polynucleotide fragments that might each be 20 nt long, that differ each in one nucleotide, a 5' nucleotide removed and a 3' nucleotide added for each, such that the entire circular Cambridge sequence molecule would be completely represented by 16568 different 20 nt long polynucleotides.

As used herein, "labeling" refers to covalently linking a reporter atom or molecule onto or within a polynucleotide fragment, preferentially using a fluorescein reporter molecule, but possibly using any reporter atom or molecule known to the art including, but not limited to 32P, 3sS, and digoxygenin.
As used herein, "hybridize" refers to combining potenti<~lly complementary polynucleotide fragments, of such length, in such a solution, at such a temperature, for such a period of time, such that these conditions provide adequate stringency such that single nucleotide differences are detectable and quantifiable by utilized methods.
As used herein, "arrays" refer to oligonucleotides bound, preferably microfabricated, in a known arrangement, preferably tiling, on a solid support, preferably a silica chip.
As used herein, "quantitate" an amount of hybridization signal refers to a method of measuring the amount of hybridization signal, preferably scanning for fluorescence, from each component of a hybridized array, all components making together making a pattern, independently or relative to a reference.
As used herein, "data set" refers to a collection of data .a type of data including, but not limited to data comprising quantitation of hybridization signal from each component of a hybridized array, data comprising the alleles at a locus, and dat<~ comprising which alleles at which loci define a haplotype. The data sets of this invention comprise, at least two associated types of data, e.g. locus and allele.
As used herein, "determining a sequence" refers to sequencing a polynucleotide fragment by this invention or any method standard to the art, and the sequence data obtained..
As used herein, "all possible variants" of the mtDNA genome sequence refers to, in addition to a set of polynucleotides collectively comprising the mtDNA
Cambridge sequence, three other sets of nucleotides collectively comprising the alternative AGCT
possible alleles at each nucleotide locus.
As used herein, "contacting said substrate" refers to combining all components that allow polynucleotide probes to hybridize to the bound polynucleotides on a solid support.
As used herein, "disease phenotype" refers to a description of the manifestation of a disease.
As used herein, "associating exceptional allele with a disease phenotype"
refers to performing the methods of this invention to associate an allele with a disease phenotype in one individual and then repeating this methods with multiple individuals with and without said disease phenotype in order to statistically correlate said exceptional allele with the phenotype in a population.
As used herein, "disease-associated polymorphic locus" refers to an allele or set of alleles at a polymorphic locus that correlates, in a population, with a disease or a tendency towards a disease, the correlation demonstrated by methods standard to the art.
As used herein, "sample" of an individual's DNA refers to a purified or amplified amount of a portion of the human individual's DNA that could include vnuclear as well as mitochondria) DNA and may include all of that individual's nuclear and mitochondria) DNA.
As used herein, "analyzing said data set" to determine the presence or absence of said allele refers to using a data set comprising a quantitated hybridization pattern, to effectively compare an unknown polynucleotide sequence fragment, specifically one locus, to a known polynucleotide sequence fragment, specifically the same locus, :having either the allele from the Cambridge sequence or a known polymorphic allele, to determine whether the unknown allele and the known allele are identical. Identity indicates the presence of that allele. Non-identity indicates absence of that allele.
As used herein, "haplogroup" refers to a data set comprising a list of which loci and which alleles at which frequencies identify a group of humans. The group of humans of this type is said to be of this haplogroup, each individual belonging to this haplogroup.
As used herein, "set of haplogroups" refers to a number, more than one, of lists of mitochondria) loci, alleles and frequencies, that identify the corresponding number of groups of humans. Each of these individual humans belongs to one of thf; haplogroups in this set.
As used herein, "phylogenetic analysis" refers to analyzing a data set comprising presence and absence of alleles at loci, for a number of individuals, by methods standard to the art" that elucidate the contemporary relationships of those individuals relative to each other and that predicts the likely historical evolutionary relationships of those individuals. relative to each other.
All references cited in the present application are incorporated by reference in their entirety herein to the extent that there is no inconsistency with the present disclosure.
The following articles and abstracts by inventors) of this invention are attached hereto and form part of this disclosure: Brown et al. {July 2001), "Novel mtDNA
Mutations and Oxidative Phosphorylation Dysfunction in Russian LHON families;" Human Genetics 109:33-39;
' Brown, et al. (December, 2000) "Functional Analysis of Lymp~hoblast and Cybrid Mitochondria Containing the 3460, 11778, or 14484 Leber's Hereditary Optic Neuropathy Mitochondria) DNA
Mutation, J. Biol. Chem. 275:39831-39836; Lell, JT and Wallace, DC (November 2000), "The Peopling of Europe from the Maternal and Paternal Perspectives," Am. J. Hum.
Genet.
67:1376-1381; Su et al. (2000), "Genetic Evidence for an East Asian Contribution to the Second Wave of Migration to the New World," American Society of Hfuman Genetics Conference 2000, Program No. 1291; Jin et al. (2000), "No Independent Origin of Modern Humans in East Asia:
A Tale of 12,000 Y Chromosomes, American Society of Human Genetics Conference 2000, Program No. 1283; Brown et al. (2000), "The Molecular Definition of Haplogroup B:
Significant Heterogeneity Revealed by Complete Sequence An<~lysis of Asian and Native American Haplogroup B mtDNAs," American Society of Human Genetics Conference 2000, Program No. 1276; Schurr et al. (2000), "The Ethnic Origins of An Enigmatic South Asian Population, the Kalasha of Northern Pakistan as Revealed by rntDNA Variation,"
American Society of Human Genetics Conference 2000, Program No. 11.72; Wallace, et al.
(2000), "Origin of Haplogroup M in Ethiopia," American Society of Human Gf;netics Conference 20008 Program No. 1171.
The following publication comprises alleles included on microarrays of this invention:
Ingman et al. (December, 2000), "Mitochondria) Genome Variation and the Origin of Modern Humans," Nature 408:708-713.
The following examples are provided for illustrative purposes, and are not intended to limit the scope of the invention as claimed herein. Any variations in the exemplified articles which occur to the skilled artisan are intended to fall within the scope of the present invention.

SEQUENCE LISTING
(1) GENERAL INFORMATION
(i) APPLICANT: Emory University (ii) TITLE OF INVENTION: Mitochondrial DNA Sequence Alleles (iii) NUMBER OF SEQUENCES: 2 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: McKay-Carey & Company (B) STREET: 2590 Commerce Place, 10155-102 Street (C) CITY: Edmonton (D) STATE: Alberta (E) COUNTRY: Canada (F) ZIP: T6J 4G8 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disc (B) COMPUTER: IBM PC Compatible (C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: PatentIn Version #3.1 (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: 2,356,536 (B) FILING DATE: 2001-08-31 (C) CLASSIFICATION: C12N-15/11 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 60/316,333 (B) FILING DATE: 2001-08-30 (viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Mary Jane McKay-Carey (B) REFERENCE/DOCKET NUMBER: 34092CA0 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (780) 424-0222 (B) TELEFAX: (780) 421-0834 (2) INFORMATION FOR SEQ ID NO:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16569 base pairs (B) TYPE: nucleic acid (ii) MOLECULE TYPE: DNA
(A) DESCRIPTION: /desc = "artificial sequence: SEQ ID N0:1 is a composite sequence of the Cambridge sequence and human mitochondrial DNA sequence alleles"
(iii) HYPOTHETICAL: NO
(ix) FEATURE
(A) NAME/KEY: misc_feature (B) LOCATION: (3106)..(3106) (D) OTHER INFORMATION: n at position 3106 is a deletion (ix) FEATURE
(A) NAME/KEY: misc_feature (B) LOCATION: (3796)..(3796) (D) OTHER INFORMATION: n at position 3796 is a, g, c, or t (xi) SEQUENCE DESCRIPTION: SEQ ID N0:1:

gatcacaggtctatcaccctattaaccactcacgggagctctccatgcatttggtatttt60 cgtytggggggyrtgcacgcgatagcatygcgrgmcgctggagccggagcaccytatgtc120 gcagtatctgtctttgattcctrccycatyyyrttatttatcgcacctacrttcaataty180 ayrgdmgavcatayhtayyraagygtrytraytartyaatgcttrtrrgacataryaata240 acaattraaygyctgcacagccrctttccacacagacatcataacaaaaartttycrcca300 aaccccccctcccccrvttytggcyacagcacttaaacayatctctgccaaaccccraaa360 acaaagaaccctracaccagcctaaccagatttcaaattktatctttwggcggtatgyac420 ttttaacagtcaccccccaactaacacattattttyccctcycaytyccayactactaay480 cycatcaayacarcccccrcccatcctrcccagcacacacacaccgctgctaaccccata540 ccccgaaccaaccaaaccccaaagacaccccccacagtttatgtagcttaccycctyaaa600 gcaatacactgaaaatgtttagacgggctcacatcaccccataaacaaataggtttggtc660 ctrgcctttctattagcycytagtaagattacacatgcaagcatccccrytccagtgagt720 ycaccctctaaatcaccacgatcaaaaggracaagcatcaagcacgcarcaatgcagctc780 aaaacgcttagcctagccacacccccacgggaaacagcagtgatwarcctttagcaataa840 acgaaagttyaactaagctatactaaccccagggttggtcaatttcgtgccagccaccgc900 ggtcacacgattaacccaagycaatagaarccggcgtaaagagtgttttagatcaccccc960 bccccaataaagctaaaactcacctgagttgtaaaaaactccagttgacacaaaatarac1020 tacgaaagtggctttaacatrtctgaayacacaatagctaagacccaaactgggattaga1080 taccccactatgcttagccctaaacctcaacagttaaaycaacaaaactgctcgccagaa1140 cactacgagccacagcttaaaactcaaaggacctggcggtgcttcatayccctctagagg1200 agcctgttctgtaatcgataaaccccgatcaacctcaccaccycttgctcagcctatata1260 ccgccatcttcagcaaaccctgatgaaggytacaaagtaagcgcaagtacccacgtaaag1320 acgttaggtcaaggtgtagcccatgaggtggcaagaaatgggctacattttctaccccag1380 amaactacgatagcccttatgaaacytaagggtcraaggyggatttagcagtaaactrag1440 artagagtgcttagttgaacagggccctgaagcgcgtacacaccgcccgtcaccctcctc1500 aartatacttcaaaggacatttaactaaaacccctacgcatttatatagaggagacaagt1560 cgtaacatggtaagtgtactggaaagtgcacttggacraaccagagtgtagcttaacaca1620 aagcacccaacttacacttaggagatttcaacttaacttgaccgctctgagctaaaccta1680 gccccaaacccactccaccytaytaycaracaacyttarccaaaccatttacccarayaa1740 agtataggcgatagaaattgaaacctggcgcaatagatayagtaccgcaagggaaagatg1800 aaaaattatarccaagcataatatagcaaggactaacccctataccttctgcataatgaa1860 ttaactagaaataactttgcaaggagarccaaagctaagacccccgaaaccagacgagct1920 acctaaraacagctaaaagagcacacccgtctatgtagcaaaatagtgggaagatttata1980 ggtagaggcgacaaacctaycgagcctggtgatagctggttgtccaagatagaatcttag2040 ttcaactttaaatttgcccrcagaaccctctaaatccccttgtaaatttaaytgttagtc2100 caaagaggaacagctctttggacactaggaaaaaaccttgtagagagagtaaaaaattta2160 acacccatagtaggcctaaaagcagccaccaattaagaaagcgttcaagctcaacaccca2220 ctacctaaaaaatcccaaacatatvactgaactcctcacaccmaattggaccaatctatc2280 accctatagaagaactaatgttagtatragtaacatgaaaacattctcctcygcataagc2340 ctgcgtcagatyaaaacrctgaactgacaattaacagccyaatatctacaatcaaccaac2400 aagtcattattacccycactgtcaacccaacacaggcatgctcataaggaaaggttaaaa2460 aaagtaaaaggaactcggcaaaycttaccccgcctgtttaccaaaaacatcacctctagc2520 atcaccagtattagaggcaccgcctgcccagtgacacatgtttaacggccgcggtaccct2580 raccgtgcaaaggtagcataatcacttgttccttaaatagggacctgtatgaatggctyc2640 acgagggttyagctgtctcttacttttaaccagtgaaattgacctgcccgtgaagaggcg2700 ggcatracacagcaagacgagaagaccctatggagctttaatttattaatgcaarcarta2760 cctaacaracccacaggtcctaaactacyaarcctgcattaaaaatttcggttggggcga2820 cctcggagcagaaycmaacctccgagcagtacatgcyaagacytcaccagtcaaagcgaa2880 ctacyatactcaattgatccaataacttgaccaacggaacaagttaccctagggataaca2940 gcgcaatcctattctagagtccatatcaacaatagggtttacgacctcgatgttggatca3000 ggacatcccratggtgcagccgctattaaaggttcgtttgttcaacgattaaagtcctac3060 gtgatctgagttcagaccggagyaatccaggtcggtttctatctancttcaaattcctcc3120 ctgtacgaaaggacaagagaaataaggcctacttcacaaagcgccttcccccgtaaatga3180 tatcatctcaacttagyatwayaycyacacccacccaagarcagggtttgttaagatggc3240 agagcccggtaatcgcataaaacttaaaactttacagtcagaggttcaaytcctcttctt3300 aacaacayacccatgrccaacctcctactcctcattgtacccattctaatcgcaatggca3360 ttcctaatgctyaccgaacgaaaaattctaggcyatatacaactacgcaaaggccccaac3420 gttgtaggcccctacggrctactacaaccyttcgctgacgccataaaactcttcaccaar3480 gagcccctaaaacccgccacatctrccatcacyctvtacatcaccgccccgaccttrgct3540 ctcaccatygchcttctactatgarcccccctccccatacccaaccccctggtyaacctc3600 aacctaggcctcctatttattctagccacctctagcctagccgyttactcaatcctctga3660 tcaggrtgagcatcaaactcaaactacgccctratcggygcactgcgagcagtagcccar3720 acaatctcatatgaagtcaccctagccatcattctrctatcaacattactaataagtggc3780 tcctttaacctctccncccttatcacarcacaagarcacctctgattactcctrccatca3840 tgrcccytggccataatatgatttayctccacactagcagagaccaaccgaacccccttc3900 gaccttgccgaaggggartcmgaactrgtctcaggcttcaacatcgaatacgccgcaggc3960 cccttcgccytattcttcatrgccgaatacacaaacattattataataaacaccctcacc4020 actayaatcttcctaggaayaacrtatracgcactctcccctgaactctacacaacatat4080 tttgtyaccaagaccctacttctracctccctgttcytatgrrttcgaacagcatacccc4140 cgattccgctacgaccarctcatacacctcctatgaaaaaacttcctaccactcacccta4200 gcrttacttatatgayatgtytccrtacccaytacaatctccagcatyccccctcaaacc4260 taagaaatatgtctgataaaagagttactttgatagagtaaataataggagyttaaaccc4320 ccttatttctaggacyatgagaatcgaacccatccctgagaatccaaaaytctccgtgcc4380 acctatcrcaccccatcctaaagtaaggtcagctaaataagctatcgggcccataccccg4440 aaaatgttggttawacccttcccgtactaattaatcccctggcccaacccrtcatctact4500 ctaccrtytttrcaggcacactcatcachgcgctaagctcrcactgattttttacctgag4560 taggcctagaaataaacatrctagcytttattccarttctaaccaaaaaaataaaccctc4620 gttccacagaagctgccatcaagtayttcctcacrcaagcaaccgcatccataatccttc4680 taatagcyatcctcytcaacaatatactctccggrcaatgaaccataaccaatactacca4740 aycaatactcatcattaataatcatartrgctatagcaataaaactaggaatagccccct4800 ttcacttctgagtcccagargttrcccaaggcrcccctctracatccggcctgcttcttc4860 tcacatgacaaaaactagccccyatctcaatcatataccaaatctcyccctcactaracg4920 taagccttctcctcactctctcaatcttatccatcatagyaggcagttgaggtggaytaa4980 accaaacccagctrcgcaaaatcytagcatactcctcaattacccayataggatgrataa5040 takcarttctaccgtacaacccyaacataaccattcttaatttaactatttatatyatcc5100 taactacyaccgcattcctactactcaacttaaactccagcaccacraccctrctactat5160 ctcgcacctgaaacaagmtaacatgactaacacccttaattccatccaccctcctctccc5220 taggaggcctrcccccrctaaccggctttttgccyaaatggrycattatcgaagaattca5280 caaaraacaatagcctcatyatccccaccatcatagccaccatcaccctmmttaacctct5340 acttctacct acgcctaatc tactccacct caatcacact actccccatr tcyaacaacg 5400 taaaaataaaatgacartttgaacayacaaaacccaccccaytcctccccacactcatcr5460 ccctyaccacrctactcctacctatctccccyttyatactaataatcttatagaaattta5520 ggttaaatacagaccaagagccttcaaagccctcagtaagttgcaatacttaatttctgy5580 racagctaaggactgcaaaaycycaytctgcatcaactgaacgcaaatcagcyactttaa5640 ttaagctaagccctyactagaccaatgggacttaaacccacaaacacttagttaacagct5700 aagcaccctartcaactggcttcaatctacttctcccgccgccgggaaaaaaggcgggag5760 aagccccggcagrtttgaagctgcttcttcgaatttgcaattcaatatgaraaycacctc5820 rgagcyggtaaaaagaggcctarcccctgtctttagatttacagtccaatgcttcactca5880 gccattttacctcacccccactgatgttcgccgaccgttgactattctctacaaaccaca5940 aagacattggracactatacctattattcggcgcatgagctggrgtyctaggcacagctc6000 taagcctccttattcgagccgagctrggycagccaggcaaccttytaggtaacgaccaca6060 tctacaacgtyatcgtyacagcccatgcatttgtaataatcttyttcatagtaataccca6120 tcataatcggaggctttggcaactgactartycccctaataatyggygcccccgatatgg6180 crttyccccgcataaacaacataagcttctgactcttaccyccctcyctcctactcctgc6240 tcgcatctgctayagtrgaggccggagcaggaacaggttgaacagtctaccctcccttag6300 cagggaactactcccaccctggarcctccgtagacctaaccatcttctccttacacctag6360 caggtrtctcytctatcttaggggccatcaayttcatcacaacaattatcaatataaaac6420 cccctgccataacccaataccaaacgcccctcttcgtctgatccgtcctaatyacagcag6480 tcctacttctmctatctctcccagtcctagctgctggcatcacyatactactaacagacc6540 gcaacctyaacaccaccttcttcgaccccgccggaggaggagacccyattctataccaac6600 acctatyctgatttttcggtcaccctgaagtttatattcttatcctaccaggcttcggaa6660 taatctcccatattgtaacytactactccggaaaaaaagaaccatttggatayataggya6720 tggtctgagctatratatcaattggcttcctrgggtttatcgtgtgagcrcaccayatat6780 ttacagtaggaatagacgtagacacacgagcatayttcacctccgcyaccataatcatcg6840 ctatccccaccggcgtcaaagtatttagctgactmgccacactccacggaagcaatatga6900 aatgatctgctgcagtgctctgagccctaggattcatytttcttttcaccgtaggtggcc6960 tractggcattgtattagcaaactcatcrctagacatcgtactacacgacacgtactacg7020 ttgtagcycacttccactatgtcctatcaatrggrgcwgtatttgccatcataggrggct7080 tcattcactgatttcccctattctcaggctacaccctagaccaaacctacgccaaaatcc7140 atttcrctatcatrttcatcggcgtaaatctaacyttcttcccacaacactttctmggcc7200 trtccggaatgccccgacgttactcrgactaccccgatgcatacaccacatgaaayrtcc7260 tatcatctgtrggytcattcatttctctaacagcagtaatattaataattttcatgatyt7320 gagaagccttcgcttcraagcgaaaartcctaatagtagaagaaccctccataaacctgg7380 agtgactayatggatgccccccrccctaccacacattcgaagarcccgtatacataaaat7440 ctaracaaaaaaggaaggaatcgaaccccccaaagytggtttcaagccaaccycatggcc7500 tccatgactttttcaaaaagrtattagaaaaaccatttcataactttgtcaaagttaaat7560 yataggctaartcctatatatcttaatggcacatgcagcrcaagtaggtctacaagacgc7620 tacwtcccctatcatagaagagctyatyacctttcatgaycacrccctcatartyatttt7680 ccttatctgcttyytartcctgtatgcccttttcctaacactcacaacaaaactaactaa7740 tacyaacatctcagacgctcaggaratrgaraccgtctgaactatcctgcccgccatcat7800 cctagtcctcatcgccctcccatccctacgcatcctttacataacagacgaggtcaayga7860 yccytcycttaccatcaaatcaattggccaccaatggtactgaacctacgagtacaccga7920 7g ctacggcggactratcttcaactcctayatacttcccccattattcctagaaccaggcga7980 cctgcgactccttgacgtygacaatcgagtagtrctcccrattgaarcccccattcgtat8040 aataattacatcacaagacgtcttgcactcatgagctgtycccacaytaggcttaaaaac8100 agatgcaattccmggacgtctaaaccaaaccactttcaccgytacacgrccrggrgtata8160 ctacggtcaatgctctgaaatctgyggagcaaaccacagyttcatrcccatcgtcctaga8220 attaattcccctaaaaatctttgaaatrggrcccgtattyaccctatarcaccccctcta8280 cccccyctagarcccacygtaaagctaacttagcattaaccttttaagttaaagattaag8340 agarccaacacctctttacagtgaaatgccccaactaaatactaccrtrtgrcccaccat8400 aatyacccccataytccttacactattyctcatcacccaactaaaaayattaaacacaar8460 ctaccacytacyyccctcaccaaarcccataaaaataaaaaattataacaaaccctgaga8520 accaaaatgaacgaaaatctgttcrcttcattyattgcccccrcartcctaggcctrccc8580 gccrcagtactgatcattctatttccccctctattgayccccacctccaaatatctcatc8640 aacaaccgactaatyaccacccaacaatgactaatcaaactaacctcaaaacaaatrata8700 rcyayacayaacactaaaggrcgaacctgatcycttatactagtatccttaatcattttt8760 attrccacaactaacctcctmggrctcctrccyyactcatttacrccaaccacccaacta8820 tctataaacctagccrtrgccatccccttatgagcrggcrcagtgattataggcytycgc8880 tctaagattaaaaatgccctagcccacttcytrccacaaggcacaccyacaccccttatc8940 ccyatactagttattatcgaarccatcagcctactcattcaaccaatagccctrgccgta9000 cgcctaaccgctaacattactgcaggccacctactcatgcayctaattggaarcrccacc9060 ctagcaatatcraccaytaaccttccctcyacmcttatcatcytcacaattctrattctr9120 ctractatcctagaartcgctgtcgccttartccargcctacgttttcacactyctagta9180 agcctctacctgcacgacaacacataatgacccaccaatcrcatgcctatcatatartaa9240 arcccagyccatgacccctaacrggggccctytcagccctcctaatgacctccggyctag9300 ccatgtgattycacttccactccayaacgctcctyatactaggcctrctaaccaryacac9360 taaccatataccaatgrtggcgcgatgtaacacgagaaagcmcataccaaggccaccaca9420 caccacctgtccaaaaaggccttcgataygggatartcctatttattacctcagaarttt9480 ttttcttcgcaggatttttctgagccttytaccactccagcctagcccctaccccycaay9540 taggrggrcactgrccccsaacaggcatcaccccrctaaatcccctagaartcccactyc9600 taaacacatccgtattactcgcatcaggagtrtcaatcacctgagcycaccatagtctaa9660 tagaaarcaaccgaaaccaaayaattcaagcactgctyattacaattttactgggtctct9720 attttaccctcctacaagcctcagagtacttcgartctcccttcaccatttccgacggca9780 tctacggctcaacattttttgtagccacaggcttccayggamtwcacgtcattattggct9840 caactttcctcactatctgcttcatccgccaactaatatttcactttacatccaaacatc9900 actttggctt ygaagccgcc gcctgatact grcattttgt agatgtggty tgactayttc 9960 tgtatrtctc catctaytga tgagggtctt actcttttag tataaatagt accgttaact 10020 tccaattaac tagytttgac aacattcaaa aaagagtaat aaacttcgcc ttaattttaa 10080 taatcvacac cctcctagcc ttactactaa taatyatyac attttgacta ccacaactca 10140 ayggctacat rsaaaaatcc accccttacg artgcggctt csaccctata tcccccrccc 10200 gcgtcccttt ctccataaaa ttcttcttag tagctatyac cttcttatta ttygayctag 10260 aaattgccct ccttttaccc ctaccatgag ccctacaaac aactaacctr ccrctaatag 10320 ytatrtcatc cctcttatta atcatcatcc tagccctrag tctggcctay gagtgactac 10380 aaaaaggatt agactgarcy gaattggtay ataktttaaa caaaacraat gatttcgact 10440 cattaaatta tgataatcat atytaccaaa tgcccctcat ttacataaat attatactrg 10500 cattyaccat ctcacttcta ggaatactag tatatcgctc acacctcatr tcctccctac 10560 tatgcctaga aggaataata ctatcrctrt tcattatagc tactctcaya accctcaaca 10620 cccactccct cttagcyaay attgtrccta ttgccatayt agtyttygcc gcctgcgaag 10680 cagcggtrgg cctagcccta ctagtctcaa tctccaacac atatggccta gactaygtac 10740 ataacctaaa cctactccaa tgctaaaact aatcgtccca acaattatay trytaccact 10800 gacrtgacty tccaaaaarc acataatytg aatcaacaca accacccaca gcctaattat 10860 tagcatcatc ccyctrctat tttttaacca aatyaacaac aacctattta gctgytcccy 10920 aaccttttcc tccgacccyc taacaacccc cctcctaata ctaacyacct gactcctacc 10980 cctsacaatc atggcaagcc arcgccactt atccarygaa ccrctatcac gaaaaaaact 11040 ctacctctct atactaatct ccctacaaat ctccttartt ataacattca crgccacaga 11100 actaatcata ttttatatct tcttcgaaac cacacttatc cccaccytgr ctatcatcac 11160 ccgatgrggc arccarycag aacgcctgaa cgcaggcaca tacttcctat tctayaccct 11220 agtaggctcc cttcccctac tcatcgcact ratttayact cacaacaccc taggctcact 11280 aaacattcta ctactyacyc tcactgccca agaactatca aactcctgag cyaacaactt 11340 aatatgacta gcttacacaa trgcytttat agtaaarata cctctttacg gactccactt 11400 atgactccct aaagcccatg tcgaagcccc catcgctggg tcaatagtac ttgccgcagt 11460 actcttraaa ctaggyggct atggtataat acgcctcaca ctcattctca accccctgac 11520 aaaacacata gcctayccct tccttgtact atccctatga ggcataatta taacaagctc 11580 catctgcctr cgacaaacag acctaaaatc rctcattgca tactcttcaa tcagccacat 11640 rgccctcgta gtrrcagcca ttctcatcca aacyccctga agcttcaccg gcgcagtcat 11700 yctcataatc gcccacggrc tyacatcctc attactattc tgcctagcaa actcaaacta 11760 cgaacgyact cacagtcgca tcataatcct ctctcaagga cttcaaactc trctcccact 11820 aatagctttt tgatgacttc tagcaagcct cgcyaacctc gccttacccc ccactattaa 11880 cctrctrgga garctctcyg tgctagtarc cacrttctcc tgatcaaata tcactctcct 11940 actyacrgga ctcaacatrc tartcacarc cctatactcc ctctacatat ttaccacaac 12000 acaatgrggc tcactcaccc accacattaa caacataaaa ccctcattya cacgagaaaa 12060 caccctcatr ttcatacacc takcccccat tctcctccta tccctcaacc ccgacatcat 12120 yaccgggttt tccycttgta aatatagttt aaycaaaaca tcagattgtg artcygacaa 12180 cagaggctta cgacccctta tttaccgaga aagctcacaa gaactgctaa ctcrtrccyc 12240 catgtctrac aacatggctt tctcaacttt taaaggataa cagctatcca ttggtcttag 12300 gccccaaraa ttttggtgca actccaaata aaagtaataa ccatgyacac tactatarcc 12360 rccctaaccc trrcttccct aattcccccc atccttrcca ccctcrttaa cccyaacaaa 12420 aaaaactcat acccccatta tgtaaaatcc attgtcgcat ccacctttat tatcagyctc 12480 ttccccacaa caatattcat rtgcctrgac caagaagtya ttatctcraa ctgacactgr 12540 gccacaaccc aaacaaccca gctctcccta agcttcaaac tagactactt ctccataata 12600 ttcatccctg trgcattgtt cgttacatgr tcyaycatag aattctcact gtgatatata 12660 aactcagayc craacattaa tcagttcttc aartatctac tcatyttcct aattaccatr 12720 ctaatcttag ttaccgcyaa caacctattc caactgttca tcggctgrga rggcgtagga 12780 attatatcct tcttgctcat cagttgatgr tacgcccgag crgatgccaa cacagcagcc 12840 attcaagcar tcctatacaa ccgtatcggc gatatcggyt tyatcctcgc cttagcatga 12900 tttatcctac actccaactc atgagacccw caacaaatar cccttctraa cgctaatcca 12960 agcctcmccc crctactagg cctcctccta gcagcagcrg gcaaatcagc ccaattaggy 13020 ctccacccct gactcccctc agccatagaa ggccccacyc cagtctcrgc cctactccac 13080 tcaagcacta tagttgtagc mggrrtcttc ttactcatcc gcttccaccc cctarcagaa 13140 aayarcccrc taatccaaac tctaacacta tgcttaggcg ctatcaccac tctrttygca 13200 gcagtctgcg cycttacaca raatgacatc aaaaaaatcg tagccttctc cacttcaagt 13260 carctaggac tcatartagt yacaatcggc atcaaccaac cacacctagc attcctgcac 13320 atctgtaccc acgccttctt caaagccata ctatttatgt gctccggrtc catcatccac 13380 aaccttaaca atgaacaaga tattcgaaaa ataggaggac tactcaaaac catacctcts 13440 acttcaacct ccctcaccat tggcagccta gcattarcag gaatrccttt cctyacaggb 13500 ttctaytcca argaccacat catcgaaacc gcaaacatat catacacaaa cgcctgagcc 13560 ctrtctatta ctctcatcgc tacctccctr acargcgcct ayagcactcg rataatyctt 13620 ctcaccctaa caggtcaacc ycgcttcccy rcccttactr acattaacga aaataacccc 13680 accctactaa accccattaa acgcctgrca gccggaagcc trttcgcagg attyctcatt 13740 actaacaaca tttcccccrc atcccccttc caaacaacar tccccctcya cctaaaactc 13800 acrgccctcg cygtcacyyt cctaggrctt ctaacagccc tagacctcaa ctacctaacc 13860 aacaaactta aaataaaatm cccacyatgc acattttatt tctccaacat actmggattc 13920 tacyctwsca tcacacaccg cacaatcccc tatctagscc ttctyrcgag ccaaaacctr 13980 cccctactcc tcctagaccw aacctgacta gaaaarctay trccyaaaac aatytcacag 14040 caccaaatct ccacctccrt catcacctcd acccaaaaag gcataatyaa actytayttc 14100 ctctctttct tcttcccrct catcctarcc ctactcctaa tcacatarcc trttcccccg 14160 agcaatytca attacaayat ayacaccaac aaacaatgty carccagtra cyacyactaa 14220 ycaacgccca tartcataca aagcccccgc accaatagga tcctcccgaa tsaaccctga 14280 cccytctcct tcataaatta ttcagctycc yacactayya aagtttacca caaccaccac 14340 cccatcatac tctttcaccc acagcaccaa yccyacctcc atcsctaacc ccactaaaac 14400 actcaccaag acctcaaccc ctgaccccca tgcctcagga tactcctcaa tagcyatcrc 14460 tgtagtatay ccaaagacaa ccaycatycc ccctaaataa aytaaaaaaa ctattaaacc 14520 catataacct cccccaaaat tcagaataat aacacacccr accacrccrc waacaatcar 14580 tactaarccc ccataaatag gagarggctt agaagaaaac cccacaaacc ccattactaa 14640 acccacactc aacagaaaca aagcatayat cattattctc gcacggacta carccacgac 14700 caatgatatg aaaaaccatc gttgtatttc aactacaaga acaccaatga ccccaatacg 14760 caaaaytarc cccctaataa aaytaattaa ccrctcaytc atcgacctcc cyaccccatc 14820 caacatctcc gcatgrtgaa acttcggctc actccttggc ryctgcctga tcctccaaat 14880 caccacagga ctattcctag ccatrcacta ytcaccagac gcctcaaccg ccttttcatc 14940 aatcgcccac atcactcgag acgtaaatta yggstgaayc atccgctacc ttcacgccaa 15000 tggcgcctca atattyttta tctgcctctt cctrcacatc ggrcgaggcc tatattacgg 15060 atcatttctc tactcagaaa cctgaaacat cggcattatc ctcctgcttr carcyatagc 15120 aacagccttc ataggytatg tcctcccgtg aggccaaata tcattctgag grgccacagt 15180 aattacaaac ttactatccg ccaycccata cattggrmca gacctagtyc aatgaatstg 15240 aggrggctac tcagtaraca rtcccaccct cacacgattc tttacctttc acttcatctt 15300 rcccttcatt attgcarycc tarcarcact ccacctccta ttcttrcacg aaacgggrtc 15360 aaacaacccc ctaggaatca cctcccattc cgataaaatc accttccacc cttactacac 15420 aatcaaagac rccctcggct trcttctctt cmttctctcc ttaatracay taacactatt 15480 ctcaccwgac ctcctargcg acccagacaa ttayacccya gccaacccct taaayacccc 15540 tccccacatc aagcccgaat gatatttcct attcgcctac acaattctcc gatccgtccc 15600 taacaarcta ggaggcgtcc ttgccytayt actatccatc ctcatyctag caataatccc 15660 yaycctccay atatccaaac aacaaagcat aatatttcgc ccactaagcc aatcacttta 15720 ttgrctccta rccgcagacc tcctcrttct aacctgaatc ggaggrcaac cagtaagcta 15780 cccytttacc atyattggac aartarcatc crtactatac ttcrcaacaa tcytaatcct 15840 aataccaayt atctccctaa ttgaaaacaa aatactcaaa tggscctgtc cttgtagtay 15900 aaaytartac accagtcttg taarccrrar aygaaaacyt yyttccaagg acaaatcaga 15960 gaaaaagyct ttaactccac cattagcacc caaagctaag attctaattt aaactaytct 16020 ctgttctttc atggggargc agatttgggt rccacccaag tattgactya yccaycaaca 16080 accgcyatgt atytcgtaca ttactgcyag ycamcatgaa tatygyacvg taccataaay 16140 actyrayyac ctrtagtaca trmaamyyya ryccryatca ammyyyy~ryc cyyatgctta 16200 caagcargya crryaaycra ccyycarcyr yyayrcatya ryygyarcyc caamryyrcy 16260 yctymycyay yagratayca acarasyyay yyrycytyaa cagyacatrg yacatrwwry 16320 catyyrycgt acatagcaca ttryagtcaa atcyyyycty gycccyaygg atgacccccc 16380 tcagataggr rtcccttgrc caccatcctc cgtgaaatca atatcccgca caagagtrmt 16440 actctcctcg ctccgggccc ataacacttg ggggtagcta aartgaactg tatccgacat 16500 ctggttccta cttcagggyc ataaagycta aatagcccac acgttcccct taaataagac 16560 atcacgatg 16569 (3) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16569 base pairs (B) TYPE: nucleic acid (ii) MOLECULE TYPE: DNA
(A) DESCRIPTION: /desc = "human mitochondrial DNA revised Cambridge reference sequence"
(iii) HYPOTHETICAL: NO
(ix) FEATURE
(A) NAME/KEY: misc_feature (B) LOCATION: (3106)..(3106) (D) OTHER INFORMATION: n at position 3106 is a deletion (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:
gatcacaggtctatcaccctattaaccactcacgggagctctccatgcatttggtatttt 60 cgtctggggggtatgcacgcgatagcattgcgagacgctggagccggagcaccctatgtc 120 gcagtatctgtctttgattcctgcctcatcctattatttatcgcacctacgttcaatatt 180 acaggcgaacatacttactaaagtgtgttaattaattaatgcttgtaggacataataata 240 acaattgaatgtctgcacagccactttccacacagacatcataacaaaaaatttccacca 300 aaccccccctcccccgcttctggccacagcacttaaacacatctctgccaaaccccaaaa 360 acaaagaaccctaacaccagcctaaccagatttcaaattttatcttttggcggtatgcac 420 ttttaacagtcaccccccaactaacacattattttcccctcccactcccatactactaat 480 ctcatcaatacaacccccgcccatcctacccagcacacacacaccgctgctaaccccata 540 ccccgaaccaaccaaaccccaaagacaccccccacagtttatgtagcttacctcctcaaa 600 gcaatacactgaaaatgtttagacgggctcacatcaccccataaacaaataggtttggtc 660 ctagcctttctattagctcttagtaagattacacatgcaagcatccccgttccagtgagt720 tcaccctctaaatcaccacgatcaaaaggaacaagcatcaagcacgcagcaatgcagctc780 aaaacgcttagcctagccacacccccacgggaaacagcagtgattaacctttagcaataa840 acgaaagtttaactaagctatactaaccccagggttggtcaatttcgtgccagccaccgc900 ggtcacacgattaacccaagtcaatagaagccggcgtaaagagtgttttagatcaccccc960 tccccaataaagctaaaactcacctgagttgtaaaaaactccagttgacacaaaatagac1020 tacgaaagtggctttaacatatctgaacacacaatagctaagacccaaactgggattaga1080 taccccactatgcttagccctaaacctcaacagttaaatcaacaaaactgctcgccagaa1140 cactacgagccacagcttaaaactcaaaggacctggcggtgcttcatatccctctagagg1200 agcctgttctgtaatcgataaaccccgatcaacctcaccacctcttgctcagcctatata1260 ccgccatcttcagcaaaccctgatgaaggctacaaagtaagcgcaagtacccacgtaaag1320 acgttaggtcaaggtgtagcccatgaggtggcaagaaatgggctacattttctaccccag1380 aaaactacgatagcccttatgaaacttaagggtcgaaggtggatttagcagtaaactaag1440 agtagagtgcttagttgaacagggccctgaagcgcgtacacaccgcccgtcaccctcctc1500 aagtatacttcaaaggacatttaactaaaacccctacgcatttatatagaggagacaagt1560 cgtaacatggtaagtgtactggaaagtgcacttggacgaaccagagtgtagcttaacaca1620 aagcacccaacttacacttaggagatttcaacttaacttgaccgctctgagctaaaccta1680 gccccaaacccactccaccttactaccagacaaccttagccaaaccatttacccaaataa1740 agtataggcgatagaaattgaaacctggcgcaatagatatagtaccgcaagggaaagatg1800 aaaaattataaccaagcataatatagcaaggactaacccctataccttctgcataatgaa1860 ttaactagaaataactttgcaaggagagccaaagctaagacccccgaaaccagacgagct1920 acctaagaacagctaaaagagcacacccgtctatgtagcaaaatagtgggaagatttata1980 ggtagaggcgacaaacctaccgagcctggtgatagctggttgtccaagatagaatcttag2040 ttcaactttaaatttgcccacagaaccctctaaatccccttgtaaatttaactgttagtc2100 caaagaggaacagctctttggacactaggaaaaaaccttgtagagagagtaaaaaattta2160 acacccatagtaggcctaaaagcagccaccaattaagaaagcgttcaagctcaacaccca2220 ctacctaaaaaatcccaaacatataactgaactcctcacacccaattggaccaatctatc2280 accctatagaagaactaatgttagtataagtaacatgaaaacattctcctccgcataagc2340 ctgcgtcagattaaaacactgaactgacaattaacagcccaatatctacaatcaaccaac2400 aagtcattattaccctcactgtcaacccaacacaggcatgctcataaggaaaggttaaaa2460 aaagtaaaaggaactcggcaaatcttaccccgcctgtttaccaaaaacatcacctctagc2520 atcaccagtattagaggcaccgcctgcccagtgacacatgtttaacggccgcggtaccct2580 aaccgtgcaaaggtagcataatcacttgttccttaaatagggacctgtatgaatggctcc2640 acgagggttcagctgtctcttacttttaaccagtgaaattgacctgcccgtgaagaggcg2700 ggcataacacagcaagacgagaagaccctatggagctttaatttattaatgcaaacagta2760 cctaacaaacccacaggtcctaaactaccaaacctgcattaaaaatttcggttggggcga2820 cctcggagcagaacccaacctccgagcagtacatgctaagacttcaccagtcaaagcgaa2880 ctactatactcaattgatccaataacttgaccaacggaacaagttaccctagggataaca2940 gcgcaatcctattctagagtccatatcaacaatagggtttacgacctcgatgttggatca3000 ggacatcccgatggtgcagccgctattaaaggttcgtttgttcaacgattaaagtcctac3060 gtgatctgagttcagaccggagtaatccaggtcggtttctatctancttcaaattcctcc3120 ctgtacgaaaggacaagagaaataaggcctacttcacaaagcgccttcccccgtaaatga3180 tatcatctcaacttagtattatacccacacccacccaagaacagggtttgttaagatggc3240 agagcccggtaatcgcataaaacttaaaactttacagtcagaggttcaattcctcttctt3300 aacaacatacccatggccaacctcctactcctcattgtacccattctaatcgcaatggca3360 ttcctaatgcttaccgaacgaaaaattctaggctatatacaactacgcaaaggccccaac3420 gttgtaggcccctacgggctactacaacccttcgctgacgccataaaactcttcaccaaa3480 gagcccctaaaacccgccacatctaccatcaccctctacatcaccgccccgaccttagct3540 ctcaccatcgctcttctactatgaacccccctccccatacccaaccccctggtcaacctc3600 aacctaggcctcctatttattctagccacctctagcctagccgtttactcaatcctctga3660 tcagggtgagcatcaaactcaaactacgccctgatcggcgcactgcgagcagtagcccaa3720 acaatctcatatgaagtcaccctagccatcattctactatcaacattactaataagtggc3780 tcctttaacctctccacccttatcacaacacaagaacacctctgattactcctgccatca3840 tgacccttggccataatatgatttatctccacactagcagagaccaaccgaacccccttc3900 gaccttgccgaaggggagtccgaactagtctcaggcttcaacatcgaatacgccgcaggc3960 cccttcgccctattcttcatagccgaatacacaaacattattataataaacaccctcacc4020 actacaatcttcctaggaacaacatatgacgcactctcccctgaactctacacaacatat4080 tttgtcaccaagaccctacttctaacctccctgttcttatgaattcgaacagcatacccc4140 cgattccgctacgaccaactcatacacctcctatgaaaaaacttcctaccactcacccta4200 gcattacttatatgatatgtctccatacccattacaatctccagcattccccctcaaacc4260 taagaaatatgtctgataaaagagttactttgatagagtaaataataggagcttaaaccc4320 ccttatttctaggactatgagaatcgaacccatccctgagaatccaaaattctccgtgcc4380 acctatcacaccccatcctaaagtaaggtcagctaaataagctatcgggcccataccccg4440 aaaatgttggttatacccttcccgtactaattaatcccctggcccaacccgtcatctact4500 ctaccatctttgcaggcacactcatcacagcgctaagctcgcactgattttttacctgag4560 taggcctagaaataaacatgctagcttttattccagttctaaccaaaaaaataaaccctc4620 gttccacagaagctgccatcaagtatttcctcacgcaagcaaccgcatccataatccttc4680 taatagctatcctcttcaacaatatactctccggacaatgaaccataaccaatactacca4740 atcaatactcatcattaataatcataatagctatagcaataaaactaggaatagccccct4800 ttcacttctgagtcccagaggttacccaaggcacccctctgacatccggcctgcttcttc4860 tcacatgacaaaaactagcccccatctcaatcatataccaaatctctccctcactaaacg4920 taagccttctcctcactctctcaatcttatccatcatagcaggcagttgaggtggattaa4980 accaaacccagctacgcaaaatcttagcatactcctcaattacccacataggatgaataa5040 tagcagttctaccgtacaaccctaacataaccattcttaatttaactatttatattatcc5100 taactactaccgcattcctactactcaacttaaactccagcaccacgaccctactactat5160 ctcgcacctgaaacaagctaacatgactaacacccttaattccatccaccctcctctccc5220 taggaggcctgcccccgctaaccggctttttgcccaaatgggccattatcgaagaattca5280 caaaaaacaatagcctcatcatccccaccatcatagccaccatcaccctccttaacctct5340 acttctacctacgcctaatctactccacctcaatcacactactccccatatctaacaacg5400 taaaaataaaatgacagtttgaacatacaaaacccaccccattcctccccacactcatcg5460 cccttaccacgctactcctacctatctccccttttatactaataatcttatagaaattta5520 ggttaaatacagaccaagagccttcaaagccctcagtaagttgcaatacttaatttctgt5580 aacagctaaggactgcaaaaccccactctgcatcaactgaacgcaaatcagccactttaa5640 ttaagctaagcccttactagaccaatgggacttaaacccacaaacacttagttaacagct5700 aagcaccctaatcaactggcttcaatctacttctcccgccgccgggaaaaaaggcgggag5760 aagccccggcaggtttgaagctgcttcttcgaatttgcaattcaatatgaaaatcacctc5820 ggagctggtaaaaagaggcctaacccctgtctttagatttacagtccaatgcttcactca5880 gccattttacctcacccccactgatgttcgccgaccgttgactattctctacaaaccaca5940 aagacattggaacactatacctattattcggcgcatgagctggagtcctaggcacagctc6000 taagcctccttattcgagccgagctgggccagccaggcaaccttctaggtaacgaccaca6060 tctacaacgttatcgtcacagcccatgcatttgtaataatcttcttcatagtaataccca6120 tcataatcggaggctttggcaactgactagttcccctaataatcggtgcccccgatatgg6180 cgtttccccgcataaacaacataagcttctgactcttacctccctctctcctactcctgc6240 tcgcatctgctatagtggaggccggagcaggaacaggttgaacagtctaccctcccttag6300 cagggaactactcccaccctggagcctccgtagacctaaccatcttctccttacacctag6360 caggtgtctcctctatcttaggggccatcaatttcatcacaacaattatcaatataaaac6420 cccctgccataacccaataccaaacgcccctcttcgtctgatccgtcctaatcacagcag6480 tcctacttctcctatctctcccagtcctagctgctggcatcactatactactaacagacc6540 gcaacctcaacaccaccttcttcgaccccgccggaggaggagaccccattctataccaac6600 acctattctgatttttcggtcaccctgaagtttatattcttatcctaccaggcttcggaa6660 taatctcccatattgtaacttactactccggaaaaaaagaaccatttggatacataggta6720 tggtctgagctatgatatcaattggcttcctagggtttatcgtgtgagcacaccatatat6780 ttacagtaggaatagacgtagacacacgagcatatttcacctccgctaccataatcatcg6840 ctatccccaccggcgtcaaagtatttagctgactcgccacactccacggaagcaatatga6900 aatgatctgctgcagtgctctgagccctaggattcatctttcttttcaccgtaggtggcc6960 tgactggcattgtattagcaaactcatcactagacatcgtactacacgacacgtactacg7020 ttgtagcccacttccactatgtcctatcaataggagctgtatttgccatcataggaggct7080 tcattcactgatttcccctattctcaggctacaccctagaccaaacctacgccaaaatcc7140 atttcactatcatattcatcggcgtaaatctaactttcttcccacaacactttctcggcc7200 tatccggaatgccccgacgttactcggactaccccgatgcatacaccacatgaaacatcc7260 tatcatctgtaggctcattcatttctctaacagcagtaatattaataattttcatgattt7320 gagaagccttcgcttcgaagcgaaaagtcctaatagtagaagaaccctccataaacctgg7380 agtgactatatggatgccccccaccctaccacacattcgaagaacccgtatacataaaat7440 ctagacaaaaaaggaaggaatcgaaccccccaaagctggtttcaagccaaccccatggcc7500 tccatgactttttcaaaaaggtattagaaaaaccatttcataactttgtcaaagttaaat7560 tataggctaaatcctatatatcttaatggcacatgcagcgcaagtaggtctacaagacgc7620 tacttcccctatcatagaagagcttatcacctttcatgatcacgccctcataatcatttt7680 ccttatctgcttcctagtcctgtatgcccttttcctaacactcacaacaaaactaactaa7740 tactaacatctcagacgctcaggaaatagaaaccgtctgaactatcctgcccgccatcat7800 cctagtcctcatcgccctcccatccctacgcatcctttacataacagacgaggtcaacga7860 tccctcccttaccatcaaatcaattggccaccaatggtactgaacctacgagtacaccga7920 ctacggcggactaatcttcaactcctacatacttcccccattattcctagaaccaggcga7980 cctgcgactccttgacgttgacaatcgagtagtactcccgattgaagcccccattcgtat8040 aataattacatcacaagacgtcttgcactcatgagctgtccccacattaggcttaaaaac8100 agatgcaattcccggacgtctaaaccaaaccactttcaccgctacacgaccgggggtata8160 ctacggtcaatgctctgaaatctgtggagcaaaccacagtttcatgcccatcgtcctaga8220 attaattcccctaaaaatctttgaaatagggcccgtatttaccctatagcaccccctcta8280 ccccctctagagcccactgtaaagctaacttagcattaaccttttaagttaaagattaag8340 agaaccaacacctctttacagtgaaatgccccaactaaatactaccgtatggcccaccat8400 aattacccccatactccttacactattcctcatcacccaactaaaaatattaaacacaaa8460 ctaccaccta cctccctcac caaagcccat aaaaataaaa aattataaca aaccctgaga 8520 accaaaatga acgaaaatct gttcgcttca ttcattgccc ccacaatcct aggcctaccc 8580 gccgcagtac tgatcattct atttccccct ctattgatcc ccacctccaa atatctcatc 8640 aacaaccgac taatcaccac ccaacaatga ctaatcaaac taacctcaaa acaaatgata 8700 accatacaca acactaaagg acgaacctga tctcttatac tagtatcctt aatcattttt 8760 attgccacaa ctaacctcct cggactcctg cctcactcat ttacaccaac cacccaacta 8820 tctataaacc tagccatggc catcccctta tgagcgggca cagtgattat aggctttcgc 8880 tctaagatta aaaatgccct agcccacttc ttaccacaag gcacacctac accccttatc 8940 cccatactag ttattatcga aaccatcagc ctactcattc aaccaatagc cctggccgta 9000 cgcctaaccg ctaacattac tgcaggccac ctactcatgc acctaattgg aagcgccacc 9060 ctagcaatat caaccattaa ccttccctct acacttatca tcttcacaat tctaattcta 9120 ctgactatcc tagaaatcgc tgtcgcctta atccaagcct acgttttcac acttctagta 9180 agcctctacc tgcacgacaa cacataatga cccaccaatc acatgcctat catatagtaa 9240 aacccagccc atgaccccta acaggggccc tctcagccct cctaatgacc tccggcctag 9300 ccatgtgatt tcacttccac tccataacgc tcctcatact aggcctacta accaacacac 9360 taaccatata ccaatgatgg cgcgatgtaa cacgagaaag cacataccaa ggccaccaca 9420 caccacctgt ccaaaaaggc cttcgatacg ggataatcct atttattacc tcagaagttt 9480 ttttcttcgc aggatttttc tgagcctttt accactccag cctagcccct accccccaat 9540 taggagggca ctggccccca acaggcatca ccccgctaaa tcccctagaa gtcccactcc 9600 taaacacatc cgtattactc gcatcaggag tatcaatcac ctgagctcac catagtctaa 9660 tagaaaacaa ccgaaaccaa ataattcaag cactgcttat tacaatttta ctgggtctct 9720 attttaccct cctacaagcc tcagagtact tcgagtctcc cttcaccatt tccgacggca 9780 tctacggctc aacatttttt gtagccacag gcttccacgg acttcacgtc attattggct 9840 caactttcct cactatctgc ttcatccgcc aactaatatt tcactttaca tccaaacatc 9900 actttggctt cgaagccgcc gcctgatact ggcattttgt agatgtggtt tgactatttc 9960 tgtatgtctc catctattga tgagggtctt actcttttag tataaatagt accgttaact 10020 tccaattaac tagttttgac aacattcaaa aaagagtaat aaacttcgcc ttaattttaa 10080 taatcaacac cctcctagcc ttactactaa taattattac attttgacta ccacaactca 10140 acggctacat agaaaaatcc accccttacg agtgcggctt cgaccctata tcccccgccc 10200 gcgtcccttt ctccataaaa ttcttcttag tagctattac cttcttatta tttgatctag 10260 aaattgccct ccttttaccc ctaccatgag ccctacaaac aactaacctg ccactaatag 10320 ttatgtcatc cctcttatta atcatcatcc tagccctaag tctggcctat gagtgactac 10380 aaaaaggatt agartgaacc gaattggtat atagtttaaa caaaacgaat gatttcgact 10440 cattaaatta tgataatcat atttaccaaa tgcccctcat ttacataaat attatactag 10500 catttaccat ctcacttcta ggaatactag tatatcgctc acacctcata tcctccctac 10560 8s tatgcctaga aggaataata ctatcgctgt tcattatagc tactctcata accctcaaca 10620 cccactccct cttagccaat attgtgccta ttgccatact agtctttgcc gcctgcgaag 10680 cagcggtggg cctagcccta ctagtctcaa tctccaacac atatggccta gactacgtac 10740 ataacctaaa cctactccaa tgctaaaact aatcgtccca acaattatat tactaccact 10800 gacatgactt tccaaaaaac acataatttg aatcaacaca accacccaca gcctaattat 10860 tagcatcatc cctctactat tttttaacca aatcaacaac aacctattta gctgttcccc 10920 aaccttttcc tccgaccccc taacaacccc cctcctaata ctaactacct gactcctacc 10980 cctcacaatc atggcaagcc aacgccactt atccagtgaa ccactatcac gaaaaaaact 11040 ctacctctct atactaatct ccctacaaat ctccttaatt ataacattca cagccacaga 11100 actaatcata ttttatatct tcttcgaaac cacacttatc cccaccttgg ctatcatcac 11160 ccgatgaggc aaccagccag aacgcctgaa cgcaggcaca tacttcctat tctacaccct 11220 agtaggctcc cttcccctac tcatcgcact aatttacact cacaacaccc taggctcact 11280 aaacattcta ctactcactc tcactgccca agaactatca aactcctgag ccaacaactt 11340 aatatgacta gcttacacaa tagcttttat agtaaagata cctctttacg gactccactt 11400 atgactccct aaagcccatg tcgaagcccc catcgctggg tcaatagtac ttgccgcagt 11460 actcttaaaa ctaggcggct atggtataat acgcctcaca ctcattctca accccctgac 11520 aaaacacata gcctacccct tccttgtact atccctatga ggcataatta taacaagctc 11580 catctgccta cgacaaacag acctaaaatc gctcattgca tactcttcaa tcagccacat 11640 agccctcgta gtaacagcca ttctcatcca aaccccctga agcttcaccg gcgcagtcat 11700 tctcataatc gcccacgggc ttacatcctc attactattc tgcctagcaa actcaaacta 11760 cgaacgcact cacagtcgca tcataatcct ctctcaagga cttcaaactc tactcccact 11820 aatagctttt tgatgacttc tagcaagcct cgctaacctc gccttacccc ccactattaa 11880 cctactggga gaactctctg tgctagtaac cacgttctcc tgatcaaata tcactctcct 11940 acttacagga ctcaacatac tagtcacagc cctatactcc ctctacatat ttaccacaac 12000 acaatggggc tcactcaccc accacattaa caacataaaa ccctcattca cacgagaaaa 12060 caccctcatg ttcatacacc tatcccccat tctcctccta tccctcaacc ccgacatcat 12120 taccgggttt tcctcttgta aatatagttt aaccaaaaca tcagattgtg aatctgacaa 12180 cagaggctta cgacccctta tttaccgaga aagctcacaa gaactgctaa ctcatgcccc 12240 catgtctaac aacatggctt tctcaacttt taaaggataa cagctatcca ttggtcttag 12300 gccccaaaaa ttttggtgca actccaaata aaagtaataa ccatgcacac tactataacc 12360 accctaaccc tgacttccct aattcccccc atccttacca ccctcgttaa ccctaacaaa 12420 aaaaactcat acccccatta tgtaaaatcc attgtcgcat ccacctttat tatcagtctc 12480 ttccccacaa caatattcat gtgcctagac caagaagtta ttatctcgaa ctgacactga 12540 gccacaaccc aaacaaccca gctctcccta agcttcaaac tagactactt ctccataata 12600 ttcatccctg tagcattgtt cgttacatgg tccatcatag aattctcact gtgatatata 12660 aactcagacc caaacattaa tcagttcttc aaatatctac tcatcttcct aattaccata 12720 ctaatcttag ttaccgctaa caacctattc caactgttca tcggctgaga gggcgtagga 12780 attatatcct tcttgctcat cagttgatga tacgcccgag cagatgccaa cacagcagcc 12840 attcaagcaa tcctatacaa ccgtatcggc gatatcggtt tcatcctcgc cttagcatga 12900 tttatcctac actccaactc atgagaccca caacaaatag cccttctaaa cgctaatcca 12960 agcctcaccc cactactagg cctcctccta gcagcagcag gcaaatcagc ccaattaggt 13020 ctccacccct gactcccctc agccatagaa ggccccaccc cagtctcagc cctactccac 13080 tcaagcacta tagttgtagc aggaatcttc ttactcatcc gcttccaccc cctagcagaa 13140 aatagcccac taatccaaac tctaacacta tgcttaggcg ctatcaccac tctgttcgca 13200 gcagtctgcg cccttacaca aaatgacatc aaaaaaatcg tagccttctc cacttcaagt 13260 caactaggac tcataatagt tacaatcggc atcaaccaac cacacctagc attcctgcac 13320 atctgtaccc acgccttctt caaagccata ctatttatgt gctccgggtc catcatccac 13380 aaccttaaca atgaacaaga tattcgaaaa ataggaggac tactcaaaac catacctctc 13440 acttcaacct ccctcaccat tggcagccta gcattagcag gaataccttt cctcacaggt 13500 ttctactcca aagaccacat catcgaaacc gcaaacatat catacacaaa cgcctgagcc 13560 ctatctatta ctctcatcgc tacctccctg acaagcgcct atagcactcg aataattctt 13620 ctcaccctaa caggtcaacc tcgcttcccc acccttacta acattaacga aaataacccc 13680 accctactaa accccattaa acgcctggca gccggaagcc tattcgcagg atttctcatt 13740 actaacaaca tttcccccgc atcccccttc caaacaacaa tccccctcta cctaaaactc 13800 acagccctcg ctgtcacttt cctaggactt ctaacagccc tagacctcaa ctacctaacc 13860 aacaaactta aaataaaatc cccactatgc acattttatt tctccaacat actcggattc 13920 taccctagca tcacacaccg cacaatcccc tatctaggcc ttcttacgag ccaaaacctg 13980 cccctactcc tcctagacct aacctgacta gaaaagctat tacctaaaac aatttcacag 14040 caccaaatct ccacctccat catcacctca acccaaaaag gcataattaa actttacttc 14100 ctctctttct tcttcccact catcctaacc ctactcctaa tcacataacc tattcccccg 14160 agcaatctca attacaatat atacaccaac aaacaatgtt caaccagtaa ctactactaa 14220 tcaacgccca taatcataca aagcccccgc accaatagga tcctcccgaa tcaaccctga 14280 cccctctcct tcataaatta ttcagcttcc tacactatta aagtttacca caaccaccac 14340 cccatcatac tctttcaccc acagcaccaa tcctacctcc atcgctaacc ccactaaaac 14400 actcaccaag acctcaaccc ctgaccccca tgcctcagga tactcctcaa tagccatcgc 14460 tgtagtatat ccaaagacaa ccatcattcc ccctaaataa attaaaaaaa ctattaaacc 14520 catataacct cccccaaaat tcagaataat aacacacccg accacaccgc taacaatcaa 14580 tactaaaccc ccataaatag gagaaggctt agaagaaaac cccacaaacc ccattactaa 14640 acccacactc aacagaaaca aagcatacat cattattctc gcacggacta caaccacgac 14700 caatgatatg aaaaaccatc gttgtatttc aactacaaga acaccaatga ccccaatacg 14760 caaaactaac cccctaataa aattaattaa ccactcattc atcgacctcc ccaccccatc 14820 caacatctcc gcatgatgaa acttcggctc actccttggc gcctgcctga tcctccaaat 14880 caccacagga ctattcctag ccatgcacta ctcaccagac gcctcaaccg ccttttcatc 14940 aatcgcccac atcactcgag acgtaaatta tggctgaatc atccgctacc ttcacgccaa 15000 tggcgcctca atattcttta tctgcctctt cctacacatc gggcgaggcc tatattacgg 15060 atcatttctc tactcagaaa cctgaaacat cggcattatc ctcctgcttg caactatagc 15120 aacagccttc ataggctatg tcctcccgtg aggccaaata tcattctgag gggccacagt 15180 aattacaaac ttactatccg ccatcccata cattgggaca gacctagttc aatgaatctg 15240 aggaggctac tcagtagaca gtcccaccct cacacgattc tttacctttc acttcatctt 15300 gcccttcatt attgcagccc tagcaacact ccacctccta ttcttgcacg aaacgggatc 25360 aaacaacccc ctaggaatca cctcccattc cgataaaatc accttccacc cttactacac 15420 aatcaaagac gccctcggct tacttctctt ccttctctcc ttaatgacat taacactatt 15480 ctcaccagac ctcctaggcg acccagacaa ttatacccta gccaacccct taaacacccc 15540 tccccacatc aagcccgaat gatatttcct attcgcctac acaattctcc gatccgtccc 15600 taacaaacta ggaggcgtcc ttgccctatt actatccatc ctcatcctag caataatccc 15660 catcctccat atatccaaac aacaaagcat aatatttcgc ccactaagcc aatcacttta 15720 ttgactccta gccgcagacc tcctcattct aacctgaatc ggaggacaac cagtaagcta 15780 cccttttacc atcattggac aagtagcatc cgtactatac ttcacaacaa tcctaatcct 15840 aataccaact atctccctaa ttgaaaacaa aatactcaaa tgggcctgtc cttgtagtat 15900 aaactaatac accagtcttg taaaccggag atgaaaacct ttttccaagg acaaatcaga 15960 gaaaaagtct ttaactccac cattagcacc caaagctaag attctaattt aaactattct 16020 ctgttctttc atggggaagc agatttgggt accacccaag tattgactca cccatcaaca 16080 accgctatgt atttcgtaca ttactgccag ccaccatgaa tattgtacgg taccataaat 16140 acttgaccac ctgtagtaca taaaaaccca atccacatca aaaccccctc cccatgctta 16200 caagcaagta cagcaatcaa ccctcaacta tcacacatca actgcaactc caaagccacc 16260 cctcacccac taggatacca acaaacctac ccacccttaa cagtacatag tacataaagc 16320 catttaccgt acatagcaca ttacagtcaa atcccttctc gtccccatgg atgacccccc 16380 tcagataggg gtcccttgac caccatcctc cgtgaaatca atatcccgca caagagtgct 16440 actctcctcg ctccgggccc ataacacttg ggggtagcta aagtgaactg tatccgacat 16500 ctggttccta cttcagggtc ataaagccta aatagcccac acgttcccct taaataagac 16560 atcacgatg 16569

Claims (43)

1. A method for diagnosing a haplogroup of a human comprising:
a) providing a sample comprising mitochondrial nucleic acid from said human;
and b) identifying, in said sample, the presence or absence of at least one nucleotide allele diagnostic of a haplogroup.
2. The method of claim 1 wherein said haplogroup is haplogroup L1 and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 825A, 2758A, 2885C, 7146G, 8468T, 8655T, 10688A, 10810C, and 13105G.
3. The method of claim 1 wherein said haplogroup is haplogroup L2 and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 2416C, 2758G, 8206A, 9221G, 11944C, and 16390G.
4. The method of claim 1 wherein said haplogroup is haplogroup L3 and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 10819G, 14212C, 8618C, 10086C, 16362C, 10398A, and 16124C.
5. The method of claim 1 wherein said haplogroup is haplogroup C and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 3552C, 4715G, 7196A, 8584A, 9545G, 13263G, 14318C, and 16327T.
6. The method of claim 1 wherein said haplogroup is haplogroup D and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 4883T, 5178A, 8414T, 14668T, and 15487T.
7. The method of claim 1 wherein said haplogroup is haplogroup E and wherein method step b) comprises identifying in said sample the nucleotide allele 16227G.
8. The method of claim 1 wherein said haplogroup is haplogroup G and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 4833G, 8200C, and 16017C.
9. The method of claim 1 wherein said haplogroup is haplogroup Z and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 11078G, 16185T, and 16260T.
10. The method of claim 1 wherein said haplogroup is haplogroup A and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 663G, 16290T, and 16319A.
11. The method of claim 1 wherein said haplogroup is haplogroup I and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 4529T, 10034C, and 16391A.
12. The method of claim 1 wherein said haplogroup is haplogroup W and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 204C, 207A, 1243C, 5046A, 5460A, 8994A, 11947G, 15884C, and 16292T.
13. The method of claim 1 wherein said haplogroup is haplogroup X and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 1719A, 3516G, 6221C, and 14470C.
14. The method of claim 1 wherein said haplogroup is haplogroup B and wherein method step b) comprises:
1) identifying in said sample nucleotide allele 16189C;
2) identifying in said sample the absence of a nucleotide allele selected from the group consisting of 1719A, 3516G, 6221C, 14470C, and 16278T; and 3) identifying in said sample the absence of a nucleotide allele selected from the group consisting of 1888A, 4216C, 4917G, 8697A, 10463C, 11251G, 11467G, 12308G, 12372A, 12633T, 13104G, 13368A, 14070G, 14905A, 15452A, 15607G, 15928A, 16126C, 16163C, 16186T, 16249C, and 16294T.
15. The method of claim 1 wherein said haplogroup is haplogroup F and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 12406A and 16304C.
16. The method of claim 1 wherein said haplogroup is haplogroup Y and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 7933G, 8392A, 16231C, and 16266T.
17. The method of claim 1 wherein said haplogroup is haplogroup U and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 3197C, 4646C, 7768G, 9055A, 11332T, 13104G, 14070G, 15907G, 16051G, 16129C, 16172C, 16219G, 16249C, 16270T, 16311T, 16318T, 16343G, and 16356C.
18. The method of claim 1 wherein said haplogroup is haplogroup J and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 295T, 12612G, 13708A, and 16069T.
19. The method of claim 1 wherein said haplogroup is haplogroup T and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 11812G, 12633T, 14233G, 16163C, 16186T, 1888A, 4917G, 8697A, 10463C, 13368A, 14905A, 15607G, 15928A, and 16294T.
20. The method of claim 1 wherein said haplogroup is haplogroup V and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 72C, 4580A, and 15904T.
21. The method of claim 1 wherein said haplogroup is haplogroup H and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 2706A and 7028C.
22. The method of claim 1 wherein said haplogroup is haplogroup LO and wherein method step b) comprises identifying in said sample at least one nucleotide allele selected from the group consisting of 4586C, 9818T, and 8113A.
23. The method of claim 1 wherein said identifying step is performed using an array comprising isolated nucleic acid molecules attached to a substrate at a known location, each molecule having a length of about 9 to about 30 nucleotides, each molecule comprising a sequence identical with a portion of SEQ ID NO:1 containing at least one nucleotide allele at a locus selected from the group of loci consisting of those listed in column 1 of Table 2.
24. A machine readable storage device comprising a data set encoded in machine readable form, said data set comprising a plurality of nucleotide alleles and a haplogroup designation associated with each allele.
25. A program storage device comprising the storage device of claim 24 and also comprising input means for inputting a data set comprising, one or more nucleotide alleles, said device also comprising program steps for diagnosing a haplogroup by associating said input nucleotide alleles with an associated haplogroup, and displaying the result.
26. A set of isolated nucleic acid molecules, each molecule having a length of about 9 to about 30 nucleotides, each molecule comprising a sequence identical with a portion of SEQ ID
NO:1 containing at least one nucleotide allele at a locus selected from the group of loci consisting of those listed in column 1 of Table 2.
27. The set of claim 26 wherein at least one molecule has a sequence comprising a nucleotide allele selected from the group consisting of non-Cambridge human mtDNA
nucleotide alleles of Table 2.
28. The set of claim 26 wherein at least one molecule has a sequence comprising a nucleotide allele selected from the group consisting of non-Cambridge human mtDNA
nucleotide alleles of Table 3.
29. The set of claim 26 wherein at least one molecule has a sequence comprising a nucleotide allele selected from the group consisting of nucleotide alleles useful for diagnosing human haplogroups and macro-haplogroups (Table 9).
30. The set of claim 26 comprising all said nucleic acid molecules.
31. A nucleic acid array comprising the set of claim 26, each member of said set attached to a substrate at a defined location.
32. The array of claim 31 wherein at least one molecule has a sequence comprising a nucleotide allele selected from the group consisting of non-Cambridge human mtDNA
nucleotide alleles of Table 2.
33. The array of claim 31 wherein at least one molecule has a sequence comprising a nucleotide allele selected from the group consisting of non-Cambridge human mtDNA
nucleotide alleles of Table 3.
34. The array of claim 31 wherein at least one molecule has a sequence comprising a nucleotide allele selected from the group consisting of nucleotide alleles in nucleotide alleles useful for diagnosing human haplogroups and macro-haplogroups (Table 9).
35. The array of claim 31 comprising all said nucleic acid molecules.
36. The array of claim 31 on a silica chip.
37. The array of claim 31 wherein said isolated nucleic acid molecules are about 20 nucleotides in length.
38. A method of making a nucleic acid array comprising:
a) providing a prepared substrate; and b) placing isolated nucleic acid molecules in known positions on said substrate, each molecule having a length of about 9 to about 30 nucleotides, each molecule comprising a sequence identical with a portion of SEQ ID NO:1, and containing at least one nucleotide allele at a locus selected from the group of loci consisting of those listed in column 1 of Table 2.
39. The method of claim 38 wherein at least one molecule has a sequence comprising a nucleotide allele selected from the group consisting of non-Cambridge human mtDNA
nucleotide alleles of Table 2.
40. The method of claim 38 wherein at least one molecule has a sequence comprising a nucleotide allele selected from the group consisting of non-Cambridge human mtDNA
nucleotide alleles of Table 3.
41. The method of claim 38 wherein said array comprises all said nucleotide acid molecules.
42. A method for determining the presence or absence of a nucleotide allele in a sample comprising:
a) providing a human sample comprising mtDNA;
b) providing an array of claim 31;
c) contacting said array with and said sample under conditions allowing quantitative hybridization;
d) measuring the pattern hybridization of said sample to said array; and e) analyzing said hybridization.
43. A method for identifying an allele associated with a disease phenotype comprising:
a) providing a first data set comprising all of the mtDNA nucleotide alleles of a human having said disease phenotype;
b) providing a second data set comprising the nucleotide alleles in SEQ ID
NO:1;
c) comparing said first data set with said second data set; and d) identifying an allele in said first data set that is not in said second data set;
thereby identifying an allele associated with said disease phenotype.
CA 2356536 2001-08-30 2001-08-31 Mitochondrial dna sequence alleles Abandoned CA2356536A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US10/488,618 US20050123913A1 (en) 2001-08-30 2002-08-30 Human mitochondrial dna polymorphisms, haplogroups, associations with physiological conditions, and genotyping arrays
CA002459127A CA2459127A1 (en) 2001-08-30 2002-08-30 Human mitochondrial dna polymorphisms, haplogroups, associations with physiological conditions, and genotyping arrays
EP02796465A EP1432831A4 (en) 2001-08-30 2002-08-30 Human mitochondrial dna polymorphisms, haplogroups, associations with physiological conditions, and genotyping arrays
PCT/US2002/028471 WO2003018775A2 (en) 2001-08-30 2002-08-30 Human mitochondrial dna polymorphisms, haplogroups, associations with physiological conditions, and genotyping arrays
JP2003523626A JP2005525082A (en) 2001-08-30 2002-08-30 Human mitochondrial DNA polymorphisms, haplogroups, associations with physiological conditions, and genotyping arrays

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US31633301P 2001-08-30 2001-08-30
US60/316,333 2001-08-30

Publications (1)

Publication Number Publication Date
CA2356536A1 true CA2356536A1 (en) 2003-02-28

Family

ID=23228600

Family Applications (1)

Application Number Title Priority Date Filing Date
CA 2356536 Abandoned CA2356536A1 (en) 2001-08-30 2001-08-31 Mitochondrial dna sequence alleles

Country Status (1)

Country Link
CA (1) CA2356536A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2006238390B2 (en) * 2005-04-18 2011-07-28 Mitomics Inc. Mitochondrial mutations and rearrangements as a diagnostic tool for the detection of sun exposure, prostate cancer and other cancers
US10308987B2 (en) 2005-04-18 2019-06-04 Mdna Life Sciences Inc. 3.4 kb mitochondrial DNA deletion for use in the detection of cancer

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2006238390B2 (en) * 2005-04-18 2011-07-28 Mitomics Inc. Mitochondrial mutations and rearrangements as a diagnostic tool for the detection of sun exposure, prostate cancer and other cancers
US8008008B2 (en) 2005-04-18 2011-08-30 Mitomics Inc. Mitochondrial mutations and rearrangements as a diagnostic tool for the detection of sun exposure, prostate cancer and other cancers
US9745632B2 (en) 2005-04-18 2017-08-29 Mdna Life Sciences Inc. Mitochondrial mutations and rearrangements as a diagnostic tool for the detection of sun exposure, prostate cancer and other cancers
US10308987B2 (en) 2005-04-18 2019-06-04 Mdna Life Sciences Inc. 3.4 kb mitochondrial DNA deletion for use in the detection of cancer
US10907213B2 (en) 2005-04-18 2021-02-02 Mdna Life Sciences Inc. Mitochondrial mutations and rearrangements as a diagnostic tool for the detection of sun exposure, prostate cancer and other cancers
US11111546B2 (en) 2005-04-18 2021-09-07 Mdna Life Sciences, Inc. 3.4 KB mitochondrial DNA deletion for use in the detection of cancer

Similar Documents

Publication Publication Date Title
US6703228B1 (en) Methods and products related to genotyping and DNA analysis
EP1991558B1 (en) Population scale hla-typing and uses thereof
EP1056889B1 (en) Methods related to genotyping and dna analysis
Hoehe Haplotypes and the systematic analysis of genetic variation in genes and genomes
US8343720B2 (en) Methods and probes for identifying a nucleotide sequence
WO2005123951A2 (en) Methods of human leukocyte antigen typing by neighboring single nucleotide polymorphism haplotypes
US20040023237A1 (en) Methods for genomic analysis
US6503707B1 (en) Method for genetic typing
CA2356536A1 (en) Mitochondrial dna sequence alleles
KR102470966B1 (en) Development of genetic markers for early prediction of body height of Jindo dogs
KR100892253B1 (en) Method, Polynucleotide Probe, DNA Chip and Kit for Identifying Mutation of Human Mitochondrial DNA
WO1999058721A1 (en) Multiplex dna amplification using chimeric primers
US20040023275A1 (en) Methods for genomic analysis
KR101985659B1 (en) Method for identification of Baekwoo breed using single nucleotide polymorphism markers
KR101931614B1 (en) Composition for parentage testing in Chickso
US20080026367A9 (en) Methods for genomic analysis
KR102700529B1 (en) Molecular marker for discriminating Yeonsan Ogye albino chicken and uses thereof
KR102470971B1 (en) Development of genetic markers for early prediction of body length vs. height ratio in Jindo dogs
KR101731618B1 (en) Single nucleotide polymorphism marker composition for identification of paternity and its use
KR102470954B1 (en) Development of genetic markers for early prediction of body length of Jindo dogs
US20220392568A1 (en) Method for identifying transplant donors for a transplant recipient
EP1762628B1 (en) Detection method of homologous sequences differing by one base on a microarray
EP1563090A2 (en) Methods, compositions and computer software products for interrogating sequence variations in functional genomic regions
Elbeltagy Utilization of functional genomics techniques in sheep breeding, a recent approach for genetic improvement.
CA2459127A1 (en) Human mitochondrial dna polymorphisms, haplogroups, associations with physiological conditions, and genotyping arrays

Legal Events

Date Code Title Description
EEER Examination request
FZDC Correction of dead application (reinstatement)
FZDE Dead