WO2003020973A2 - Polymorphic sex testing method - Google Patents

Polymorphic sex testing method Download PDF

Info

Publication number
WO2003020973A2
WO2003020973A2 PCT/GB2002/003920 GB0203920W WO03020973A2 WO 2003020973 A2 WO2003020973 A2 WO 2003020973A2 GB 0203920 W GB0203920 W GB 0203920W WO 03020973 A2 WO03020973 A2 WO 03020973A2
Authority
WO
WIPO (PCT)
Prior art keywords
chromosome
str
allele
dxy156xy
alleles
Prior art date
Application number
PCT/GB2002/003920
Other languages
French (fr)
Other versions
WO2003020973A3 (en
Inventor
Peter Forster
Francesco Cali
Original Assignee
Cambridge University Technical Services Limited
Associazione Oasi Maria Ss Onlus
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB0120816A external-priority patent/GB0120816D0/en
Priority claimed from GB0210129A external-priority patent/GB0210129D0/en
Application filed by Cambridge University Technical Services Limited, Associazione Oasi Maria Ss Onlus filed Critical Cambridge University Technical Services Limited
Publication of WO2003020973A2 publication Critical patent/WO2003020973A2/en
Publication of WO2003020973A3 publication Critical patent/WO2003020973A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6879Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for sex determination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • This invention relates to methods of genetic analysis, in particular Y chromosomal typing, using the homologous short tandem repeat loci DXYS156Y and DXYS156X. These methods are particularly useful in forensics and legal medicine.
  • STR loci are continuously being discovered with the progress of the Human Genome Project (e.g. White PS et al (1999). Genomics 57:433-437) and are gaining importance not only in studies of human evolution (Forster P. et al.
  • Y typing is indispensable for sex testing (conventionally using the a elogenin system, Mannucci A et al (1994) Int J Legal Med 106:190-3, but see Santos FR et al (1998) Nat Genet 18:103 and Brinkmann B (2002) Int J Legal Med 116: 63), for paternity deficiency tests (i.e. when the putative father is dead or unavailable, the patrilineal relatives can be taken as substitutes, e.g. Foster EA et al (1998) Nature 396:27-28, Thomas MG et al (1998). Nature 394:138-40), and for multiple rape investigation (determination of the number and identity of male offenders in mixed sperm stains) .
  • DXYS156Y and DXYS156X are homologous microsatellite loci which map to the long arm of chromosome X (DXYS156X) and the short arm of chromosome Y (DXYS156Y) in humans (Chen et al (1994) Human Mutation 4: 208-211). Whilst the patterns of variation in these loci in human populations have been partially characterised (Karafet et al (1998) Human Biology 70 (6) 979- 992) , the use of DXYS156Y and DXYS156X in forensics has been limited by the overlap in size between alleles of the X and Y chromosome loci .
  • the repeat motifs of the DXYS156 STR are flanked by several hundred nucleotides within the duplicate DXYS156XY locus which are almost identical on both the.X and Y chromosome.
  • the published sequence of the DXYS156 locus has the database accession number X71600.
  • the present invention relates to the realisation that point mutations exist within the X and Y homologues of the DXYS156 locus which are chromosome-specific and can be used to distinguish DXYS156 alleles from the X and Y chromosome.
  • These mutations include an internal point insertion within the repetitive domain and two nucleotide substitutions within the flanking region.
  • the use of these mutations significantly increases both the discrimination and sex testing capacity of the DXYS156XY locus, and allows, for example a single PCR -amplification experiment to provide the data which hitherto needed to be determined in three unrelated PCRs (i.e. DNA fingerprinting through a multi-allelic autosomal STR, Y typing through a Y STR, and sex testing with the amelogenin system) .
  • One aspect of the present invention provides a method of analysis of genomic DNA comprising: i) amplifying the DXY156XY locus from a sample of genomic DNA to generate amplification products comprising one or more alleles of said DXY156XY locus; and, ii) identifying an allele of said one or more DXY156XY alleles as a Y chromosome allele or an X chromosome allele, wherein said allele is identified by determining the presence of one or more X chromosome or Y chromosome-specific point mutations in the amplified DXY156XY locus.
  • Suitable X or Y chromosome-specific point mutations in the amplified DXY156XY locus include the following; a) an A insertion within a T (A) 4 motif of the DXY156XY STR,
  • the G/A substitution is located 125 nucleotides downstream (i.e. 3') of the final ' A residue in the last T (A) 4 repeat motif of the STR in the published DXYS156XY sequence.
  • the T/C substitution is located 53 nucleotides downstream (i.e. 3') of the final A residue in the last T (A) 4 repeat motif of the STR in the published DXYS156XY sequence.
  • an allele of said one or more DXY156XY alleles is identified as a Y chromosome allele by determining the presence of one or more Y chromosome-specific polymorphisms or point mutations in the amplified DXY156XY locus, for example a T(A) 5 motif within said STR, a G residue at a position 125 nucleotides downstream of the STR, and/or an T residue at a position 53 nucleotides downstream of the STR.
  • a Y chromosome specific point mutation which is an A residue insertion may equally be referred to as an X chromosome specific point mutation which is an A deletion.
  • a Y chromosome specific point mutation in which an A residue is substituted by a G residue is equivalent to an X chromosome specific point mutation in which a G residue is substituted by a A residue. This also applies to chromosome specific T/C substitutions .
  • a T>C substitution on one strand of nucleic acid is equivalent to a A>G substitution on the complementary strand.
  • a G>A substitution on one strand of nucleic acid is equivalent to a C>T substitution on the complementary strand and an A insertion in one strand is equivalent to a T insertion in the complementary strand.
  • the point mutations described herein may be determined on either strand.
  • a method of analysis of genomic DNA may comprise: i) amplifying the DXY156XY short tandem repeat (STR) from a sample of genomic DNA to generate amplification products comprising one or more DXY156XY alleles; and ii) identifying an allele of said one or more DXY156XY alleles as a Y chromosome allele or an X chromosome allele, wherein said allele is identified by determining the presence or absence of an T (A) 5 motif within said STR.
  • STR DXY156XY short tandem repeat
  • the presence or absence of a T(A) 5 motif at the fourth repeat position from the 5' end of the STR is determined.
  • the presence of a T (A) 5 motif is indicative that the allele is a Y chromosome allele.
  • the absence of a T (A) 5 motif is indicative that the allele is an X chromosome allele.
  • T(A) 5 motif at the fourth repeat position may be determined indirectly by determining the presence of a T (A) 4 motif at that position.
  • an allele may be identified by determining the presence or absence of G residue at position 125 nucleotides downstream of the STR (or a C residue on the complementary strand, see above) .
  • the presence of a G residue at this position is indicative that the allele is a Y chromosome allele.
  • absence of a G residue at this position is indicative that the allele is an X chromosome allele.
  • an allele may be identified by determining the presence or absence of an T residue at a position 53 nucleotides downstream of the STR. The presence of a T residue is indicative that the allele is a Y chromosome allele. Conversely, the absence of a T residue at this position is indicative that the allele is an X chromosome allele .
  • a method may include obtaining or providing a sample of genomic DNA from an individual, patient or donor.
  • a sample of genomic DNA may be extracted or taken from a sample of biological tissue from the individual, patient or donor, for example, a blood, bone, semen, hair, saliva or skin sample.
  • Such a sample may be a forensic sample, for example a sample which is taken from a crime scene or archaeological site, and not directly from an individual.
  • the published sequence of the human DXYS156Y locus has the Genbank database accession number X71600 (Chen et al (1994) Hu .Mutat. 4 (3) 208-211).
  • the STR sequence i.e. the repetitive sequence
  • the STR sequence begins at base 1214 of the published sequence.
  • the repeat sequence ends at base 1264.
  • the number of repeat motifs of an allele and the presence or absence of a T(A) 5 motif as described herein will determine the precise location of the 3' boundary of the repetitive region.
  • the term 'locus' refers to a genomic region which comprises both the repeat motifs (STR) and the sequence flanking these repeat motifs.
  • One region of particular interest to the present application is the fourth motif from the 5' end of the STR. This is shown herein to be indicative of the chromosomal origin of an allele of the homologous DXYS156X and DXYS156Y STR loci.
  • DXYS156Y allele sequence (X71600) , which contains eleven pentanucleotide repeats and one hexanucleotide T (A) 5 motif
  • the fourth repeat from the 5' end begins at base 1229 and ends at base 1234.
  • Another position of interest is 125 nucleotides downstream of the last A residue of the repetitive STR sequence.
  • the present inventors have identified a chromosome specific single nucleotide polymorphism (SNP) at this position which is indicative of the chromosomal origin of an allele of the homologous DXYS156X and DXYS156Y STR loci (i.e. the DXYS156XY locus) .
  • a G residue is indicative of the DXYS156Y locus (i.e. the Y chromosomal locus) and an A residue is indicative of the DXYS156X locus (i.e. the X chromosomal locus).
  • the position 125 nucleotides downstream of the 3' end of the STR is base 1399.
  • Yet another position of interest is 53 residues downstream of the last A residue of the repetitive STR sequence.
  • the present inventors have identified a chromosome-specific single nucleotide polymorphism at this position which is also shown herein to be indicative of the chromosomal origin of an allele of the homologous DXYS156X and DXYS156Y STR loci.
  • a T residue is indicative of the DXYS156Y locus (i.e. the Y chromosomal locus) and a C residue is indicative of the DXYS156X locus (i.e. the X chromosomal locus).
  • the position 53 nucleotides downstream of the 3' end of the STR is base 1327.
  • DXYS156 STR in DNA analysis, for example in sex testing protocols, allows a positive control to be performed; in an assay which is functioning correctly, amplification of the DXYS156 STR from a genomic sample should result in the amplification of at least one X chromosome allele, as genomic samples from both male and female donors contain such an allele.
  • the presence of an X chromosome allele therefore indicates that the amplification reaction is operating correctly and the absence of such an allele is an indication of incorrect operation.
  • X chromosome alleles may also be used to estimate the maternal geographic . origin of a sample, as they display geographic specificity.
  • X chromosome alleles may be identified by determining the absence of a T (A) 5 motif in the STR, the absence of a G at a position 125 nucleotides downstream of the STR (or the presence of A at this position) and/or the absence of a T at position 53 nucleotides downstream of the STR (or the presence of C at this position) .
  • X chromosome alleles may be determined positively, by determining the presence of a T (A) 4 motif at the fourth repeat position from the 5' end of the STR or negatively, by determining the absence of a T (A) 5 motif at the fourth repeat position from the 5' end of the STR.
  • Methods of analysis of genomic DNA as described herein may therefore comprise identifying one or more DXYS156XY alleles as a Y chromosome allele and one or more DXYS156XY alleles as a X chromosome allele, wherein said allele is identified by determining the presence of one or more X or Y chromosome- specific polymorphisms.
  • a suitable chromosome-specific polymorphism may be a point mutation selected from the group consisting of: a) an A insertion within a T (A) 4 motif of said STR, b) a G/A substitution at a position 125 nucleotides downstream of the STR, and; c) a T/C substitution at a position 53 nucleotides downstream of the STR.
  • a method of analysis of genomic DNA may comprise identifying one or more DXYS156XY alleles as a Y chromosome allele and one or more DXYS156XY alleles as a X chromosome allele, wherein said X chromosome allele and said Y chromosome allele are identified by determining the presence or absence of a T(A) ⁇ motif in the STR sequence.
  • the presence or absence of an STR at the fourth repeat position from the 5' end of the STR is determined.
  • the presence of said T (A) 5 motif is indicative that the product is a Y chromosome allele and the absence of said T (A) 5 motif is indicative that allele is a X chromosome allele.
  • the X chromosome allele and the Y chromosome allele may be identified by determining the presence or absence of a G residue at a position 125 nucleotides downstream of the STR sequence (+125) and/or an T residue at position 53 residues downstream of the STR sequence (+53) .
  • the absence of a G residue at position +125 may be determined by determining the presence of an A residue at this position and the absence of a T residue at position +53 may be determined by determining the presence of a C residue at this position.
  • the identity of the residue at position +125 and/or position +53 may be determined.
  • the chromosomal origin of an allele may be identified by determining the presence or absence of any one (i.e. a, b or c) , any two (ab, ac or be) or all three (abc) of the point mutations described herein.
  • DXY156XY is multi-allelic and the particular alleles of the homologous STR on the Y chromosome (DXY156Y) and X chromosome (DXY156X) may be determined. As different individuals will possess different alleles, this enables sample contamination to be detected and provides a substantial contribution to the DNA fingerprinting of a sample. Alleles of an STR such as DXY156XY are distinguished by the number of repeats of the pentanucleotide repetitive motif . The different alleles are indicated herein by a number which refers to the number of repeat motifs present e.g. allele 11 has 11 copies of the repeat motif.
  • the repeat motif of DXY156XY is (A) 4 . Of course, in Y chromosome alleles, the T (A) 4 motif is replaced in the fourth position from the 5 'end by the motif T (A) 5 as described herein.
  • T (A) 4 is used herein to designate the pentanucleotide repeat motif TAAAA and T (A) 5 is used herein to designate the hexanucleotide repeat motif TAAAAA.
  • General references to repeat motifs and tandem repeats include both pentanucleotide TAAAA motifs and hexanucleotide TAAAAA motifs unless otherwise stated.
  • Methods of analysis as described herein may comprise characterising an Y chromosome allele and/or an X chromosome allele by determining the number of copies of the repeat motif in the STR sequence i.e. the number of tandem repeats of said allele.
  • the number of repeat motifs will determine the length of the STR and molecular weight of products amplified from it. Alleles of the STR are distinguished by the number of repeat motifs .
  • the number of tandem repeats may be determined using conventional methods in the art, such as sequencing and electrophoresis as described herein.
  • a test sample of genomic DNA may be provided for example by extracting nucleic acid from cells or biological tissues or fluids, urine, ' saliva, faeces, a buccal swab, biopsy or preferably blood, or for pre-natal testing from the amnion, placenta or foetus itself.
  • Methods of extracting DNA from biological samples are well known in the art. A sample so extracted may be used in a method described herein.
  • the DXY156XY short tandem repeat may be amplified by subjecting the genomic DNA in a sample to a specific nucleic acid amplification reaction such as the polymerase chain reaction (PCR) (reviewed for instance in "PCR protocols; A Guide to Methods and Applications", Eds. Innis et al, 1990, Academic Press, New York, Mullis et al , Cold Spring Harbor Symp. Quant. Biol., 51:263, (1987), Ehrlich (ed) , PCR technology, Stockton Press, NY, 1989, and Ehrlich et al, Science, 252:1643-1650, (1991)).
  • PCR comprises steps of denaturation of template nucleic acid (if double-stranded) , annealing of primer to target, and polymerisation.
  • the nucleic acid used as template in the amplification reaction is preferably human genomic DNA.
  • PCR is used herein in contexts where other nucleic acid amplification techniques may be applied by those skilled in the art. Unless the context requires otherwise, reference to PCR should be taken to cover use of any suitable nucleic amplification reaction available in the art .
  • Oligonucleotide primers suitable for amplification may be designed using genomic sequences which flank a region of the DXYS156XY locus containing one or more point mutations as described herein, in particular sequence which flanks the STR and/or positions 125 nucleotides or 53 nucleotides downstream of the STR.
  • the published DXYS156Y locus sequence (X71600) may be used to design such primers.
  • a suitable oligonucleotide may be about 30 or fewer nucleotides in length (e.g. 18, 21 or 24) .
  • primers are upwards of 14 nucleotides in length, but need not be than 18-20. Those skilled in the art are well versed in the design of primers for use in processes such as PCR.
  • the DXYS156XY locus may be amplified using a pair of oligonucleotide primers, of which the first member of the pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5' of the pentanucleotide repeats, and the second member of the primer pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 3 ' of the pentanucleotide repeats .
  • the DXYS156XY locus may be amplified using a pair of oligonucleotide primers, of which the first member of the pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5 ' of a position 125 nucleotides downstream of the STR, and the second member of the primer pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 3' of position 125 nucleotides downstream of the STR.
  • the DXYS156XY locus may be amplified using a pair of oligonucleotide primers, of which the first member of the pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5' of a position 53 nucleotides downstream of the STR, and the second member of the primer pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 3 ' of a position 53 nucleotides downstream of the STR.
  • a primer pair consisting of a primer from each of two different primer pairs described above may be used to amplify a region of the DXYS156 locus which comprises more than one point mutation (e.g. two or three) .
  • suitable primers may be designed using genomic sequence flanking the DXYS156XY STR, for example, the sequence of the DXYS156Y locus which has the Genbank database accession number X71600. Primers may, of course, also be designed using genomic sequences flanking the published DXYS156Y locus sequence (X71600) , for example neighbouring loci. Examples of suitable primers for amplifying the DXYS156 locus are MVF12 and PF31 described below.
  • DXY156XY locus amplification products may be sequenced and/or tested using known methods to determine the presence or absence of one or more of the T (A) 5 motif in the fourth repeat from the 5' end of the DXY156XY STR, a G residue at a position 125 nucleotides downstream of the STR and/or an T residue at a position 53 nucleotides downstream of the STR, and thereby distinguish between Y chromosome and X chromosome alleles.
  • Sequencing of an amplification product may involve precipitation with isopropanol, resuspension and sequencing using a TaqFS ⁇ Dye terminator sequencing kit. Extension products may be electrophoresed on an ABI 377 DNA sequencer and data analysed using Sequence Navigator software.
  • An amplification product may be treated so as to display on a denaturing polyacrylamide DNA sequencing gel specific bands that are linked to the chromosome specific variant.
  • SSCP heteroduplex analysis may be used to screen genomic DNA for Y or X chromosome DXY156XY STR alleles. This generally involves amplifying labelled 100-300 bp fragments of nucleic acid, diluting these products and denaturing at 95°C. The fragments are quick-cooled on ice so that the DNA remains in single stranded form. These single stranded fragments are run through acrylamide based gels. The insertion or substitution of an single base in alleles from the Y chromosome will cause the single stranded molecules to adopt difference conformations in this gel matrix, making their mobility different from X chromosome alleles, which lack the substitution or insertion. This allows detection of Y chromosome alleles in the amplification products being analysed, for example upon exposure of the gel to X-ray film.
  • Any suitable label may be used to label the fragments of nucleic acid, for example a fluorescent label or radio label .
  • Alleles with altered mobility/conformations may be directly excised from the gel and directly sequenced to confirm the presence of a DXYS156Y specific polymorphism, such as presence of the T (A) 5 motif in the STR, for example at the fourth repeat from the 5' end of the STR.
  • a chromosome specific point mutation such as described herein may generate a chromosome specific endonuclease recognition site which may provide a convenient means of determining the presence of such mutations.
  • GATCC is. a Y chromosome specific restriction site recognised by various endonucleases, including Alwl , BspPI , AclWI, which is generated by the substitution at +125 in the DXYS156Y locus as described herein. This is the only site for this class of restriction enzymes within the DXYS156 STR flanking regions, so these enzymes may be used to differentiate the X from the Y chromosome by selectively cleaving the Y amplicon.
  • GATTC is an X-chromosome specific restriction site which is recognised by various endonucleases, including Hinfl , which is generated by the substitution at +125 in the DXYS156X locus as described herein.
  • Suitable amplification primers may be designed as described herein to produce an amplicon in which this is the only site for this class of restriction enzymes. These enzymes may then be used to differentiate the Y from the X chromosome by selectively cleaving the X amplicon.
  • a method of the present invention may comprise treating a sample of genomic DNA with an enzyme which cleaves the DXYS156XY locus in a chromosome specific manner (i.e. is specific for one of the DXYS156Y and DXYS156X loci) prior to amplification.
  • a suitable enzyme will cleave within the DXYS156Y locus (or an amplicon therein) but not within the DXYS156X locus or vice versa.
  • Such an enzyme may be a restriction endonuclease which recognises a site which is present in the DXYS156X locus but not the DXYS156Y locus or vice versa.
  • Molecules which have been cleaved by the restriction enzyme will not form effective templates for amplification, so endonuclease cleavage of the sample prior to amplification may provide selective amplification of the X chromosome allele or the Y chromosome allele (i.e. the locus from one chromosome is preferentially amplified over the other) .
  • a method of the present invention may comprise treating an amplified DXYS156XY locus with enzyme which cleaves the DXYS156XY locus in a chromosome specific manner (i.e. is specific for either the DXYS156Y or the DXYS156X locus) . Cleavage of the amplified locus is indicative that it is a substrate for the chromosome specific enzyme and thus allows the chromosomal origin of the amplified locus to be determined. Treatment with a specific endonuclease as described herein after amplification may thus be used to distinguish between X and Y chromosome amplicons.
  • a DXYS156XY target locus may be treated with both an X and a Y chromosome specific endonuclease, either individually in different reactions, or simultaneously in the same reaction. Such treatment may occur before or after amplification.
  • the reaction is configured such that only one of the enzymes (for example the Y chromosome specific endonuclease) is active under a first set of conditions, for example a first temperature, whilst under a second set of conditions only the other enzyme (for example the X chromosome specific endonuclease) is active.
  • the target locus is then treated in a reaction with both the enzymes under the first set of conditions and the products of the reaction determined, for example by electrophoresis, real-time PCR etc as described herein, then the same reaction conditions is treated under the second set of conditions and the products determined again.
  • DXYS156XY locus DNA may be treated with Hinfl and BspPI simultaneously. At 37°C, only the Hinfl is active and only X chromosomes (DXYS156X) will be cleaved. The presence of cleavage products is then determined by any conventional method as described herein. The reaction is then treated at 55°C, where only the BspPI is active and only the Y chromosome (DXYS156Y) is cleaved. The presence of cleavage products is then determined by any conventional method as described herein. If a Y allele is present, cleavage products will be present in the second analysis. If no Y allele is present, no such products will be detected in the second analysis.
  • the presence of an X allele is indicated by the results of the first analysis. As a male sample will contain an single X allele, and a female sample a pair of X alleles, the determination of the presence of an X allele in the first analysis may also be used as a positive control.
  • An alternative or supplement to looking for the presence of the T (A) s motif in a test sample is to look for the presence of the T (A) 4 motif, e.g. using a suitably specific oligonucleotide probe or primer.
  • An alternative or supplement to looking for the presence of a G residue at position +125 in a test sample is to look for the presence of an A residue at this position, e.g. using a suitably specific oligonucleotide probe or primer.
  • An alternative or supplement to looking for the presence of an T residue at position +53 in a test sample is to look for the presence of a C residue at this position, e.g. using a suitably specific oligonucleotide probe or primer.
  • Amplification products may be screened using a variant- specific probe.
  • a variant-specific probe may correspond in sequence to a region of the DXY156XY STR, or its complement, preferably to a region comprising the fourth repeat from the 5' end, or a region of the DXY156XY locus comprising position 125 nucleotides downstream of the STR (+125) and/or position 53 downstream of the STR (+53) .
  • Y chromosome specific polymorphism such as a T(A) 5 motif in an . allele of the DXYS156 STR in the test or sample nucleic acid.
  • a motif may replace a T (A) 4 at the fourth repeat position in the STR.
  • Other Y chromosome specific polymorphisms in the DXYS156 locus include a A>G substitution at a position 125 nucleotides downstream of the STR sequence and a C>T substitution at a position 53 downstream of the STR sequence .
  • Hybridization may be performed using any suitable approach or format.
  • a large number of filter and solid support formats are available which are suitable for nucleic acid hybridization analysis (Beltz GA et al . (1985), in Methods In Enzymology, Vol. 100, Part B, R. Wu, L. Grossman, K. Moldave, Eds., Academic Press, New York, Chapter 19, pp. 266-308) .
  • One format, the so-called "dot blot" hybridization involves the non-covalent attachment of target DNAs to a filter, which are subsequently hybridized with a radioisotope labeled probe (s).
  • Nucleic acid hybridization may also be performed on micro- formatted multiplex or matrix devices (e.g., DNA chips) (Barinaga M, (1991) Science 253 1489; Bains W, 10 Bio/Technology, pp. 757-758, 1992) . These methods involve the attachment of specific DNA sequences to very small specific areas of a solid support, such as micro-wells of a DNA chip.
  • the micro-array format also provides for sequencing using the 'sequencing by hybridisation' (SBH) approach (Southern et al (1992) Genomics 13 1008, Drmanac et al (1993) Science 260 1649-1652) .
  • SBH 'sequencing by hybridisation'
  • Various microarray approaches for the analysis of genomic and STR sequences have been proposed (for example W09943853) and are suitable for carrying out the methods described herein.
  • Mismatch between a probe and a target sequence may be employed to detect a point mutation as described herein.
  • an oligonucleotide probe will hybridise with a sequence which is not entirely complementary.
  • the degree of base-pairing between. the two molecules will be suf icient for them to anneal despite a mis-match.
  • Various approaches are well-known in the art for detecting the presence of a mis-match between two annealing nucleic acid molecules.
  • RN'ase A cleaves at the site of a mis-match. Cleavage can be detected by electrophoresing test nucleic acid to which the relevant probe or probe has annealed and looking for smaller molecules (i.e. molecules with higher electrophoretic mobility) than the full length probe/test hybrid.
  • an oligonucleotide probe that has the sequence of the DXY156XY STR, the DXY156XY STR flanking region or a portion thereof (either sense or anti-sense strand) which includes the a Y chromosome specific polymorphism as described herein may be annealed to test nucleic acid i.e. products amplified from the DXY156XY STR of genomic DNA sample and the presence or absence of a mis-match determined.
  • Detection of the presence of a mis-match may indicate the presence in the test nucleic acid of Y chromosome specific sequence variant such as a T (A) 5 motif at the fourth repeat position, a G at position +125 relative to the STR or an T at position +53 relative to the STR or the presence of an X chromosome specific sequence variant such as a T (A) 4 motif at the fourth repeat position, an A at position +125 3' of the STR or a C at position +53 3' of the STR.
  • Y chromosome specific sequence variant such as a T (A) 5 motif at the fourth repeat position
  • G at position +125 relative to the STR or an T at position +53 relative to the STR
  • an X chromosome specific sequence variant such as a T (A) 4 motif at the fourth repeat position, an A at position +125 3' of the STR or a C at position +53 3' of the STR.
  • the oligonucleotide probe may comprise a label. Binding of the probe may be determined by detecting the presence of the label .
  • Binding of a probe to target nucleic acid may be measured using any of a variety of techniques at the disposal of those skilled in the art as described above.
  • probes may be radioactively, fluorescently or enzymatically labelled.
  • Other methods not employing labelling of probe include variant specific amplification using PCR, RN'ase cleavage and allele specific oligonucleotide probing.
  • Probing may employ the standard Southern blotting technique. For instance DNA may be extracted from cells and digested with different restriction enzymes. Restriction fragments may then be separated by electrophoresis on an agarose gel, before denaturation and transfer to a nitrocellulose filter. Labelled probe may be hybridised to the DNA fragments on the filter and binding determined.
  • Suitable selective hybridisation conditions for oligonucleotides of 17 to 30 bases include hybridization overnight at 42°C in 6X SSC and washing in 6X SSC at a series of increasing temperatures from 42°C to 65°C. Selective hybridisation using micro-array formats may be performed using conditions of equivalent stringency.
  • Variant-specific oligonucleotide primers may also be used in PCR to specifically amplify sequence from the DXY156XY locus if a Y chromosome specific' sequence variant, such as a T (A) 5 motif, is present in the STR.
  • sequence may be amplified only if a T (A) 5 motif is present at the fourth repeat position, or alternatively, only if a T (A) 4 motif is present at the fourth repeat position in a test sample of genomic DNA.
  • the first member of the pair of oligonucleotide primers may comprise a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5' or 3 ' of the pentanucleotide repeats
  • the second member of the pair may comprise a nucleotide sequence which hybridises under stringent conditions to a DXYS156 STR sequence which has T (A) 5 at the fourth repeat position and not to an DXYS156 STR sequence which has T (A) 4 at the fourth repeat position (or vice versa) , such that amplification only occurs in the presence of the particular motif at the fourth repeat position.
  • sequence may be amplified only if a G residue is present at a position 125 nucleotides downstream of the final residue of the STR, or alternatively, only if an A motif is present at this position in a test sample of genomic DNA.
  • the first member of the pair of oligonucleotide primers may comprise a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5' or 3' of position +125 downstream of the STR
  • the second member of the pair may comprise a nucleotide sequence which hybridises under stringent conditions to a DXYS156 locus sequence which has G at position +125 and not to an DXYS156 locus sequence which has A at this position (or vice versa) , such that amplification only occurs in the presence of the particular residue at position +125.
  • sequence may be amplified only if an T residue is present at the position 53 nucleotides downstream of the STR, or alternatively, only if a C residue is present at this position in a test sample of genomic DNA.
  • the first member of the pair of oligonucleotide primers may comprise a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5' or 3 ' of position +53
  • the second member of the pair may comprise a nucleotide sequence which hybridises under stringent conditions to a DXYS156 locus sequence which has T at position +53 and not to an DXYS156 locus sequence which has C at this position (or vice versa) , such that amplification only occurs in the presence of the particular residue at position +53.
  • amplification of more than one X or Y chromosome specific sequence variant or polymorphism may be undertaken simultaneously, for example in a multiplex reaction, or alternatively using a pair of amplification primers which both hybridise selectively to different sequences which each comprise a Y chromosome specific point mutation or which both hybridise selectively to different sequences which each comprise a X chromosome specific point mutation.
  • Alleles from the DXYS156 loci may be amplified in a multiplex reaction along with other microsatellites, for example multi- allelic autosomal STRs for DNA fingerprinting.
  • Protocols suitable for use in amplifying DXYS156XY locus and determining the motif in the fourth repeat position of the STR and/or the residues at positions +53 and +125 relative to the STR may be found in Molecular Cloning: a Laboratory Manual: 3rd edition, Sambrook & Russell, 2001, Cold Spring Harbor Laboratory Press and/or Current Protocols in Molecular Biology, Ausubel et al . eds . , John Wiley & Sons, 1992.
  • the identification of a product as a Y chromosome allele is indicative of said sample being derived from a male donor.
  • a method for determining the sex of a donor of a genomic sample may therefore include; i) amplifying the DXYS156XY locus of the sample of genomic
  • DNA provided by a donor to generate amplification products comprising one or more DXYS156XY alleles; and ii) determining the presence or absence of one or more Y chromosome-specific polymorphisms or point mutations in the amplified DXYS156XY locus.
  • a suitable Y chromosome-specific polymorphism, sequence variant or point mutation in the amplified DXYS156XY locus may be selected from the group consisting of; a) an T (A) 5 insertion within the DXYS156 STR, b) a G residue at a position 125 nucleotides downstream of the STR, c) a T residue at a position 53 nucleotides downstream of the STR.
  • the presence of one or more of these point mutations is indicative that said allele is a Y chromosome allele and thus that said donor is male.
  • a method for determining the sex of a donor of a genomic sample may include; i) amplifying the DXYS156XY short tandem repeat (STR) of the sample of genomic DNA provided by a donor to generate amplification products comprising one or more DXY156XY alleles; and ii) determining the presence or absence of a T (A) 5 motif in an allele of said one or more DXY156XY alleles, the presence of said motif being indicative that said allele is a Y chromosome allele and said donor is male.
  • STR DXYS156XY short tandem repeat
  • the presence or absence of a T(A) 5 motif at the fourth repeat position from the 5' end of the STR may be determined in such methods .
  • a common problem in genetic analysis is the determination of paternity in circumstances in which genomic DNA from the potential father is not available.
  • the present invention allows for simplified methods for paternity deficiency tests.
  • Such a method may comprise the steps of: i) amplifying the DXYS156XY locus from a first sample of genomic DNA provided by a first donor and a second sample of genomic DNA provided by a second donor to generate first and second amplification products, each said products comprising one or more DXYS156XY alleles, ii) identifying an allele in said first amplification products as a Y chromosome allele of said first donor and an allele in said second amplification products as a Y chromosome allele of said second donor, wherein said Y chromosome alleles are identified by determining the presence of one or more Y chromosome-specific point mutations in the amplified DXY156XY locus, iii) characterising the Y chromosome allele of said first and said second donor by determining the number of tandem repeats of each said allele; and iv) determining the family relationship between said first and second donors by comparing the Y chromosome all
  • a suitable Y chromosome-specific polymorphism, sequence variant or point mutation may be selected from the group consisting of: a) an T (A) 5 motif within the DXYS156XY STR, b) a G residue at a position 125 nucleotides downstream of the STR, c) an T residue at a position 53 nucleotides downstream of the STR.
  • such a method may comprise: i) amplifying the DXYS156XY STR from a first sample of genomic
  • the presence or absence of a T (A) 5 motif at the fourth repeat position from the 5' end of the allele of the STR may be determined.
  • the second donor is preferably a male of the same patrilineal descent as the absent or deceased father.
  • the presence of the same Y chromosome allele is indicative that the first and second donor are related i.e. are of the same patrilineal descent.
  • the presence of different Y chromosome alleles in the samples from the first and second donors is indicative of there being no family relationship between the individuals.
  • the presence of the same Y chromosome allele in the samples from the first and second donors is consistent with there being a family relationship between the individuals.
  • Further forensic aspects of the present invention relate to the determination of the number and/or identity multiple individuals from a mixed sample, for example male offenders in multiple rape cases.
  • Such a method may include; i) amplifying the DXY156XY locus from a sample of genomic DNA to generate amplification products comprising two or more DXY156XY alleles; and ii) identifying two or more alleles of said DXY156XY alleles as Y chromosome alleles, wherein said Y chromosome alleles are identified by determining wherein the presence of one or more Y chromosome- specific polymorphisms, sequence variants or point mutations in the amplified DXY156XY short tandem repeat (STR) locus.
  • STR short tandem repeat
  • Suitable Y chromosome-specific polymorphisms, sequence variants or point mutations may be selected from the group consisting of: a) an T (A) 5 motif within the DXYS156 STR, b) a G residue at a position 125 nucleotides downstream of the STR, c) a T residue at a position 53 nucleotides downstream of the STR.
  • such a method may include; i) amplifying the DXY156XY short tandem repeat (STR) from a sample of genomic DNA to generate amplification products comprising two or more DXY156XY alleles; and ii) identifying two or more alleles of said DXY156XY alleles as Y chromosome alleles, wherein said alleles are identified by determining the presence or absence of a T (A) s motif with said allele.
  • the presence or absence of a T (A) 5 motif may be determined at the fourth repeat position from the 5' end of the allele.
  • Methods may further include characterising said two or more Y chromosome alleles by determining the number of tandem repeats within said alleles.
  • the characterisation of said alleles provides information which contributes to the DNA fingerprinting of the individuals from whom the genomic DNA sample is derived.
  • Another potentially very useful application of the present methods is the determination of the paternal descent of a stain, sample, or deposit: in many police investigations, especially where expensive mass screenings are required, knowledge of the geographic origin, and by extrapolation, the- phenotype, of an unknown stain donor could narrow down the list of suspects and substantially reduce time and cost of the screenin .
  • DXYS156 Different human populations have different frequencies of X and Y chromosome alleles of DXYS156. Determination of the particular X and Y alleles of DXYS156 may be used to provide estimates of the paternal and/or maternal descent of an individual .
  • a method may comprise: i) amplifying the DXY156XY locus from a genomic DNA sample provided by an individual to generate amplification products comprising one or more DXY156XY STR alleles; and, ii) identifying an allele of said one or more DXY156XY STR alleles as a Y chromosome allele or an X chromosome allele, wherein said allele is identified by determining wherein the presence of one or more X or Y chromosome-specific polymorphisms, sequence variants or point mutations in the amplified DXY156XY locus comprising said allele, iv) characterising said X or Y chromosome STR allele by determining the number of tandem repeats therein, v) providing a database of the distribution of DXY156XY STR chromosome alleles in one or more different human populations, vi) comparing the Y chromosome allele or the X chromosome allele of said
  • a suitable X or Y chromosome specific polymorphism, sequence variant or point mutation may be selected from the group consisting of; a) an A insertion in a T (A) 4 motif within said STR, b) a G/A substitution at a position 125 nucleotides downstream of the STR, c) an T/C substitution at a position 53 nucleotides downstream of the STR.
  • such a method may include: i) amplifying the DXY156XY short tandem repeat (STR) from a genomic DNA sample provided by an individual to generate amplification products comprising one or more DXY156XY alleles; and, ii) identifying an allele of said one or more DXY156XY alleles as a Y chromosome allele or an X chromosome allele, wherein said allele is identified by determining the presence or absence of a T (A) s motif within said allele, iii) characterising said X or Y chromosome allele by determining the number of tandem repeats therein,- iv) providing a database of the distribution of DXY156XY STR chromosome alleles in one or more different human populations, v) comparing the Y chromosome allele or the X chromosome allele of said sample with said database; and vi) determining the probability of said individual being a member of one or more of
  • T (A) s motif may be determined at the fourth repeat position from the 5' end of the allele.
  • Tables 2 and 3 Databases showing the distribution of DXY156XY STR alleles in different human populations are shown in Tables 2 and 3.
  • Nucleic acid suitable for use in performing the methods described herein may be provided as part of a kit, e.g. in a suitable container such as a vial in which the contents are protected from the external environment .
  • a kit may comprise; a) amplification primers specific for the DXY156XY locus b) an oligonucleotide probe specific for an X or Y chromosome- specific polymorphism in the amplified DXY156XY short tandem repeat (STR) locus
  • a suitable probe may be specific for one or more of the following: a) an T (A) 5 motif within the fourth repeat motif of said STR, b) a G or A residue at a position 125 nucleotides downstream of the STR, c) an T or C residue at a position 53 nucleotides downstream of the STR.
  • kit may, for example, include amplification primers specific for the DXY156XY STR and an oligonucleotide probe specific for Y chromosome alleles of DXY156XY comprising a T (A) s motif at the fourth repeat position.
  • a probe may comprise the sequence of the fourth repeat motif of the DXY156XY STR or the complement thereof.
  • a kit may also include instructions for use in a method as described herein.
  • a kit may further include one or more reagents required for the amplification reaction, such as polymerase, nucleotides, buffer solution etc and for detection of the hybridised probe.
  • the nucleic acid may be labelled.
  • Figure 1 shows sequence data from Sicilian and Korean individuals which indicates that an inserted adenine is present in the fourth repeat motif of all males.
  • Figure 2 shows an allelic ladder which contains the most common X and Y alleles . Length analysis indicates that the adenine insertion is present in males but not females.
  • Figure 3 shows the global distribution of the Yll allele of DXYS156.
  • Figure 4 shows the global distribution of the Y13, Y14 and Y15 alleles of DXYS156.
  • Figure 5 shows the global distribution of the X4 allele of DXYS156.
  • Table la shows the frequencies of DXYS156X alleles in Sicily and Korea.
  • Table lb shows the frequencies of DXYS156Y alleles in Sicily and Korea .
  • Table 2 shows the frequencies of DXYS156X alleles worldwide. References are; Karafet T et al (1998) Hum Biol 70:979-992; Kersting et al (2001) Croat Med J. 42:310-314.
  • Table 3 shows the frequencies of DXYS156X alleles worldwide. References are Rossi et al (1999) Int J Legal Med 112:78-81; Nata et al (1999) Int J Legal Med 112:406-408; Sajantila et al
  • Sicilian blood samples were obtained from healthy blood donors, selected for ancestry of all four grandparents, from the Sicilian towns of Troina (59 women and 48 men) and Sciacca (33 women and 50 men) .
  • Korean samples (7 women and 34 men) were obtained from students in Kwangju, South Korea.
  • DNA was extracted from peripheral blood leukocytes by standard procedures .
  • Extracted DNA was amplified by polymerase chain reaction (PCR) using the primers;
  • the PCR was performed in 50 ⁇ L containing: 50ng of genomic DNA; 1U Taq DNA-polymerase (Perkin Elmer, USA) ; 5 ⁇ L reaction buffer lOx (20mM Tris-HCl pH 8, lOOmM KCl, 0. ImM EDTA, ImM DTT, 50% glycerol, 0.5% Tween 20, 0.5% Nonidet P40) 1.5mM MgCl 2 , 0.2mM of each dNTP, 0.2mM of each primer.
  • Primer 1 was modified 5' by addition of a dye label (TAMRA: N,N,N' ,N' -tetramethyl-6- carboxyrhodamine) .
  • l ⁇ L of the products was diluted in 12 ⁇ L of deionised formamide and l ⁇ L of GeneScan 350 Rox (molecular weight DNA marker) .
  • the DNA was denatured at 95°C for 3min, cooled on melting ice, and loaded on an ABI PRISM 310 Genetic Analyzer (Perkin Elmer, USA) for amplicon length determination.
  • the alleles were first cloned using the TOPO-TA Cloning Kit (Invitrogen, BV, Groningen, The Netherlands) .
  • the same primers were used to 'sequence both strands of DNA, using ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit (PE, Applied Biosystems, Milano, Italy) .
  • DXYS156 Locus Amplification The DXYS156 locus was amplified by PCR using primers MVF12 and
  • Each PCR reaction contained: 1 ⁇ l MVF12 (lOpmoles)
  • Thermocycling was performed without oil as follows: lid preheated at 105°C, initial denaturing at 94°C for 2 minutes then 35 cycles of 94°C for 15 seconds, 58°C for 15 seconds, 72°C for 15 seconds with a final extension at 72°C for 10 minutes .
  • PCR products were separated by electrophoresis on a 5% agarose gel until the X and the Y band were separated.
  • the X and the Y band were then excised with a scalpel from the gel and purified separately with QIAQUICK gel purification kit (QIAGEN, Germany) .
  • the purified X and Y amplicons were submitted to DNA fluorescence sequencing using the Big Dye Terminator version 2 kit (ABI, USA) and the DNA sequence determined by capillary electrophoresis (ABI, USA) .
  • DNA sequencing is inconvenient for routine applications so an allelic ladder was generated which contains the most common alleles on the X and on the Y (Fig. 2) .
  • Length analysis using this ladder confirmed the presence of the insertion in all males (98 out of 98 Sicilians and 34 out of 34 Koreans) , but • never in females (0 out of 92 Sicilians and 0 out of 7 Koreans) .
  • Sex testing by amelogenin (Sullivan et al . 1993 supra) confirmed the sex recorded on the sample labels. We conclude that the point insertion is specific to the Y chromosome .
  • the two X specific nucleotides have been found in DXYS156 allele 4 and in allele 7 in our samples .
  • these two X nucleotides occur in at least 44 % , 80% and 91% in African, European, and other X chromosomes respectively and more probably in close to 100% of X chromosomes .
  • DXYS156 generally has allele lengths ranging from 4 to 12 repeats on the X chromosome (Table 2), and 8 to 15 repeats on the Y chromosome (Table 3) .
  • the X locus generally has allele lengths of 10 repeats or shorter, and the Y locus generally has 11 repeats or longer, there is an overlap of X and Y allele ranges amounting to several percent in many populations (Tables 2 and 3) . This overlap has. previously hindered the secure chromosomal assignment of alleles, despite efforts at statistical separation (Karafet T et al (1998) Hum Biol .70:979-992) .
  • Y alleles of the STR can be unambiguously distinguished from the X alleles.
  • the predominant Y allele in the Sicilian samples is allele length 12, which is by far the most common European allele (Table 3) .
  • the- allele 11 in the Troina sample is, at 10 out of 48 males, unusually common for a European population (Fig. 3) .
  • the family names of the Troina donors were investigated and six of the ten males with allele 11 were found to share either of two family names.
  • Evidently some of the increased percentage of allele 11 in Troina is due to these two paternal founders, who must have lived some time after the introduction of family names in Italy (13th to 14th century AD) , but before three generations ago, which was our limit for tracing family relationships between donors .
  • mtDNA mitochondrial DNA
  • mtDNA localisation is possible with a geographic precision of 0km to 2000km in two thirds of cases: if the mtDNA type is specific to the same part of the world as one of the X alleles, then it may be reasonable to equate the geographic specificity of the other X allele with the paternal origin of the sample.
  • DXYS156 offers a positive and ⁇ negative control for sex testing (the X homologue should always appear) , except that DXYS156 is ultiallelic and therefore additionally may warn the user of the presence of contaminant alleles in the sample.
  • the unambiguously distinguishable X and Y alleles display geographic speci icities, allowing separate estimates of the maternal and paternal geographic origins of a given sample. Typing of the DXYS156 STR as described herein therefore offers significant advantages over less informative STRs for standard multiplexing kits .
  • Japan 44 Forster 2000 (3) 0 0 0 0.43 0.34 0.20 0.(

Abstract

This invention relates to methods of Y chromosomal typing, using Y chromosome specific polymorphisms in the homologous short tandem repeat loci DXYS156Y and DXYS156X. These methods are particularly useful in forensics and legal medicine.

Description

Polymorphic Sex Testing Method
This invention relates to methods of genetic analysis, in particular Y chromosomal typing, using the homologous short tandem repeat loci DXYS156Y and DXYS156X. These methods are particularly useful in forensics and legal medicine.
New Y short tandem repeat (STR) loci are continuously being discovered with the progress of the Human Genome Project (e.g. White PS et al (1999). Genomics 57:433-437) and are gaining importance not only in studies of human evolution (Forster P. et al. (1998) Mol Biol Evol 15:1108-1114, Forster P et al (2000) Am J Hum Genet 67:182-196) but also in genealogical research, in forensic casework and in paternity testing (Jobling MA et al (1997) Int J Legal Med 110:118-124, Kayser M et al (1997) Int J Legal Med 110:125-133, 141-149, Honda K et al (1999) J Forensic Sci 44:868-72). Due to their non- recombining nature, Y STRs are inherited uniparentally in male lineages and persist as Y haplotypes over generations of male descendency, unless a mutation occurs. This feature of paternal inheritance makes Y STRs a powerful tool for family name studies (Sykes B, Irven C (2000) Am J Hum Genet 66:1417- 1419) .
In legal medical applications, Y typing is indispensable for sex testing (conventionally using the a elogenin system, Mannucci A et al (1994) Int J Legal Med 106:190-3, but see Santos FR et al (1998) Nat Genet 18:103 and Brinkmann B (2002) Int J Legal Med 116: 63), for paternity deficiency tests (i.e. when the putative father is dead or unavailable, the patrilineal relatives can be taken as substitutes, e.g. Foster EA et al (1998) Nature 396:27-28, Thomas MG et al (1998). Nature 394:138-40), and for multiple rape investigation (determination of the number and identity of male offenders in mixed sperm stains) .
However, Y haplotypes alone do not have a discrimination capacity which is high enough for court convictions; many suspects, for example, will have male relatives with the same Y chromosome. The typing of autosomal loci which, because they inherently recombine in every generation, yield potentially unique genetic profiles (except in.cases of identical twins), is therefore still indispensable. This often leads to a dilemma for the investigator as to whether to sacrifice valuable material for the sake of extensive Y chromosomal investigation, as stain DNA is often very limited and typically allows only few PCR amplifications.
DXYS156Y and DXYS156X are homologous microsatellite loci which map to the long arm of chromosome X (DXYS156X) and the short arm of chromosome Y (DXYS156Y) in humans (Chen et al (1994) Human Mutation 4: 208-211). Whilst the patterns of variation in these loci in human populations have been partially characterised (Karafet et al (1998) Human Biology 70 (6) 979- 992) , the use of DXYS156Y and DXYS156X in forensics has been limited by the overlap in size between alleles of the X and Y chromosome loci . This prevents unambiguous designation of an allele to the X or Y chromosome. The repeat motifs of the DXYS156 STR are flanked by several hundred nucleotides within the duplicate DXYS156XY locus which are almost identical on both the.X and Y chromosome. The published sequence of the DXYS156 locus has the database accession number X71600.
The present invention relates to the realisation that point mutations exist within the X and Y homologues of the DXYS156 locus which are chromosome-specific and can be used to distinguish DXYS156 alleles from the X and Y chromosome. These mutations include an internal point insertion within the repetitive domain and two nucleotide substitutions within the flanking region. The use of these mutations significantly increases both the discrimination and sex testing capacity of the DXYS156XY locus, and allows, for example a single PCR -amplification experiment to provide the data which hitherto needed to be determined in three unrelated PCRs (i.e. DNA fingerprinting through a multi-allelic autosomal STR, Y typing through a Y STR, and sex testing with the amelogenin system) .
One aspect of the present invention provides a method of analysis of genomic DNA comprising: i) amplifying the DXY156XY locus from a sample of genomic DNA to generate amplification products comprising one or more alleles of said DXY156XY locus; and, ii) identifying an allele of said one or more DXY156XY alleles as a Y chromosome allele or an X chromosome allele, wherein said allele is identified by determining the presence of one or more X chromosome or Y chromosome-specific point mutations in the amplified DXY156XY locus.
Suitable X or Y chromosome-specific point mutations in the amplified DXY156XY locus include the following; a) an A insertion within a T (A) 4 motif of the DXY156XY STR,
■ b) a G/A substitution at a position 125 nucleotides downstream of the STR, c) an T/C substitution at a position 53 nucleotides downstream of the STR.
The G/A substitution is located 125 nucleotides downstream (i.e. 3') of the final' A residue in the last T (A) 4 repeat motif of the STR in the published DXYS156XY sequence. The T/C substitution is located 53 nucleotides downstream (i.e. 3') of the final A residue in the last T (A) 4 repeat motif of the STR in the published DXYS156XY sequence.
Preferably, an allele of said one or more DXY156XY alleles is identified as a Y chromosome allele by determining the presence of one or more Y chromosome-specific polymorphisms or point mutations in the amplified DXY156XY locus, for example a T(A)5 motif within said STR, a G residue at a position 125 nucleotides downstream of the STR, and/or an T residue at a position 53 nucleotides downstream of the STR.
Of course, the skilled reader will understand that a Y chromosome specific point mutation which is an A residue insertion may equally be referred to as an X chromosome specific point mutation which is an A deletion. Similarly, a Y chromosome specific point mutation in which an A residue is substituted by a G residue is equivalent to an X chromosome specific point mutation in which a G residue is substituted by a A residue. This also applies to chromosome specific T/C substitutions .
Furthermore, it is evident to a skilled reader that a T>C substitution on one strand of nucleic acid is equivalent to a A>G substitution on the complementary strand. Similarly, a G>A substitution on one strand of nucleic acid is equivalent to a C>T substitution on the complementary strand and an A insertion in one strand is equivalent to a T insertion in the complementary strand. The point mutations described herein may be determined on either strand.
In some embodiments, a method of analysis of genomic DNA may comprise: i) amplifying the DXY156XY short tandem repeat (STR) from a sample of genomic DNA to generate amplification products comprising one or more DXY156XY alleles; and ii) identifying an allele of said one or more DXY156XY alleles as a Y chromosome allele or an X chromosome allele, wherein said allele is identified by determining the presence or absence of an T (A) 5 motif within said STR.
Preferably, the presence or absence of a T(A)5 motif at the fourth repeat position from the 5' end of the STR is determined. The presence of a T (A) 5 motif is indicative that the allele is a Y chromosome allele. Conversely, the absence of a T (A) 5 motif is indicative that the allele is an X chromosome allele.
It will, of course, be appreciated that the absence of a T(A)5 motif at the fourth repeat position may be determined indirectly by determining the presence of a T (A) 4 motif at that position.
In some embodiments of the present methods, an allele may be identified by determining the presence or absence of G residue at position 125 nucleotides downstream of the STR (or a C residue on the complementary strand, see above) . The presence of a G residue at this position is indicative that the allele is a Y chromosome allele. Conversely, absence of a G residue at this position is indicative that the allele is an X chromosome allele.
It will, of course, be appreciated that the absence of a G residue at this position may be determined indirectly by determining the presence of an A residue at this position. In some embodiments, an allele may be identified by determining the presence or absence of an T residue at a position 53 nucleotides downstream of the STR. The presence of a T residue is indicative that the allele is a Y chromosome allele. Conversely, the absence of a T residue at this position is indicative that the allele is an X chromosome allele .
It will, of course, be appreciated that the absence of an T residue at the position 53 nucleotides downstream of the STR may be determined indirectly by determining the presence of a C residue at that position.
A method may include obtaining or providing a sample of genomic DNA from an individual, patient or donor. A sample of genomic DNA may be extracted or taken from a sample of biological tissue from the individual, patient or donor, for example, a blood, bone, semen, hair, saliva or skin sample. Such a sample may be a forensic sample, for example a sample which is taken from a crime scene or archaeological site, and not directly from an individual.
The published sequence of the human DXYS156Y locus has the Genbank database accession number X71600 (Chen et al (1994) Hu .Mutat. 4 (3) 208-211). The STR sequence (i.e. the repetitive sequence) begins at base 1214 of the published sequence. For the published 165 bp allele sequence, which contains twelve penta-nucleotide repeats, the repeat sequence ends at base 1264. Clearly however, the number of repeat motifs of an allele and the presence or absence of a T(A)5 motif as described herein will determine the precise location of the 3' boundary of the repetitive region. As used herein, the term 'locus' refers to a genomic region which comprises both the repeat motifs (STR) and the sequence flanking these repeat motifs.
One region of particular interest to the present application is the fourth motif from the 5' end of the STR. This is shown herein to be indicative of the chromosomal origin of an allele of the homologous DXYS156X and DXYS156Y STR loci. In the published DXYS156Y allele sequence (X71600) , which contains eleven pentanucleotide repeats and one hexanucleotide T (A) 5 motif, the fourth repeat from the 5' end begins at base 1229 and ends at base 1234.
Another position of interest is 125 nucleotides downstream of the last A residue of the repetitive STR sequence. The present inventors have identified a chromosome specific single nucleotide polymorphism (SNP) at this position which is indicative of the chromosomal origin of an allele of the homologous DXYS156X and DXYS156Y STR loci (i.e. the DXYS156XY locus) . A G residue is indicative of the DXYS156Y locus (i.e. the Y chromosomal locus) and an A residue is indicative of the DXYS156X locus (i.e. the X chromosomal locus). In the published DXYS156Y allele sequence (X71600) , the position 125 nucleotides downstream of the 3' end of the STR is base 1399.
Yet another position of interest is 53 residues downstream of the last A residue of the repetitive STR sequence. The present inventors have identified a chromosome-specific single nucleotide polymorphism at this position which is also shown herein to be indicative of the chromosomal origin of an allele of the homologous DXYS156X and DXYS156Y STR loci. A T residue is indicative of the DXYS156Y locus (i.e. the Y chromosomal locus) and a C residue is indicative of the DXYS156X locus (i.e. the X chromosomal locus). In the published DXYS156Y allele sequence (X71600) , the position 53 nucleotides downstream of the 3' end of the STR is base 1327.
The use of the DXYS156 STR in DNA analysis, for example in sex testing protocols, allows a positive control to be performed; in an assay which is functioning correctly, amplification of the DXYS156 STR from a genomic sample should result in the amplification of at least one X chromosome allele, as genomic samples from both male and female donors contain such an allele. The presence of an X chromosome allele therefore indicates that the amplification reaction is operating correctly and the absence of such an allele is an indication of incorrect operation.
X chromosome alleles may also be used to estimate the maternal geographic . origin of a sample, as they display geographic specificity.
In methods of the present invention, it may be desirable to identify and characterise X chromosome alleles of DXYS156 in a sample as well as Y chromosome alleles. As described above, X chromosome alleles may be identified by determining the absence of a T (A) 5 motif in the STR, the absence of a G at a position 125 nucleotides downstream of the STR (or the presence of A at this position) and/or the absence of a T at position 53 nucleotides downstream of the STR (or the presence of C at this position) .
For example, X chromosome alleles may be determined positively, by determining the presence of a T (A) 4 motif at the fourth repeat position from the 5' end of the STR or negatively, by determining the absence of a T (A) 5 motif at the fourth repeat position from the 5' end of the STR. Methods of analysis of genomic DNA as described herein may therefore comprise identifying one or more DXYS156XY alleles as a Y chromosome allele and one or more DXYS156XY alleles as a X chromosome allele, wherein said allele is identified by determining the presence of one or more X or Y chromosome- specific polymorphisms.
A suitable chromosome-specific polymorphism may be a point mutation selected from the group consisting of: a) an A insertion within a T (A) 4 motif of said STR, b) a G/A substitution at a position 125 nucleotides downstream of the STR, and; c) a T/C substitution at a position 53 nucleotides downstream of the STR.
For example a method of analysis of genomic DNA may comprise identifying one or more DXYS156XY alleles as a Y chromosome allele and one or more DXYS156XY alleles as a X chromosome allele, wherein said X chromosome allele and said Y chromosome allele are identified by determining the presence or absence of a T(A)Ξ motif in the STR sequence.
Preferably the presence or absence of an STR at the fourth repeat position from the 5' end of the STR is determined. The presence of said T (A) 5 motif is indicative that the product is a Y chromosome allele and the absence of said T (A) 5 motif is indicative that allele is a X chromosome allele.
Alternatively and/or additionally, the X chromosome allele and the Y chromosome allele may be identified by determining the presence or absence of a G residue at a position 125 nucleotides downstream of the STR sequence (+125) and/or an T residue at position 53 residues downstream of the STR sequence (+53) . Of course, the absence of a G residue at position +125 may be determined by determining the presence of an A residue at this position and the absence of a T residue at position +53 may be determined by determining the presence of a C residue at this position.
In some embodiments, the identity of the residue at position +125 and/or position +53 may be determined.
Of course, in methods of the present invention, the chromosomal origin of an allele may be identified by determining the presence or absence of any one (i.e. a, b or c) , any two (ab, ac or be) or all three (abc) of the point mutations described herein.
DXY156XY is multi-allelic and the particular alleles of the homologous STR on the Y chromosome (DXY156Y) and X chromosome (DXY156X) may be determined. As different individuals will possess different alleles, this enables sample contamination to be detected and provides a substantial contribution to the DNA fingerprinting of a sample. Alleles of an STR such as DXY156XY are distinguished by the number of repeats of the pentanucleotide repetitive motif . The different alleles are indicated herein by a number which refers to the number of repeat motifs present e.g. allele 11 has 11 copies of the repeat motif. The repeat motif of DXY156XY is (A) 4. Of course, in Y chromosome alleles, the T (A) 4 motif is replaced in the fourth position from the 5 'end by the motif T (A) 5 as described herein.
For the avoidance of doubt, T (A) 4 is used herein to designate the pentanucleotide repeat motif TAAAA and T (A) 5 is used herein to designate the hexanucleotide repeat motif TAAAAA. General references to repeat motifs and tandem repeats include both pentanucleotide TAAAA motifs and hexanucleotide TAAAAA motifs unless otherwise stated.
Methods of analysis as described herein may comprise characterising an Y chromosome allele and/or an X chromosome allele by determining the number of copies of the repeat motif in the STR sequence i.e. the number of tandem repeats of said allele. The number of repeat motifs will determine the length of the STR and molecular weight of products amplified from it. Alleles of the STR are distinguished by the number of repeat motifs .
The number of tandem repeats may be determined using conventional methods in the art, such as sequencing and electrophoresis as described herein.
As described above, a test sample of genomic DNA may be provided for example by extracting nucleic acid from cells or biological tissues or fluids, urine,' saliva, faeces, a buccal swab, biopsy or preferably blood, or for pre-natal testing from the amnion, placenta or foetus itself. Methods of extracting DNA from biological samples are well known in the art. A sample so extracted may be used in a method described herein.
The DXY156XY short tandem repeat (STR) may be amplified by subjecting the genomic DNA in a sample to a specific nucleic acid amplification reaction such as the polymerase chain reaction (PCR) (reviewed for instance in "PCR protocols; A Guide to Methods and Applications", Eds. Innis et al, 1990, Academic Press, New York, Mullis et al , Cold Spring Harbor Symp. Quant. Biol., 51:263, (1987), Ehrlich (ed) , PCR technology, Stockton Press, NY, 1989, and Ehrlich et al, Science, 252:1643-1650, (1991)). PCR comprises steps of denaturation of template nucleic acid (if double-stranded) , annealing of primer to target, and polymerisation. The nucleic acid used as template in the amplification reaction is preferably human genomic DNA.
Other specific nucleic acid amplification techniques include strand displacement activation, the QB replicase system, the repair chain reaction, the ligase chain reaction and ligation activated transcription. For convenience, and because it is generally preferred, the term PCR is used herein in contexts where other nucleic acid amplification techniques may be applied by those skilled in the art. Unless the context requires otherwise, reference to PCR should be taken to cover use of any suitable nucleic amplification reaction available in the art .
Oligonucleotide primers suitable for amplification may be designed using genomic sequences which flank a region of the DXYS156XY locus containing one or more point mutations as described herein, in particular sequence which flanks the STR and/or positions 125 nucleotides or 53 nucleotides downstream of the STR. The published DXYS156Y locus sequence (X71600) may be used to design such primers. A suitable oligonucleotide may be about 30 or fewer nucleotides in length (e.g. 18, 21 or 24) . Generally specific primers are upwards of 14 nucleotides in length, but need not be than 18-20. Those skilled in the art are well versed in the design of primers for use in processes such as PCR.
Various techniques for synthesizing oligonucleotide primers are well known in the art, including phosphotriester and phosphodiester synthesis methods. In some embodiments, the DXYS156XY locus may be amplified using a pair of oligonucleotide primers, of which the first member of the pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5' of the pentanucleotide repeats, and the second member of the primer pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 3 ' of the pentanucleotide repeats .
In other embodiments, the DXYS156XY locus may be amplified using a pair of oligonucleotide primers, of which the first member of the pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5 ' of a position 125 nucleotides downstream of the STR, and the second member of the primer pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 3' of position 125 nucleotides downstream of the STR.
In other embodiments, the DXYS156XY locus may be amplified using a pair of oligonucleotide primers, of which the first member of the pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5' of a position 53 nucleotides downstream of the STR, and the second member of the primer pair comprises a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 3 ' of a position 53 nucleotides downstream of the STR.
In other embodiments, a primer pair consisting of a primer from each of two different primer pairs described above may be used to amplify a region of the DXYS156 locus which comprises more than one point mutation (e.g. two or three) . As described above, suitable primers may be designed using genomic sequence flanking the DXYS156XY STR, for example, the sequence of the DXYS156Y locus which has the Genbank database accession number X71600. Primers may, of course, also be designed using genomic sequences flanking the published DXYS156Y locus sequence (X71600) , for example neighbouring loci. Examples of suitable primers for amplifying the DXYS156 locus are MVF12 and PF31 described below.
Analysis of DXY156XY STR amplification products to distinguish between Y chromosome and X chromosome alleles as described herein may be carried out in a number of ways familiar to those skilled in the art.
DXY156XY locus amplification products may be sequenced and/or tested using known methods to determine the presence or absence of one or more of the T (A) 5 motif in the fourth repeat from the 5' end of the DXY156XY STR, a G residue at a position 125 nucleotides downstream of the STR and/or an T residue at a position 53 nucleotides downstream of the STR, and thereby distinguish between Y chromosome and X chromosome alleles.
Sequencing of an amplification product may involve precipitation with isopropanol, resuspension and sequencing using a TaqFS÷ Dye terminator sequencing kit. Extension products may be electrophoresed on an ABI 377 DNA sequencer and data analysed using Sequence Navigator software.
An amplification product may be treated so as to display on a denaturing polyacrylamide DNA sequencing gel specific bands that are linked to the chromosome specific variant.
Alternatively, SSCP heteroduplex analysis may be used to screen genomic DNA for Y or X chromosome DXY156XY STR alleles. This generally involves amplifying labelled 100-300 bp fragments of nucleic acid, diluting these products and denaturing at 95°C. The fragments are quick-cooled on ice so that the DNA remains in single stranded form. These single stranded fragments are run through acrylamide based gels. The insertion or substitution of an single base in alleles from the Y chromosome will cause the single stranded molecules to adopt difference conformations in this gel matrix, making their mobility different from X chromosome alleles, which lack the substitution or insertion. This allows detection of Y chromosome alleles in the amplification products being analysed, for example upon exposure of the gel to X-ray film.
Any suitable label may be used to label the fragments of nucleic acid, for example a fluorescent label or radio label .
Alleles with altered mobility/conformations may be directly excised from the gel and directly sequenced to confirm the presence of a DXYS156Y specific polymorphism, such as presence of the T (A) 5 motif in the STR, for example at the fourth repeat from the 5' end of the STR.
A chromosome specific point mutation such as described herein may generate a chromosome specific endonuclease recognition site which may provide a convenient means of determining the presence of such mutations.
For example, GATCC is. a Y chromosome specific restriction site recognised by various endonucleases, including Alwl , BspPI , AclWI, which is generated by the substitution at +125 in the DXYS156Y locus as described herein. This is the only site for this class of restriction enzymes within the DXYS156 STR flanking regions, so these enzymes may be used to differentiate the X from the Y chromosome by selectively cleaving the Y amplicon. GATTC is an X-chromosome specific restriction site which is recognised by various endonucleases, including Hinfl , which is generated by the substitution at +125 in the DXYS156X locus as described herein. Suitable amplification primers may be designed as described herein to produce an amplicon in which this is the only site for this class of restriction enzymes. These enzymes may then be used to differentiate the Y from the X chromosome by selectively cleaving the X amplicon.
A method of the present invention may comprise treating a sample of genomic DNA with an enzyme which cleaves the DXYS156XY locus in a chromosome specific manner (i.e. is specific for one of the DXYS156Y and DXYS156X loci) prior to amplification. A suitable enzyme will cleave within the DXYS156Y locus (or an amplicon therein) but not within the DXYS156X locus or vice versa. Such an enzyme may be a restriction endonuclease which recognises a site which is present in the DXYS156X locus but not the DXYS156Y locus or vice versa.
Molecules which have been cleaved by the restriction enzyme will not form effective templates for amplification, so endonuclease cleavage of the sample prior to amplification may provide selective amplification of the X chromosome allele or the Y chromosome allele (i.e. the locus from one chromosome is preferentially amplified over the other) .
In other embodiments, a method of the present invention may comprise treating an amplified DXYS156XY locus with enzyme which cleaves the DXYS156XY locus in a chromosome specific manner (i.e. is specific for either the DXYS156Y or the DXYS156X locus) . Cleavage of the amplified locus is indicative that it is a substrate for the chromosome specific enzyme and thus allows the chromosomal origin of the amplified locus to be determined. Treatment with a specific endonuclease as described herein after amplification may thus be used to distinguish between X and Y chromosome amplicons.
A DXYS156XY target locus may be treated with both an X and a Y chromosome specific endonuclease, either individually in different reactions, or simultaneously in the same reaction. Such treatment may occur before or after amplification. For simultaneous treatment, the reaction is configured such that only one of the enzymes (for example the Y chromosome specific endonuclease) is active under a first set of conditions, for example a first temperature, whilst under a second set of conditions only the other enzyme (for example the X chromosome specific endonuclease) is active. The target locus is then treated in a reaction with both the enzymes under the first set of conditions and the products of the reaction determined, for example by electrophoresis, real-time PCR etc as described herein, then the same reaction conditions is treated under the second set of conditions and the products determined again.
For example, DXYS156XY locus DNA may be treated with Hinfl and BspPI simultaneously. At 37°C, only the Hinfl is active and only X chromosomes (DXYS156X) will be cleaved. The presence of cleavage products is then determined by any conventional method as described herein. The reaction is then treated at 55°C, where only the BspPI is active and only the Y chromosome (DXYS156Y) is cleaved. The presence of cleavage products is then determined by any conventional method as described herein. If a Y allele is present, cleavage products will be present in the second analysis. If no Y allele is present, no such products will be detected in the second analysis. The presence of an X allele is indicated by the results of the first analysis. As a male sample will contain an single X allele, and a female sample a pair of X alleles, the determination of the presence of an X allele in the first analysis may also be used as a positive control.
An alternative or supplement to looking for the presence of the T (A) s motif in a test sample is to look for the presence of the T (A) 4 motif, e.g. using a suitably specific oligonucleotide probe or primer.
An alternative or supplement to looking for the presence of a G residue at position +125 in a test sample is to look for the presence of an A residue at this position, e.g. using a suitably specific oligonucleotide probe or primer.
An alternative or supplement to looking for the presence of an T residue at position +53 in a test sample is to look for the presence of a C residue at this position, e.g. using a suitably specific oligonucleotide probe or primer.
Amplification products may be screened using a variant- specific probe. Such a probe may correspond in sequence to a region of the DXY156XY STR, or its complement, preferably to a region comprising the fourth repeat from the 5' end, or a region of the DXY156XY locus comprising position 125 nucleotides downstream of the STR (+125) and/or position 53 downstream of the STR (+53) .
Under suitably stringent conditions, specific hybridisation of such a probe to test nucleic acid is indicative of the presence of a Y chromosome specific polymorphism such as a T(A)5 motif in an. allele of the DXYS156 STR in the test or sample nucleic acid. Such a motif may replace a T (A) 4 at the fourth repeat position in the STR. Other Y chromosome specific polymorphisms in the DXYS156 locus include a A>G substitution at a position 125 nucleotides downstream of the STR sequence and a C>T substitution at a position 53 downstream of the STR sequence .
Hybridization may be performed using any suitable approach or format. A large number of filter and solid support formats are available which are suitable for nucleic acid hybridization analysis (Beltz GA et al . (1985), in Methods In Enzymology, Vol. 100, Part B, R. Wu, L. Grossman, K. Moldave, Eds., Academic Press, New York, Chapter 19, pp. 266-308) . One format, the so-called "dot blot" hybridization, involves the non-covalent attachment of target DNAs to a filter, which are subsequently hybridized with a radioisotope labeled probe (s). Many versions of "Dot blot" hybridization have been developed (Anderson MLM and Young BD, in Nucleic Acid Hybridization - A Practical Approach, B. D. Hames and S. J. Higgins, Eds., IRL Press, Washington, D.C. Chapter 4, pp. 73-111, 1985) and the technique has been used for multiple analysis of genomic mutations, the detection of overlapping clones and the construction of genomic maps .
Nucleic acid hybridization may also be performed on micro- formatted multiplex or matrix devices (e.g., DNA chips) (Barinaga M, (1991) Science 253 1489; Bains W, 10 Bio/Technology, pp. 757-758, 1992) . These methods involve the attachment of specific DNA sequences to very small specific areas of a solid support, such as micro-wells of a DNA chip. The micro-array format also provides for sequencing using the 'sequencing by hybridisation' (SBH) approach (Southern et al (1992) Genomics 13 1008, Drmanac et al (1993) Science 260 1649-1652) . Various microarray approaches for the analysis of genomic and STR sequences have been proposed (for example W09943853) and are suitable for carrying out the methods described herein.
Mismatch between a probe and a target sequence may be employed to detect a point mutation as described herein. Under appropriate conditions (temperature, pH etc.), an oligonucleotide probe will hybridise with a sequence which is not entirely complementary. The degree of base-pairing between. the two molecules will be suf icient for them to anneal despite a mis-match. Various approaches are well-known in the art for detecting the presence of a mis-match between two annealing nucleic acid molecules.
For instance, RN'ase A cleaves at the site of a mis-match. Cleavage can be detected by electrophoresing test nucleic acid to which the relevant probe or probe has annealed and looking for smaller molecules (i.e. molecules with higher electrophoretic mobility) than the full length probe/test hybrid.
Thus, an oligonucleotide probe that has the sequence of the DXY156XY STR, the DXY156XY STR flanking region or a portion thereof (either sense or anti-sense strand) which includes the a Y chromosome specific polymorphism as described herein may be annealed to test nucleic acid i.e. products amplified from the DXY156XY STR of genomic DNA sample and the presence or absence of a mis-match determined. Detection of the presence of a mis-match may indicate the presence in the test nucleic acid of Y chromosome specific sequence variant such as a T (A) 5 motif at the fourth repeat position, a G at position +125 relative to the STR or an T at position +53 relative to the STR or the presence of an X chromosome specific sequence variant such as a T (A) 4 motif at the fourth repeat position, an A at position +125 3' of the STR or a C at position +53 3' of the STR.
The oligonucleotide probe may comprise a label. Binding of the probe may be determined by detecting the presence of the label .
Binding of a probe to target nucleic acid (e.g. amplified genomic DNA) may be measured using any of a variety of techniques at the disposal of those skilled in the art as described above. For instance, probes may be radioactively, fluorescently or enzymatically labelled. Other methods not employing labelling of probe include variant specific amplification using PCR, RN'ase cleavage and allele specific oligonucleotide probing. Probing may employ the standard Southern blotting technique. For instance DNA may be extracted from cells and digested with different restriction enzymes. Restriction fragments may then be separated by electrophoresis on an agarose gel, before denaturation and transfer to a nitrocellulose filter. Labelled probe may be hybridised to the DNA fragments on the filter and binding determined.
Those skilled in the art are well able to employ suitable conditions of the desired stringency for selective hybridisation, taking into account factors such as oligonucleotide length and base composition, temperature and so on.
Suitable selective hybridisation conditions for oligonucleotides of 17 to 30 bases include hybridization overnight at 42°C in 6X SSC and washing in 6X SSC at a series of increasing temperatures from 42°C to 65°C. Selective hybridisation using micro-array formats may be performed using conditions of equivalent stringency.
Variant-specific oligonucleotide primers may also be used in PCR to specifically amplify sequence from the DXY156XY locus if a Y chromosome specific' sequence variant, such as a T (A) 5 motif, is present in the STR. In some embodiments, sequence may be amplified only if a T (A) 5 motif is present at the fourth repeat position, or alternatively, only if a T (A) 4 motif is present at the fourth repeat position in a test sample of genomic DNA.
In such reactions, the first member of the pair of oligonucleotide primers may comprise a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5' or 3 ' of the pentanucleotide repeats, and the second member of the pair may comprise a nucleotide sequence which hybridises under stringent conditions to a DXYS156 STR sequence which has T (A) 5 at the fourth repeat position and not to an DXYS156 STR sequence which has T (A) 4 at the fourth repeat position (or vice versa) , such that amplification only occurs in the presence of the particular motif at the fourth repeat position.
In other embodiments, sequence may be amplified only if a G residue is present at a position 125 nucleotides downstream of the final residue of the STR, or alternatively, only if an A motif is present at this position in a test sample of genomic DNA.
In such reactions, the first member of the pair of oligonucleotide primers may comprise a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5' or 3' of position +125 downstream of the STR, and the second member of the pair may comprise a nucleotide sequence which hybridises under stringent conditions to a DXYS156 locus sequence which has G at position +125 and not to an DXYS156 locus sequence which has A at this position (or vice versa) , such that amplification only occurs in the presence of the particular residue at position +125.
In yet other embodiments, sequence may be amplified only if an T residue is present at the position 53 nucleotides downstream of the STR, or alternatively, only if a C residue is present at this position in a test sample of genomic DNA.
In such reactions, the first member of the pair of oligonucleotide primers may comprise a nucleotide sequence which hybridises to a complementary sequence which is proximal to and 5' or 3 ' of position +53, and the second member of the pair may comprise a nucleotide sequence which hybridises under stringent conditions to a DXYS156 locus sequence which has T at position +53 and not to an DXYS156 locus sequence which has C at this position (or vice versa) , such that amplification only occurs in the presence of the particular residue at position +53.
Of course, amplification of more than one X or Y chromosome specific sequence variant or polymorphism may be undertaken simultaneously, for example in a multiplex reaction, or alternatively using a pair of amplification primers which both hybridise selectively to different sequences which each comprise a Y chromosome specific point mutation or which both hybridise selectively to different sequences which each comprise a X chromosome specific point mutation. Alleles from the DXYS156 loci may be amplified in a multiplex reaction along with other microsatellites, for example multi- allelic autosomal STRs for DNA fingerprinting.
Protocols suitable for use in amplifying DXYS156XY locus and determining the motif in the fourth repeat position of the STR and/or the residues at positions +53 and +125 relative to the STR may be found in Molecular Cloning: a Laboratory Manual: 3rd edition, Sambrook & Russell, 2001, Cold Spring Harbor Laboratory Press and/or Current Protocols in Molecular Biology, Ausubel et al . eds . , John Wiley & Sons, 1992.
The identification of a product as a Y chromosome allele is indicative of said sample being derived from a male donor.
A method for determining the sex of a donor of a genomic sample may therefore include; i) amplifying the DXYS156XY locus of the sample of genomic
DNA provided by a donor to generate amplification products comprising one or more DXYS156XY alleles; and ii) determining the presence or absence of one or more Y chromosome-specific polymorphisms or point mutations in the amplified DXYS156XY locus.
A suitable Y chromosome-specific polymorphism, sequence variant or point mutation in the amplified DXYS156XY locus may be selected from the group consisting of; a) an T (A) 5 insertion within the DXYS156 STR, b) a G residue at a position 125 nucleotides downstream of the STR, c) a T residue at a position 53 nucleotides downstream of the STR. The presence of one or more of these point mutations is indicative that said allele is a Y chromosome allele and thus that said donor is male.
In some embodiments, a method for determining the sex of a donor of a genomic sample may include; i) amplifying the DXYS156XY short tandem repeat (STR) of the sample of genomic DNA provided by a donor to generate amplification products comprising one or more DXY156XY alleles; and ii) determining the presence or absence of a T (A) 5 motif in an allele of said one or more DXY156XY alleles, the presence of said motif being indicative that said allele is a Y chromosome allele and said donor is male.
The presence or absence of a T(A)5 motif at the fourth repeat position from the 5' end of the STR may be determined in such methods .
A common problem in genetic analysis is the determination of paternity in circumstances in which genomic DNA from the potential father is not available. By providing an unambiguous distinction between X and Y chromosome alleles, the present invention allows for simplified methods for paternity deficiency tests.
Such a method may comprise the steps of: i) amplifying the DXYS156XY locus from a first sample of genomic DNA provided by a first donor and a second sample of genomic DNA provided by a second donor to generate first and second amplification products, each said products comprising one or more DXYS156XY alleles, ii) identifying an allele in said first amplification products as a Y chromosome allele of said first donor and an allele in said second amplification products as a Y chromosome allele of said second donor, wherein said Y chromosome alleles are identified by determining the presence of one or more Y chromosome-specific point mutations in the amplified DXY156XY locus, iii) characterising the Y chromosome allele of said first and said second donor by determining the number of tandem repeats of each said allele; and iv) determining the family relationship between said first and second donors by comparing the Y chromosome allele of said first donor with the Y chromosome allele of said second donor.
A suitable Y chromosome-specific polymorphism, sequence variant or point mutation may be selected from the group consisting of: a) an T (A) 5 motif within the DXYS156XY STR, b) a G residue at a position 125 nucleotides downstream of the STR, c) an T residue at a position 53 nucleotides downstream of the STR.
In some embodiments, such a method may comprise: i) amplifying the DXYS156XY STR from a first sample of genomic
DNA provided by a first donor and a second sample of genomic
DNA provided by a second donor to generate first and second amplification products, each said products comprising one or more DXY156XY alleles, ii) identifying an allele in said first amplification products as a Y chromosome allele of said first donor and an allele in said second amplification products as a Y chromosome allele of said second donor, wherein said Y chromosome alleles are identified by determining the presence of a T (A) 5 motif in said allele, the presence of said T (A) 5 motif being indicative that said allele is a Y chromosome allele, iii) characterising the Y chromosome allele of said first and said second donor by determining the number of tandem repeats of each said allele; and iv) determining the family relationship between said first and second donors by comparing the Y chromosome allele of said first donor with the Y chromosome allele of said second donor.
The presence or absence of a T (A) 5 motif at the fourth repeat position from the 5' end of the allele of the STR may be determined.
In a method for determining or confirming the paternity of a first donor where the father is absent or deceased, the second donor is preferably a male of the same patrilineal descent as the absent or deceased father. The presence of the same Y chromosome allele is indicative that the first and second donor are related i.e. are of the same patrilineal descent.
The presence of different Y chromosome alleles in the samples from the first and second donors is indicative of there being no family relationship between the individuals. The presence of the same Y chromosome allele in the samples from the first and second donors is consistent with there being a family relationship between the individuals.
Further forensic aspects of the present invention relate to the determination of the number and/or identity multiple individuals from a mixed sample, for example male offenders in multiple rape cases.
Such a method may include; i) amplifying the DXY156XY locus from a sample of genomic DNA to generate amplification products comprising two or more DXY156XY alleles; and ii) identifying two or more alleles of said DXY156XY alleles as Y chromosome alleles, wherein said Y chromosome alleles are identified by determining wherein the presence of one or more Y chromosome- specific polymorphisms, sequence variants or point mutations in the amplified DXY156XY short tandem repeat (STR) locus.
Suitable Y chromosome-specific polymorphisms, sequence variants or point mutations may be selected from the group consisting of: a) an T (A) 5 motif within the DXYS156 STR, b) a G residue at a position 125 nucleotides downstream of the STR, c) a T residue at a position 53 nucleotides downstream of the STR.
In some embodiments, such a method may include; i) amplifying the DXY156XY short tandem repeat (STR) from a sample of genomic DNA to generate amplification products comprising two or more DXY156XY alleles; and ii) identifying two or more alleles of said DXY156XY alleles as Y chromosome alleles, wherein said alleles are identified by determining the presence or absence of a T (A) s motif with said allele. The presence or absence of a T (A) 5 motif may be determined at the fourth repeat position from the 5' end of the allele.
The presence of said two or more Y chromosome alleles is indicative of said genomic DNA sample being derived from two or more males . Methods may further include characterising said two or more Y chromosome alleles by determining the number of tandem repeats within said alleles.
As a number of different alleles exist, the characterisation of said alleles provides information which contributes to the DNA fingerprinting of the individuals from whom the genomic DNA sample is derived.
Another potentially very useful application of the present methods is the determination of the paternal descent of a stain, sample, or deposit: in many police investigations, especially where expensive mass screenings are required, knowledge of the geographic origin, and by extrapolation, the- phenotype, of an unknown stain donor could narrow down the list of suspects and substantially reduce time and cost of the screenin .
Different human populations have different frequencies of X and Y chromosome alleles of DXYS156. Determination of the particular X and Y alleles of DXYS156 may be used to provide estimates of the paternal and/or maternal descent of an individual .
A method may comprise: i) amplifying the DXY156XY locus from a genomic DNA sample provided by an individual to generate amplification products comprising one or more DXY156XY STR alleles; and, ii) identifying an allele of said one or more DXY156XY STR alleles as a Y chromosome allele or an X chromosome allele, wherein said allele is identified by determining wherein the presence of one or more X or Y chromosome-specific polymorphisms, sequence variants or point mutations in the amplified DXY156XY locus comprising said allele, iv) characterising said X or Y chromosome STR allele by determining the number of tandem repeats therein, v) providing a database of the distribution of DXY156XY STR chromosome alleles in one or more different human populations, vi) comparing the Y chromosome allele or the X chromosome allele of said sample with said database; and vii) determining the probability of said individual being a member of one or more of said human populations.
A suitable X or Y chromosome specific polymorphism, sequence variant or point mutation may be selected from the group consisting of; a) an A insertion in a T (A) 4 motif within said STR, b) a G/A substitution at a position 125 nucleotides downstream of the STR, c) an T/C substitution at a position 53 nucleotides downstream of the STR.
In some embodiments, such a method may include: i) amplifying the DXY156XY short tandem repeat (STR) from a genomic DNA sample provided by an individual to generate amplification products comprising one or more DXY156XY alleles; and, ii) identifying an allele of said one or more DXY156XY alleles as a Y chromosome allele or an X chromosome allele, wherein said allele is identified by determining the presence or absence of a T (A) s motif within said allele, iii) characterising said X or Y chromosome allele by determining the number of tandem repeats therein,- iv) providing a database of the distribution of DXY156XY STR chromosome alleles in one or more different human populations, v) comparing the Y chromosome allele or the X chromosome allele of said sample with said database; and vi) determining the probability of said individual being a member of one or more of said human populations.
The presence or absence of a T (A) s motif may be determined at the fourth repeat position from the 5' end of the allele. Databases showing the distribution of DXY156XY STR alleles in different human populations are shown in Tables 2 and 3.
Nucleic acid suitable for use in performing the methods described herein may be provided as part of a kit, e.g. in a suitable container such as a vial in which the contents are protected from the external environment .
A kit may comprise; a) amplification primers specific for the DXY156XY locus b) an oligonucleotide probe specific for an X or Y chromosome- specific polymorphism in the amplified DXY156XY short tandem repeat (STR) locus
A suitable probe may be specific for one or more of the following: a) an T (A) 5 motif within the fourth repeat motif of said STR, b) a G or A residue at a position 125 nucleotides downstream of the STR, c) an T or C residue at a position 53 nucleotides downstream of the STR.
Such a kit may, for example, include amplification primers specific for the DXY156XY STR and an oligonucleotide probe specific for Y chromosome alleles of DXY156XY comprising a T (A) s motif at the fourth repeat position. Such a probe may comprise the sequence of the fourth repeat motif of the DXY156XY STR or the complement thereof. A kit may also include instructions for use in a method as described herein. A kit may further include one or more reagents required for the amplification reaction, such as polymerase, nucleotides, buffer solution etc and for detection of the hybridised probe. The nucleic acid may be labelled.
Various further aspects and embodiments of the present invention will be apparent to those skilled in the art in view of the present disclosure.
Certain aspects and embodiments of the invention will now be illustrated by way of example and with reference to the figures described below.
Figure 1 shows sequence data from Sicilian and Korean individuals which indicates that an inserted adenine is present in the fourth repeat motif of all males.
Figure 2 shows an allelic ladder which contains the most common X and Y alleles . Length analysis indicates that the adenine insertion is present in males but not females.
Figure 3 shows the global distribution of the Yll allele of DXYS156.
Figure 4 shows the global distribution of the Y13, Y14 and Y15 alleles of DXYS156.
Figure 5 shows the global distribution of the X4 allele of DXYS156.
Table la shows the frequencies of DXYS156X alleles in Sicily and Korea. Table lb shows the frequencies of DXYS156Y alleles in Sicily and Korea .
Table 2 shows the frequencies of DXYS156X alleles worldwide. References are; Karafet T et al (1998) Hum Biol 70:979-992; Kersting et al (2001) Croat Med J. 42:310-314.
Table 3 shows the frequencies of DXYS156X alleles worldwide. References are Rossi et al (1999) Int J Legal Med 112:78-81; Nata et al (1999) Int J Legal Med 112:406-408; Sajantila et al
(1996) Proc Natl Acad Sci USA 93:12035-12039; Kayser et al
(1997) Int J Legal Med 110:125-133, 141-149; De Knijff et al (1997) Int J Legal Med 110:134-149; Forster et al (2000) Am J
Hum Genet 67:182-196; Brinkman et al (1999) Int J Legal Med 112:181-183; Salem et al (1996) Am J Hum Genet 59:741-743; Horst et al (1999) Int J Legal Med 112:211-212; Forster et al(1998) Mol Biol Evol 15:1108-1114
Experimental Subjects and Methods
Sicilian blood samples were obtained from healthy blood donors, selected for ancestry of all four grandparents, from the Sicilian towns of Troina (59 women and 48 men) and Sciacca (33 women and 50 men) . Korean samples (7 women and 34 men) were obtained from students in Kwangju, South Korea. DNA was extracted from peripheral blood leukocytes by standard procedures .
DXYS156 STR Amplification
Extracted DNA was amplified by polymerase chain reaction (PCR) using the primers;
5 ' cagataccaaggtgagaatc3 '
5 ' gtagtggtcttttgcctcc3 ' under the following thermocycling conditions; 94°C 6 minutes, followed by 30 cycles of 94°C 30s, 58°C 30s, 72°C 30s (Chen H et al (1994) Hum Mut 4:208-211).
The PCR was performed in 50μL containing: 50ng of genomic DNA; 1U Taq DNA-polymerase (Perkin Elmer, USA) ; 5μL reaction buffer lOx (20mM Tris-HCl pH 8, lOOmM KCl, 0. ImM EDTA, ImM DTT, 50% glycerol, 0.5% Tween 20, 0.5% Nonidet P40) 1.5mM MgCl2, 0.2mM of each dNTP, 0.2mM of each primer. Primer 1 was modified 5' by addition of a dye label (TAMRA: N,N,N' ,N' -tetramethyl-6- carboxyrhodamine) . After PCR amplification, lμL of the products was diluted in 12μL of deionised formamide and lμL of GeneScan 350 Rox (molecular weight DNA marker) . The DNA was denatured at 95°C for 3min, cooled on melting ice, and loaded on an ABI PRISM 310 Genetic Analyzer (Perkin Elmer, USA) for amplicon length determination.
For those alleles that were chosen for sequencing, the alleles were first cloned using the TOPO-TA Cloning Kit (Invitrogen, BV, Groningen, The Netherlands) . The same primers were used to 'sequence both strands of DNA, using ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit (PE, Applied Biosystems, Milano, Italy) .
In order to exclude the possibility of sample confusion or contamination, the recorded sex was confirmed for each sequenced sample by the amelogenin test (Sullivan KM et al (1993) Biotechniques 15:636-641, Mannucci A et al (1994) Int J Legal Med 106:190-3) using the kit AmpFl STR Green (PE Applied Biosystems, Milano, Italy).
DXYS156 Locus Amplification The DXYS156 locus was amplified by PCR using primers MVF12 and
PF31;
MVF12: 5 ' -GGTGGTATTTTGATGGGAATTACATTTAAT-3 '
PF31 : 3 ' -CAGATACCAAGGTGAGAATC-3 '
Each PCR reaction contained: 1 μl MVF12 (lOpmoles)
1 μl PF31 (lOpmoles) 5 μl DNA extract
2.5 μl Yellow Sub PCR Enhancer (Geneo BioProducts, Germany)
2 μl dNTP mix (lOrriM dATP, dCTP, dGTP, dTTP) 0.2 μl DNA polymerase (5 u per μl)
1.5 μl MgCl2 (25mM)
2.5 μl PCR reaction buffer (Geneo BioProducts, Germany)
10.3 μl ultrapure water
Thermocycling was performed without oil as follows: lid preheated at 105°C, initial denaturing at 94°C for 2 minutes then 35 cycles of 94°C for 15 seconds, 58°C for 15 seconds, 72°C for 15 seconds with a final extension at 72°C for 10 minutes .
PCR products were separated by electrophoresis on a 5% agarose gel until the X and the Y band were separated. The X and the Y band were then excised with a scalpel from the gel and purified separately with QIAQUICK gel purification kit (QIAGEN, Germany) . The purified X and Y amplicons were submitted to DNA fluorescence sequencing using the Big Dye Terminator version 2 kit (ABI, USA) and the DNA sequence determined by capillary electrophoresis (ABI, USA) . Results
Adenine insertion in the Y homologue of DXYS156
Initial sequencing in Sicilian and Korean males and females of the whole range (4 to 14 repeats) of DXYS156 alleles consistently revealed an inserted adenine in the fourth repeat motif from the 5' end (Fig. 1) in all males, see Table 1.
DNA sequencing is inconvenient for routine applications so an allelic ladder was generated which contains the most common alleles on the X and on the Y (Fig. 2) . Length analysis using this ladder confirmed the presence of the insertion in all males (98 out of 98 Sicilians and 34 out of 34 Koreans) , but never in females (0 out of 92 Sicilians and 0 out of 7 Koreans) . Sex testing by amelogenin (Sullivan et al . 1993 supra) confirmed the sex recorded on the sample labels. We conclude that the point insertion is specific to the Y chromosome .
Nucleotide Substitutions in the Y homologue of the DXYS156 Locus
Two Y-specific nucleotides were found in DXYS156 allele 11 and allele 12. Worldwide , >99% of non-African men and 95% of African men have Y allele 11, 12 or the derived alleles 13, 14, or 15. Given the negligible mutation speed of point mutations, it appears that these two Y nucleotides occur in almost 100% of Y chromosomes.
DXYS156Y:
1211 taataaaata aaataaaata aaaataaaat aaaataaaat aaaataaaat aaaataaaat 1271 aaaatactta ggaatatacc tagccaagga ggcaaaagac cactacaagg aaattataaa 1331 acactgctga aataaatcat agacaaaaca aacaagtaga aacacatccc atgctcatgg 1391 ataggtagga tcaatattgt gaaaatgaca The position of the point mutations in the complementary strand to the published DXYS156XY sequence (X71600 ) is shown below .
acaatattga tCctacctat ccatgagcat gggatgtgtt tctacttgtt tgttttgtct atgatttatt tcagcagtgt tttataattt ccttgtagtg gtcttttgcc tccttggcta ggtatattcc taagtatttt attttatttt attttatttt attttatttt attttatttt tattttattt tattttatta tttttg
Furthermore , the two X specific nucleotides have been found in DXYS156 allele 4 and in allele 7 in our samples . Worldwide, 44% of African, 80% of European, and >91% of other X chromosomes have X alleles between 4 and 7 and all others have longer, derived X alleles . Given the negligible mutation speed of point mutations , these two X nucleotides occur in at least 44 % , 80% and 91% in African, European, and other X chromosomes respectively and more probably in close to 100% of X chromosomes .
The positions of these nucleotides in the published DXYS156XY sequence is shown below;
DXYS156X :
1211 taataaaata aaataaaat aaaataaaat aaaataaaat aaaataaaat aaaataaaat
1271 aaaatactta ggaatatacc tagccaagga ggcaaaagac cactacaagg aaattaCaaa
1331 acactgctga aataaatcat agacaaaaca aacaagtaga aacacatccc atgctcatgg 1391 ataggtagaa tcaatattgt gaaaatgaca
The position of the point mutations in the complementary strand to the published DXYS156XY sequence (X71600 ) is shown below .
acaatattga ttctacctat ccatgagcat gggatgtgtt tctacttgtt tgttttgtct atgatttatt tcagcagtgt tttgtaattt ccttgtagtg gtcttttgcc tccttggcta ggtatattcc taagtatttt attttatttt attttatttt attttatttt attttatttt attttattt tattttatta tttttg
The typing of these X and Y specific nucleotides will yield the sex of the human individual sample in close to 100% of applications .
DXYS156 Alleles
Worldwide surveys have shown that DXYS156 generally has allele lengths ranging from 4 to 12 repeats on the X chromosome (Table 2), and 8 to 15 repeats on the Y chromosome (Table 3) . Although the X locus generally has allele lengths of 10 repeats or shorter, and the Y locus generally has 11 repeats or longer, there is an overlap of X and Y allele ranges amounting to several percent in many populations (Tables 2 and 3) . This overlap has. previously hindered the secure chromosomal assignment of alleles, despite efforts at statistical separation (Karafet T et al (1998) Hum Biol .70:979-992) . By typing the adenine insertion which we have shown to be Y-specific and/or one or both of the X or Y specific nucleotides, Y alleles of the STR can be unambiguously distinguished from the X alleles.
Geographic Specificities of Alleles
The predominant Y allele in the Sicilian samples (inhabitants of Troina and Sciacca) is allele length 12, which is by far the most common European allele (Table 3) . However, the- allele 11 in the Troina sample is, at 10 out of 48 males, unusually common for a European population (Fig. 3) . The family names of the Troina donors were investigated and six of the ten males with allele 11 were found to share either of two family names. Evidently some of the increased percentage of allele 11 in Troina is due to these two paternal founders, who must have lived some time after the introduction of family names in Italy (13th to 14th century AD) , but before three generations ago, which was our limit for tracing family relationships between donors .
While European males predominantly have Y allele 12 and Africans generally have Y allele 11, east Asians are distinctive in having the longest Y alleles at high frequency, namely Y alleles 13, 14 and 15 (Fig. 4) . On the X chromosome, X allele 4 is found at high frequency only in Africans (Fig. 5) . Hence, there are several X and Y alleles which are diagnostic for different regions of the world, which can assist the geographical assignment of the maternal and paternal line of a male sample of unknown origin. For female samples, in ideal cases paternal and maternal X alleles may be distinguished by inference if the. abundant, maternally inherited, mitochondrial DNA (mtDNA) of a sample is additionally typed and located geographically (mtDNA localisation is possible with a geographic precision of 0km to 2000km in two thirds of cases) : if the mtDNA type is specific to the same part of the world as one of the X alleles, then it may be reasonable to equate the geographic specificity of the other X allele with the paternal origin of the sample.
Conclusions
The ability to unambiguously distinguish the X and Y homologues via the point mutations described herein greatly enhances the discrimination capacity and sex testing capacity of DXYS156, thus combining into a single PCR amplification experiment what hitherto needs to be typed in three unrelated PCRs (i.e. DNA fingerprinting through a multi-allelic autosomal STR, Y typing through a Y STR, and sex testing with the amelogenin system) . Like the amelogenin system, DXYS156 offers a positive and negative control for sex testing (the X homologue should always appear) , except that DXYS156 is ultiallelic and therefore additionally may warn the user of the presence of contaminant alleles in the sample. As a further bonus, the unambiguously distinguishable X and Y alleles display geographic speci icities, allowing separate estimates of the maternal and paternal geographic origins of a given sample. Typing of the DXYS156 STR as described herein therefore offers significant advantages over less informative STRs for standard multiplexing kits .
Table 1a. DXYS156X alleles in Sicily and Korea location n 4 5 6 7 8 9 10 11
Sicily 116 0.01 0 0 0.87 0.03 0.05 0.04 0
(Sciacca)
Sicily (Troina) 166 0 0 0 0.78 0.08 0.13 0.01 0.01
Korea 48 0 0 0 0.96 0.04 0 0 0
(Kwangju)
Table 1b.
DXYS156Y alleles in Sicily and Korea location n 10.1 11.1 12.1 13.1 14.1
Sicily 50 0.02 0.14 0.84 0 0
(Sciacca)
Sicily (Troina) 48 0.02 0.21 0.77 0 0
Korea 34 0 0.12 0.50 0.32 0.06
(Kwangju)
Table 2. DXYS156X alleles worldwide
location n reference 4 5 6 7 8 9 10 1
Sicily (Sciacca) 116 this study 0.01 0 0 0.87 0.03 0.05 0.04 C
Sicily (Troina) 166 this study 0 0 0 0.78 0.08 0.13 0.01 0.(
Germany 116 Kersting 2001 (17) 0 0 0 0.81 0.08 0.07 0.03 C
Italy 100 Kersting 2001 (17) 0 0 0 0.81 0.06 0.12 0.01 C
Turkey 194 Kersting 2001 (17) 0.01 0 0.00 0.82 0.02 0.11 0.01 c
Morocco 129 Kersting 2001 (17) 0.00 0.00 0 0.75 0.03 0.17 0.01 0.(
E. Africa 21 Kersting 2001 (17) 0.14 0 0.04 0.33 0.28 0.14 0.04 c
W. Africa 361 Kersting 2001 (17) 0.32 0.00 0 0.20 0.26 0.14 0.03 0.(
Namibia 179 Kersting 2001 (17) 0.31 0 0 0.21 0.30 0.13 0.02 0.(
(Ovambo)
Mongolia 20 Kersting 2001 (17) 0 0 0 1 0 0 0 c
China 94 Kersting 2001 (17) 0 0.01 0 0.93 0.05 0 0 c
Korea 48 this study 0 0 0 0.96 0.04 0 0 c
Japan 138 Kersting 2001 (17) 0 0.00 0.00 0.92 0.04 0.02 0 c
PNG highlands 114 Kersting 2001 (17) 0 0.00 0 0.96 0.03 0 0 c
Africa (incl. 223 Karafet 1998 (15) 0.19 0 0.01 0.24 0.27 0.18 0.08 0.(
Egypt)
Europe 312 Karafet 1998 (15) 0 0 0 0.80 0.07 0.10 0.03 c
N. Asia 657 Karafet 1998 (15) 0 0 0 0.91 0.06 0.02 0 c
E. Asia 552 Karafet 1998 (15) 0 0 0 0.93 0.05 0.02 0 c
Australasia 104 Karafet 1998 (15) 0 0 0 0.96 0.03 0.01 0 c
Americas 442 Karafet 1998 (15) 0 0 0 0.91 0.02 0.07 0 c
Table 3. DXYS156Y alleles worldwide location n reference 8 9 10 11 12 13
Sicily (Sciacca) 50 this study 0 0 0.02 0.14 0.84 0
Sicily (Troina) 48 this study 0 0 0.02 0.20 0.77 0
Italy (Modena) 99 Rossi 1999 (18) 0 0 0 0 0.99 0.01
Germany 179 Nata 1999 (19) 0 0 0.01 0.03 0.97 0
Finland 54 Sajantila 1996 (20) 0 0 0 0 0.98 0.02
Estonia 20 Sajantila 1996 (20) 0 0 0 0 1.00 0
Saami 28 Sajantila 1996 (20) 0 0 0 0 1.00 0
Sweden 40 Sajantila 1996 (20) 0 0 0 0 0.98 0.02
Basques 25 Sajantila 1996 (20) 0 0 0 0.08 0.88 0.04
Switzerland 51 Sajantila 1996 (20) 0 0 0 0 1.00 0
Switzerland 99 Kayser/Knijff 1997 0 0 0 0 0.97 0.02 (5,6)
Netherlands 89 Kayser/Knijff 1997 0 0 0 0.05 0.94 0.01 (5,6)
Turkey 39 Forster 2000 (3) 0 0 0 0.13 0.82 0.05
Kurdistan 101 Brinkmann 1999 (21) 0 0.02 0 0.13 0.86 0
Sinai 67 Salem 1996 (22) 0 0 0 0 1.00 0
Egypt 153 Salem 1996 (22) 0 0 0 0.46 0.54 0
Morocco 44 Kersting 2001 (17) 0 0 0 0.66 0.32 0.02
E. Africa 21 Kersting 2001 (17) 0.05 0.05 0 0.90 0 0
W. Africa 181 Kersting 2001 (17) 0 0 0.01 0.90 0.09 0.01
Namibia 28 Forster 2000 (3) 0 0 0 0.96 0.04 0
(Ovambo)
Thailand 50 Horst 1999 (23) 0 0 0 0.14 0.56 0.14 0.'
Mongolia 20 Forster 2000 (3) 0 0 0 0.45 0.40 0.10 0.(
China 35 Forster 2000 (3) 0 0 0 0.31 0.23 0.46
Korea 34 this study 0 0 0 0.12 0.50 0.32 0.(
Japan 44 Forster 2000 (3) 0 0 0 0.43 0.34 0.20 0.(
Australia 32 Forster 1998 (2) 0 0 0 0.56 0.44 0 C
PNG highlands 47 Forster 1998 (2) 0 0 0 0.09 0.91 0 C
Africa (incl. 204 Karafet 1998 (15) 0.02 0.03 0 0.72 0.22 0.01 C
Egypt)
Europe 231 Karafet 1998 (15) 0 0 0 0.04 0.94 0.02 0
N. Asia 644 Karafet 1998 (15) 0 0 0 0.34 0.64 0.02 0
E. Asia 497 Karafet 1998 (15) 0 0 0 0.42 0.30 0.17 0.1
Australasia 68 Karafet 1998 (15) 0 0 0 0.27 0.69 0.04 0
Americas 362 Karafet 1998 (15) 0 0 0.01 0.06 0.92 0.01 0

Claims

Claims
1. A method of analysis of genomic DNA comprising: i) amplifying the DXY156XY locus from a sample of genomic DNA to generate amplification products comprising one or more alleles of said DXY156XY STR; and ii) identifying an allele of said one or more DXY156XY STR alleles as a Y chromosome allele or an X chromosome allele, wherein said allele is identified by determining the presence of one or more X or Y chromosome-specific point mutations in the amplified DXY156XY locus comprising said allele .
2. A method according to claim 1 wherein the presence of one or more of the following X or Y chromosome-specific point mutations in the amplified DXY156XY locus is determined; a) an A insertion within a T (A) 4 motif of the DXY156XY STR, b) a G/A substitution at a position 125 nucleotides downstream of the STR, c) an T/C substitution at a position 53 nucleotides downstream of the STR.
3. A method according to claim 2 wherein the presence or absence of a T (A) 5 motif at the fourth repeat position from the 5' end of the STR is determined.
4., A method according to claim 2 wherein the presence or absence of a G residue at a position 125 nucleotides downstream of the STR is determined.
5. A method according to claim 2 wherein the presence or absence of a T residue at a position 53 nucleotides downstream of the STR is determined.
6. A method for determining the sex of a donor of a genomic sample including; i) amplifying the DXY156XY locus of the sample of genomic
DNA provided by a donor to generate amplification products comprising one or more DXY156XY STR alleles,-. and ii) .determining the presence or absence of one or more Y chromosome-specific point mutations in the amplified DXY156XY locus, the presence of said one or more point mutations being indicative that said allele is a Y chromosome allele and said donor is male.
7. A method according to claim 6 wherein said one or more Y chromosome-specific point mutations are selected from the group consisting of; a) an T (A) 5 insertion within said STR, b) a G residue at a position 125 nucleotides downstream of ■ the STR, c) a T residue at a position 53 nucleotides downstream of the STR.
8. A method comprising the steps of : i) amplifying the DXY156XY locus from a first sample of genomic DNA provided by a first donor and a second sample of genomic DNA provided by a second donor to generate first and second amplification products, each said products comprising one or more DXY156XY STR alleles, ii) identifying an allele in said first amplification products as a Y chromosome allele of said first donor and an allele in said second amplification products as a Y chromosome allele of said second donor, wherein said Y chromosome alleles are identified by determining the presence of one or more Y chromosome-specific point mutations in the amplified DXY156XY locus, iii) characterising the Y chromosome . allele of said first and said second donor by determining the number of .tandem repeats of each said allele; and iv) determining the family relationship between said first and second donors by comparing the Y chromosome allele of said first donor with the Y chromosome allele of said second donor.
9. A method comprising; i) amplifying the DXY156XY locus from a sample of genomic DNA to generate amplification products comprising two or more DXY156XY STR alleles; and ii) identifying two or more alleles of said DXY156XY STR alleles as Y chromosome alleles, wherein a Y chromosome allele is identified by determining the presence of one or more Y chromosome-specific point mutations in the amplified DXY156XY locus comprising said allele.
10. A method according to claim 8 or claim 9 wherein said Y chromosome-specific point mutations are selected from the group consisting of : a) an T (A) 5 motif within said STR, b) a G residue at a position 125 nucleotides downstream of the STR, c) a T residue at a position 53 nucleotides downstream of the STR.
11. A method comprising: i) amplifying the DXY156XY locus from a genomic DNA sample provided by an individual to generate amplification products comprising one or more DXY156XY STR alleles; and ii) identifying an allele of said one or more DXY156XY alleles as a Y chromosome allele or an X chromosome allele, wherein said allele is identified by determining wherein the presence of one or more X or Y chromosome-specific point mutations in the amplified DXY156XY locus comprising said allele, iii) characterising said X or Y' chromosome allele by determining the number of tandem repeats therein, iv) providing a database of the distribution of DXY156XY STR chromosome alleles in one or more different human populations, v) comparing the Y chromosome allele or the X chromosome allele of said sample with said database; and vi) determining the probability of said individual being a member of one or more of said human populations .
12. A method according to claim 11 wherein said one or more X or Y chromosome specific point mutations are selected from the group consisting of; a) an A insertion in a T(A)4 motif within said STR, b) a G/A substitution at a position 125 nucleotides downstream of the STR, c) an T/C substitution at a position 53 nucleotides downstream of the STR.
13. A kit comprising; a) amplification primers specific for the DXY156XY locus b) an oligonucleotide probe specific for an X or Y chromosome-specific polymorphism in the amplified DXY156XY locus .
1 . A kit according to claim 13 wherein said probe is specific for one or more of the following: a) an T(A)5 motif within the fourth repeat motif of said STR, b) a G or A residue at a position 125 nucleotides downstream of the STR, c) an T or C residue at a position 53 nucleotides downstream of the STR.
PCT/GB2002/003920 2001-08-28 2002-08-27 Polymorphic sex testing method WO2003020973A2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
GB0120816A GB0120816D0 (en) 2001-08-28 2001-08-28 Method of DNA analysis
GB0120816.4 2001-08-28
GB0210129.3 2002-05-02
GB0210129A GB0210129D0 (en) 2002-05-02 2002-05-02 Method of DNA analysis

Publications (2)

Publication Number Publication Date
WO2003020973A2 true WO2003020973A2 (en) 2003-03-13
WO2003020973A3 WO2003020973A3 (en) 2003-09-25

Family

ID=26246481

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2002/003920 WO2003020973A2 (en) 2001-08-28 2002-08-27 Polymorphic sex testing method

Country Status (1)

Country Link
WO (1) WO2003020973A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006004659A1 (en) * 2004-06-30 2006-01-12 Applera Corporation Methods for analyzing short tandem repeats and single nucleotide polymorphisms

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CALI FRANCESCO ET AL: "DXYS156: A multi-purpose short tandem repeat locus for determination of sex, paternal and maternal geographic origins and DNA fingerprinting." INTERNATIONAL JOURNAL OF LEGAL MEDICINE, vol. 116, no. 3, June 2002 (2002-06), pages 133-138, XP002249224 ISSN: 0937-9827 *
CHEN HAIMING ET AL: "Homologous loci DXYS156X and DXYS156Y contain a polymorphic pentanucleotide repeat (TAAAA)-n and map to human X and Y chromosomes." HUMAN MUTATION, vol. 4, no. 3, 1994, pages 208-211, XP009013757 ISSN: 1059-7794 cited in the application *
KARAFET TATIANA ET AL: "Different patterns of variation at the X- and Y-chromosome-linked microsatellite loci DXYS156X and DXYS156Y in human populations." HUMAN BIOLOGY, vol. 70, no. 6, December 1998 (1998-12), pages 979-992, XP009013756 ISSN: 0018-7143 cited in the application *
KAYSER M ET AL: "Evaluation of Y-chromosomal STRs: A multicenter study." INTERNATIONAL JOURNAL OF LEGAL MEDICINE, vol. 110, no. 3, 1997, pages 125-133, XP002249225 ISSN: 0937-9827 cited in the application *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006004659A1 (en) * 2004-06-30 2006-01-12 Applera Corporation Methods for analyzing short tandem repeats and single nucleotide polymorphisms

Also Published As

Publication number Publication date
WO2003020973A3 (en) 2003-09-25

Similar Documents

Publication Publication Date Title
JP7322063B2 (en) Novel primers and uses thereof
Divne et al. A DNA microarray system for forensic SNP analysis
White et al. The polymerase chain reaction
US5468610A (en) Three highly informative microsatellite repeat polymorphic DNA markers
Onofri et al. Development of multiplex PCRs for evolutionary and forensic applications of 37 human Y chromosome SNPs
US8999644B2 (en) Method for detecting the presence of a DNA minor contributor in a DNA mixture
JP2011501967A (en) Methods and kits for multiplex amplification of short tandem repeat loci
JP2007530026A (en) Nucleic acid sequencing
Pena DNA fingerprinting: state of the science
AU773854B2 (en) Method for the detection and/or analysis, by means of primer extension techniques, of single nucleotide polymorphisms in restriction fragments, in particular in amplified restriction fragments generated using AFLP
US7820385B2 (en) Method for retaining methylation pattern in globally amplified DNA
US9260757B2 (en) Human single nucleotide polymorphisms
US20090286235A1 (en) Mdr1 Snp in Acute Rejection
EP1003917A1 (en) Method and kit for hla class i typing dna
US20090286234A1 (en) Il10 snp associated with acute rejection
WO2003020973A2 (en) Polymorphic sex testing method
DiZinno et al. Typing of DNA derived from hairs
Asamura et al. Population data on 10 non-CODIS STR loci in Japanese population using a newly developed multiplex PCR system
US20100092947A1 (en) Impdh2 snp associated with acute rejection
WO2004063390A2 (en) Compositions and methods for determining canine gender
WO1999015701A1 (en) SUSCEPTIBILITY MUTATION 6495delGC OF BRCA2
KR101341943B1 (en) Kit for detecting STRs and method for detecting STRs using the same
Elmrghni Role of Amelogenin in Human Identifica-tion
WO1997013876A1 (en) Microsatellite sequences for canine genotyping
Hildebrandt et al. Polymerase Chain Reaction

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BY BZ CA CH CN CO CR CU CZ DE DM DZ EC EE ES FI GB GD GE GH HR HU ID IL IN IS JP KE KG KP KR LC LK LR LS LT LU LV MA MD MG MN MW MX MZ NO NZ OM PH PL PT RU SD SE SG SI SK SL TJ TM TN TR TZ UA UG US UZ VC VN YU ZA ZM

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ UG ZM ZW AM AZ BY KG KZ RU TJ TM AT BE BG CH CY CZ DK EE ES FI FR GB GR IE IT LU MC PT SE SK TR BF BJ CF CG CI GA GN GQ GW ML MR NE SN TD TG

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP