CN118159667A

CN118159667A - Method for detecting repeated spreading sequences

Info

Publication number: CN118159667A
Application number: CN202280053833.7A
Authority: CN
Inventors: 庄松船; 李月丽; 王思佳; 刘题明; 谢秀凤; 连雅慧
Original assignee: National University of Singapore; National University Hospital Singapore Pte Ltd
Current assignee: National University of Singapore; National University Hospital Singapore Pte Ltd
Priority date: 2021-06-02
Filing date: 2022-06-02
Publication date: 2024-06-07
Also published as: WO2022255952A3; WO2022255952A2

Abstract

The present disclosure relates, inter alia, to a method of detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the method comprising: i) Contacting a nucleic acid sample of each of the two or more genes with: a) A gene-specific primer that binds to a different target sequence of each gene, wherein the genes comprise nucleotide repeats, and wherein the different target sequences are upstream or downstream of the nucleotide repeats, and b) a universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeats and on the opposite strand bound by the gene-specific primer; and ii) analyzing the amplification product.

Description

Method for detecting repeated spreading sequences

The present disclosure relates to the field of molecular biology. In particular, the specification teaches methods of detecting the presence or absence of repeat expansion sequences in two or more genes in a nucleic acid sample, and methods of screening for multiple repeat expansion diseases in a subject.

Background

The extension of simple sequence repeats scattered throughout the human genome is now known to directly cause more than 35 human diseases. The most common repeat expansion disease is caused by trinucleotide repeats, although tetranucleotide, pentanucleotide, hexanucleotide and even dodecanucleotide repeat expansions have also been identified as potential mutations in other diseases.

Repeated spread diseases often exhibit significantly different phenotypes, and are often difficult to distinguish by signs and symptoms alone due to extensive clinical overlap and other concomitant phenotypes. Thus, molecular genetic testing is necessary to identify pathogenic mutations to confirm disease status in symptomatic individuals. Several diseases are caused by common repeat expansion mutations located at different loci, for example CGG or GCG repeat expansion causing Fragile X Syndrome (FXS), several types of remote ophthalmopharyngeal myodystrophy, ocular pharyngeal muscular dystrophy and developmental epileptic encephalopathy, CCG repeat expansion causing one type of mental development and ocular pharyngeal myopathy with white matter encephalopathy, CAG repeat expansion causing huntington's disease and several types of spinocerebellar ataxia (SCA), CTG repeat expansion causing type 1 tonic myodystrophy, huntington's disease-like 2 and Fuchs corneal endothelial dystrophy, whereas TTTCA five nucleotide repeat expansion causing several types of familial adult myoclonus epilepsy and spinocerebellar ataxia. Multiple rounds of genetic testing may be required to achieve proper diagnosis of the affected individual, resulting in additional cost and time.

Accordingly, there is a need to overcome or at least alleviate one or more of the problems mentioned above.

Disclosure of Invention

Disclosed herein is a method of detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the method comprising: i) Contacting a nucleic acid sample under amplification conditions for each of two or more genes with: a) A gene-specific primer that specifically binds to a different target sequence of each gene, wherein the genes comprise nucleotide repeats, and wherein the different target sequences are upstream or downstream of the nucleotide repeats, and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeats and on opposite strands bound by the gene-specific primers; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and ii) analyzing the amplified product.

Disclosed herein is a kit for detecting the presence or absence of repeated extension sequences in two or more genes in a nucleic acid sample obtained from a subject, the kit comprising: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene specific primers and universal primers are capable of producing one or more amplification products from each gene.

Disclosed herein are compositions comprising a nucleic acid sample obtained from a subject, the compositions comprising a) a gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequences are upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene;

wherein the gene specific primers and universal primers are capable of producing one or more amplification products from each gene.

Disclosed herein is a method of screening a subject for one or more multiple repeat spread disease, the method comprising: i) Contacting a nucleic acid sample from a subject under amplification conditions with: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and ii) analyzing the amplified product to screen the subject for one or more multiple repeat spread disease.

Disclosed herein is a method of screening for one or more multiple repeat spread disease in a subject and treating the subject, the method comprising: i) Contacting a nucleic acid sample from a subject under amplification conditions with: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; ii) analyzing the amplified product to screen the subject for one or more multiple repeat spread disease; and iii) treating the subject found to have at least one multiple repeat spread disease.

Drawings

Some embodiments of the invention will now be described, by way of non-limiting example only, with reference to the accompanying drawings, in which:

FIG. 1. Schematic representation of FMR1 and AFF2 double TP-PCR reactions. Locus-specific flanking primer annealing positions are indicated (black and grey arrows, with the ends marked with star symbols), as are TP primer annealing positions in the repeat (black and grey arrows). Grey arrows indicate poor amplification, either because the TP primer is farther away from the locus-specific primer, or because of an interrupt-mediated mismatch. The expected pattern of electropherograms for samples carrying only normal FMR1 and normal AFF2 alleles (a), extended FMR1 alleles and extended AFF2 alleles (C) are shown. The black rectangles represent FMR1 CGG or AFF2 CCG trinucleotides. Grey rectangles represent non-CGG or non-CCG trinucleotides as shown above each rectangle.

FIG. 2 electropherograms of dual TP-PCR products of FMR1 (FAM/dark gray) and AFF2 (HEX/light gray) from normal, pre-FMR 1 mutation and full FMR1 mutation male and female DNA samples. The electropherograms show both FAM and HEX (left), FAM (middle) only and HEX (right) only fluorescence channels. Dark gray peaks indicate FAM-labeled FMR1 TP-PCR products, while light gray peaks indicate HEX-labeled AFF2 TP-PCR products. The threshold repeat size separating the normal allele from the extended allele is indicated by the vertical dashed line.

FIG. 3 electrophoretogram of FMR1 (FAM/dark grey) and AFF2 (HEX/light grey) dual TP-PCR products from AFF2 pre-mutant and full mutant DNA samples. The electropherograms show fluorescence channels at both FAM and HEX (left), FAM (middle) only and HEX (right) only. Dark gray peaks indicate FAM-labeled FMR1 TP-PCR products, while light gray peaks indicate HEX-labeled AFF2 TP-PCR products. The threshold repeat size separating the normal allele from the extended allele is indicated by the vertical dashed line.

Figure 4.Aff 2CCG repeat size and structural distribution. AFF2CCG repeat size (x-axis) and frequency (y-axis) are distributed among african americans (gray filled), caucasians (unfilled), chinese (black filled), indian (dark grey thick bars) and males (gray forward slash) populations. B, comparison of the allele frequency distribution in Zhong et al (grey) and in this study (black). Comparison of the distribution of caucasian allele frequencies in C, zhong et al (light grey) and the present study (grey). D, a heat map of population distribution (top) of X chromosomes with different FMR1 CGG and AFF2CCG repeat size combinations, and population repeat size distribution and abundance (bottom) of common and variant AFF2 alleles.

FIG. 5. Spectra of AFF2 CCG repeat structures and corresponding patterns of TP-PCR electropherograms. A, a common normal allele with rs868914124 (T) reference nucleotide. B, variant normal allele with rs868914124 (C) variant nucleotide. C, variant normal alleles with rs868914124 (T) reference nucleotide and rs1389911365 (T) variant nucleotide. D, full mutant allele with rs868914124 (C) variant nucleotide and three consecutive non-CCG disruptions. A "+" indicates a non-CCG interrupt, and "+++". Three are indicated continuous non-CCG interrupts. The leftmost 111bp isolated peak was generated by annealing the TP primer upstream of the repeat segment. The first repeatedly generated TP-PCR product of the rs868914124 (T) common allele migrates into a 138bp fragment. The first repeatedly generated TP-PCR product from the rs868914124 (C) rare allele migrates into a 132bp fragment. Disruption within the repeat results in mismatched TP primer pairs, which results in a no-peak gap within the peak cluster.

FIG. 6 (A) total mutated male FX0229 and (B) its total mutated mother's brother FX0230 AFF2CCG repeat structure and corresponding TP-PCR electropherogram pattern. A, the full mutant allele carries an rs868914124 (C) rare variant and three consecutive non-CCG breaks at repeats 8-10. B, a full mutant allele with rs868914124 (C) rare variants and multiple non-CCG disruptions. "+". ++'s three are indicated continuous non-CCG interrupts.

FIG. 7 (A) AFF2 CCG repeat structure of total mutant male DNA_25926 and (B) its pre-mutant progeny DNA_3802 and corresponding TP-PCR electropherogram patterns. A, carrying the rare variant of rs868914124 (C) and three consecutive non-CCG-interrupted full mutant alleles at repeats 6-8. B, a normal allele of 22 CCG repeats carrying rs868914124 (T) common reference nucleotides, and a pre-mutant allele of about 137 repeats with a rare variant of rs86891412 (C) and three consecutive non-CCG breaks at repeats 6-8. "+". ++'s three are indicated continuous non-CCG interrupts.

FIG. 8 multiplex three primer PCR for seven common SCA repeat loci. Schematic representation of the repetitive structures of SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and DRPLA and the primer annealing positions (A) and the expected TP-PCR electropherograms for each repetitive locus (B). Asterisks indicate fluorophore label. The upper limit of the normal allele repeat size for each locus is indicated by the vertical dashed line.

FIG. 9 seven TP-PCR capillary electrophoresis graphs were generated from DNA samples negative for expansion at any of the seven SCA repeat loci. The upper panel shows the electrophoretic peaks representing TP-PCR products from all seven repeat loci seen with all four fluorophore channels (Fam, vic, ned and Pet) open. The four lower panels show the capillary electrophoresis patterns for the same case but with one fluorophore channel open at a time. The upper limit of the normal allele repeat size for each locus is indicated by the vertical dashed line.

FIG. 10 shows the result of a seven-fold TP-PCR capillary electrophoresis of a DNA sample affected by SCA. For each sample, only the electropherograms were shown in which the fluorophore channels showed repeated expansion at the relevant SCA loci. Samples were positive for CAG repeat expansion in ATXN1 (top row), ATXN2 (second row), ATXN3 (third row), CACNA1A (fourth row), ATXN7 (fifth row), PPP2R2B (sixth row) and ATN1 (bottom row). The upper limit of the normal allele repeat size for each locus is indicated by the vertical dashed line.

Detailed Description

The present specification teaches a method of detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the method comprising: i) Contacting a nucleic acid sample under amplification conditions for each of two or more genes with: a) A gene-specific primer that specifically binds to a different target sequence of each gene, wherein the genes comprise nucleotide repeats, and wherein the different target sequences are upstream or downstream of the nucleotide repeats, and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeats and on opposite strands bound by the gene-specific primers; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and ii) analyzing the amplified product.

In one embodiment, the method is a simultaneous method of detecting the presence or absence of repeated spreading sequences in two or more genes in a nucleic acid sample obtained from a subject

Without being bound by theory, the inventors developed a strategy to screen for the presence of repeated expansion mutations in two or more suspected genes of a patient simultaneously, and use both examples to describe their use in the present disclosure. This strategy employs a single tube assay to screen multiple genetic loci responsible for different repeatedly expanding diseases caused by the expansion of the same type of repeat. The specific assays shown herein employ multiplex three-primer PCR (TP-PCR) to detect extended mutations involving trinucleotide repeats present, for example, in different genes, by using differentially labeled locus-specific flanking primers and universal three-primer (TP) primers. Extension products at any locus that shows a repeat size in the pathogenic size range, or that exceeds the maximum normal repeat size, can be rapidly identified and sized by capillary electrophoresis. A single amplification reaction provides reliable one-step mutation screening of multiple disease genes, thereby greatly shortening the lengthy process of diagnosis of affected individuals. The strategy can be applied to screen any group of disease genes sharing the same repetitive sequence simultaneously.

In one embodiment, the amplification products are separated according to size. In one embodiment, a change in the size of the amplified product of the gene as compared to a reference is indicative of the presence of a repeated extension sequence in the gene. In one embodiment, no change in the size of the amplified product of the gene as compared to the reference indicates the absence of a repeat in the gene.

The terms "detect," "determine," "measure," "evaluate," "assess," and "determine" are used interchangeably herein to refer to any form of measurement and include determining the presence or absence of an element. These terms include quantitative and/or qualitative determinations. Assessment may be relative or absolute.

The term "nucleic acid" refers to a deoxyribonucleotide or ribonucleotide polymer in either single-or double-stranded form, and unless otherwise limited, encompasses known analogs of natural nucleotides that hybridize to nucleic acids in a manner similar to naturally occurring nucleotides.

As used herein, the term "nucleic acid" and equivalent terms, such as "polynucleotide", refer to a polymeric form of nucleotides of any length, such as ribonucleotides, deoxyribonucleotides or Peptide Nucleic Acids (PNAs), that comprise purine and pyrimidine bases, or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases. The nucleic acid may be double-stranded or single-stranded. Reference to single stranded nucleic acids includes reference to sense or antisense strands. The backbone of the polynucleotide may include sugar and phosphate groups, as commonly found in RNA or DNA, or modified or substituted sugar or phosphate groups. Polynucleotides may include modified nucleotides, such as methylated nucleotides and nucleotide analogs. The sequence of nucleotides may be interrupted by non-nucleotide components. The terms nucleoside, nucleotide, deoxynucleoside and deoxynucleotide generally include complements, fragments and variants of nucleosides, nucleotides, deoxynucleosides and deoxynucleotides or analogs thereof.

The term "primer" is used herein to refer to any single stranded oligonucleotide sequence that can be used as a primer in, for example, PCR techniques. Thus, a "primer" according to the present disclosure refers to a single stranded oligonucleotide sequence that is capable of acting as a starting point for the synthesis of a primer extension product that is substantially identical to a nucleic acid strand to be replicated (for a forward primer) or substantially identical to the reverse complement of a nucleic acid strand to be replicated (for a reverse primer). The primers may be suitable for use in, for example, PCR techniques. Single strands include, for example, hairpin structures formed from single-stranded nucleotide sequences.

The design of a primer, e.g., its length and specific sequence, depends on the nature of the target nucleotide sequence and the conditions under which the primer is used, e.g., temperature and ionic strength.

Primers may consist of the nucleotide sequences described herein, or may be 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100 or more nucleotides comprising or falling within the sequences described herein, provided they are suitable for specifically binding to a target sequence under stringent conditions. In one embodiment, the primer sequence is less than 35 nucleotides in length, e.g., the primer sequence is less than 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, or 21 nucleotides in length.

The length or sequence of the primers or probes may be slightly modified to maintain the desired specificity and sensitivity in a given situation. In one embodiment of the present disclosure, the probes and/or primers described herein may extend 1, 2, 3, 4, or 5 nucleotides in length, or decrease 1, 2, 3, 4, or 5 nucleotides in length, for example, in any direction.

The primer sequences may be synthesized using any method known in the art.

The term "complementary" refers to base pairing between nucleotides or nucleic acids, for example, between two strands of a double-stranded DNA molecule, or between an oligonucleotide primer and a primer binding site on a single-stranded nucleic acid to be sequenced or amplified. The complementary nucleotides are typically A and T (or A and U), or C and G. Two single stranded RNA or DNA molecules are considered complementary when the nucleotides of one strand, optimally aligned and compared, with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand (typically at least about 90% to 95% of the nucleotides of the other strand, and more preferably about 98% to 100% of the nucleotides of the other strand). Alternatively, complementarity exists when an RNA or DNA strand will hybridize to its complement under selective hybridization conditions. Typically, selective hybridization will occur when there is at least about 65% complementarity over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, and more preferably at least about 90% complementarity.

As used herein, the term "hybridization" refers to the process by which two single-stranded polynucleotides non-covalently bind to form a stable double-stranded polynucleotide. The resulting (typically) double-stranded polynucleotide is a "hybrid". The proportion of the population of polynucleotides that form stable hybrids is referred to herein as the "degree of hybridization"

Typically, hybridization conditions will include salt concentrations of less than about 1M, more typically less than about 500mM and less than about 200 mM. Hybridization temperatures can be as low as 5 ℃, but are typically greater than 22 ℃, more typically greater than about 30 ℃, and preferably greater than about 37 ℃. The hybridization is usually carried out under stringent conditions (i.e., conditions under which the probe will hybridize to its target sequence). Stringent conditions are sequence-dependent and will be different in different circumstances. Longer fragments may require higher hybridization temperatures for specific hybridization. The combination of parameters is more important than the absolute measure of either alone, as other factors may affect the stringency of hybridization, including the base composition and length of the complementary strands, the presence of organic solvents, and the degree of base mismatch. Typically, stringent conditions are selected to be about 5 ℃ lower than the thermodynamic melting point (Tm) for a particular sequence at a defined ionic strength and pH. Tm is the temperature (under defined ionic strength, pH and nucleic acid composition) at which 50% of the probes complementary to the target sequence hybridize to the target sequence at equilibrium. Typically, stringent conditions include salt concentrations of at least 0.01M to no more than 1M Na ion concentration (or other salt) at a pH of 7.0 to 8.3 and a temperature of at least 25 ℃. For example, conditions of 5 XSSPE (750mM NaCl,50mM Na phosphate, 5mM EDTA, pH 7.4) and temperatures of 25-30℃are suitable for allele-specific probe hybridization.

By "specific binding" or "specific hybridization" of a primer to a target sequence is meant that under the experimental conditions used, e.g., under stringent hybridization conditions, the primer forms a duplex (double-stranded nucleotide sequence) with a portion of the target sequence region or with the entire target sequence (as desired), and under these conditions the primer does not form a duplex with other regions of the nucleotide sequence present in the sample to be analyzed.

The term "repeat extension sequence" may refer to a nucleotide repeat sequence that has been subjected to an extension mutation resulting in the nucleotide repeat sequence extending beyond the normal repeat size.

In one embodiment, the repeat spreading sequence is a trinucleotide repeat spreading sequence. In other embodiments, the repeat extension is a four nucleotide or five nucleotide repeat extension or a repeat extension of other length, such as a 6, 7, 8, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotide repeat sequence.

In one embodiment, the nucleotide repeat is a trinucleotide repeat. In one embodiment, the trinucleotide repeat sequence is selected from (CGG)_n、(CCG)_n、(CAG)_n、(CTG)_n、(GCC)_n、(GGC)_n、(GAA)_n or (TTC) _n, where n is 2 to 200 or greater. (CGG) _n repeats include (GCG) _n and (GGC) _n. Similarly, (CCG) _n sequences include (CGC) _n and (GCC) _n, and (CTG) _n sequences include (TGC) _n and (GCT) _n.

A "universal primer" as referred to herein may bind to a "common target sequence" shared by two or more genes, wherein the "common target sequence" is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer.

"Gene-specific primers" as referred to herein may refer to primers that specifically bind to a particular gene but not to other genes.

The term "gene" is used broadly to refer to any nucleic acid associated with a biological function. Genes typically include coding sequences and/or regulatory sequences required for expression of such coding sequences. The term gene may apply to a particular genomic sequence, as well as to the cDNA or mRNA encoded by that genomic sequence. Genes also include non-expressed nucleic acid segments that, for example, form recognition sequences for other proteins. Non-expressed regulatory sequences include promoters and enhancers to which regulatory proteins (e.g., transcription factors) bind, resulting in transcription of adjacent or nearby sequences.

In one embodiment, the universal primer binds to a common target sequence comprising or consisting of 5, 6, 7, 8, 9, 10 or more consecutive trinucleotide repeats. In one embodiment, the universal primer binds to (CGG) ₅ (SEQ ID NO: 15). In one embodiment, the universal primer binds to (CTG) ₅ (SEQ ID NO: 16).

In one embodiment, the universal primer binds to a common target sequence comprising or consisting of repeat units of 2,3,4,5,6, 7, 8, 9, 10 or more consecutive four nucleotides, five nucleotides or more nucleotides. In one embodiment, the five nucleotide repeat sequence is (TTTCA) _n, which includes (CATTT) _n and (ATTTC) _n.

In one embodiment, the universal primer comprises a unique tail sequence. In one embodiment, the universal primer comprises a unique 5' tail sequence. The term "unique tail sequence" or "unique 5' tail sequence" refers to a sequence that does not hybridize under stringent conditions to any gene or any region in an intergenic region in a nucleic acid sample, which is used to detect the presence or absence of a repeated spreading sequence.

In one embodiment, the method includes providing a tail primer that specifically binds to a unique 5' tail sequence of a universal primer. Adding a unique 5 'tail sequence to the universal primer and providing a tail primer that specifically binds to the unique 5' tail sequence can improve the accuracy of repeat sizing.

In one embodiment, the gene specific primers are labeled. Each gene-specific primer may be labeled, for example, with a different fluorophore, to enable the amplified products from each gene to be distinguished from each other. For example, FAM or Hex markers may be used.

In one embodiment, the fluorescent label may be active in the blue, yellow, green, and far red regions of the spectrum. In preferred embodiments, non-limiting examples of fluorescent markers useful in the methods of the present disclosure include: fluorescent labels or reporter dyes, such as 6-carboxyfluorescein (6 FAM ^TM)、NED^TM(Applera Corporation)、HEX^TM or VIC ^TM(Applied Biosystems);TAMRA^TM labels (Applied Biosystems, CA, USA), rox. One skilled in the art will appreciate that other alternative fluorescent labels may also be used in the methods according to the present disclosure.

In another embodiment, chemiluminescent labels, such as ruthenium probes, may be used; and radiolabels, such as tritium in the form of tritiated thymidine. 32-phosphorus can also be used as a radiolabel.

Alternatively, the label may be selected from an electroluminescent label, a magnetic label, an affinity or binding label, a nucleotide sequence label, a position specific label and/or a label having specific physical properties (e.g. different dimensions, masses, gyrations, ionic strength, dielectric properties, polarization or impedance).

In one embodiment, the detectable label may be directly or indirectly attached to the primer. In one embodiment, the labeled primer is a reverse primer. In one embodiment, the detectable label comprises a fluorescent moiety attached to the 5' end of the probe. In a most preferred embodiment, the label is selected from the group consisting of 6-FAM and NED.

In an alternative embodiment, the nucleic acid is detected with a nucleic acid intercalating fluorophore. Preferably, the intercalating fluorophore is SYBR Green or EvaGreen, or the like. Those skilled in the art will appreciate that other intercalating fluorophores active in the blue, yellow, green, and far-red regions of the spectrum may be used. It will be further understood that other intercalating fluorophores may be used in accordance with the present disclosure.

The term "sample" as referred to herein may originate from a biological fluid, cell, tissue, organ or organism comprising a nucleic acid or mixture of nucleic acids having at least one nucleic acid sequence to be screened for copy number variation. In certain embodiments, the sample has at least one nucleic acid sequence, suspected of having had its copy number changed. Such samples include, but are not limited to, sputum/oral fluid, amniotic fluid, blood fractions or fine needle biopsy samples, urine, peritoneal fluid, pleural fluid, and the like. Although the sample is typically taken from a human subject (e.g., a patient), the sample may be taken from any mammal, including but not limited to dogs, cats, horses, goats, sheep, cattle, pigs, and the like. The sample may be used as obtained from biological sources directly or after a pretreatment that alters the properties of the sample. For example, such pretreatment may include preparing plasma from blood, diluting viscous fluids, and the like. The method of pretreatment may also involve, but is not limited to, filtration, precipitation, dilution, distillation, mixing, centrifugation, freezing, lyophilization, concentration, amplification, nucleic acid fragmentation, inactivation of interfering components, addition of reagents, lysis, and the like. Where such pretreatment methods are employed with respect to a sample, such pretreatment methods typically leave the nucleic acid of interest in the test sample, sometimes in a concentration proportional to the concentration in the untreated test sample (e.g., a sample that has not been subjected to any such pretreatment methods).

The control sample may be a negative or positive control sample. "negative control sample" or "unaffected sample" refers to a sample comprising nucleic acids known or expected to have repeat sequences with a number of repeats in a non-pathogenic range. "positive control samples" or "affected samples" are known or expected to have repeat sequences with a number of repeats that are within the pathogenic range. The repeat sequence in the negative control sample typically does not extend beyond the normal range, whereas the repeat sequence in the positive control sample typically has extended beyond the normal range. Thus, the nucleic acids in the test sample can be compared to one or more control samples.

The term "biological fluid" herein refers to a liquid taken from a biological source and includes, for example, blood, serum, plasma, sputum, lavage, cerebrospinal fluid, urine, semen, sweat, tears, saliva, and the like. As used herein, the terms "blood," "plasma," and "serum" expressly encompass fractions or processed portions thereof. Also, where the sample is taken from a biopsy, swab, smear, or the like, the "sample" expressly encompasses a processed fraction or portion derived from a biopsy, swab, smear, or the like.

"Label" refers to a reporter molecule (e.g., a fluorophore) capable of producing a measurable signal and covalently or non-covalently linked to a polynucleotide.

In one embodiment, the method comprises analyzing the amplified products using a size separation technique or a sequencing technique. The size separation technique may be an electrophoresis-based technique (e.g., capillary electrophoresis). For example, capillary electrophoresis by size separating polymers can be used. The fluorescently labeled PCR product (via the use of 5' labeled primers) is detected via laser excitation as it migrates and resolves through the polymer filled capillary. Another size separation technique is plate gel PAGE (polyacrylamide gel electrophoresis). The PCR products can be fluorescently labeled with 5' end-labeled primers and detected via laser excitation as they migrate and resolve through the slab gel. The PCR products can also be radiolabeled via 5' -end-labeled primers or via incorporation of radioisotope-labeled nucleotides during the PCR process and detected by exposing the slab gel to X-ray film. The sequencing technique may be next generation sequencing. By using long read next generation sequencing, PCR products can be sequenced and all reads belonging to the gene of interest can be aligned from shortest to longest, or vice versa.

Two or more genes as referred to herein may be adjacent to each other or in proximity to each other on the same chromosome. Alternatively, they may be on different chromosomes.

In one embodiment, two or more genes consist of FMR1 and AFF2 or comprise FMR1 and AFF2.FMR1 and AFF2 are located on the same chromosome (X) on the long arm of adjacent chromosome bands. This provides a simple method of simultaneously testing two or more genes including FMR1 and AFF2, either of which FMR1 and AFF2 can cause a disease exhibited by a patient.

In one embodiment, the method comprises contacting a nucleic acid sample with: a) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID No. 1; b) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID No. 2; and C) a universal primer comprising or consisting of a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence of SEQ ID NO. 3 or (CGG) ₅ (SEQ ID NO 15).

In one embodiment, the two or more genes comprise or are selected from the group consisting of SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and DRPLA. In one embodiment, the two or more genes consist of two, three, four, five, six or seven or more genes comprising or selected from SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and DRPLA.

In one embodiment, the two or more genes are HTT and JPH3.

In one embodiment, two or more genes consist of ATXN1, ATXN2, ATXN3, CACNA1A, ATXN, PPP2R2B and ATN 1.

In one embodiment, the two or more genes comprise or are selected from ATXN1, ATXN2, ATXN3, CACNA1A, ATXN, PPP2R2B, ATN1, TBP, ATXN8OS, AR, HTT, JPH3, TCF4 and DMPK, and consist of two, three, four, five, six, seven, eight, nine, ten or more or all of these genes.

In one embodiment, the two or more genes comprise or are selected from FMR1, AFF2, NUTM BAS1, LRP12, GIPC1, NOTCH2NLC, RILPL1, PABPN1, and ARX, and consist of two, three, four, five, six, seven, eight, or all of these genes.

In one embodiment, the two or more genes comprise or are selected from DAB1, SAMD12, STARD7, TNRC6A and RAPGEF2 and consist of two, three, four or all of these genes.

In one embodiment, the method comprises contacting a nucleic acid sample with: a) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID No. 7; b) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID No. 8; c) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID No. 9; d) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID No. 10; e) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID No. 11; f) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID No. 12; and/or g) a gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID NO. 13, and h) a universal primer comprising or consisting of a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence of SEQ ID NO. 14 or (CTG) ₅ (SEQ ID NO. 16).

Disclosed herein is a kit for detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the kit comprising: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene specific primers and universal primers are capable of producing one or more amplification products from each gene.

Disclosed herein are compositions comprising a nucleic acid sample obtained from a subject, the compositions comprising a) a gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequences are upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene specific primers and universal primers are capable of producing one or more amplification products from each gene.

The term "multiple repeat extended disease" refers to a genetic disease caused by an increase in the number of DNA repeat motifs (e.g., trinucleotide repeat motifs) in certain genes beyond a normal, stable threshold (different threshold for each gene). The term is intended to include all diseases of this nature and includes the diseases listed in table 1.

TABLE 1 diseases caused by the extension of different repeat motifs

/>

In one embodiment, the one or more multiple repeat extended diseases comprise or consist of one or more of the diseases listed in table 1.

In one embodiment, the one or more multiple repeat spread disease comprises or consists of a disease comprising or selected from the group consisting of: fragile X Syndrome (FXS), fragile X-related primary ovarian dysfunction (FXPOI), fragile X-related tremor/ataxia syndrome (FXTAS), and fragile XE non-complex intellectual disability (FRAXE NSID).

In one embodiment, the method distinguishes FXS, FXPOI, FXTAS from FRAXE NSID.

In one embodiment, the one or more multiple repeat expansion disorders is spinocerebellar ataxia (SCA) and/or dentate nuclear pallidoluid atrophy (DRPLA).

In one embodiment, the method distinguishes between SCA and DRPLA. In one embodiment, the method distinguishes between different SCA types and DRPLA.

In one embodiment, the one or more multiple repeat spread diseases are Huntington's Disease (HD) and huntington's disease-like 2 (HDL 2).

In one embodiment, the method distinguishes between Huntington's Disease (HD) and huntington's disease-like 2 (HDL 2).

In one embodiment, the method distinguishes between different types of oculopharyngeal distal myopathy (OPDM), oculopharyngeal myodystrophy (OPMD), oculopharyngeal myopathy (OPML) with white matter encephalopathy, and Developmental Epileptic Encephalopathy (DEE).

In one embodiment, the method distinguishes between different familial adult myoclonus seizures (FAME) types and spinocerebellar ataxia (SCA) type 37.

"Subject" or "patient" refers to any individual subject in need of treatment, including humans, cattle, horses, pigs, goats, sheep, dogs, cats, guinea pigs, rabbits, chickens, insects, and the like. The subject is also intended to include any subject that is involved in a clinical study trial but does not exhibit any clinical signs of disease or a subject that is involved in an epidemiological study, or a subject that is used as a control.

The term "treating" or the like also includes alleviating, reducing, alleviating, ameliorating or otherwise inhibiting the effects of a condition for at least a period of time. It is also to be understood that the term "treating" or the like does not mean that the condition or symptoms thereof are permanently alleviated, reduced, alleged, ameliorated, or otherwise inhibited, and thus temporary relief, reduction, alleviation, amelioration, or otherwise inhibition of the condition or symptoms thereof is also contemplated.

Methods of treating a subject may include administering a drug to the subject, or may include providing early intervention management to the subject.

The term "administering" refers to contacting an inhibitor as referred to herein with, applying, injecting, infusing, or providing it to a subject.

Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.

As used in this specification, the singular forms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a method" includes a single method as well as two or more methods; references to "an agent" include a single agent as well as two or more agents; references to "the present disclosure" include single and multiple aspects of the present disclosure's teachings; etc. Aspects taught and authorized herein are encompassed by the term "invention". Any variant and derivative is contemplated.

The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as, an acknowledgment or admission or any form of suggestion that the prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.

Examples

Certain embodiments of the present invention will now be described with reference to the following examples, which are intended for illustrative purposes only and are not intended to limit the scope of the subject matter described above.

Example 1: FRAXA/FRAXE

Materials and methods

Biological sample

Genomic DNA was extracted from 408 unrelated and anonymous cord blood from 161 chinese male infants, 158 males and 89 indian male infants born at National University Hospital, singapore. Another 44 caucasian male DNA samples and 14 African American male DNA samples from the human variant DNA group (HD 100CAU and HD100 AA-2) were purchased from Coriell Cell Repositories (Camden, new Jersey, USA). In the assay validation, archived and previously characterized genomic DNA consisting of 40 normal, 17 pre-FMR 1 mutation positive, 23 total FMR1 mutation positive and 6 AFF2 expansion positive samples was used. De-identified normal and FMR1 CGG repeat expansion positive samples were obtained from KK word 'S AND CHILDREN's Hospital. AFF2 CCG duplicate expansion positive samples were de-identified archive samples from Baylor College of Medicine (Houston, TX, USA) and The University of Adelaide (Adelaide, south Australia, australia). Four of the six AFF2 CCG repeat expansion positive samples are correlated. FX0230 and FX0229 are mother's brother and nephew, respectively, and DNA_25926 and DNA_3802 are father and daughter.

Dual TP-PCR of FMR1 and AFF2 trinucleotide repeats

Double screening for repeated extension of AFF2 and FMR1 triplets utilized a Three Primer (TP) PCR method involving four primers, fam-labeled FMR1-R (5'-AGCCCCGCACTTCCACCACCAGCTCCTCCA-3' (SEQ ID NO: 1)), hex-labeled AFF2-F (5'-CCATGTCGCGGCTTCTAGCTGTCCAGGCTCC-3' (SEQ ID NO: 2)), and shared primer TP (5'-TGCTCTGGACCCTGAAGTGTGCCGTTGATACGGCGGCGGCGGCGG-3' (SEQ ID NO: 3)) and tail (5'-TGCTCTGGACCCTGAAGTGTGCCGTTGATA-3' (SEQ ID NO: 4)). Each 15 μl PCR reaction contained 100ng of genomic DNA,2.5x Q-solutions (Qiagen, hilden, germany), 1 XPCR buffer (Qiagen) containing 1.5mM MgCl ₂, 2mM dNTPs (dGTP/dCTP to dATP/dTTP ratio of 5:1) (Roche APPLIED SCIENCE, mannheim, germany), 0.6 μM AFF2-F, FMR-R and tail primers each, 0.0006 μM TP primer and 5U HotStarTaq DNA polymerase (Qiagen). The initial 15 minutes of enzyme activation was performed at 95℃followed by 40 cycles of 99℃45 seconds, 55℃45 seconds and 70℃8 minutes (15 seconds increase per extension cycle) and finally extension at 72℃for 10 minutes.

Standard PCR spanning FMR1 and AFF2 trinucleotide repeats

Standard/conventional PCR across FMR1 CGG repeats utilized 0.6. Mu.M each of primer 5' -F1 and Fam-labeled FMR1-R, while PCR across AFF2 CCG repeats utilized Hex-labeled AFF2-F and AFF2-R (5'-CGCTGCGGGCTCAGGCGGGCT-3' (SEQ ID NO: 5)). The PCR reaction and cycling conditions were similar to the double TP-PCR mentioned above, except that 50ng of genomic DNA was used in each reaction.

Capillary electrophoresis

An aliquot of 2. Mu.l of the dual TP-PCR product was mixed with 9. Mu.l of Hi-Di ^TM formamide and 0.5. Mu.l of GeneScan ^TM500ROX^TM dye size standard (Applied Biosystems, foster City, california, USA), denatured at 95℃for 5 minutes, cooled to 4℃and resolved in a 3130xl genetic Analyzer (Applied Biosystems) using a 36cm capillary filled with POP-7 ^TM polymer. The mixture was electrokinetically injected at 1kV for 5 seconds and electrophoresed at 60℃for 40 minutes. Analysis was performed using GeneMapper ^TM software v4.0 (Applied Biosystems). If an extended allele is detected after an initial Capillary Electrophoresis (CE) run, a second CE run is performed using a power injection at 10kV for 5 seconds and electrophoresis at 60℃for 40 minutes.

An aliquot of 1. Mu.l of the one-twentieth diluted standard PCR product was mixed with 9. Mu.l of Hi-Di ^TM carboxamide (Applied Biosystems) and 0.3. Mu.l of GeneScan ^TM500ROX^TM dye size standard (Applied Biosystems), denatured at 95℃for 5 min, cooled to 4℃and resolved in a 3130xl genetic analyzer (Applied Biosystems) using 36cm capillaries filled with POP-7 ^TM polymer. The mixture was electrokinetically injected at 1.2kV for 23 seconds and electrophoresed at 60℃for 20 minutes.

Post CE analysis was performed using GeneMapper ^TM software v4.0 (Applied Biosystems).

Data interpretation

Primers AFF2-F anneal to AFF2 sequences immediately upstream of the CCG repeat of AFF2 sequences and produce Hex-labeled TP-PCR amplicons, while primer FMRI-R anneals to FMR1 sequences immediately downstream of the CGG repeat of FMR1 sequences and produces Fam-labeled TP-PCR products. Both TP-PCR reactions use common Three Primer (TP) and tail primers, and their products can be analyzed separately or together using different fluorescence detection channels. The TP primer was designed to anneal optimally to any one of the five CCG trinucleotides on the AFF2 sense strand or FMR1 antisense strand. When there is a stretch of more than five repeats that are continuous and uninterrupted, the electropherogram will exhibit a series of consecutive peaks that are 3bp apart from each other. If there is a non-CCG disruption, the electropherogram will exhibit a gap of about 18bp between the fluorescent peaks, corresponding to the absence of 5 fluorescent peaks (FIG. 1). The repeat size of an uninterrupted stretch is the total number of consecutive fluorescent peaks plus four (the first fluorescent peak is generated from the TP primer annealed to the first 5 repeats). For alleles with non-CCG interruption, the repeat size is the total number of consecutive fluorescent peaks plus the total number of missing peaks, plus four. In heterozygote females, a decrease in fluorescence intensity will be seen over a certain size peak; the repeat size of the smaller allele can be derived from the number of peaks before the drop, while the repeat size of the larger allele can be derived from the total number of peaks (data not shown). For the AFF2 TP-PCR reaction, the leftmost isolated peak was not generated within the repeat sequence, but rather by annealing the TP primer to the (CCG) ₄ CCT (SEQ ID NO: 6) sequence immediately upstream of the CCG repeat defined as beginning after the CTG trinucleotide (FIG. 1). The size and structure of the AFF2 allele is further detailed in the results.

Sequencing of AFF2 CCG repeats

AFF2 TP-PCR and Standard/conventional PCR were performed as described above, except that all primers were unlabeled. According to the manufacturer's instructions, useTP-PCR and standard/conventional PCR products were purified by beads (Agencourt Bioscience, beverly, massachusetts, USA) and quantified using a Nanodrop ^TM 1000 spectrophotometer (Thermo Scientific ^TM, waltham, massachusetts, USA). Each 20. Mu.l sequencing reaction contained 10-50ng of purified standard PCR or TP-PCR product, 1 x/>Terminator Ready Reaction Mix (Applied Biosystems), 2.5x Q-Solution (Qiagen) and 3.2pmol of AFF2-F primer. Initial denaturation was carried out at 96℃for 1 min, followed by 25 cycles of 98℃for 10 seconds, 60℃for 5 seconds and 60℃for 4 min. The extension product was purified using an Oligo Clean & Concentrator ^TM column (Zymo Research, irvine, california, USA) according to the manufacturer's instructions. The eluted purified extension product was purified at Savant/>Vacuum dried for 5 min in a concentrator (Thermo Scientific), resuspended in 12. Mu.l Hi-Di ^TM carboxamide (Applied Biosystems) and resolved in a 3130xl genetic analyzer (Applied Biosystems) using a 36cm capillary filled with POP-7 ^TM polymer. The mixture was electrokinetically injected at 1.2kV for 16 seconds and electrophoresed at 60℃for 20 minutes. Post CE analysis was performed using sequencing analysis software v6.0 (Applied Biosystems).

Results

Evaluation of double TP-PCR on normal samples and on repeated expansion of positive samples by FMR1 and AFF2 triplets

The friable site of FRAXE folate sensitivity on the X chromosome is associated with several medical conditions including mental retardation, obsessive-compulsive disorder and premature ovarian failure. FRAXE friability is caused by the overexpansion of CCG trinucleotide repeats in the 5' untranslated region (UTR) in exon 1 of the AFF2 (formerly FMR 2) gene (ClinVar: VCV 000010526.1). Over-extension from normal alleles ranging from 6-30 repeats to >200 repeats is accompanied by repeated CpG methylation. This in turn silences AFF2 expression, AFF2 being a subunit of SEC-L2, which regulates transcription of several genes. AFF2 CCG repeat hyperextension is a genetic mutation that results in fragile XE non-complex intellectual disability (FRAXE NSID; OMIM 309548), a mild (IQ 50-70) to critical (IQ 70-85) intellectual disability, estimated to be affected by one in every 50000 to 100000 men, and also other cognitive/behavioral abnormalities, including obsessive-compulsive disorders. Paradoxically, alleles with fewer than 11 repeats or with microdeletions within or near the repeats are enriched in premature ovarian failure patients. Studies correlating intermediate (31-60) repeat sizes with parkinson's disease, a neurodegenerative motor system disease, have not been concluded.

FRAXE shares similar genetic features with FRAXA, a well studied fragile site on the X chromosome, with more clearly demonstrated clinical involvement. The FRAXA site contains CGG repeats within the 5' UTR in exon 1 of the FMR1 gene. The overproduction of FMR1 CGG to >200 was accompanied by CpG methylation and FMR1 gene silencing, which resulted in fragile X syndrome (FXS; OMIM 309550) (ClinVar: VCV 000009972.1). FXS is the most common genetic monogenic cause of mental retardation, affected by which is about one in every 5000 men and one in every 4000 to 8000 women. The pre-mutant alleles are associated with a number of behavioral characteristics including anxiety, obsessive-compulsive disorder, and depression. In addition, about one fifth of women carrying a pre-mutant allele (55-200 CGG repeats) suffer from fragile X-related primary ovarian insufficiency (FXPOI). The pre-FMR 1 mutation was identified in 2% of sporadic POI and 14% of familial POI patients, making FXPOI the most common genetic cause of POI in euploid females. Furthermore, a subset of pre-FMR 1 mutant carriers will eventually develop fragile X-related tremor/ataxia syndrome (FXTAS), a neurodegenerative disorder characterized by ataxia, tremor and parkinsonism, affected by which about 1 person per 4000 men over the age of 55 and 1 person per 7800 women over the age of 55. The FMR1 gene is also associated with Parkinson's disease.

Although both FRAXA and FRAXE fragile sites are closely related to mental disorders, the mild to critical phenotype of FRAXE NSID compared to FXS can lead to under-determination and under-diagnosis. The lack of diagnosis may be due in part to the lack of a rapid, simple and inexpensive assay to screen for repeat expansion at the FRAXE site. Although standard PCR methods have been described for repeated extension at the FRAXE locus, either as independent assays or multiplexed assays, they fail to detect large pre-and full-mutant alleles, detection of which still relies on Southern blot analysis. Because of the non-syndromic nature of FRAXE (i.e., lack of characteristic and consistent clinical manifestations, in contrast to, for example, facial deformities (prominent ears, chin, forehead and long face) and giant testicle present in FXS patients), molecular diagnostics are critical for validating FRAXE.

The dual TP-PCR assay was initially evaluated using 12 DNA samples. FIGS. 2 and 3 show patterns of TP-PCR electropherograms generated from normal male and female and FMR1 and AFF2 pre-mutated (PM) or total mutated (FM) individuals. The leftmost peak generated by the AFF2 TP-PCR reaction is not generated by annealing of the TP primer within the AFF2 CCG repeat that begins after the last CTG trinucleotide of the 5' flanking sequence. In contrast, this isolated peak, which migrates as a distinct 111bp fragment, results from annealing of the TP primer to the (CCG) ₄ CCT (SEQ ID NO: 6) sequence immediately upstream of the repeat (FIG. 1). The first true peak resulting from annealing of the TP primer to the repetitive CCG 1-5 of AFF2 appears as a distinct 138bp fragment on the electropherogram. The TP primer anneal occurs at all other positions in the repeat segment that contains five consecutive CCGs (CCGs 2-6, 3-7, etc.), with the final fluorescence peak resulting from the annealing of the TP primer to the last five CCGs of the repeat segment. Thus, the repeat size of an AFF2 allele can be quickly and easily determined by counting the number of peaks generated from within the repeat segment (i.e., excluding the leftmost isolated peak) and adding four to that number.

The absence of more than 55 repeated fluorescent peaks indicates no extension, whereas consecutive fluorescent peaks extending more than 55 and up to 200 repeats indicate the presence of a pre-mutation (PM) allele, and fluorescent peaks extending more than 200 repeats indicate the presence of a Full Mutation (FM). These results indicate that the dual TP-PCR assay successfully detected and accurately identified pre-and total mutations in both FMR1 CGG and AFF2 CCG in male and female DNA samples (fig. 2 and 3).

Identification of population allele distribution and repeat Structure

A total of 466 male DNA samples (including DNA from 161 chinese, 158 males, 89 indians, 44 caucasians and 14 african americans) were genotyped using dual TP-PCR to determine the allele size distribution (fig. 4 and table 2). Different allele distributions were observed in the various families, with mode AFF2 repeat sizes of 18 for chinese and males (32.3% and 27.8%, respectively), but 15 for caucasian (40.9%), african americans (50.0%), and indian (32.6%). The range of allele sizes for Chinese (5-31) and Malayan (6-37) was wider than Indian (10-27), caucasian (9-26) and African American (14-28), which may be due in part to the larger sample sizes for Chinese and Malayan. Using conventional classification, 24 samples had the smallest allele (< 11 replicates), 440 samples had the normal allele (11-30 replicates), and the remaining 2 samples had the medium allele (31-60 replicates). No pre-and total mutant alleles were observed. These results are similar to early studies of AFF2 allele distribution in the Chinese Han (mode 18, ranges 9-26) and Caucasian New York City (mode 16, ranges 8-34) (FIGS. 4A-C).

Chromosomes carrying a combination of 29 FMR1 CGG repeats and 18 AFF2 CCG repeats are most common (fig. 4D, top panel). Three AFF2 TP-PCR electropherogram patterns were observed in the 466 male population DNA samples screened. As noted in Genome Reference Consortium Human Build (GRCh 38), the most common pattern was found in 459 samples (98.1%), which represented most of the normal and all medium alleles, while the other two patterns were identified by Sanger sequencing as being caused by the presence of two Single Nucleotide Polymorphism (SNP) variants (fig. 4D, bottom panel).

One of the SNP variants, the T > C substitution at chrX:148,500,637 (rs 868914124, GRCh 38), converts CTG trinucleotides to CCG at the 5' start of AFF2 CCG repeats. Although the AFF2 trinucleotide repeat in the rs868914124 (T) common allele starts after the last CTG trinucleotide of the 5 'flanking sequence (fig. 5A), this CTG trinucleotide becomes a CCG trinucleotide in the rs86891412 (C) variant allele, extending the AFF2 trinucleotide repeat stretch 2 CCG or 6bp upstream to start after the last CAG trinucleotide of the 5' flanking sequence (fig. 5B). Thus, the first TP-PCR product generated from within the repeat stretch of the rs868914124 (C) variant allele appears as a distinct 132bp fluorescence peak (FIG. 5B), in contrast to the 138bp first fluorescence peak from the rs868934124 (T) common allele (FIG. 5A). In the TP-PCR electrophoretogram, this size difference is shown as a narrower gap of 20bp between the leftmost Bian Guli peak and the first peak of the AFF2 CCG repeat of the rs868914124 (C) variant allele (FIG. 5B) compared to the 26bp wider gap of the rs868914124 (T) common allele (FIGS. 5A, C).

Rs868914124 (C) variants were observed in 8 of 466 AFF2 alleles (1.72%) (fig. 4D, bottom panel, line B), 6 of which were males alleles. 3 of the 8 samples with rs868914124 (C) variants contained the smallest allele (< 11 replicates) compared to 21 of 458 samples carrying rs868941424 (T). Using Fisher's exact test (double tail), this rare variant was observed to be enriched in the males group (odds ratio=6.01; 95% confidence interval 1.06-61.7; p=0.021) and it had the smallest allele (odds ratio=12.3; confidence interval 1.79-68.4; p=0.006). Interestingly, all 6 AFF2 extended alleles (5 total mutations and 1 pre-mutation) carried rare rs868914124 (C) variants, although the affected males and their mother's brother extended alleles, as well as the extended alleles from father and daughter pairs, were considered identical in ancestry (fig. 5D and 6 and 7).

Unlike the AGG disruption within CGG repeats normally observed in the normal FMR1 allele, we observed only one normal AFF2 allele (0.21%) that contained a non-CCG disruption in its repeat segment, i.e. CTG disruption at the fifth repeat location (fig. 4D, lower panel, line C and fig. 5C). This disruption is caused by a C > T substitution at chrX:148,500,652 (rs 1389911365, GRCh 38). Annealing of the TP primer at a location that includes such an interruption will result in mismatched pairing and failure to generate a PCR product from that location. In the TP-PCR electropherogram, failed PCR at the primer mismatch position appears as a gap without peaks (FIG. 5C).

Interestingly, new non-CCG disruptions were also observed in all 6 AFF2 expansion positive samples. Based on the TP-PCR electropherogram pattern of four AFF2 FM men, which shows the presence or absence of a gap between peak clusters/missing peaks, an interruption in the repeat was initially suspected (FIG. 3). Sanger sequencing revealed that there were one or more non-CCG breaks at the 5' end of the repeat that carried the same sequence CCTGTGCAG as a 9 nucleotide stretch immediately upstream of the repeat. The number of interrupts varies from one (fig. 6A) to more than four (fig. 6B). Their positions also vary, for example from 8 th to 10 th repetition position (fig. 6A) or from 6 th to 8 th position (fig. 7).

We observed that in father and daughter pairs (fig. 7), AFF2 CCG repeat expansion was a full mutation in the father (dna_ 25926) and a pre-mutation in the daughter (dna_3802), indicating previously recorded contractions upon delivery. Although the AFF2 full mutant alleles (fig. 6) of nephew (FX 0229) and mother's brother (FX 0230) are considered identical in ancestry, their repeat structure differs due to the presence of additional non-CCG breaks in mother's brother. The exact cause or mechanism of this difference has not been studied.

Blind validation of dual TP-PCR assays on previously characterized samples

The blind validation of the duplex TP-PCR assay included 82 archived and previously characterized genomic DNA. AFF2 CCG repeats sizing as described in materials and methods. The dual TP-PCR assay accurately classified all 40 normal, 17 FMR1 PM, 23 FMR1 FM and 2 AFF2 FM samples included in the test (table 3).

Table 2. AFF2 CCG repeat allele frequencies in five populations.

CH, chinese; ML, males; IN, indian; CAU, caucasian; AA, non-business americans

Table 3. Consistency analysis of samples with or without FMR1 or AFF2 repeat expansion.

/>

Example 2: spinocerebellar ataxia (SCA)

Materials and methods

Biological sample

Genomic DNA and cell lines were obtained from Coriell Cell Repositories (CCR; coriell Institute for MEDICAL RESEARCH, camden, NJ, USA). NA06926, NA13536 and NA13537 each carry an extended ATXN1 CAG repeat, NA14982 carries an extended ATXN2 CAG repeat, GM06151 carries an extended ATXN3 CAG repeat, NA03562 carries an extended ATXN7 CAG repeat, and NA13716 and NA13717 each carry an extended ATN1 CAG repeat. GM16243 carries an extended FXN GAA repeat, GM06907 carries an extended FMR1 CGG repeat, and GM05164 and GM06075 each carry an extended DMPK CTG repeat. Two clinical DNA samples 160920 and 183254 carrying extended CACNA1A CAG repeats and PPP2R2B CAG repeats, respectively, were obtained from KK dimension 'S AND CHILDREN's Hospital. The 60 archived DNA samples of known genotypes were obtained from Siriraj Hospital-Mahidol University and included in the blind evaluation of assay accuracy.

Seven-fold TP-PCR

Seven-fold TP-PCR was performed in a 25- μl reaction containing: 10ng of genomic DNA,1.5x Q-Solution (Qiagen, hilden, germany), 1 XPCR buffer (Qiagen) containing 1.5mmol/L of MgCl ₂, a deoxyribonucleic acid triphosphate (dNTP) mixture consisting of dATP, dTTP, dCTP and dGTP at 0.2mmol/L each (Roche APPLIED SCIENCE, penzberg, germany) and 2 units of HotStar Taq DNA polymerase (Qiagen). Eight primers (seven fluorescently labeled locus specific primers and one universal TP primer TP-R) are included at their respective optimal working concentrations. Primer sequences, fluorophore labels and primer concentrations are shown in tables 4 and 5. On SIMPLIAMP thermocyclers (Applied Biosystems-Thermo FISHER SCIENTIFIC, foster City, CA, USA), the thermocycling consisted of: initial polymerase activation was performed at 95℃for 15 minutes, followed by 35 cycles of 98℃for 45 seconds, 60℃for 1 minute and 72℃for 2 minutes, and finally extension at 72℃for 5 minutes.

Capillary Electrophoresis (CE)

An aliquot of 1- μl of the fluorescently labeled TP-PCR product was mixed with 9 μl of Hi-Di ^TM carboxamide (Applied Biosystems) and 0.5 μl of GeneScan ^TM500LIZ^TM dye size standard (Applied Biosystems), then denatured at 95 ℃ for 5 minutes, rapidly cooled to 4 ℃ and resolved in a 3130xl genetic analyzer (Applied Biosystems) using 36-cm capillaries filled with POP-7 ^TM polymer. The samples were electrokinetically injected at 1kV for 15 seconds and electrophoresed at 60℃for 40 minutes. GeneScan analysis was performed using GeneMapper 5.0 software (Applied Biosystems). Amplified products from each locus can be identified by their product size range and peak color. Seven-fold TP-PCR products exhibit four different colored electrophoretic peaks and can be analyzed either by opening all four fluorescence detection channels together or separately using one fluorescence detection channel while closing the other.

Results

Detection of extended CAG repeats by single tube heptafold TP-PCR

Spinocerebellar ataxia (SCA) is a neurodegenerative disease that causes degeneration of the cerebellum, and sometimes the spinal cord, and is mainly characterized by gait ataxia gait, poor hand-eye coordination and poor mouth and teeth. These autosomal dominant genetic diseases have genetic diversity and can be caused by conventional mutations as well as repeated expansion mutations. Overall, the prevalence of SCA on a global scale averages 2.7 cases per 100,000 individuals, with SCA3 being most common. There are more than 40 genetically distinct SCAs, of which at least 12 are caused by repeated expansion. Among them, SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and SCA17 are caused by abnormal repeated expansion of CAG trinucleotide. Another disease, dentate nuclear pallidum atrophy (DRPLA), is also caused by CAG repeat expansion, which is classified as SCA due to its heterogeneity and overlapping clinical phenotypes with other SCAs.

SCA1 is caused by the repeated extension of CAG in exon 8 of the ATXN1 gene on chromosome 6p22.3. The non-extended (normal) ATXN1 allele contains CAG of 6 to 44 CAT breaks. Alleles containing 36 to 38 CAGs are variable normal alleles that may extend into the range of pathogenic sizes when transmitted to the next generation, but are not related to the clinical symptoms themselves, whereas those containing ≡39 CAGs are pathogenic. SCA2 is caused by the repeated extension of CAG in exon 1 of the ATXN2 gene on chromosome 12q24.12. Normal ATXN2 alleles contain 14 to 31 pure or CAA-disrupted CAGs, and those containing 32 CAGs are intermediate alleles with uncertain clinical significance. Alleles containing 33 to 500 CAGs are pathogenic. SCA3 or Marchado-Josephson disease is caused by repeated extension of CAG in exon 10 of the ATXN3 gene on chromosome 14q32.12. Normal ATXN3 alleles contain 12 to 44 CAGs, while those containing 45 to 59 CAGs are readily extendable intermediate alleles, and those with 60 to 87 CAGs are all-exon alleles. SCA6 is caused by the repeated extension of CAG in exon 47 of the CACNA1A gene on chromosome 19p13.13. The normal CACNA1A allele contains 18 CAGs and the full exotic allele contains 20 to 33 CAGs. The CACNA1A allele containing 19 CAGs has uncertain clinical significance. SCA7 is caused by the repeated extension of CAG in exon 3 of the ATXN7 gene on chromosome 3p14.1. The normal ATXN7 allele contains 7 to 27 CAGs, while those containing 28 to 33 CAGs are variable normal alleles, and allelic mutations with 34 or more CAGs are pathogenic. SCA12 is caused by the repeated expansion of CAG in the 5' region of the PPP2R2B gene on chromosome 5q 32. Normal PPP2R2B alleles contain 7 to 32 CAGs, while those containing 51 to 78 CAGs are all-exon alleles. DRPLA is caused by the repeated extension of CAG in exon 5 of the ATN1 gene on chromosome 12p13.31. Normal ATN1 alleles contain 6 to 35 CAGs, while those containing > 48 CAGs are all-exon alleles.

The various SCA types often exhibit significantly different phenotypes and are often difficult to distinguish by signs and symptoms alone due to extensive clinical overlap and other accompanying non-ataxia phenotypes. Thus, molecular genetic testing is necessary and the only way to identify pathogenic mutations to confirm disease status in symptomatic individuals. Identification of pathogenic genes can be time consuming and expensive because it is difficult to detect certain genes without a known family history or disease-specific symptoms. The european molecular genetic diagnostic quality consortium (European Molecular Genetics Quality Network) suggests that all laboratories should minimally provide for the detection of the five most common types of SCA (SCA 1, SCA2, SCA3, SCA6 and SCA 7), while other types of detection will depend on local epidemic.

Detection and sizing of CAG repeats in different SCAs relies on standard PCR or three-primer PCR (TP-PCR) followed by capillary electrophoresis. When only a single fragment from a normal allele is detected, the repeat size determination by standard PCR must typically be accompanied by low throughput and labor intensive Southern blot analysis to confirm the presence of a very large allele and estimate its repeat size, or by enzymatic digestion to determine the presence of CAT disruption in the extended ATXN1 allele. TP-PCR, first described by Warner et al [24], has been widely used to detect repeated expansions that cause many repeated expansions of disease, regardless of the size of the expansion. It uses locus specific flanking primers, triple Primer (TP) primers and tail primers and is able to detect very large extensions, accurately determine the size of all normal alleles and moderate extensions, and identify breaks in the repeat segments by different TP-PCR electropherogram patterns.

Since multiple rounds of genetic testing may be required prior to identifying a pathogenic gene in a SCA patient, simultaneous screening for trinucleotide repeats of different disease genes may facilitate faster identification of the pathogenic gene in addition to saving the cost of multiple tests. We report the development of a new single tube multiplex TP-PCR assay that is able to screen simultaneously for expansion mutations at seven loci responsible for some of the most common SCAs (SCA 1, SCA2, SCA3, SCA6, SCA7, SCA12 and DRPLA). Seven-fold assays used locus-specific flanking primers that were differentially labeled for each repeat locus, as well as universal primers that annealed within CAG.

The SCA heptad TP-PCR assay utilized fluorescently labeled locus-specific flanking primers located upstream of the CAG repeats of each disease gene and universal TP primers that annealed within all CAG repeats to enable CAG repeats from seven different SCA loci to be co-amplified in a single reaction tube. Each flanking primer is labeled with one of four fluorophores (Ned, vic, fam or Pet). In addition, each flanking primer is spaced a different distance from its respective repeat segment so that the electrophoresis peak of the normal allele resulting from one repeat locus does not overlap with the electrophoresis peak of the normal allele from another repeat locus labeled with the same fluorophore (fig. 8). The combined effect of differential fluorophore labeling and localization of the seven flanking primers is that the TP-PCR products generated from the normal alleles of each repeat locus do not overlap and can be distinguished after capillary electrophoresis. The number of repeats was determined by counting the electrophoresis peaks from the left side of the electrophoresis pattern, with the first peak representing the first five pure CAGs of the repeat segment (fig. 8).

Seven TP-PCR assays were first tested on genotype-known samples to confirm locus-specific annealing of each flanking primer and to detect extended CAG repeats at each extended repeat locus. When the TP primer is repeatedly annealed to five consecutive CAGs, TP-PCR produces a mixture of fragments, each of which differs by one triplet. The minimal amplification product of all loci contained five CAG triplets, except for ATXN3, which has 11 triplets since the wild type allele of ATXN3 had CAA at positions 3 and 6 and AAG at position 4 (fig. 8A). The electropherograms show a series of consecutive ladder peaks (fig. 8B) because TP primers anneal to multiple positions within uninterrupted CAG repeats (fig. 8A). Each successive electrophoresis peak represents a TP-PCR product, which is one or three base pairs larger, and the size of the repeat in the allele can be derived from the number of counting peaks. The presence of non-CAG breaks in the middle of the repeat segment, which may occur in the ATXN1 and ATXN2 alleles, prevents the TP-polymer from annealing effectively at these positions, resulting in the absence of fluorescent peaks between the peak clusters. Expanding the negative samples produced TP-PCR products from two non-expanded (normal) alleles that lie within the normal repeat size range (fig. 9), while expanding the positive samples additionally produced longer TP-PCR products from the expanded alleles (fig. 10). Samples with repeat sizes exceeding the upper limit of the normal allele size range at a particular repeat locus will be indicative of expansion at that repeat locus (fig. 10).

Blind method clinical sample verification

To assess the ability of the seven-fold TP-PCR assay to accurately identify the expansion at seven repeat loci in affected patient samples, blind analysis was performed on 60 archived clinical DNA samples of known genotypes. After TP-PCR and capillary electrophoresis, one fluorescence detection channel is opened at a time to facilitate the analysis of amplified products labeled with different fluorophores, respectively. This enables samples showing electrophoresis peaks beyond the upper limit of the normal allele size range at any of the seven SCA loci to be clearly identified. For all 31 DNA samples positive for one of the seven SCA repeat expansions (5 SCA1 positive, 7 SCA2 positive, 12 SCA3 positive, 5 SCA6 positive, 1 SCA7 positive and 1 DRPLA positive), the repeat expansion in the expected SCA repeat locus was correctly detected by the heptad TP-PCR (table 6). Multiplex PCR involving flanking primers and universal TP primers for the relevant repeat loci was performed on all screened positive samples to confirm the results (data not shown). No expansion was detected in any of the 29 expansion-negative DNA samples.

TABLE 4 seven-fold TP-PCR primers and expected TP-PCR product sizes

TABLE 5 seven expected TP-PCR product size for SCA

Table 6: disease states and CAG repeat size for 60 archived DNA samples of known genotype

/>

Reference to the literature

Zhong, N.et al, A survey of FRAXE allele sizes in three displacements. Am J Med Genet,1996.64 (2): p.415-9.

Claims

1. A method of detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the method comprising:

i) Contacting the nucleic acid sample under amplification conditions for each of the two or more genes with: a) A gene-specific primer that specifically binds to a different target sequence of each gene, wherein the genes comprise nucleotide repeats, and wherein the different target sequences are upstream or downstream of the nucleotide repeats, and b) a universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeats and on opposite strands bound by the gene-specific primers; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and

Ii) analyzing the amplification product.

2. The method of claim 1, wherein the method comprises analyzing the amplification product using a size separation technique or a sequencing technique.

3. The method of claim 2, wherein the size separation technique is an electrophoresis-based technique.

4. A method according to any one of claims 1 to 3, wherein the amplification products are separated according to size.

5. The method of any one of claims 1 to 4, wherein a change in the size of a gene amplification product as compared to a reference is indicative of the presence of a repeat extension sequence in the gene.

6. The method of any one of claims 1 to 5, wherein the repeated spreading sequence is a trinucleotide repeated spreading sequence.

7. The method of any one of claims 1 to 6, wherein the trinucleotide repeat sequence is selected from (CGG)_n、(CCG)_n、(CAG)_n,、(CTG)_n、(GCC)_n、(GGC)_n、(GAA)_n or (TTC) _n, wherein n is 2 to 200 or more.

8. The method of any one of claims 1 to 7, wherein the universal primer comprises a unique 5' tail sequence.

9. The method of claim 8, wherein the method comprises providing a tail primer that specifically binds to a unique 5' tail sequence of the universal primer.

10. The method of any one of claims 1 to 9, wherein the universal primer binds to a common target sequence comprising or consisting of 5, 6,7,8,9, 10 or more consecutive trinucleotide repeats.

11. The method of any one of claims 1 to 10, wherein the gene-specific primer is labeled.

12. The method of any one of claims 1 to 11, wherein the two or more genes consist of FMR1 and AFF2 or comprise FMR1 and AFF2.

13. The method of claim 12, wherein the method comprises contacting the nucleic acid sample with:

a) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 1;

b) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 2; and

C) Comprising or consisting of a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence of (CGG) ₅ (SEQ ID NO: 15) or SEQ ID NO: 3.

14. The method according to any one of claims 1 to 11, wherein the two or more genes are selected from the group consisting of SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and DRPLA.

15. The method of claim 14, wherein the method comprises contacting the nucleic acid sample with:

a) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 7;

b) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 8;

c) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 9;

d) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 10;

e) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 11;

f) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 12; and/or

G) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID NO. 13, and

H) Comprising or consisting of a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence of (CTG) ₅ (SEQ ID NO: 16) or SEQ ID NO: 14.

16. A kit for detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the kit comprising:

a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and

B) A universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene;

Wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene.

17. A composition comprising a nucleic acid sample obtained from a subject, the composition comprising a) a gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene;

18. A method of screening a subject for one or more multiple repeat spread disease, the method comprising:

i) Contacting a nucleic acid sample from the subject under amplification conditions with: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and

Ii) analyzing the amplification products to screen the subject for the one or more multiple repeat spread disease.

19. The method of claim 18, wherein the one or more multiple repeat extended diseases comprise or consist of Fragile X Syndrome (FXS), fragile X-related primary ovarian dysfunction (FXPOI), fragile X-related tremor/ataxia syndrome (FXTAS), and fragile XE non-complex intellectual disability (FRAXE NSID).

20. The method of claim 18, wherein the one or more multiple repeat expansion diseases is spinocerebellar ataxia (SCA) and/or dentate nuclear pallidoluid atrophy (DRPLA).

21. A method of screening one or more multiple repeat spread disease in a subject and treating the subject, the method comprising:

i) Contacting a nucleic acid sample from the subject under amplification conditions with: a) A gene-specific primer that specifically binds to a different target sequence of each gene of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene

Ii) analyzing the amplification products to screen the subject for the one or more multiple repeat spread disease; and

Iii) Subjects found to have at least one multiple repeat extended disease are treated.