CN118159667A - Method for detecting repeated spreading sequences - Google Patents
Method for detecting repeated spreading sequences Download PDFInfo
- Publication number
- CN118159667A CN118159667A CN202280053833.7A CN202280053833A CN118159667A CN 118159667 A CN118159667 A CN 118159667A CN 202280053833 A CN202280053833 A CN 202280053833A CN 118159667 A CN118159667 A CN 118159667A
- Authority
- CN
- China
- Prior art keywords
- gene
- sequence
- nucleic acid
- repeat
- genes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 78
- 230000007480 spreading Effects 0.000 title claims description 8
- 238000003892 spreading Methods 0.000 title claims description 8
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 216
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 97
- 239000002773 nucleotide Substances 0.000 claims abstract description 94
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 93
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 65
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 65
- 230000003321 amplification Effects 0.000 claims abstract description 35
- 238000003199 nucleic acid amplification method Methods 0.000 claims abstract description 35
- 238000011144 upstream manufacturing Methods 0.000 claims abstract description 23
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 46
- 201000010099 disease Diseases 0.000 claims description 45
- 101000828537 Homo sapiens Synaptic functional regulator FMR1 Proteins 0.000 claims description 41
- 102100023532 Synaptic functional regulator FMR1 Human genes 0.000 claims description 41
- 208000009415 Spinocerebellar Ataxias Diseases 0.000 claims description 40
- 206010008025 Cerebellar ataxia Diseases 0.000 claims description 36
- 102000014461 Ataxins Human genes 0.000 claims description 33
- 108010078286 Ataxins Proteins 0.000 claims description 33
- 201000004562 autosomal dominant cerebellar ataxia Diseases 0.000 claims description 33
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 29
- 238000001962 electrophoresis Methods 0.000 claims description 29
- 102100024378 AF4/FMR2 family member 2 Human genes 0.000 claims description 17
- 101000833172 Homo sapiens AF4/FMR2 family member 2 Proteins 0.000 claims description 17
- 102000007372 Ataxin-1 Human genes 0.000 claims description 14
- 108010032963 Ataxin-1 Proteins 0.000 claims description 14
- 239000000203 mixture Substances 0.000 claims description 14
- 208000001914 Fragile X syndrome Diseases 0.000 claims description 12
- 238000012216 screening Methods 0.000 claims description 10
- 201000003624 spinocerebellar ataxia type 1 Diseases 0.000 claims description 7
- 238000012163 sequencing technique Methods 0.000 claims description 6
- 238000000926 separation method Methods 0.000 claims description 5
- 201000006347 Intellectual Disability Diseases 0.000 claims description 4
- 206010044565 Tremor Diseases 0.000 claims description 4
- 206010003694 Atrophy Diseases 0.000 claims description 3
- 230000037444 atrophy Effects 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 3
- 208000015124 ovarian disease Diseases 0.000 claims description 2
- 201000004535 ovarian dysfunction Diseases 0.000 claims description 2
- 231100000543 ovarian dysfunction Toxicity 0.000 claims description 2
- 230000004544 DNA amplification Effects 0.000 claims 1
- 108700028369 Alleles Proteins 0.000 description 97
- 239000000523 sample Substances 0.000 description 48
- 108020004414 DNA Proteins 0.000 description 30
- 230000035772 mutation Effects 0.000 description 25
- 238000003556 assay Methods 0.000 description 17
- 238000005251 capillar electrophoresis Methods 0.000 description 17
- 108010032947 Ataxin-3 Proteins 0.000 description 16
- 102000007371 Ataxin-3 Human genes 0.000 description 16
- 101000935117 Homo sapiens Voltage-dependent P/Q-type calcium channel subunit alpha-1A Proteins 0.000 description 14
- 230000002068 genetic effect Effects 0.000 description 14
- 102000007370 Ataxin2 Human genes 0.000 description 13
- 108010032951 Ataxin2 Proteins 0.000 description 13
- 238000009396 hybridization Methods 0.000 description 13
- 238000000137 annealing Methods 0.000 description 12
- 210000000349 chromosome Anatomy 0.000 description 12
- 230000001717 pathogenic effect Effects 0.000 description 12
- 101000915806 Homo sapiens Serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B beta isoform Proteins 0.000 description 11
- 102100029014 Serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B beta isoform Human genes 0.000 description 11
- 238000006243 chemical reaction Methods 0.000 description 11
- 102000007368 Ataxin-7 Human genes 0.000 description 10
- 108010032953 Ataxin-7 Proteins 0.000 description 10
- 238000012360 testing method Methods 0.000 description 10
- 108091006146 Channels Proteins 0.000 description 9
- 208000023105 Huntington disease Diseases 0.000 description 9
- 238000009826 distribution Methods 0.000 description 9
- 230000009977 dual effect Effects 0.000 description 9
- 239000012634 fragment Substances 0.000 description 8
- 229920000642 polymer Polymers 0.000 description 8
- 108091033319 polynucleotide Proteins 0.000 description 8
- 102000040430 polynucleotide Human genes 0.000 description 8
- 239000002157 polynucleotide Substances 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 7
- 230000000295 complement effect Effects 0.000 description 7
- 102000014817 CACNA1A Human genes 0.000 description 6
- 102100025330 Voltage-dependent P/Q-type calcium channel subunit alpha-1A Human genes 0.000 description 6
- 208000024891 symptom Diseases 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 102100020741 Atrophin-1 Human genes 0.000 description 4
- 208000014644 Brain disease Diseases 0.000 description 4
- 208000032274 Encephalopathy Diseases 0.000 description 4
- 101000785083 Homo sapiens Atrophin-1 Proteins 0.000 description 4
- 101100013186 Mus musculus Fmr1 gene Proteins 0.000 description 4
- 210000004369 blood Anatomy 0.000 description 4
- 239000008280 blood Substances 0.000 description 4
- 210000004027 cell Anatomy 0.000 description 4
- 230000000875 corresponding effect Effects 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 4
- 239000000975 dye Substances 0.000 description 4
- 238000001917 fluorescence detection Methods 0.000 description 4
- 238000009830 intercalation Methods 0.000 description 4
- 201000006938 muscular dystrophy Diseases 0.000 description 4
- 230000003252 repetitive effect Effects 0.000 description 4
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 3
- 108020005029 5' Flanking Region Proteins 0.000 description 3
- 102000053602 DNA Human genes 0.000 description 3
- 101150082209 Fmr1 gene Proteins 0.000 description 3
- 208000021384 Obsessive-Compulsive disease Diseases 0.000 description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 3
- 208000002500 Primary Ovarian Insufficiency Diseases 0.000 description 3
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 3
- 210000001766 X chromosome Anatomy 0.000 description 3
- 230000027455 binding Effects 0.000 description 3
- 150000003857 carboxamides Chemical class 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 239000013642 negative control Substances 0.000 description 3
- 210000002381 plasma Anatomy 0.000 description 3
- 239000013641 positive control Substances 0.000 description 3
- 238000002203 pretreatment Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 150000003839 salts Chemical class 0.000 description 3
- 238000004513 sizing Methods 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 238000010200 validation analysis Methods 0.000 description 3
- BZTDTCNHAFUJOG-UHFFFAOYSA-N 6-carboxyfluorescein Chemical compound C12=CC=C(O)C=C2OC2=CC(O)=CC=C2C11OC(=O)C2=CC=C(C(=O)O)C=C21 BZTDTCNHAFUJOG-UHFFFAOYSA-N 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- -1 ATXN Proteins 0.000 description 2
- 206010003591 Ataxia Diseases 0.000 description 2
- 241000283690 Bos taurus Species 0.000 description 2
- 241000282472 Canis lupus familiaris Species 0.000 description 2
- 241000283707 Capra Species 0.000 description 2
- 241000283086 Equidae Species 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 2
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 2
- 108010021075 HDL2 Lipoproteins Proteins 0.000 description 2
- 101000614618 Homo sapiens Junctophilin-3 Proteins 0.000 description 2
- 208000026350 Inborn Genetic disease Diseases 0.000 description 2
- 102100040488 Junctophilin-3 Human genes 0.000 description 2
- 208000036626 Mental retardation Diseases 0.000 description 2
- 208000021642 Muscular disease Diseases 0.000 description 2
- 201000009623 Myopathy Diseases 0.000 description 2
- 102000018658 Myotonin-Protein Kinase Human genes 0.000 description 2
- 108010052185 Myotonin-Protein Kinase Proteins 0.000 description 2
- 208000018737 Parkinson disease Diseases 0.000 description 2
- 241001494479 Pecora Species 0.000 description 2
- 108091093037 Peptide nucleic acid Proteins 0.000 description 2
- 206010036790 Productive cough Diseases 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 238000002105 Southern blotting Methods 0.000 description 2
- 241000282887 Suidae Species 0.000 description 2
- 102100040296 TATA-box-binding protein Human genes 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000000692 anti-sense effect Effects 0.000 description 2
- 230000003542 behavioural effect Effects 0.000 description 2
- 239000013060 biological fluid Substances 0.000 description 2
- 239000012472 biological sample Substances 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 2
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 2
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 2
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 2
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 230000001037 epileptic effect Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 208000016361 genetic disease Diseases 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 230000011987 methylation Effects 0.000 description 2
- 238000007069 methylation reaction Methods 0.000 description 2
- 208000015122 neurodegenerative disease Diseases 0.000 description 2
- 238000007481 next generation sequencing Methods 0.000 description 2
- 239000002777 nucleoside Substances 0.000 description 2
- 208000005632 oculopharyngodistal myopathy Diseases 0.000 description 2
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 2
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 2
- 206010036601 premature menopause Diseases 0.000 description 2
- 208000017942 premature ovarian failure 1 Diseases 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 210000003802 sputum Anatomy 0.000 description 2
- 208000024794 sputum Diseases 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 210000002700 urine Anatomy 0.000 description 2
- 210000004885 white matter Anatomy 0.000 description 2
- IQFYYKKMVGJFEH-OFKYTIFKSA-N 1-[(2r,4s,5r)-4-hydroxy-5-(tritiooxymethyl)oxolan-2-yl]-5-methylpyrimidine-2,4-dione Chemical compound C1[C@H](O)[C@@H](CO[3H])O[C@H]1N1C(=O)NC(=O)C(C)=C1 IQFYYKKMVGJFEH-OFKYTIFKSA-N 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 108020003589 5' Untranslated Regions Proteins 0.000 description 1
- 101150084229 ATXN1 gene Proteins 0.000 description 1
- 101150029341 ATXN2 gene Proteins 0.000 description 1
- 101150072286 ATXN7 gene Proteins 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- 208000019901 Anxiety disease Diseases 0.000 description 1
- 101100111953 Arabidopsis thaliana CYP734A1 gene Proteins 0.000 description 1
- 101150025446 Atn1 gene Proteins 0.000 description 1
- 101150074725 Atxn3 gene Proteins 0.000 description 1
- 101150100308 BAS1 gene Proteins 0.000 description 1
- 101100165166 Barbarea vulgaris LUP5 gene Proteins 0.000 description 1
- 101150041164 Cacna1a gene Proteins 0.000 description 1
- 241000700198 Cavia Species 0.000 description 1
- 208000032170 Congenital Abnormalities Diseases 0.000 description 1
- 206010010904 Convulsion Diseases 0.000 description 1
- OQEBIHBLFRADNM-UHFFFAOYSA-N D-iminoxylitol Natural products OCC1NCC(O)C1O OQEBIHBLFRADNM-UHFFFAOYSA-N 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 101000923091 Danio rerio Aristaless-related homeobox protein Proteins 0.000 description 1
- 102100028561 Disabled homolog 1 Human genes 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 238000000729 Fisher's exact test Methods 0.000 description 1
- 208000005622 Gait Ataxia Diseases 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 102100031470 Homeobox protein ARX Human genes 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101100164975 Homo sapiens ATXN2 gene Proteins 0.000 description 1
- 101100164990 Homo sapiens ATXN7 gene Proteins 0.000 description 1
- 101000915416 Homo sapiens Disabled homolog 1 Proteins 0.000 description 1
- 101000923090 Homo sapiens Homeobox protein ARX Proteins 0.000 description 1
- 101000984626 Homo sapiens Low-density lipoprotein receptor-related protein 12 Proteins 0.000 description 1
- 101000591189 Homo sapiens Notch homolog 2 N-terminal-like protein C Proteins 0.000 description 1
- 101000886818 Homo sapiens PDZ domain-containing protein GIPC1 Proteins 0.000 description 1
- 101000609211 Homo sapiens Polyadenylate-binding protein 2 Proteins 0.000 description 1
- 101000701517 Homo sapiens Putative protein ATXN8OS Proteins 0.000 description 1
- 101001089245 Homo sapiens RILP-like protein 1 Proteins 0.000 description 1
- 101000848727 Homo sapiens Rap guanine nucleotide exchange factor 2 Proteins 0.000 description 1
- 101000716931 Homo sapiens Sterile alpha motif domain-containing protein 12 Proteins 0.000 description 1
- 101000891654 Homo sapiens TATA-box-binding protein Proteins 0.000 description 1
- 101000976959 Homo sapiens Transcription factor 4 Proteins 0.000 description 1
- 101000596771 Homo sapiens Transcription factor 7-like 2 Proteins 0.000 description 1
- 101000611194 Homo sapiens Trinucleotide repeat-containing gene 6A protein Proteins 0.000 description 1
- 108091029795 Intergenic region Proteins 0.000 description 1
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 1
- 102100027120 Low-density lipoprotein receptor-related protein 12 Human genes 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 108091092878 Microsatellite Proteins 0.000 description 1
- 208000036572 Myoclonic epilepsy Diseases 0.000 description 1
- 208000002033 Myoclonus Diseases 0.000 description 1
- 102100034094 Notch homolog 2 N-terminal-like protein C Human genes 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 102100039983 PDZ domain-containing protein GIPC1 Human genes 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 208000027089 Parkinsonian disease Diseases 0.000 description 1
- 206010034010 Parkinsonism Diseases 0.000 description 1
- 102100039427 Polyadenylate-binding protein 2 Human genes 0.000 description 1
- 101710156592 Putative TATA-binding protein pB263R Proteins 0.000 description 1
- 102100030469 Putative protein ATXN8OS Human genes 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 102100033759 RILP-like protein 1 Human genes 0.000 description 1
- 102100034585 Rap guanine nucleotide exchange factor 2 Human genes 0.000 description 1
- KJTLSVCANCCWHF-UHFFFAOYSA-N Ruthenium Chemical compound [Ru] KJTLSVCANCCWHF-UHFFFAOYSA-N 0.000 description 1
- 108091081021 Sense strand Proteins 0.000 description 1
- 108091027568 Single-stranded nucleotide Proteins 0.000 description 1
- 102100026760 StAR-related lipid transfer protein 7, mitochondrial Human genes 0.000 description 1
- 101150000240 Stard7 gene Proteins 0.000 description 1
- 102100020929 Sterile alpha motif domain-containing protein 12 Human genes 0.000 description 1
- 101710145783 TATA-box-binding protein Proteins 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 102100023489 Transcription factor 4 Human genes 0.000 description 1
- 102100040241 Trinucleotide repeat-containing gene 6A protein Human genes 0.000 description 1
- YZCKVEUIGOORGS-NJFSPNSNSA-N Tritium Chemical compound [3H] YZCKVEUIGOORGS-NJFSPNSNSA-N 0.000 description 1
- 108091023045 Untranslated Region Proteins 0.000 description 1
- 241000282485 Vulpes vulpes Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 230000001668 ameliorated effect Effects 0.000 description 1
- 210000004381 amniotic fluid Anatomy 0.000 description 1
- 230000036506 anxiety Effects 0.000 description 1
- 210000003567 ascitic fluid Anatomy 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 102220024007 c.496_498CAG(?31)(?31 Human genes 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 210000001638 cerebellum Anatomy 0.000 description 1
- 210000001175 cerebrospinal fluid Anatomy 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 235000013330 chicken meat Nutrition 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 230000002301 combined effect Effects 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 230000008602 contraction Effects 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 201000004180 corneal endothelial dystrophy Diseases 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 230000007850 degeneration Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000007865 diluting Methods 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000004821 distillation Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000006862 enzymatic digestion Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 210000004700 fetal blood Anatomy 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 229940014144 folate Drugs 0.000 description 1
- OVBPIULPVIDEAO-LBPRGKRZSA-N folic acid Chemical compound C=1N=C2NC(N)=NC(=O)C2=NC=1CNC1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 OVBPIULPVIDEAO-LBPRGKRZSA-N 0.000 description 1
- 235000019152 folic acid Nutrition 0.000 description 1
- 239000011724 folic acid Substances 0.000 description 1
- 210000001061 forehead Anatomy 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 238000004108 freeze drying Methods 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 230000005021 gait Effects 0.000 description 1
- 230000030279 gene silencing Effects 0.000 description 1
- 238000012226 gene silencing method Methods 0.000 description 1
- 108091006104 gene-regulatory proteins Proteins 0.000 description 1
- 102000034356 gene-regulatory proteins Human genes 0.000 description 1
- 208000035474 group of disease Diseases 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000007403 mPCR Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 230000036630 mental development Effects 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 208000005264 motor neuron disease Diseases 0.000 description 1
- 238000013188 needle biopsy Methods 0.000 description 1
- 230000004770 neurodegeneration Effects 0.000 description 1
- 230000000626 neurodegenerative effect Effects 0.000 description 1
- 150000003833 nucleoside derivatives Chemical class 0.000 description 1
- 125000003835 nucleoside group Chemical group 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 239000003960 organic solvent Substances 0.000 description 1
- 238000012261 overproduction Methods 0.000 description 1
- 239000013610 patient sample Substances 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 229910052698 phosphorus Inorganic materials 0.000 description 1
- 239000011574 phosphorus Substances 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 210000004910 pleural fluid Anatomy 0.000 description 1
- 230000010287 polarization Effects 0.000 description 1
- 101150056959 ppp2r2b gene Proteins 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 208000016685 primary ovarian failure Diseases 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 208000020016 psychiatric disease Diseases 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 229910052707 ruthenium Inorganic materials 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 210000000582 semen Anatomy 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 229910001415 sodium ion Inorganic materials 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 210000000278 spinal cord Anatomy 0.000 description 1
- 210000004243 sweat Anatomy 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 210000001138 tear Anatomy 0.000 description 1
- 101150065190 term gene Proteins 0.000 description 1
- 210000001550 testis Anatomy 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 230000001256 tonic effect Effects 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 1
- 229910052722 tritium Inorganic materials 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Physics & Mathematics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present disclosure relates, inter alia, to a method of detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the method comprising: i) Contacting a nucleic acid sample of each of the two or more genes with: a) A gene-specific primer that binds to a different target sequence of each gene, wherein the genes comprise nucleotide repeats, and wherein the different target sequences are upstream or downstream of the nucleotide repeats, and b) a universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeats and on the opposite strand bound by the gene-specific primer; and ii) analyzing the amplification product.
Description
The present disclosure relates to the field of molecular biology. In particular, the specification teaches methods of detecting the presence or absence of repeat expansion sequences in two or more genes in a nucleic acid sample, and methods of screening for multiple repeat expansion diseases in a subject.
Background
The extension of simple sequence repeats scattered throughout the human genome is now known to directly cause more than 35 human diseases. The most common repeat expansion disease is caused by trinucleotide repeats, although tetranucleotide, pentanucleotide, hexanucleotide and even dodecanucleotide repeat expansions have also been identified as potential mutations in other diseases.
Repeated spread diseases often exhibit significantly different phenotypes, and are often difficult to distinguish by signs and symptoms alone due to extensive clinical overlap and other concomitant phenotypes. Thus, molecular genetic testing is necessary to identify pathogenic mutations to confirm disease status in symptomatic individuals. Several diseases are caused by common repeat expansion mutations located at different loci, for example CGG or GCG repeat expansion causing Fragile X Syndrome (FXS), several types of remote ophthalmopharyngeal myodystrophy, ocular pharyngeal muscular dystrophy and developmental epileptic encephalopathy, CCG repeat expansion causing one type of mental development and ocular pharyngeal myopathy with white matter encephalopathy, CAG repeat expansion causing huntington's disease and several types of spinocerebellar ataxia (SCA), CTG repeat expansion causing type 1 tonic myodystrophy, huntington's disease-like 2 and Fuchs corneal endothelial dystrophy, whereas TTTCA five nucleotide repeat expansion causing several types of familial adult myoclonus epilepsy and spinocerebellar ataxia. Multiple rounds of genetic testing may be required to achieve proper diagnosis of the affected individual, resulting in additional cost and time.
Accordingly, there is a need to overcome or at least alleviate one or more of the problems mentioned above.
Disclosure of Invention
Disclosed herein is a method of detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the method comprising: i) Contacting a nucleic acid sample under amplification conditions for each of two or more genes with: a) A gene-specific primer that specifically binds to a different target sequence of each gene, wherein the genes comprise nucleotide repeats, and wherein the different target sequences are upstream or downstream of the nucleotide repeats, and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeats and on opposite strands bound by the gene-specific primers; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and ii) analyzing the amplified product.
Disclosed herein is a kit for detecting the presence or absence of repeated extension sequences in two or more genes in a nucleic acid sample obtained from a subject, the kit comprising: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene specific primers and universal primers are capable of producing one or more amplification products from each gene.
Disclosed herein are compositions comprising a nucleic acid sample obtained from a subject, the compositions comprising a) a gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequences are upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene;
wherein the gene specific primers and universal primers are capable of producing one or more amplification products from each gene.
Disclosed herein is a method of screening a subject for one or more multiple repeat spread disease, the method comprising: i) Contacting a nucleic acid sample from a subject under amplification conditions with: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and ii) analyzing the amplified product to screen the subject for one or more multiple repeat spread disease.
Disclosed herein is a method of screening for one or more multiple repeat spread disease in a subject and treating the subject, the method comprising: i) Contacting a nucleic acid sample from a subject under amplification conditions with: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; ii) analyzing the amplified product to screen the subject for one or more multiple repeat spread disease; and iii) treating the subject found to have at least one multiple repeat spread disease.
Drawings
Some embodiments of the invention will now be described, by way of non-limiting example only, with reference to the accompanying drawings, in which:
FIG. 1. Schematic representation of FMR1 and AFF2 double TP-PCR reactions. Locus-specific flanking primer annealing positions are indicated (black and grey arrows, with the ends marked with star symbols), as are TP primer annealing positions in the repeat (black and grey arrows). Grey arrows indicate poor amplification, either because the TP primer is farther away from the locus-specific primer, or because of an interrupt-mediated mismatch. The expected pattern of electropherograms for samples carrying only normal FMR1 and normal AFF2 alleles (a), extended FMR1 alleles and extended AFF2 alleles (C) are shown. The black rectangles represent FMR1 CGG or AFF2 CCG trinucleotides. Grey rectangles represent non-CGG or non-CCG trinucleotides as shown above each rectangle.
FIG. 2 electropherograms of dual TP-PCR products of FMR1 (FAM/dark gray) and AFF2 (HEX/light gray) from normal, pre-FMR 1 mutation and full FMR1 mutation male and female DNA samples. The electropherograms show both FAM and HEX (left), FAM (middle) only and HEX (right) only fluorescence channels. Dark gray peaks indicate FAM-labeled FMR1 TP-PCR products, while light gray peaks indicate HEX-labeled AFF2 TP-PCR products. The threshold repeat size separating the normal allele from the extended allele is indicated by the vertical dashed line.
FIG. 3 electrophoretogram of FMR1 (FAM/dark grey) and AFF2 (HEX/light grey) dual TP-PCR products from AFF2 pre-mutant and full mutant DNA samples. The electropherograms show fluorescence channels at both FAM and HEX (left), FAM (middle) only and HEX (right) only. Dark gray peaks indicate FAM-labeled FMR1 TP-PCR products, while light gray peaks indicate HEX-labeled AFF2 TP-PCR products. The threshold repeat size separating the normal allele from the extended allele is indicated by the vertical dashed line.
Figure 4.Aff 2CCG repeat size and structural distribution. AFF2CCG repeat size (x-axis) and frequency (y-axis) are distributed among african americans (gray filled), caucasians (unfilled), chinese (black filled), indian (dark grey thick bars) and males (gray forward slash) populations. B, comparison of the allele frequency distribution in Zhong et al (grey) and in this study (black). Comparison of the distribution of caucasian allele frequencies in C, zhong et al (light grey) and the present study (grey). D, a heat map of population distribution (top) of X chromosomes with different FMR1 CGG and AFF2CCG repeat size combinations, and population repeat size distribution and abundance (bottom) of common and variant AFF2 alleles.
FIG. 5. Spectra of AFF2 CCG repeat structures and corresponding patterns of TP-PCR electropherograms. A, a common normal allele with rs868914124 (T) reference nucleotide. B, variant normal allele with rs868914124 (C) variant nucleotide. C, variant normal alleles with rs868914124 (T) reference nucleotide and rs1389911365 (T) variant nucleotide. D, full mutant allele with rs868914124 (C) variant nucleotide and three consecutive non-CCG disruptions. A "+" indicates a non-CCG interrupt, and "+++". Three are indicated continuous non-CCG interrupts. The leftmost 111bp isolated peak was generated by annealing the TP primer upstream of the repeat segment. The first repeatedly generated TP-PCR product of the rs868914124 (T) common allele migrates into a 138bp fragment. The first repeatedly generated TP-PCR product from the rs868914124 (C) rare allele migrates into a 132bp fragment. Disruption within the repeat results in mismatched TP primer pairs, which results in a no-peak gap within the peak cluster.
FIG. 6 (A) total mutated male FX0229 and (B) its total mutated mother's brother FX0230 AFF2CCG repeat structure and corresponding TP-PCR electropherogram pattern. A, the full mutant allele carries an rs868914124 (C) rare variant and three consecutive non-CCG breaks at repeats 8-10. B, a full mutant allele with rs868914124 (C) rare variants and multiple non-CCG disruptions. "+". ++'s three are indicated continuous non-CCG interrupts.
FIG. 7 (A) AFF2 CCG repeat structure of total mutant male DNA_25926 and (B) its pre-mutant progeny DNA_3802 and corresponding TP-PCR electropherogram patterns. A, carrying the rare variant of rs868914124 (C) and three consecutive non-CCG-interrupted full mutant alleles at repeats 6-8. B, a normal allele of 22 CCG repeats carrying rs868914124 (T) common reference nucleotides, and a pre-mutant allele of about 137 repeats with a rare variant of rs86891412 (C) and three consecutive non-CCG breaks at repeats 6-8. "+". ++'s three are indicated continuous non-CCG interrupts.
FIG. 8 multiplex three primer PCR for seven common SCA repeat loci. Schematic representation of the repetitive structures of SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and DRPLA and the primer annealing positions (A) and the expected TP-PCR electropherograms for each repetitive locus (B). Asterisks indicate fluorophore label. The upper limit of the normal allele repeat size for each locus is indicated by the vertical dashed line.
FIG. 9 seven TP-PCR capillary electrophoresis graphs were generated from DNA samples negative for expansion at any of the seven SCA repeat loci. The upper panel shows the electrophoretic peaks representing TP-PCR products from all seven repeat loci seen with all four fluorophore channels (Fam, vic, ned and Pet) open. The four lower panels show the capillary electrophoresis patterns for the same case but with one fluorophore channel open at a time. The upper limit of the normal allele repeat size for each locus is indicated by the vertical dashed line.
FIG. 10 shows the result of a seven-fold TP-PCR capillary electrophoresis of a DNA sample affected by SCA. For each sample, only the electropherograms were shown in which the fluorophore channels showed repeated expansion at the relevant SCA loci. Samples were positive for CAG repeat expansion in ATXN1 (top row), ATXN2 (second row), ATXN3 (third row), CACNA1A (fourth row), ATXN7 (fifth row), PPP2R2B (sixth row) and ATN1 (bottom row). The upper limit of the normal allele repeat size for each locus is indicated by the vertical dashed line.
Detailed Description
The present specification teaches a method of detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the method comprising: i) Contacting a nucleic acid sample under amplification conditions for each of two or more genes with: a) A gene-specific primer that specifically binds to a different target sequence of each gene, wherein the genes comprise nucleotide repeats, and wherein the different target sequences are upstream or downstream of the nucleotide repeats, and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeats and on opposite strands bound by the gene-specific primers; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and ii) analyzing the amplified product.
In one embodiment, the method is a simultaneous method of detecting the presence or absence of repeated spreading sequences in two or more genes in a nucleic acid sample obtained from a subject
Without being bound by theory, the inventors developed a strategy to screen for the presence of repeated expansion mutations in two or more suspected genes of a patient simultaneously, and use both examples to describe their use in the present disclosure. This strategy employs a single tube assay to screen multiple genetic loci responsible for different repeatedly expanding diseases caused by the expansion of the same type of repeat. The specific assays shown herein employ multiplex three-primer PCR (TP-PCR) to detect extended mutations involving trinucleotide repeats present, for example, in different genes, by using differentially labeled locus-specific flanking primers and universal three-primer (TP) primers. Extension products at any locus that shows a repeat size in the pathogenic size range, or that exceeds the maximum normal repeat size, can be rapidly identified and sized by capillary electrophoresis. A single amplification reaction provides reliable one-step mutation screening of multiple disease genes, thereby greatly shortening the lengthy process of diagnosis of affected individuals. The strategy can be applied to screen any group of disease genes sharing the same repetitive sequence simultaneously.
In one embodiment, the amplification products are separated according to size. In one embodiment, a change in the size of the amplified product of the gene as compared to a reference is indicative of the presence of a repeated extension sequence in the gene. In one embodiment, no change in the size of the amplified product of the gene as compared to the reference indicates the absence of a repeat in the gene.
The terms "detect," "determine," "measure," "evaluate," "assess," and "determine" are used interchangeably herein to refer to any form of measurement and include determining the presence or absence of an element. These terms include quantitative and/or qualitative determinations. Assessment may be relative or absolute.
The term "nucleic acid" refers to a deoxyribonucleotide or ribonucleotide polymer in either single-or double-stranded form, and unless otherwise limited, encompasses known analogs of natural nucleotides that hybridize to nucleic acids in a manner similar to naturally occurring nucleotides.
As used herein, the term "nucleic acid" and equivalent terms, such as "polynucleotide", refer to a polymeric form of nucleotides of any length, such as ribonucleotides, deoxyribonucleotides or Peptide Nucleic Acids (PNAs), that comprise purine and pyrimidine bases, or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases. The nucleic acid may be double-stranded or single-stranded. Reference to single stranded nucleic acids includes reference to sense or antisense strands. The backbone of the polynucleotide may include sugar and phosphate groups, as commonly found in RNA or DNA, or modified or substituted sugar or phosphate groups. Polynucleotides may include modified nucleotides, such as methylated nucleotides and nucleotide analogs. The sequence of nucleotides may be interrupted by non-nucleotide components. The terms nucleoside, nucleotide, deoxynucleoside and deoxynucleotide generally include complements, fragments and variants of nucleosides, nucleotides, deoxynucleosides and deoxynucleotides or analogs thereof.
The term "primer" is used herein to refer to any single stranded oligonucleotide sequence that can be used as a primer in, for example, PCR techniques. Thus, a "primer" according to the present disclosure refers to a single stranded oligonucleotide sequence that is capable of acting as a starting point for the synthesis of a primer extension product that is substantially identical to a nucleic acid strand to be replicated (for a forward primer) or substantially identical to the reverse complement of a nucleic acid strand to be replicated (for a reverse primer). The primers may be suitable for use in, for example, PCR techniques. Single strands include, for example, hairpin structures formed from single-stranded nucleotide sequences.
The design of a primer, e.g., its length and specific sequence, depends on the nature of the target nucleotide sequence and the conditions under which the primer is used, e.g., temperature and ionic strength.
Primers may consist of the nucleotide sequences described herein, or may be 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100 or more nucleotides comprising or falling within the sequences described herein, provided they are suitable for specifically binding to a target sequence under stringent conditions. In one embodiment, the primer sequence is less than 35 nucleotides in length, e.g., the primer sequence is less than 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, or 21 nucleotides in length.
The length or sequence of the primers or probes may be slightly modified to maintain the desired specificity and sensitivity in a given situation. In one embodiment of the present disclosure, the probes and/or primers described herein may extend 1, 2, 3, 4, or 5 nucleotides in length, or decrease 1, 2, 3, 4, or 5 nucleotides in length, for example, in any direction.
The primer sequences may be synthesized using any method known in the art.
The term "complementary" refers to base pairing between nucleotides or nucleic acids, for example, between two strands of a double-stranded DNA molecule, or between an oligonucleotide primer and a primer binding site on a single-stranded nucleic acid to be sequenced or amplified. The complementary nucleotides are typically A and T (or A and U), or C and G. Two single stranded RNA or DNA molecules are considered complementary when the nucleotides of one strand, optimally aligned and compared, with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand (typically at least about 90% to 95% of the nucleotides of the other strand, and more preferably about 98% to 100% of the nucleotides of the other strand). Alternatively, complementarity exists when an RNA or DNA strand will hybridize to its complement under selective hybridization conditions. Typically, selective hybridization will occur when there is at least about 65% complementarity over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, and more preferably at least about 90% complementarity.
As used herein, the term "hybridization" refers to the process by which two single-stranded polynucleotides non-covalently bind to form a stable double-stranded polynucleotide. The resulting (typically) double-stranded polynucleotide is a "hybrid". The proportion of the population of polynucleotides that form stable hybrids is referred to herein as the "degree of hybridization"
Typically, hybridization conditions will include salt concentrations of less than about 1M, more typically less than about 500mM and less than about 200 mM. Hybridization temperatures can be as low as 5 ℃, but are typically greater than 22 ℃, more typically greater than about 30 ℃, and preferably greater than about 37 ℃. The hybridization is usually carried out under stringent conditions (i.e., conditions under which the probe will hybridize to its target sequence). Stringent conditions are sequence-dependent and will be different in different circumstances. Longer fragments may require higher hybridization temperatures for specific hybridization. The combination of parameters is more important than the absolute measure of either alone, as other factors may affect the stringency of hybridization, including the base composition and length of the complementary strands, the presence of organic solvents, and the degree of base mismatch. Typically, stringent conditions are selected to be about 5 ℃ lower than the thermodynamic melting point (Tm) for a particular sequence at a defined ionic strength and pH. Tm is the temperature (under defined ionic strength, pH and nucleic acid composition) at which 50% of the probes complementary to the target sequence hybridize to the target sequence at equilibrium. Typically, stringent conditions include salt concentrations of at least 0.01M to no more than 1M Na ion concentration (or other salt) at a pH of 7.0 to 8.3 and a temperature of at least 25 ℃. For example, conditions of 5 XSSPE (750mM NaCl,50mM Na phosphate, 5mM EDTA, pH 7.4) and temperatures of 25-30℃are suitable for allele-specific probe hybridization.
By "specific binding" or "specific hybridization" of a primer to a target sequence is meant that under the experimental conditions used, e.g., under stringent hybridization conditions, the primer forms a duplex (double-stranded nucleotide sequence) with a portion of the target sequence region or with the entire target sequence (as desired), and under these conditions the primer does not form a duplex with other regions of the nucleotide sequence present in the sample to be analyzed.
The term "repeat extension sequence" may refer to a nucleotide repeat sequence that has been subjected to an extension mutation resulting in the nucleotide repeat sequence extending beyond the normal repeat size.
In one embodiment, the repeat spreading sequence is a trinucleotide repeat spreading sequence. In other embodiments, the repeat extension is a four nucleotide or five nucleotide repeat extension or a repeat extension of other length, such as a 6, 7, 8, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotide repeat sequence.
In one embodiment, the nucleotide repeat is a trinucleotide repeat. In one embodiment, the trinucleotide repeat sequence is selected from (CGG)n、(CCG)n、(CAG)n、(CTG)n、(GCC)n、(GGC)n、(GAA)n or (TTC) n, where n is 2 to 200 or greater. (CGG) n repeats include (GCG) n and (GGC) n. Similarly, (CCG) n sequences include (CGC) n and (GCC) n, and (CTG) n sequences include (TGC) n and (GCT) n.
A "universal primer" as referred to herein may bind to a "common target sequence" shared by two or more genes, wherein the "common target sequence" is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer.
"Gene-specific primers" as referred to herein may refer to primers that specifically bind to a particular gene but not to other genes.
The term "gene" is used broadly to refer to any nucleic acid associated with a biological function. Genes typically include coding sequences and/or regulatory sequences required for expression of such coding sequences. The term gene may apply to a particular genomic sequence, as well as to the cDNA or mRNA encoded by that genomic sequence. Genes also include non-expressed nucleic acid segments that, for example, form recognition sequences for other proteins. Non-expressed regulatory sequences include promoters and enhancers to which regulatory proteins (e.g., transcription factors) bind, resulting in transcription of adjacent or nearby sequences.
In one embodiment, the universal primer binds to a common target sequence comprising or consisting of 5, 6, 7, 8, 9, 10 or more consecutive trinucleotide repeats. In one embodiment, the universal primer binds to (CGG) 5 (SEQ ID NO: 15). In one embodiment, the universal primer binds to (CTG) 5 (SEQ ID NO: 16).
In one embodiment, the universal primer binds to a common target sequence comprising or consisting of repeat units of 2,3,4,5,6, 7, 8, 9, 10 or more consecutive four nucleotides, five nucleotides or more nucleotides. In one embodiment, the five nucleotide repeat sequence is (TTTCA) n, which includes (CATTT) n and (ATTTC) n.
In one embodiment, the universal primer comprises a unique tail sequence. In one embodiment, the universal primer comprises a unique 5' tail sequence. The term "unique tail sequence" or "unique 5' tail sequence" refers to a sequence that does not hybridize under stringent conditions to any gene or any region in an intergenic region in a nucleic acid sample, which is used to detect the presence or absence of a repeated spreading sequence.
In one embodiment, the method includes providing a tail primer that specifically binds to a unique 5' tail sequence of a universal primer. Adding a unique 5 'tail sequence to the universal primer and providing a tail primer that specifically binds to the unique 5' tail sequence can improve the accuracy of repeat sizing.
In one embodiment, the gene specific primers are labeled. Each gene-specific primer may be labeled, for example, with a different fluorophore, to enable the amplified products from each gene to be distinguished from each other. For example, FAM or Hex markers may be used.
In one embodiment, the fluorescent label may be active in the blue, yellow, green, and far red regions of the spectrum. In preferred embodiments, non-limiting examples of fluorescent markers useful in the methods of the present disclosure include: fluorescent labels or reporter dyes, such as 6-carboxyfluorescein (6 FAM TM)、NEDTM(Applera Corporation)、HEXTM or VIC TM(Applied Biosystems);TAMRATM labels (Applied Biosystems, CA, USA), rox. One skilled in the art will appreciate that other alternative fluorescent labels may also be used in the methods according to the present disclosure.
In another embodiment, chemiluminescent labels, such as ruthenium probes, may be used; and radiolabels, such as tritium in the form of tritiated thymidine. 32-phosphorus can also be used as a radiolabel.
Alternatively, the label may be selected from an electroluminescent label, a magnetic label, an affinity or binding label, a nucleotide sequence label, a position specific label and/or a label having specific physical properties (e.g. different dimensions, masses, gyrations, ionic strength, dielectric properties, polarization or impedance).
In one embodiment, the detectable label may be directly or indirectly attached to the primer. In one embodiment, the labeled primer is a reverse primer. In one embodiment, the detectable label comprises a fluorescent moiety attached to the 5' end of the probe. In a most preferred embodiment, the label is selected from the group consisting of 6-FAM and NED.
In an alternative embodiment, the nucleic acid is detected with a nucleic acid intercalating fluorophore. Preferably, the intercalating fluorophore is SYBR Green or EvaGreen, or the like. Those skilled in the art will appreciate that other intercalating fluorophores active in the blue, yellow, green, and far-red regions of the spectrum may be used. It will be further understood that other intercalating fluorophores may be used in accordance with the present disclosure.
The term "sample" as referred to herein may originate from a biological fluid, cell, tissue, organ or organism comprising a nucleic acid or mixture of nucleic acids having at least one nucleic acid sequence to be screened for copy number variation. In certain embodiments, the sample has at least one nucleic acid sequence, suspected of having had its copy number changed. Such samples include, but are not limited to, sputum/oral fluid, amniotic fluid, blood fractions or fine needle biopsy samples, urine, peritoneal fluid, pleural fluid, and the like. Although the sample is typically taken from a human subject (e.g., a patient), the sample may be taken from any mammal, including but not limited to dogs, cats, horses, goats, sheep, cattle, pigs, and the like. The sample may be used as obtained from biological sources directly or after a pretreatment that alters the properties of the sample. For example, such pretreatment may include preparing plasma from blood, diluting viscous fluids, and the like. The method of pretreatment may also involve, but is not limited to, filtration, precipitation, dilution, distillation, mixing, centrifugation, freezing, lyophilization, concentration, amplification, nucleic acid fragmentation, inactivation of interfering components, addition of reagents, lysis, and the like. Where such pretreatment methods are employed with respect to a sample, such pretreatment methods typically leave the nucleic acid of interest in the test sample, sometimes in a concentration proportional to the concentration in the untreated test sample (e.g., a sample that has not been subjected to any such pretreatment methods).
The control sample may be a negative or positive control sample. "negative control sample" or "unaffected sample" refers to a sample comprising nucleic acids known or expected to have repeat sequences with a number of repeats in a non-pathogenic range. "positive control samples" or "affected samples" are known or expected to have repeat sequences with a number of repeats that are within the pathogenic range. The repeat sequence in the negative control sample typically does not extend beyond the normal range, whereas the repeat sequence in the positive control sample typically has extended beyond the normal range. Thus, the nucleic acids in the test sample can be compared to one or more control samples.
The term "biological fluid" herein refers to a liquid taken from a biological source and includes, for example, blood, serum, plasma, sputum, lavage, cerebrospinal fluid, urine, semen, sweat, tears, saliva, and the like. As used herein, the terms "blood," "plasma," and "serum" expressly encompass fractions or processed portions thereof. Also, where the sample is taken from a biopsy, swab, smear, or the like, the "sample" expressly encompasses a processed fraction or portion derived from a biopsy, swab, smear, or the like.
"Label" refers to a reporter molecule (e.g., a fluorophore) capable of producing a measurable signal and covalently or non-covalently linked to a polynucleotide.
In one embodiment, the method comprises analyzing the amplified products using a size separation technique or a sequencing technique. The size separation technique may be an electrophoresis-based technique (e.g., capillary electrophoresis). For example, capillary electrophoresis by size separating polymers can be used. The fluorescently labeled PCR product (via the use of 5' labeled primers) is detected via laser excitation as it migrates and resolves through the polymer filled capillary. Another size separation technique is plate gel PAGE (polyacrylamide gel electrophoresis). The PCR products can be fluorescently labeled with 5' end-labeled primers and detected via laser excitation as they migrate and resolve through the slab gel. The PCR products can also be radiolabeled via 5' -end-labeled primers or via incorporation of radioisotope-labeled nucleotides during the PCR process and detected by exposing the slab gel to X-ray film. The sequencing technique may be next generation sequencing. By using long read next generation sequencing, PCR products can be sequenced and all reads belonging to the gene of interest can be aligned from shortest to longest, or vice versa.
Two or more genes as referred to herein may be adjacent to each other or in proximity to each other on the same chromosome. Alternatively, they may be on different chromosomes.
In one embodiment, two or more genes consist of FMR1 and AFF2 or comprise FMR1 and AFF2.FMR1 and AFF2 are located on the same chromosome (X) on the long arm of adjacent chromosome bands. This provides a simple method of simultaneously testing two or more genes including FMR1 and AFF2, either of which FMR1 and AFF2 can cause a disease exhibited by a patient.
In one embodiment, the method comprises contacting a nucleic acid sample with: a) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID No. 1; b) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID No. 2; and C) a universal primer comprising or consisting of a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence of SEQ ID NO. 3 or (CGG) 5 (SEQ ID NO 15).
In one embodiment, the two or more genes comprise or are selected from the group consisting of SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and DRPLA. In one embodiment, the two or more genes consist of two, three, four, five, six or seven or more genes comprising or selected from SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and DRPLA.
In one embodiment, the two or more genes are HTT and JPH3.
In one embodiment, two or more genes consist of ATXN1, ATXN2, ATXN3, CACNA1A, ATXN, PPP2R2B and ATN 1.
In one embodiment, the two or more genes comprise or are selected from ATXN1, ATXN2, ATXN3, CACNA1A, ATXN, PPP2R2B, ATN1, TBP, ATXN8OS, AR, HTT, JPH3, TCF4 and DMPK, and consist of two, three, four, five, six, seven, eight, nine, ten or more or all of these genes.
In one embodiment, the two or more genes comprise or are selected from FMR1, AFF2, NUTM BAS1, LRP12, GIPC1, NOTCH2NLC, RILPL1, PABPN1, and ARX, and consist of two, three, four, five, six, seven, eight, or all of these genes.
In one embodiment, the two or more genes comprise or are selected from DAB1, SAMD12, STARD7, TNRC6A and RAPGEF2 and consist of two, three, four or all of these genes.
In one embodiment, the method comprises contacting a nucleic acid sample with: a) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID No. 7; b) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID No. 8; c) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID No. 9; d) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID No. 10; e) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID No. 11; f) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID No. 12; and/or g) a gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID NO. 13, and h) a universal primer comprising or consisting of a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence of SEQ ID NO. 14 or (CTG) 5 (SEQ ID NO. 16).
Disclosed herein is a kit for detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the kit comprising: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene specific primers and universal primers are capable of producing one or more amplification products from each gene.
Disclosed herein are compositions comprising a nucleic acid sample obtained from a subject, the compositions comprising a) a gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequences are upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene specific primers and universal primers are capable of producing one or more amplification products from each gene.
Disclosed herein is a method of screening a subject for one or more multiple repeat spread disease, the method comprising: i) Contacting a nucleic acid sample from a subject under amplification conditions with: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and ii) analyzing the amplified product to screen the subject for one or more multiple repeat spread disease.
The term "multiple repeat extended disease" refers to a genetic disease caused by an increase in the number of DNA repeat motifs (e.g., trinucleotide repeat motifs) in certain genes beyond a normal, stable threshold (different threshold for each gene). The term is intended to include all diseases of this nature and includes the diseases listed in table 1.
TABLE 1 diseases caused by the extension of different repeat motifs
/>
In one embodiment, the one or more multiple repeat extended diseases comprise or consist of one or more of the diseases listed in table 1.
In one embodiment, the one or more multiple repeat spread disease comprises or consists of a disease comprising or selected from the group consisting of: fragile X Syndrome (FXS), fragile X-related primary ovarian dysfunction (FXPOI), fragile X-related tremor/ataxia syndrome (FXTAS), and fragile XE non-complex intellectual disability (FRAXE NSID).
In one embodiment, the method distinguishes FXS, FXPOI, FXTAS from FRAXE NSID.
In one embodiment, the one or more multiple repeat expansion disorders is spinocerebellar ataxia (SCA) and/or dentate nuclear pallidoluid atrophy (DRPLA).
In one embodiment, the method distinguishes between SCA and DRPLA. In one embodiment, the method distinguishes between different SCA types and DRPLA.
In one embodiment, the one or more multiple repeat spread diseases are Huntington's Disease (HD) and huntington's disease-like 2 (HDL 2).
In one embodiment, the method distinguishes between Huntington's Disease (HD) and huntington's disease-like 2 (HDL 2).
In one embodiment, the method distinguishes between different types of oculopharyngeal distal myopathy (OPDM), oculopharyngeal myodystrophy (OPMD), oculopharyngeal myopathy (OPML) with white matter encephalopathy, and Developmental Epileptic Encephalopathy (DEE).
In one embodiment, the method distinguishes between different familial adult myoclonus seizures (FAME) types and spinocerebellar ataxia (SCA) type 37.
"Subject" or "patient" refers to any individual subject in need of treatment, including humans, cattle, horses, pigs, goats, sheep, dogs, cats, guinea pigs, rabbits, chickens, insects, and the like. The subject is also intended to include any subject that is involved in a clinical study trial but does not exhibit any clinical signs of disease or a subject that is involved in an epidemiological study, or a subject that is used as a control.
Disclosed herein is a method of screening for one or more multiple repeat spread disease in a subject and treating the subject, the method comprising: i) Contacting a nucleic acid sample from a subject under amplification conditions with: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; ii) analyzing the amplified product to screen the subject for one or more multiple repeat spread disease; and iii) treating the subject found to have at least one multiple repeat spread disease.
The term "treating" or the like also includes alleviating, reducing, alleviating, ameliorating or otherwise inhibiting the effects of a condition for at least a period of time. It is also to be understood that the term "treating" or the like does not mean that the condition or symptoms thereof are permanently alleviated, reduced, alleged, ameliorated, or otherwise inhibited, and thus temporary relief, reduction, alleviation, amelioration, or otherwise inhibition of the condition or symptoms thereof is also contemplated.
Methods of treating a subject may include administering a drug to the subject, or may include providing early intervention management to the subject.
The term "administering" refers to contacting an inhibitor as referred to herein with, applying, injecting, infusing, or providing it to a subject.
Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.
As used in this specification, the singular forms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a method" includes a single method as well as two or more methods; references to "an agent" include a single agent as well as two or more agents; references to "the present disclosure" include single and multiple aspects of the present disclosure's teachings; etc. Aspects taught and authorized herein are encompassed by the term "invention". Any variant and derivative is contemplated.
The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as, an acknowledgment or admission or any form of suggestion that the prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.
Examples
Certain embodiments of the present invention will now be described with reference to the following examples, which are intended for illustrative purposes only and are not intended to limit the scope of the subject matter described above.
Example 1: FRAXA/FRAXE
Materials and methods
Biological sample
Genomic DNA was extracted from 408 unrelated and anonymous cord blood from 161 chinese male infants, 158 males and 89 indian male infants born at National University Hospital, singapore. Another 44 caucasian male DNA samples and 14 African American male DNA samples from the human variant DNA group (HD 100CAU and HD100 AA-2) were purchased from Coriell Cell Repositories (Camden, new Jersey, USA). In the assay validation, archived and previously characterized genomic DNA consisting of 40 normal, 17 pre-FMR 1 mutation positive, 23 total FMR1 mutation positive and 6 AFF2 expansion positive samples was used. De-identified normal and FMR1 CGG repeat expansion positive samples were obtained from KK word 'S AND CHILDREN's Hospital. AFF2 CCG duplicate expansion positive samples were de-identified archive samples from Baylor College of Medicine (Houston, TX, USA) and The University of Adelaide (Adelaide, south Australia, australia). Four of the six AFF2 CCG repeat expansion positive samples are correlated. FX0230 and FX0229 are mother's brother and nephew, respectively, and DNA_25926 and DNA_3802 are father and daughter.
Dual TP-PCR of FMR1 and AFF2 trinucleotide repeats
Double screening for repeated extension of AFF2 and FMR1 triplets utilized a Three Primer (TP) PCR method involving four primers, fam-labeled FMR1-R (5'-AGCCCCGCACTTCCACCACCAGCTCCTCCA-3' (SEQ ID NO: 1)), hex-labeled AFF2-F (5'-CCATGTCGCGGCTTCTAGCTGTCCAGGCTCC-3' (SEQ ID NO: 2)), and shared primer TP (5'-TGCTCTGGACCCTGAAGTGTGCCGTTGATACGGCGGCGGCGGCGG-3' (SEQ ID NO: 3)) and tail (5'-TGCTCTGGACCCTGAAGTGTGCCGTTGATA-3' (SEQ ID NO: 4)). Each 15 μl PCR reaction contained 100ng of genomic DNA,2.5x Q-solutions (Qiagen, hilden, germany), 1 XPCR buffer (Qiagen) containing 1.5mM MgCl 2, 2mM dNTPs (dGTP/dCTP to dATP/dTTP ratio of 5:1) (Roche APPLIED SCIENCE, mannheim, germany), 0.6 μM AFF2-F, FMR-R and tail primers each, 0.0006 μM TP primer and 5U HotStarTaq DNA polymerase (Qiagen). The initial 15 minutes of enzyme activation was performed at 95℃followed by 40 cycles of 99℃45 seconds, 55℃45 seconds and 70℃8 minutes (15 seconds increase per extension cycle) and finally extension at 72℃for 10 minutes.
Standard PCR spanning FMR1 and AFF2 trinucleotide repeats
Standard/conventional PCR across FMR1 CGG repeats utilized 0.6. Mu.M each of primer 5' -F1 and Fam-labeled FMR1-R, while PCR across AFF2 CCG repeats utilized Hex-labeled AFF2-F and AFF2-R (5'-CGCTGCGGGCTCAGGCGGGCT-3' (SEQ ID NO: 5)). The PCR reaction and cycling conditions were similar to the double TP-PCR mentioned above, except that 50ng of genomic DNA was used in each reaction.
Capillary electrophoresis
An aliquot of 2. Mu.l of the dual TP-PCR product was mixed with 9. Mu.l of Hi-Di TM formamide and 0.5. Mu.l of GeneScan TM500ROXTM dye size standard (Applied Biosystems, foster City, california, USA), denatured at 95℃for 5 minutes, cooled to 4℃and resolved in a 3130xl genetic Analyzer (Applied Biosystems) using a 36cm capillary filled with POP-7 TM polymer. The mixture was electrokinetically injected at 1kV for 5 seconds and electrophoresed at 60℃for 40 minutes. Analysis was performed using GeneMapper TM software v4.0 (Applied Biosystems). If an extended allele is detected after an initial Capillary Electrophoresis (CE) run, a second CE run is performed using a power injection at 10kV for 5 seconds and electrophoresis at 60℃for 40 minutes.
An aliquot of 1. Mu.l of the one-twentieth diluted standard PCR product was mixed with 9. Mu.l of Hi-Di TM carboxamide (Applied Biosystems) and 0.3. Mu.l of GeneScan TM500ROXTM dye size standard (Applied Biosystems), denatured at 95℃for 5 min, cooled to 4℃and resolved in a 3130xl genetic analyzer (Applied Biosystems) using 36cm capillaries filled with POP-7 TM polymer. The mixture was electrokinetically injected at 1.2kV for 23 seconds and electrophoresed at 60℃for 20 minutes.
Post CE analysis was performed using GeneMapper TM software v4.0 (Applied Biosystems).
Data interpretation
Primers AFF2-F anneal to AFF2 sequences immediately upstream of the CCG repeat of AFF2 sequences and produce Hex-labeled TP-PCR amplicons, while primer FMRI-R anneals to FMR1 sequences immediately downstream of the CGG repeat of FMR1 sequences and produces Fam-labeled TP-PCR products. Both TP-PCR reactions use common Three Primer (TP) and tail primers, and their products can be analyzed separately or together using different fluorescence detection channels. The TP primer was designed to anneal optimally to any one of the five CCG trinucleotides on the AFF2 sense strand or FMR1 antisense strand. When there is a stretch of more than five repeats that are continuous and uninterrupted, the electropherogram will exhibit a series of consecutive peaks that are 3bp apart from each other. If there is a non-CCG disruption, the electropherogram will exhibit a gap of about 18bp between the fluorescent peaks, corresponding to the absence of 5 fluorescent peaks (FIG. 1). The repeat size of an uninterrupted stretch is the total number of consecutive fluorescent peaks plus four (the first fluorescent peak is generated from the TP primer annealed to the first 5 repeats). For alleles with non-CCG interruption, the repeat size is the total number of consecutive fluorescent peaks plus the total number of missing peaks, plus four. In heterozygote females, a decrease in fluorescence intensity will be seen over a certain size peak; the repeat size of the smaller allele can be derived from the number of peaks before the drop, while the repeat size of the larger allele can be derived from the total number of peaks (data not shown). For the AFF2 TP-PCR reaction, the leftmost isolated peak was not generated within the repeat sequence, but rather by annealing the TP primer to the (CCG) 4 CCT (SEQ ID NO: 6) sequence immediately upstream of the CCG repeat defined as beginning after the CTG trinucleotide (FIG. 1). The size and structure of the AFF2 allele is further detailed in the results.
Sequencing of AFF2 CCG repeats
AFF2 TP-PCR and Standard/conventional PCR were performed as described above, except that all primers were unlabeled. According to the manufacturer's instructions, useTP-PCR and standard/conventional PCR products were purified by beads (Agencourt Bioscience, beverly, massachusetts, USA) and quantified using a Nanodrop TM 1000 spectrophotometer (Thermo Scientific TM, waltham, massachusetts, USA). Each 20. Mu.l sequencing reaction contained 10-50ng of purified standard PCR or TP-PCR product, 1 x/>Terminator Ready Reaction Mix (Applied Biosystems), 2.5x Q-Solution (Qiagen) and 3.2pmol of AFF2-F primer. Initial denaturation was carried out at 96℃for 1 min, followed by 25 cycles of 98℃for 10 seconds, 60℃for 5 seconds and 60℃for 4 min. The extension product was purified using an Oligo Clean & Concentrator TM column (Zymo Research, irvine, california, USA) according to the manufacturer's instructions. The eluted purified extension product was purified at Savant/>Vacuum dried for 5 min in a concentrator (Thermo Scientific), resuspended in 12. Mu.l Hi-Di TM carboxamide (Applied Biosystems) and resolved in a 3130xl genetic analyzer (Applied Biosystems) using a 36cm capillary filled with POP-7 TM polymer. The mixture was electrokinetically injected at 1.2kV for 16 seconds and electrophoresed at 60℃for 20 minutes. Post CE analysis was performed using sequencing analysis software v6.0 (Applied Biosystems).
Results
Evaluation of double TP-PCR on normal samples and on repeated expansion of positive samples by FMR1 and AFF2 triplets
The friable site of FRAXE folate sensitivity on the X chromosome is associated with several medical conditions including mental retardation, obsessive-compulsive disorder and premature ovarian failure. FRAXE friability is caused by the overexpansion of CCG trinucleotide repeats in the 5' untranslated region (UTR) in exon 1 of the AFF2 (formerly FMR 2) gene (ClinVar: VCV 000010526.1). Over-extension from normal alleles ranging from 6-30 repeats to >200 repeats is accompanied by repeated CpG methylation. This in turn silences AFF2 expression, AFF2 being a subunit of SEC-L2, which regulates transcription of several genes. AFF2 CCG repeat hyperextension is a genetic mutation that results in fragile XE non-complex intellectual disability (FRAXE NSID; OMIM 309548), a mild (IQ 50-70) to critical (IQ 70-85) intellectual disability, estimated to be affected by one in every 50000 to 100000 men, and also other cognitive/behavioral abnormalities, including obsessive-compulsive disorders. Paradoxically, alleles with fewer than 11 repeats or with microdeletions within or near the repeats are enriched in premature ovarian failure patients. Studies correlating intermediate (31-60) repeat sizes with parkinson's disease, a neurodegenerative motor system disease, have not been concluded.
FRAXE shares similar genetic features with FRAXA, a well studied fragile site on the X chromosome, with more clearly demonstrated clinical involvement. The FRAXA site contains CGG repeats within the 5' UTR in exon 1 of the FMR1 gene. The overproduction of FMR1 CGG to >200 was accompanied by CpG methylation and FMR1 gene silencing, which resulted in fragile X syndrome (FXS; OMIM 309550) (ClinVar: VCV 000009972.1). FXS is the most common genetic monogenic cause of mental retardation, affected by which is about one in every 5000 men and one in every 4000 to 8000 women. The pre-mutant alleles are associated with a number of behavioral characteristics including anxiety, obsessive-compulsive disorder, and depression. In addition, about one fifth of women carrying a pre-mutant allele (55-200 CGG repeats) suffer from fragile X-related primary ovarian insufficiency (FXPOI). The pre-FMR 1 mutation was identified in 2% of sporadic POI and 14% of familial POI patients, making FXPOI the most common genetic cause of POI in euploid females. Furthermore, a subset of pre-FMR 1 mutant carriers will eventually develop fragile X-related tremor/ataxia syndrome (FXTAS), a neurodegenerative disorder characterized by ataxia, tremor and parkinsonism, affected by which about 1 person per 4000 men over the age of 55 and 1 person per 7800 women over the age of 55. The FMR1 gene is also associated with Parkinson's disease.
Although both FRAXA and FRAXE fragile sites are closely related to mental disorders, the mild to critical phenotype of FRAXE NSID compared to FXS can lead to under-determination and under-diagnosis. The lack of diagnosis may be due in part to the lack of a rapid, simple and inexpensive assay to screen for repeat expansion at the FRAXE site. Although standard PCR methods have been described for repeated extension at the FRAXE locus, either as independent assays or multiplexed assays, they fail to detect large pre-and full-mutant alleles, detection of which still relies on Southern blot analysis. Because of the non-syndromic nature of FRAXE (i.e., lack of characteristic and consistent clinical manifestations, in contrast to, for example, facial deformities (prominent ears, chin, forehead and long face) and giant testicle present in FXS patients), molecular diagnostics are critical for validating FRAXE.
The dual TP-PCR assay was initially evaluated using 12 DNA samples. FIGS. 2 and 3 show patterns of TP-PCR electropherograms generated from normal male and female and FMR1 and AFF2 pre-mutated (PM) or total mutated (FM) individuals. The leftmost peak generated by the AFF2 TP-PCR reaction is not generated by annealing of the TP primer within the AFF2 CCG repeat that begins after the last CTG trinucleotide of the 5' flanking sequence. In contrast, this isolated peak, which migrates as a distinct 111bp fragment, results from annealing of the TP primer to the (CCG) 4 CCT (SEQ ID NO: 6) sequence immediately upstream of the repeat (FIG. 1). The first true peak resulting from annealing of the TP primer to the repetitive CCG 1-5 of AFF2 appears as a distinct 138bp fragment on the electropherogram. The TP primer anneal occurs at all other positions in the repeat segment that contains five consecutive CCGs (CCGs 2-6, 3-7, etc.), with the final fluorescence peak resulting from the annealing of the TP primer to the last five CCGs of the repeat segment. Thus, the repeat size of an AFF2 allele can be quickly and easily determined by counting the number of peaks generated from within the repeat segment (i.e., excluding the leftmost isolated peak) and adding four to that number.
The absence of more than 55 repeated fluorescent peaks indicates no extension, whereas consecutive fluorescent peaks extending more than 55 and up to 200 repeats indicate the presence of a pre-mutation (PM) allele, and fluorescent peaks extending more than 200 repeats indicate the presence of a Full Mutation (FM). These results indicate that the dual TP-PCR assay successfully detected and accurately identified pre-and total mutations in both FMR1 CGG and AFF2 CCG in male and female DNA samples (fig. 2 and 3).
Identification of population allele distribution and repeat Structure
A total of 466 male DNA samples (including DNA from 161 chinese, 158 males, 89 indians, 44 caucasians and 14 african americans) were genotyped using dual TP-PCR to determine the allele size distribution (fig. 4 and table 2). Different allele distributions were observed in the various families, with mode AFF2 repeat sizes of 18 for chinese and males (32.3% and 27.8%, respectively), but 15 for caucasian (40.9%), african americans (50.0%), and indian (32.6%). The range of allele sizes for Chinese (5-31) and Malayan (6-37) was wider than Indian (10-27), caucasian (9-26) and African American (14-28), which may be due in part to the larger sample sizes for Chinese and Malayan. Using conventional classification, 24 samples had the smallest allele (< 11 replicates), 440 samples had the normal allele (11-30 replicates), and the remaining 2 samples had the medium allele (31-60 replicates). No pre-and total mutant alleles were observed. These results are similar to early studies of AFF2 allele distribution in the Chinese Han (mode 18, ranges 9-26) and Caucasian New York City (mode 16, ranges 8-34) (FIGS. 4A-C).
Chromosomes carrying a combination of 29 FMR1 CGG repeats and 18 AFF2 CCG repeats are most common (fig. 4D, top panel). Three AFF2 TP-PCR electropherogram patterns were observed in the 466 male population DNA samples screened. As noted in Genome Reference Consortium Human Build (GRCh 38), the most common pattern was found in 459 samples (98.1%), which represented most of the normal and all medium alleles, while the other two patterns were identified by Sanger sequencing as being caused by the presence of two Single Nucleotide Polymorphism (SNP) variants (fig. 4D, bottom panel).
One of the SNP variants, the T > C substitution at chrX:148,500,637 (rs 868914124, GRCh 38), converts CTG trinucleotides to CCG at the 5' start of AFF2 CCG repeats. Although the AFF2 trinucleotide repeat in the rs868914124 (T) common allele starts after the last CTG trinucleotide of the 5 'flanking sequence (fig. 5A), this CTG trinucleotide becomes a CCG trinucleotide in the rs86891412 (C) variant allele, extending the AFF2 trinucleotide repeat stretch 2 CCG or 6bp upstream to start after the last CAG trinucleotide of the 5' flanking sequence (fig. 5B). Thus, the first TP-PCR product generated from within the repeat stretch of the rs868914124 (C) variant allele appears as a distinct 132bp fluorescence peak (FIG. 5B), in contrast to the 138bp first fluorescence peak from the rs868934124 (T) common allele (FIG. 5A). In the TP-PCR electrophoretogram, this size difference is shown as a narrower gap of 20bp between the leftmost Bian Guli peak and the first peak of the AFF2 CCG repeat of the rs868914124 (C) variant allele (FIG. 5B) compared to the 26bp wider gap of the rs868914124 (T) common allele (FIGS. 5A, C).
Rs868914124 (C) variants were observed in 8 of 466 AFF2 alleles (1.72%) (fig. 4D, bottom panel, line B), 6 of which were males alleles. 3 of the 8 samples with rs868914124 (C) variants contained the smallest allele (< 11 replicates) compared to 21 of 458 samples carrying rs868941424 (T). Using Fisher's exact test (double tail), this rare variant was observed to be enriched in the males group (odds ratio=6.01; 95% confidence interval 1.06-61.7; p=0.021) and it had the smallest allele (odds ratio=12.3; confidence interval 1.79-68.4; p=0.006). Interestingly, all 6 AFF2 extended alleles (5 total mutations and 1 pre-mutation) carried rare rs868914124 (C) variants, although the affected males and their mother's brother extended alleles, as well as the extended alleles from father and daughter pairs, were considered identical in ancestry (fig. 5D and 6 and 7).
Unlike the AGG disruption within CGG repeats normally observed in the normal FMR1 allele, we observed only one normal AFF2 allele (0.21%) that contained a non-CCG disruption in its repeat segment, i.e. CTG disruption at the fifth repeat location (fig. 4D, lower panel, line C and fig. 5C). This disruption is caused by a C > T substitution at chrX:148,500,652 (rs 1389911365, GRCh 38). Annealing of the TP primer at a location that includes such an interruption will result in mismatched pairing and failure to generate a PCR product from that location. In the TP-PCR electropherogram, failed PCR at the primer mismatch position appears as a gap without peaks (FIG. 5C).
Interestingly, new non-CCG disruptions were also observed in all 6 AFF2 expansion positive samples. Based on the TP-PCR electropherogram pattern of four AFF2 FM men, which shows the presence or absence of a gap between peak clusters/missing peaks, an interruption in the repeat was initially suspected (FIG. 3). Sanger sequencing revealed that there were one or more non-CCG breaks at the 5' end of the repeat that carried the same sequence CCTGTGCAG as a 9 nucleotide stretch immediately upstream of the repeat. The number of interrupts varies from one (fig. 6A) to more than four (fig. 6B). Their positions also vary, for example from 8 th to 10 th repetition position (fig. 6A) or from 6 th to 8 th position (fig. 7).
We observed that in father and daughter pairs (fig. 7), AFF2 CCG repeat expansion was a full mutation in the father (dna_ 25926) and a pre-mutation in the daughter (dna_3802), indicating previously recorded contractions upon delivery. Although the AFF2 full mutant alleles (fig. 6) of nephew (FX 0229) and mother's brother (FX 0230) are considered identical in ancestry, their repeat structure differs due to the presence of additional non-CCG breaks in mother's brother. The exact cause or mechanism of this difference has not been studied.
Blind validation of dual TP-PCR assays on previously characterized samples
The blind validation of the duplex TP-PCR assay included 82 archived and previously characterized genomic DNA. AFF2 CCG repeats sizing as described in materials and methods. The dual TP-PCR assay accurately classified all 40 normal, 17 FMR1 PM, 23 FMR1 FM and 2 AFF2 FM samples included in the test (table 3).
Table 2. AFF2 CCG repeat allele frequencies in five populations.
CH, chinese; ML, males; IN, indian; CAU, caucasian; AA, non-business americans
Table 3. Consistency analysis of samples with or without FMR1 or AFF2 repeat expansion.
/>
/>
Example 2: spinocerebellar ataxia (SCA)
Materials and methods
Biological sample
Genomic DNA and cell lines were obtained from Coriell Cell Repositories (CCR; coriell Institute for MEDICAL RESEARCH, camden, NJ, USA). NA06926, NA13536 and NA13537 each carry an extended ATXN1 CAG repeat, NA14982 carries an extended ATXN2 CAG repeat, GM06151 carries an extended ATXN3 CAG repeat, NA03562 carries an extended ATXN7 CAG repeat, and NA13716 and NA13717 each carry an extended ATN1 CAG repeat. GM16243 carries an extended FXN GAA repeat, GM06907 carries an extended FMR1 CGG repeat, and GM05164 and GM06075 each carry an extended DMPK CTG repeat. Two clinical DNA samples 160920 and 183254 carrying extended CACNA1A CAG repeats and PPP2R2B CAG repeats, respectively, were obtained from KK dimension 'S AND CHILDREN's Hospital. The 60 archived DNA samples of known genotypes were obtained from Siriraj Hospital-Mahidol University and included in the blind evaluation of assay accuracy.
Seven-fold TP-PCR
Seven-fold TP-PCR was performed in a 25- μl reaction containing: 10ng of genomic DNA,1.5x Q-Solution (Qiagen, hilden, germany), 1 XPCR buffer (Qiagen) containing 1.5mmol/L of MgCl 2, a deoxyribonucleic acid triphosphate (dNTP) mixture consisting of dATP, dTTP, dCTP and dGTP at 0.2mmol/L each (Roche APPLIED SCIENCE, penzberg, germany) and 2 units of HotStar Taq DNA polymerase (Qiagen). Eight primers (seven fluorescently labeled locus specific primers and one universal TP primer TP-R) are included at their respective optimal working concentrations. Primer sequences, fluorophore labels and primer concentrations are shown in tables 4 and 5. On SIMPLIAMP thermocyclers (Applied Biosystems-Thermo FISHER SCIENTIFIC, foster City, CA, USA), the thermocycling consisted of: initial polymerase activation was performed at 95℃for 15 minutes, followed by 35 cycles of 98℃for 45 seconds, 60℃for 1 minute and 72℃for 2 minutes, and finally extension at 72℃for 5 minutes.
Capillary Electrophoresis (CE)
An aliquot of 1- μl of the fluorescently labeled TP-PCR product was mixed with 9 μl of Hi-Di TM carboxamide (Applied Biosystems) and 0.5 μl of GeneScan TM500LIZTM dye size standard (Applied Biosystems), then denatured at 95 ℃ for 5 minutes, rapidly cooled to 4 ℃ and resolved in a 3130xl genetic analyzer (Applied Biosystems) using 36-cm capillaries filled with POP-7 TM polymer. The samples were electrokinetically injected at 1kV for 15 seconds and electrophoresed at 60℃for 40 minutes. GeneScan analysis was performed using GeneMapper 5.0 software (Applied Biosystems). Amplified products from each locus can be identified by their product size range and peak color. Seven-fold TP-PCR products exhibit four different colored electrophoretic peaks and can be analyzed either by opening all four fluorescence detection channels together or separately using one fluorescence detection channel while closing the other.
Results
Detection of extended CAG repeats by single tube heptafold TP-PCR
Spinocerebellar ataxia (SCA) is a neurodegenerative disease that causes degeneration of the cerebellum, and sometimes the spinal cord, and is mainly characterized by gait ataxia gait, poor hand-eye coordination and poor mouth and teeth. These autosomal dominant genetic diseases have genetic diversity and can be caused by conventional mutations as well as repeated expansion mutations. Overall, the prevalence of SCA on a global scale averages 2.7 cases per 100,000 individuals, with SCA3 being most common. There are more than 40 genetically distinct SCAs, of which at least 12 are caused by repeated expansion. Among them, SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and SCA17 are caused by abnormal repeated expansion of CAG trinucleotide. Another disease, dentate nuclear pallidum atrophy (DRPLA), is also caused by CAG repeat expansion, which is classified as SCA due to its heterogeneity and overlapping clinical phenotypes with other SCAs.
SCA1 is caused by the repeated extension of CAG in exon 8 of the ATXN1 gene on chromosome 6p22.3. The non-extended (normal) ATXN1 allele contains CAG of 6 to 44 CAT breaks. Alleles containing 36 to 38 CAGs are variable normal alleles that may extend into the range of pathogenic sizes when transmitted to the next generation, but are not related to the clinical symptoms themselves, whereas those containing ≡39 CAGs are pathogenic. SCA2 is caused by the repeated extension of CAG in exon 1 of the ATXN2 gene on chromosome 12q24.12. Normal ATXN2 alleles contain 14 to 31 pure or CAA-disrupted CAGs, and those containing 32 CAGs are intermediate alleles with uncertain clinical significance. Alleles containing 33 to 500 CAGs are pathogenic. SCA3 or Marchado-Josephson disease is caused by repeated extension of CAG in exon 10 of the ATXN3 gene on chromosome 14q32.12. Normal ATXN3 alleles contain 12 to 44 CAGs, while those containing 45 to 59 CAGs are readily extendable intermediate alleles, and those with 60 to 87 CAGs are all-exon alleles. SCA6 is caused by the repeated extension of CAG in exon 47 of the CACNA1A gene on chromosome 19p13.13. The normal CACNA1A allele contains 18 CAGs and the full exotic allele contains 20 to 33 CAGs. The CACNA1A allele containing 19 CAGs has uncertain clinical significance. SCA7 is caused by the repeated extension of CAG in exon 3 of the ATXN7 gene on chromosome 3p14.1. The normal ATXN7 allele contains 7 to 27 CAGs, while those containing 28 to 33 CAGs are variable normal alleles, and allelic mutations with 34 or more CAGs are pathogenic. SCA12 is caused by the repeated expansion of CAG in the 5' region of the PPP2R2B gene on chromosome 5q 32. Normal PPP2R2B alleles contain 7 to 32 CAGs, while those containing 51 to 78 CAGs are all-exon alleles. DRPLA is caused by the repeated extension of CAG in exon 5 of the ATN1 gene on chromosome 12p13.31. Normal ATN1 alleles contain 6 to 35 CAGs, while those containing > 48 CAGs are all-exon alleles.
The various SCA types often exhibit significantly different phenotypes and are often difficult to distinguish by signs and symptoms alone due to extensive clinical overlap and other accompanying non-ataxia phenotypes. Thus, molecular genetic testing is necessary and the only way to identify pathogenic mutations to confirm disease status in symptomatic individuals. Identification of pathogenic genes can be time consuming and expensive because it is difficult to detect certain genes without a known family history or disease-specific symptoms. The european molecular genetic diagnostic quality consortium (European Molecular Genetics Quality Network) suggests that all laboratories should minimally provide for the detection of the five most common types of SCA (SCA 1, SCA2, SCA3, SCA6 and SCA 7), while other types of detection will depend on local epidemic.
Detection and sizing of CAG repeats in different SCAs relies on standard PCR or three-primer PCR (TP-PCR) followed by capillary electrophoresis. When only a single fragment from a normal allele is detected, the repeat size determination by standard PCR must typically be accompanied by low throughput and labor intensive Southern blot analysis to confirm the presence of a very large allele and estimate its repeat size, or by enzymatic digestion to determine the presence of CAT disruption in the extended ATXN1 allele. TP-PCR, first described by Warner et al [24], has been widely used to detect repeated expansions that cause many repeated expansions of disease, regardless of the size of the expansion. It uses locus specific flanking primers, triple Primer (TP) primers and tail primers and is able to detect very large extensions, accurately determine the size of all normal alleles and moderate extensions, and identify breaks in the repeat segments by different TP-PCR electropherogram patterns.
Since multiple rounds of genetic testing may be required prior to identifying a pathogenic gene in a SCA patient, simultaneous screening for trinucleotide repeats of different disease genes may facilitate faster identification of the pathogenic gene in addition to saving the cost of multiple tests. We report the development of a new single tube multiplex TP-PCR assay that is able to screen simultaneously for expansion mutations at seven loci responsible for some of the most common SCAs (SCA 1, SCA2, SCA3, SCA6, SCA7, SCA12 and DRPLA). Seven-fold assays used locus-specific flanking primers that were differentially labeled for each repeat locus, as well as universal primers that annealed within CAG.
The SCA heptad TP-PCR assay utilized fluorescently labeled locus-specific flanking primers located upstream of the CAG repeats of each disease gene and universal TP primers that annealed within all CAG repeats to enable CAG repeats from seven different SCA loci to be co-amplified in a single reaction tube. Each flanking primer is labeled with one of four fluorophores (Ned, vic, fam or Pet). In addition, each flanking primer is spaced a different distance from its respective repeat segment so that the electrophoresis peak of the normal allele resulting from one repeat locus does not overlap with the electrophoresis peak of the normal allele from another repeat locus labeled with the same fluorophore (fig. 8). The combined effect of differential fluorophore labeling and localization of the seven flanking primers is that the TP-PCR products generated from the normal alleles of each repeat locus do not overlap and can be distinguished after capillary electrophoresis. The number of repeats was determined by counting the electrophoresis peaks from the left side of the electrophoresis pattern, with the first peak representing the first five pure CAGs of the repeat segment (fig. 8).
Seven TP-PCR assays were first tested on genotype-known samples to confirm locus-specific annealing of each flanking primer and to detect extended CAG repeats at each extended repeat locus. When the TP primer is repeatedly annealed to five consecutive CAGs, TP-PCR produces a mixture of fragments, each of which differs by one triplet. The minimal amplification product of all loci contained five CAG triplets, except for ATXN3, which has 11 triplets since the wild type allele of ATXN3 had CAA at positions 3 and 6 and AAG at position 4 (fig. 8A). The electropherograms show a series of consecutive ladder peaks (fig. 8B) because TP primers anneal to multiple positions within uninterrupted CAG repeats (fig. 8A). Each successive electrophoresis peak represents a TP-PCR product, which is one or three base pairs larger, and the size of the repeat in the allele can be derived from the number of counting peaks. The presence of non-CAG breaks in the middle of the repeat segment, which may occur in the ATXN1 and ATXN2 alleles, prevents the TP-polymer from annealing effectively at these positions, resulting in the absence of fluorescent peaks between the peak clusters. Expanding the negative samples produced TP-PCR products from two non-expanded (normal) alleles that lie within the normal repeat size range (fig. 9), while expanding the positive samples additionally produced longer TP-PCR products from the expanded alleles (fig. 10). Samples with repeat sizes exceeding the upper limit of the normal allele size range at a particular repeat locus will be indicative of expansion at that repeat locus (fig. 10).
Blind method clinical sample verification
To assess the ability of the seven-fold TP-PCR assay to accurately identify the expansion at seven repeat loci in affected patient samples, blind analysis was performed on 60 archived clinical DNA samples of known genotypes. After TP-PCR and capillary electrophoresis, one fluorescence detection channel is opened at a time to facilitate the analysis of amplified products labeled with different fluorophores, respectively. This enables samples showing electrophoresis peaks beyond the upper limit of the normal allele size range at any of the seven SCA loci to be clearly identified. For all 31 DNA samples positive for one of the seven SCA repeat expansions (5 SCA1 positive, 7 SCA2 positive, 12 SCA3 positive, 5 SCA6 positive, 1 SCA7 positive and 1 DRPLA positive), the repeat expansion in the expected SCA repeat locus was correctly detected by the heptad TP-PCR (table 6). Multiplex PCR involving flanking primers and universal TP primers for the relevant repeat loci was performed on all screened positive samples to confirm the results (data not shown). No expansion was detected in any of the 29 expansion-negative DNA samples.
TABLE 4 seven-fold TP-PCR primers and expected TP-PCR product sizes
TABLE 5 seven expected TP-PCR product size for SCA
Table 6: disease states and CAG repeat size for 60 archived DNA samples of known genotype
/>
/>
Reference to the literature
Zhong, N.et al, A survey of FRAXE allele sizes in three displacements. Am J Med Genet,1996.64 (2): p.415-9.
Claims (21)
1. A method of detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the method comprising:
i) Contacting the nucleic acid sample under amplification conditions for each of the two or more genes with: a) A gene-specific primer that specifically binds to a different target sequence of each gene, wherein the genes comprise nucleotide repeats, and wherein the different target sequences are upstream or downstream of the nucleotide repeats, and b) a universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeats and on opposite strands bound by the gene-specific primers; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and
Ii) analyzing the amplification product.
2. The method of claim 1, wherein the method comprises analyzing the amplification product using a size separation technique or a sequencing technique.
3. The method of claim 2, wherein the size separation technique is an electrophoresis-based technique.
4. A method according to any one of claims 1 to 3, wherein the amplification products are separated according to size.
5. The method of any one of claims 1 to 4, wherein a change in the size of a gene amplification product as compared to a reference is indicative of the presence of a repeat extension sequence in the gene.
6. The method of any one of claims 1 to 5, wherein the repeated spreading sequence is a trinucleotide repeated spreading sequence.
7. The method of any one of claims 1 to 6, wherein the trinucleotide repeat sequence is selected from (CGG)n、(CCG)n、(CAG)n,、(CTG)n、(GCC)n、(GGC)n、(GAA)n or (TTC) n, wherein n is 2 to 200 or more.
8. The method of any one of claims 1 to 7, wherein the universal primer comprises a unique 5' tail sequence.
9. The method of claim 8, wherein the method comprises providing a tail primer that specifically binds to a unique 5' tail sequence of the universal primer.
10. The method of any one of claims 1 to 9, wherein the universal primer binds to a common target sequence comprising or consisting of 5, 6,7,8,9, 10 or more consecutive trinucleotide repeats.
11. The method of any one of claims 1 to 10, wherein the gene-specific primer is labeled.
12. The method of any one of claims 1 to 11, wherein the two or more genes consist of FMR1 and AFF2 or comprise FMR1 and AFF2.
13. The method of claim 12, wherein the method comprises contacting the nucleic acid sample with:
a) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 1;
b) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 2; and
C) Comprising or consisting of a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence of (CGG) 5 (SEQ ID NO: 15) or SEQ ID NO: 3.
14. The method according to any one of claims 1 to 11, wherein the two or more genes are selected from the group consisting of SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and DRPLA.
15. The method of claim 14, wherein the method comprises contacting the nucleic acid sample with:
a) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 7;
b) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 8;
c) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 9;
d) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 10;
e) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 11;
f) A gene specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO. 12; and/or
G) A gene-specific primer comprising or consisting of a nucleic acid having at least 90% sequence identity with the nucleic acid sequence of SEQ ID NO. 13, and
H) Comprising or consisting of a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence of (CTG) 5 (SEQ ID NO: 16) or SEQ ID NO: 14.
16. A kit for detecting the presence or absence of a repeat extension sequence in two or more genes in a nucleic acid sample obtained from a subject, the kit comprising:
a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and
B) A universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene;
Wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene.
17. A composition comprising a nucleic acid sample obtained from a subject, the composition comprising a) a gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene;
Wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene.
18. A method of screening a subject for one or more multiple repeat spread disease, the method comprising:
i) Contacting a nucleic acid sample from the subject under amplification conditions with: a) A gene-specific primer that specifically binds to a different target sequence of each of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene; and
Ii) analyzing the amplification products to screen the subject for the one or more multiple repeat spread disease.
19. The method of claim 18, wherein the one or more multiple repeat extended diseases comprise or consist of Fragile X Syndrome (FXS), fragile X-related primary ovarian dysfunction (FXPOI), fragile X-related tremor/ataxia syndrome (FXTAS), and fragile XE non-complex intellectual disability (FRAXE NSID).
20. The method of claim 18, wherein the one or more multiple repeat expansion diseases is spinocerebellar ataxia (SCA) and/or dentate nuclear pallidoluid atrophy (DRPLA).
21. A method of screening one or more multiple repeat spread disease in a subject and treating the subject, the method comprising:
i) Contacting a nucleic acid sample from the subject under amplification conditions with: a) A gene-specific primer that specifically binds to a different target sequence of each gene of two or more genes, wherein each gene comprises a nucleotide repeat sequence, and wherein the different target sequence is upstream or downstream of the nucleotide repeat sequence in each gene; and b) a universal primer that binds to a common target sequence shared by the two or more genes, wherein the common target sequence is located within the nucleotide repeat sequence and on the opposite strand bound by the gene-specific primer of each gene; wherein the gene-specific primers and the universal primers are capable of producing one or more amplification products from each gene
Ii) analyzing the amplification products to screen the subject for the one or more multiple repeat spread disease; and
Iii) Subjects found to have at least one multiple repeat extended disease are treated.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG10202105851T | 2021-06-02 | ||
SG10202105851T | 2021-06-02 | ||
PCT/SG2022/050380 WO2022255952A2 (en) | 2021-06-02 | 2022-06-02 | Method of detecting a repeat expansion sequence |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118159667A true CN118159667A (en) | 2024-06-07 |
Family
ID=84324630
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202280053833.7A Pending CN118159667A (en) | 2021-06-02 | 2022-06-02 | Method for detecting repeated spreading sequences |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN118159667A (en) |
WO (1) | WO2022255952A2 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116103388B (en) * | 2022-12-23 | 2024-01-23 | 北京大学第三医院(北京大学第三临床医学院) | Kit, system and method for detecting ATXN3 gene of embryo before implantation |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2488665B1 (en) * | 2009-10-16 | 2017-12-13 | National University of Singapore | Screening method for trinucleotide repeat sequences |
-
2022
- 2022-06-02 CN CN202280053833.7A patent/CN118159667A/en active Pending
- 2022-06-02 WO PCT/SG2022/050380 patent/WO2022255952A2/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2022255952A3 (en) | 2023-01-12 |
WO2022255952A2 (en) | 2022-12-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3378954B1 (en) | Quantification of a minority nucleic acid species | |
JP2853864B2 (en) | Methods for detecting nucleotide sequences | |
EP3755813B1 (en) | Improved detection of microsatellite instability | |
US20130022973A1 (en) | Multiplex Amplification for the Detection of Nucleic Acid Variations | |
EP2488665B1 (en) | Screening method for trinucleotide repeat sequences | |
KR20100063050A (en) | Analysis of nucleic acids of varying lengths by digital pcr | |
WO2012114075A1 (en) | Method for processing maternal and fetal dna | |
US10590468B2 (en) | Method for methylation analysis | |
WO2017004189A1 (en) | Single nucleotide polymorphism in hla-b*15:02 and use thereof | |
US20090042195A1 (en) | Methods and systems for screening for and diagnosing dna methylation associated abnormalities and sex chromosome aneuploidies | |
Almomani et al. | Rapid and cost effective detection of small mutations in the DMD gene by high resolution melting curve analysis | |
US5550020A (en) | Method, reagents and kit for diagnosis and targeted screening for retinoblastoma | |
CN118159667A (en) | Method for detecting repeated spreading sequences | |
US10894978B2 (en) | Genetic test for detecting congenital adrenal hyperplasia | |
AU2006226873B2 (en) | Nucleic acid detection | |
WO2011090154A1 (en) | Target sequence amplification method, polymorphism detection method, and reagents for use in the methods | |
US20070196849A1 (en) | Double-ligation Method for Haplotype and Large-scale Polymorphism Detection | |
CN102424833A (en) | Chip and method for real-time PCR (polymerase chain reaction) gene detection at polygenic mutation site | |
WO2014114922A1 (en) | Methods for estimating the size of disease-associated polynucleotide repeat expansions in genes | |
JPWO2006070666A1 (en) | Simultaneous detection method of gene polymorphism | |
KR101899235B1 (en) | Primer set for diagnosing adhd in korean, kit for diagnosing comprising the same, and method of predicting adhd risk in korean using thereof | |
RU2412247C2 (en) | Method of genetic polymorphism analysis for carrying out postnatal dna-diagnostics of mucoviscidosis | |
Yau | Repeat Expansions in Movement Disorders: Disease Modification and New Horizon | |
Essop | Molecular Aspects of X-Linked Mental Retardation Loci | |
KR20130135414A (en) | Method for multiplex determination of copy number variation using modified mlpa |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination |