WO2022082199A1 - Procédé de détection de la sclérose latérale amyotrophique - Google Patents
Procédé de détection de la sclérose latérale amyotrophique Download PDFInfo
- Publication number
- WO2022082199A1 WO2022082199A1 PCT/US2021/071865 US2021071865W WO2022082199A1 WO 2022082199 A1 WO2022082199 A1 WO 2022082199A1 US 2021071865 W US2021071865 W US 2021071865W WO 2022082199 A1 WO2022082199 A1 WO 2022082199A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- als
- mutations
- subject
- genes
- lateral sclerosis
- Prior art date
Links
- 206010002026 amyotrophic lateral sclerosis Diseases 0.000 title claims abstract description 175
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000035772 mutation Effects 0.000 claims abstract description 72
- 108020004414 DNA Proteins 0.000 claims abstract description 29
- 238000012163 sequencing technique Methods 0.000 claims abstract description 22
- 239000012472 biological sample Substances 0.000 claims abstract description 18
- 210000000349 chromosome Anatomy 0.000 claims abstract description 16
- 108091028043 Nucleic acid sequence Proteins 0.000 claims abstract description 15
- 101100495925 Schizosaccharomyces pombe (strain 972 / ATCC 24843) chr3 gene Proteins 0.000 claims abstract description 12
- 238000003752 polymerase chain reaction Methods 0.000 claims abstract description 12
- 108090000623 proteins and genes Proteins 0.000 claims description 124
- 239000000523 sample Substances 0.000 claims description 27
- -1 AC106707.1 Proteins 0.000 claims description 24
- 239000002773 nucleotide Substances 0.000 claims description 14
- 125000003729 nucleotide group Chemical group 0.000 claims description 14
- 108091068844 miR-7155 stem-loop Proteins 0.000 claims description 12
- 101000986786 Homo sapiens Orexin/Hypocretin receptor type 1 Proteins 0.000 claims description 9
- 101001134134 Homo sapiens Oxidation resistance protein 1 Proteins 0.000 claims description 9
- 102100028141 Orexin/Hypocretin receptor type 1 Human genes 0.000 claims description 9
- 102100036625 Coiled-coil domain-containing protein 42 Human genes 0.000 claims description 6
- 101000715288 Homo sapiens Coiled-coil domain-containing protein 42 Proteins 0.000 claims description 6
- 102100034154 Guanine nucleotide-binding protein G(i) subunit alpha-2 Human genes 0.000 claims description 5
- 101001070508 Homo sapiens Guanine nucleotide-binding protein G(i) subunit alpha-2 Proteins 0.000 claims description 5
- 101001046948 Homo sapiens SANT and BTB domain regulator of class switch recombination Proteins 0.000 claims description 5
- 102100022847 SANT and BTB domain regulator of class switch recombination Human genes 0.000 claims description 5
- 239000008280 blood Substances 0.000 claims description 5
- 210000004369 blood Anatomy 0.000 claims description 5
- 210000004027 cell Anatomy 0.000 claims description 5
- 101000601581 Homo sapiens NADH dehydrogenase [ubiquinone] iron-sulfur protein 4, mitochondrial Proteins 0.000 claims description 4
- 102100037519 NADH dehydrogenase [ubiquinone] iron-sulfur protein 4, mitochondrial Human genes 0.000 claims description 4
- 102100027563 Cytochrome c oxidase subunit 5A, mitochondrial Human genes 0.000 claims description 3
- 102100033587 DNA topoisomerase 2-alpha Human genes 0.000 claims description 3
- 101000725076 Homo sapiens Cytochrome c oxidase subunit 5A, mitochondrial Proteins 0.000 claims description 3
- 101000637977 Homo sapiens Neuronal calcium sensor 1 Proteins 0.000 claims description 3
- 101000905839 Homo sapiens Phospholipid-transporting ATPase VA Proteins 0.000 claims description 3
- 101000877833 Homo sapiens Protein FAM184B Proteins 0.000 claims description 3
- 101000709106 Homo sapiens SMC5-SMC6 complex localization factor protein 1 Proteins 0.000 claims description 3
- 101000795185 Homo sapiens Thyroid hormone receptor-associated protein 3 Proteins 0.000 claims description 3
- 101000652578 Homo sapiens Thyroid transcription factor 1-associated protein 26 Proteins 0.000 claims description 3
- 101000830563 Homo sapiens Trinucleotide repeat-containing gene 18 protein Proteins 0.000 claims description 3
- 101000781865 Homo sapiens Zinc finger CCCH domain-containing protein 7B Proteins 0.000 claims description 3
- 102100030658 Lipase member H Human genes 0.000 claims description 3
- 101710102454 Lipase member H Proteins 0.000 claims description 3
- 101001083117 Microbacterium liquefaciens Hydantoin permease Proteins 0.000 claims description 3
- 102100032077 Neuronal calcium sensor 1 Human genes 0.000 claims description 3
- 102100023496 Phospholipid-transporting ATPase VA Human genes 0.000 claims description 3
- 102100035465 Protein FAM184B Human genes 0.000 claims description 3
- 102100032663 SMC5-SMC6 complex localization factor protein 1 Human genes 0.000 claims description 3
- 102000003620 TRPM3 Human genes 0.000 claims description 3
- 108060008547 TRPM3 Proteins 0.000 claims description 3
- 102100029689 Thyroid hormone receptor-associated protein 3 Human genes 0.000 claims description 3
- 102100030344 Thyroid transcription factor 1-associated protein 26 Human genes 0.000 claims description 3
- 102100024597 Trinucleotide repeat-containing gene 18 protein Human genes 0.000 claims description 3
- 108010046308 Type II DNA Topoisomerases Proteins 0.000 claims description 3
- 102100036643 Zinc finger CCCH domain-containing protein 7B Human genes 0.000 claims description 3
- 238000012217 deletion Methods 0.000 claims description 3
- 230000037430 deletion Effects 0.000 claims description 3
- 238000003780 insertion Methods 0.000 claims description 3
- 230000037431 insertion Effects 0.000 claims description 3
- 102000054765 polymorphisms of proteins Human genes 0.000 claims description 3
- 210000001519 tissue Anatomy 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 description 18
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 18
- 230000002068 genetic effect Effects 0.000 description 16
- 201000010099 disease Diseases 0.000 description 11
- 239000013068 control sample Substances 0.000 description 5
- 238000001514 detection method Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000013459 approach Methods 0.000 description 4
- 230000000875 corresponding effect Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 238000007481 next generation sequencing Methods 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 230000035945 sensitivity Effects 0.000 description 4
- 238000007619 statistical method Methods 0.000 description 4
- 206010064571 Gene mutation Diseases 0.000 description 3
- 101150014554 TARDBP gene Proteins 0.000 description 3
- 230000008826 genomic mutation Effects 0.000 description 3
- 208000015122 neurodegenerative disease Diseases 0.000 description 3
- 230000008506 pathogenesis Effects 0.000 description 3
- 230000007170 pathology Effects 0.000 description 3
- 238000012070 whole genome sequencing analysis Methods 0.000 description 3
- 208000024827 Alzheimer disease Diseases 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 201000011240 Frontotemporal dementia Diseases 0.000 description 2
- 101000610557 Homo sapiens U4/U6 small nuclear ribonucleoprotein Prp31 Proteins 0.000 description 2
- 108700026244 Open Reading Frames Proteins 0.000 description 2
- 101001109965 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) 60S ribosomal protein L7-A Proteins 0.000 description 2
- 101001109960 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) 60S ribosomal protein L7-B Proteins 0.000 description 2
- 108010021188 Superoxide Dismutase-1 Proteins 0.000 description 2
- 102100038836 Superoxide dismutase [Cu-Zn] Human genes 0.000 description 2
- 102100040347 TAR DNA-binding protein 43 Human genes 0.000 description 2
- 102100040118 U4/U6 small nuclear ribonucleoprotein Prp31 Human genes 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 238000003339 best practice Methods 0.000 description 2
- 230000007850 degeneration Effects 0.000 description 2
- 238000002405 diagnostic procedure Methods 0.000 description 2
- 239000000539 dimer Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000010448 genetic screening Methods 0.000 description 2
- 238000011331 genomic analysis Methods 0.000 description 2
- 238000003205 genotyping method Methods 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 238000007403 mPCR Methods 0.000 description 2
- 239000002679 microRNA Substances 0.000 description 2
- 210000002161 motor neuron Anatomy 0.000 description 2
- 238000002610 neuroimaging Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 102000004169 proteins and genes Human genes 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- 238000002560 therapeutic procedure Methods 0.000 description 2
- 101150076401 16 gene Proteins 0.000 description 1
- 101150092328 22 gene Proteins 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 102100029671 E3 ubiquitin-protein ligase TRIM8 Human genes 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 238000000729 Fisher's exact test Methods 0.000 description 1
- 101000795300 Homo sapiens E3 ubiquitin-protein ligase TRIM8 Proteins 0.000 description 1
- 101000612980 Homo sapiens Thrombospondin-type laminin G domain and EAR repeat-containing protein Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 108700011259 MicroRNAs Proteins 0.000 description 1
- 208000036110 Neuroinflammatory disease Diseases 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000006851 antioxidant defense Effects 0.000 description 1
- 230000008236 biological pathway Effects 0.000 description 1
- 239000000090 biomarker Substances 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000009137 competitive binding Effects 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 229910003460 diamond Inorganic materials 0.000 description 1
- 239000010432 diamond Substances 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000012252 genetic analysis Methods 0.000 description 1
- 210000004602 germ cell Anatomy 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 108091070501 miRNA Proteins 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 230000003959 neuroinflammation Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Definitions
- the present invention relates to methods for detecting amyotrophic lateral sclerosis.
- Each method comprises sequencing 16 target genes or 23 target genomic loci from a biological sample of a subject, and identifying one or more mutations such as single nucleotide polymorphisms or insertions/deletions , if present, in the 16 target genes or 23 target genomic loci.
- ALS Amyotrophic lateral sclerosis
- ALS cases can be grouped by two categories: familial ALS (fALS), where the patient has a genetically related family member also affected, and sporadic ALS (sALS), where the patient has no family history of ALS 9 . Historically, 5-10% of cases are fALS, and the other 90-95% cases are sALS. In the past ten years, the C9ORF72 hexanucleotide repeat expansion, has been identified as the most prevalent genomic mutation found in the ALS disease population 13 . C9ORF72 repeat expansions can be found in up to 34% of fALS and 5% of sALS cases.
- FIG. 1 Distribution of SNPs present only in the 338 ALS sample. rs767982303 and rs760890146 (SNIP IDs) are each in greater than 25% of the ALS population. The dots represent percentage of the ALS sample for each selected SNP with the 68.5% Confidence Level (CL) Clopper-Pearson interval on the true binomial proportion. The grey area represents the range of the possible percentage in the healthy population, with a 95% CL Clopper-Pearson interval.
- CL Confidence Level
- FIG. 2 SNPs that are not mutated in the control sample. The number of ALS cases out of the overall 338 patient cohort, percent of total ALS cases with the 99% CL Clopper- Pearson interval, and p-value.
- FIG. 3 Distribution of mutated genes found only in the 338 ALS sample. Dots represent percentage of the ALS group for each selected gene. The grey area represents an upper-bound on the potential false-positive percentage in the healthy population. This upper bound is set via the 99% CL Clopper-Pearson interval on the binomial proportion. MIR7155 mutations are detected in 51% of the ALS cohort.
- FIG. 4 16 genes that are not mutated in the control sample. The number of ALS cases out of the 338-patient cohort, number of unique SNPs, percent of total ALS cases, and p-value with the 99% CL Clopper-Pearson interval are shown, respectively.
- FIGs. 5A-5B Classifier Analysis using candidate ALS-only mutated genes. Selecting patients with three or more genes mutated of the 16 candidate genes yields a falsepositive rate less than 0.1% and false-negative rate less than 59% at 99% CL. 52% of the ALS cases have at least three of the 16 candidate genes mutated.
- (5B) The percentage of ALS cases with at least the given number of genes mutated from the candidate list (light). The maximum false positive rate at 99% CL (dark).
- FIG. 6 Distribution of candidate ALS-only mutated genes and probability of having ALS based on number of mutations. The distribution of the number of genes out of the top 22 candidates found in each of the 713 ALS cases is shown in grey. The probability of having ALS and the probability of not having ALS is represented is shown. DETAILED DESCRIPTION OF THE INVENTION
- locus is a specific, fixed position on a chromosome where a particular gene or genetic marker is located.
- a “single nucleotide polymorphism” is a germline substitution of a single nucleotide at a specific position in the genome. For example, at a specific base position in the human genome, the G nucleotide may appear in most individuals, but in a minority of individuals, the position is occupied by an A. This means that there is a SNP at this specific position, and the two possible nucleotide variations - G or A - are the alleles for this specific position.
- ALS Amyotrophic Lateral Sclerosis
- the present invention identifies a set of mutations in genomic-coding regions that are present in ALS patients but not in healthy control samples.
- the present invention provides methods to detect and diagnose ALS before clinical and pathological onset, which is imperative to prolonging patient lifespan, understanding the pathobiology, and designing therapies for early intervention.
- the inventors compute and analyze large datasets of genomes of over 1,500 ALS disease patients and healthy controls.
- the inventors unravel mutations such as single nucleotide polymorphisms (SNPs) and Indels (insertions and deletions) in gene-coding and inter-genic regions that are associated with ALS disease diagnosis and always absent in healthy control patients.
- SNPs single nucleotide polymorphisms
- Indels insertions and deletions
- the inventors have analyzed nextgeneration genomic sequencing data from two cohorts of ALS and healthy controls from the Answer ALS Consortium. In doing so, the inventors discover mutations in protein-coding genes that have not been associated with ALS previously.
- the present invention is directed to methods for detecting amyotrophic lateral sclerosis in a subject by detecting one or more mutations in specific genes or gene loci.
- the inventors have discovered that 16 target genes of the human genome, MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPOR1P1, KIAA1841, and AC008074.4, are important for detecting ALS.
- the invention provides a method for detecting ALS in a subject, comprising obtaining a biological sample from a subject, and from the sample, detecting one or more mutations in 16 target genes selected from the groups consisting of: MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPOR1P1, KIAA1841, and AC008074.4.
- the method comprising the steps of: (a) sequencing 16 target genes from a biological sample of a human subject, wherein the target genes are MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPOR1P1, KIAA1841, and AC008074.4, (b) comparing each of the DNA sequences of the 16 target genes with its corresponding normal genes, (c) identifying one or more mutations such as SNPs, if present, in each of the DNA sequences of the 16 target genes, and (d) detecting amyotrophic lateral sclerosis in the subject if at least one of the 16 target genes has one or more mutations. With at least one target gene found mutated, 67% to 80%, at 99% CL (C-P), of ALS can be detected, with at least one target
- ALS is detected in the subject if at least two of the 16 target genes have one or more mutations. With at least two target genes mutated, 50% to 64%, at 99% CL, of ALS can be detected, with a false positive rate less than 0.9% at 99% CL.
- ALS is detected in the subject if at least three of the 16 target genes have one or more mutations. With at least three target genes mutated, 45% to 59%, at 99% CL, of ALS can be detected, with a false positive rate less than 0.09% at 99% CL.
- the DNA is first extracted from a biological sample of a human subject.
- the biological sample is blood (such as peripheral whole blood), a tissue sample (such as fibroblast (skin) biopsy, or a mucosal sample), or any cell derived from the patient of a human subject.
- Method for extracting DNA from a biological sample is well-known to a person skilled in the art. For Example, see protocols for extracting DNAs from blood from Thermo Fisher product sheet catalog CS11040.
- the DNA extracted from the biological sample of the human subject is then performed target-specific amplification and target-specific sequencing to sequence each of the 16 target genes: MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPORIP 1, KIAA1841, and AC008074.4.
- Whole genome sequencing which is a genomic technique for sequencing all the protein-coding regions of genes in a genome, is not performed in this method.
- each of the DNA sequences of the 16 specific target genes is compared with its corresponding reference gene sequence.
- Targeted gene data are processed through an automated pipeline to perform read alignment and mutation analysis including variants such as SNPs, indels and substitutions in either introns, exons or both.
- paired- end 150bp reads are aligned to the GRCh38 human reference using the Burrows-Wheeler Aligner (BWA-MEM) and processed using the GATK best-practices workflow that includes marking of duplicate reads by the use of Picard tools, local realignment around indels, and base quality score recalibration (BQSR) via Genome Analysis Toolkit (GATK).
- BWA-MEM Burrows-Wheeler Aligner
- GATK Genome Analysis Toolkit
- step (c) single nucleotide variant analysis is performed to identify one or more mutations, if present, in each of the DNA sequences of the 16 target genes.
- Variant discovery is a two-step process. HaplotypeCaller is run on each sample separately in gVCF mode (GATK v3.5). This produces an intermediate file format called gVCF (genomic VCF). For projects with large number of samples, gVCFs are combined by batches into merged gVCFs. gVCFs are then run through a joint genotyping step (GATK v3.5) to produce a multi-sample VCF. Variant filtration is performed using Variant Quality Score Recalibration (VQSR) which identifies annotation profiles of variants that are likely to be real, and assigns a score (VQSLOD) to each variant.
- VQSR Variant Quality Score Recalibration
- Variant effects annotation is performed using SnpEff (PMID: 22728672), bcftools (http://github.com/samtools/bcftools) and in-house software.
- Other functional annotations include variant frequencies in different populations from 1000 Genomes project (PMID:20981092), Exome Aggregation Consortium - ExAC(http://biorxiv.org/content/early/2015/10/30/030338), dbSNP147 (PMID: 11125122); cross-species conservation scores from PhyloP (PMID: 15965027), Genomic Evolutionary Rate Profiling (GERP; PMID: 21152010), PhastCons (PMID: 21278375); functional prediction scores from Polyphen2 (PMID: 20354512) and SIFT (PMID: 19561590); Clinvar(http://www.ncbi.
- Variant discovery for example, is described in the following references: “A framework for variation discovery and genotyping using next-generation DNA sequencing data” DePristo M, et al, 2011 NATURE GENETICS 43:491-498; and “From FastQ data to high-confidence variant calls: the genome analysis toolkit best practices pipeline.” Van der Auwera G, et al., Curr Protoc Bioinformatics. 2013; 43: 1-33.
- step (d) ALS is detected in the subject, if at least one of the 16 target genes has one or more mutations, preferably at least two of the 16 target genes have one or more mutations, and more preferably at least three of the 16 target genes have one or more mutations.
- the present invention provides a method to detect and diagnose ALS before clinical- and pathological-onset, which is imperative to prolonging patient lifespan, understanding the pathobiology, and designing therapies for early intervention.
- ALS is a devastating neurodegenerative disorder, with no cures or genetic diagnostics.
- the present method detects 45%-59% of the ALS-only population, at 99% CL, with the 16 target genomic signatures, when at least 3 of the 16 target genes contain a mutation.
- the present method provides use of genetic screening in early ALS diagnosis and therapeutic intervention.
- this applications show that the detection of single mutations can identify up to 59% of the ALS population with genes that are never found mutated in the healthy control sample.
- This application illustrates two novel mutations in gene-coding regions of the genome that are never present in the healthy group yet are found in over 25% of the ALS cohort.
- the inventors have discovered that 22 target genes of the human genome, AL033528.3, THRAP3, AC106707.1, LIPH, AC007690.1, FAM184B, AC096747.1-NDUFB5P1, NDUFS4, RPL5P16-AC008885.1, SLF1, TNRC18, AC023095.1, TRPM3, AL161629.1, NCS1, TXNP1-INPP5F, CCDC59, ATP10A, COX5A, RN7SL33P, TOP2A, and ZC3H7B, which do not mutate in a normal subject, are important for detecting ALS.
- the invention provides a method for detecting ALS in a subject, comprising obtaining a biological sample from a subject, and from the sample, detecting one or more mutations in 22 target genes selected from the groups consisting of: AL033528.3, THRAP3, AC106707.1, LIPH, AC007690.1, FAM184B, AC096747.1-NDUFB5P1, NDUFS4, RPL5P16- AC008885.1, SLF1, TNRC18, AC023095.1, TRPM3, AL161629.1, NCS1, TXNP1-INPP5F, CCDC59, ATP10A, COX5A, RN7SL33P, TOP2A, and ZC3H7B, and detecting amyotrophic lateral sclerosis in the subject if the 22 genes has one or more mutations.
- 22 target genes selected from the groups consisting of: AL033528.3, THRAP3, AC106707.1, LIPH, AC007690.1, FAM184B, AC096747.1-
- the invention also provides a method for detecting ALS in a subject, comprising obtaining a biological sample from a subject, and from the sample, detecting one or more mutations in 23 genomic loci selected from the groups consisting of: chrl :25854953 (chromosome 1 at nucleotide position 25854953), chrl :3624870, chr3: 158557839, chr3: 185543848, chr3: 186923875, chr4: 17685198, chr4: 180358067, chr5:53655366, chr5: 82813472, chr5:94666955, chr7:5338617, chr8: 62196626, chr9:71428255, chr9: 89866631, chr9: 130224292, chrlO: 119712877, chrlO: 119712899, chrl2
- the method comprises the step of: (a) amplifying DNA extracted from a biological sample of a subject by target-specific polymerase chain reaction to amplify specific genomic loci comprising 23 specific chromosome positions of chrl :25854953, chrl :3624870, chr3: 158557839, chr3:185543848, chr3: 186923875, chr4: 17685198, chr4: 180358067, chr5:53655366, chr5: 82813472, chr5:94666955, chr7:5338617, chr8: 62196626, chr9:71428255, chr9: 89866631, chr9: 130224292, chrlO: 119712877, chrlO: 119712899, chrl2:82295320, chrl5:25687571,
- Whole genome sequencing which is a genomic technique for sequencing all the protein-coding regions of genes in a genome, is not performed in this method.
- step (a) of the method the DNA is first extracted from a biological sample of a human subject, as described in the first method.
- the DNA extracted from the biological sample of the human subject is then performed target-specific amplification to amplify the 23 loci of the 22 genes.
- Table 1 shows the 22 genes that frequently has at least one mutation in ALS patients and the position of the mutation in terms of nucleotide position on a chromosome.
- Gene TXNP1-INPP5F has two mutated loci chrlO: 119712877 and chrlO: 119712899 in ALS patients.
- PCR polymerase chain reaction
- the forward and reverse primer are designed to be 30-400 bases away from the target site, e,g, 40- 250 bases, 40-200 bases, 40-150 bases, or 40-100 bases.
- Table 1 illustrates one design of the forward primer and reverse primer for each of the 23 target loci.
- the primer design shown in Table 1 is an example, and the present invention is not limited to such specific primer sequences.
- the two loci of chrlO: 119712877 and chrlO: 119712899 of Gene TXNP1-INPP5F are only 22 bases apart from each other and therefore one set of forward and reverse primers can conveniently amplify both loci.
- step (b) the amplified DNA is purified, and sequenced according to methods known to a person skilled in the art.
- DNA purification is a step that removes everything that is not the amplicon from the PCR product, this includes unused primers, nucleotides, enzymes, and other impurities.
- Sequencing includes library preparation and the act of DNA sequencing itself, done by a sequencing system. Library preparation typically consists of fragmenting the DNA sample and adding sequencing adapters to the fragments that are needed for the sequencing step (next generation sequencing). The act of sequencing itself includes reading the nucleotides in the DNA sample and saving them sequentially into a digital file.
- the specific protocol for DNA purification and DNA sequencing may differ depending on a number of factors, including the method used for DNA amplification and the sequencing system used.
- the amplified DNA can be purified and sequenced by using QIAquickPCR Purification Kit for DNA purification, following QIAquick® Spin Handbook protocol; TruSeq DNA LT kit (see product sheet of TruSeq DNA Library Prep Kits®, Illumina) for library preparation, following the protocol available at Ilumina's website TruSeq® DNA Sample Preparation Guide; and sequencing done by Illumina MiSeq system (see MiSeqTM System specification sheet, Illumina).
- step (c) the amplified DNA sequences of (b) is analyzed and compared with its corresponding DNA sequence of the normal genomic loci. See description in the first method.
- step (d) single nucleotide variant analysis is performed to identify one or more mutations, if present, in each of the DNA sequences of the 23 target loci. See description in the first method.
- step (e) ALS is detected in the subject, if at least one of the 23 target loci has single mutation, preferably at least two of the 23 target loci have mutations, and more preferably at least three of the 23 target loci have mutations.
- the present method detects over 30% of the ALS-only population, at 99% CL, with the 23 target genomic signatures, when at least 1 of the 23 target genes contain a mutation.
- the present method provides use of genetic screening in early ALS diagnosis and therapeutic intervention.
- the inventors show that at least two genes must be mutated in the list of 23 top candidates to achieve 35.7-44.9% accuracy at detection of ALS.
- the following examples further illustrate the present invention. These examples are intended merely to be illustrative of the present invention and are not to be construed as being limiting.
- Clopper-Pearson Interval Bounds are set on the true fractions of either population with a given feature(s). The number of people within a sample that are positive for the feature-of- interest will have a binomial distribution. Clopper-Pearson intervals on the binomial proportion are calculated for true population proportions 11 .
- Fisher's Exact Test The probability (p-value) of the null hypothesis, that a mutation is present in the ALS population in the same proportion as in the control population 12 . This test statistic is ideal for this study because it is the exact probability that the two proportions are equal and can still be calculated in a reasonable amount of time due to the sample size limits. T -tests are approximations of this probability which converge to the exact value in the limit of large sample size.
- Row- wise Conditional Percentage To quantify how often a pair of our top 16 genes is mutated in the same patient, we calculated a conditional probability considering the independent probabilities of each mutation occurring on its own. For every possible ordered pairing of two genes (240 combinations), we counted the number of cases which have both gene mutations and divided by the total number of cases where the first gene was present. This metric is visually represented as a matrix, with each row and column representing a particular mutated gene from the set of 16, and each element representing the conditional probability of the column and row mutation happening in the same patient, adjusted for the baseline prevalence of the row mutation. The probability is converted into a percentage and can provide insights into how often two gene mutations co-occur in each patient.
- Answer ALS Data were provided by the Answer ALS consortium.
- C9ORF72 hexanucleotide-repeats are the most prevalent ALS mutation known to date, effecting 5-10% of all cases, and up to 34% of familial (fALS) 13 .
- fALS familial
- rs767982303 and rs760890146 were each found in 25% of the total ALS population yet are absent in controls (FIG2. 1 and 2). rs767982303 (located on the 0XR1 gene) and rs760890146 (located on the NPM1P49 gene) both lead to an acceptor variant mutation. Other top SNPs-of-interest and their significance are illustrated.
- FIGs 5A and 5B We propose a simple classifier that requires at least three of the 16 genes to be mutated. A conservative upper limit on the rate in the healthy population of having a gene mutation for each of these top 16 genes is estimated to be less than 10% (at 99% CL) using the Clopper-Pearson interval since each gene was not found in 53 control patients 11 .
- 16 mutations has a false-positive rate less than 0.1% (1/1000), meaning the specificity is greater than 99.9% at 99% CL.
- the sensitivity of this classifier is 52% ⁇ 7% at 99% CL, identifying just over half of the ALS sample.
- Example 1 demonstrates SNPs in coding-regions or entire genes that are associated in a majority of the ALS population.
- the Answer ALS consortium utilized the latest next-generation sequencing technology and annotation with the highest quality control and protocols to allow us to perform unbiased genetic analyses on protein-coding genes and other genomic areas of interest. We are the first to report on this novel genomic database using these statistical and computational methods.
- OXR1 is an essential member of the antioxidant defense mechanisms in the cell.
- microRNA MIR7155
- Answer ALS Data were provided by the Answer ALS consortium and Alzheimer’s Disease Neuroimaging Initiative.
- C9ORF72 hexanucleotide-repeats are the most prevalent ALS mutation know to data, affecting 5-10% of all cases and up to 34% of familial (fALS).
- fALS familial
- Table 2 shows the 22 genes that are not mutated in the control sample. The gene names, the number of ALS cases out of the 713-patient cohort, percent of total ALS cases with the 99% CL Clopper-Pearson interval are shown, and p-value, respectively.
- Table 2 shows the sensitivity and specificity of combined loci in detecting ALS. The sensitivity of any number of combination of mutations and specificity are shown.
- Diagnostic testing based on novel gene sequence identification could serve as an early disease detection tool.
- FIG. 6 illustrates distribution of candidate ALS-only mutated genes and probability of having ALS or not having ALS based on the number of positive results or negative results on mutations.
- the distribution of numbers of variants found out of the 23 genomic loci in the 713 ALS cases is shown in grey.
- the diamond plus represents the probability of having ALS, which shows an increasing probability with increasing positive numbers of variants.
- the star represents the probability of not having ALS, which shows a decreasing probability base with increasing positive numbers of variants.
- ALS A clinical and comprehensive multi- omics signature for ALS employing induced pluripotent stem cell derived motor neurons from 1000 sporadic and familial ALS patients nationwide. Annals of Neurology 80, S243- S243 (2016).
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Pathology (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Les inventeurs ont identifié 23 loci génomiques et établi qu'une majorité de patients atteints de sclérose latérale amyotrophique (SLA) présentent des mutations dans au moins un des loci cibles. La présente invention concerne un procédé pour détecter la SLA chez un sujet, comprenant les étape suivantes : (A) amplification d'ADN extrait d'un échantillon biologique d'un sujet par une réaction en chaîne par polymérase spécifique à une cible pour amplifier des loci génomiques spécifiques comprenant 23 positions chromosomiques spécifiques de chrl:25854953, chrl:3624870, chr3:158557839, chr3:185543848, chr3: 186923875, chr4: 17685198, chr4: 180358067, chr5:53655366, chr5: 82813472, chr5:94666955, chr7:5338617, chr8: 62196626, chr9:71428255, chr9:89866631, chr9: 130224292, chrlO: 119712877, chrlO: 119712899, chrl2:82295320, chrl5:25687571, chrl 5:74926032, chrl7:2562894, chrl7:40390624, et chr22:41330858 ; (b) purification, et séquençage de l'ADN amplifié ; (c) analyse de chacune des séquences d'ADN amplifiées et comparaison avec sa séquence d'ADN correspondante des loci génomiques normaux, (d) identification d'une ou plusieurs mutations, si elles sont présentes, aux 23 positions chromosomiques, et (e) détection de la SLA chez le sujet si les 23 positions chromosomiques présentent une ou plusieurs mutations.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202063093049P | 2020-10-16 | 2020-10-16 | |
US63/093,049 | 2020-10-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022082199A1 true WO2022082199A1 (fr) | 2022-04-21 |
Family
ID=81209431
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2021/071865 WO2022082199A1 (fr) | 2020-10-16 | 2021-10-14 | Procédé de détection de la sclérose latérale amyotrophique |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022082199A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024077282A3 (fr) * | 2022-10-07 | 2024-06-13 | Neu Bio, Inc. | Biomarqueurs pour le diagnostic de la sclérose latérale amyotrophique |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040137450A1 (en) * | 2001-04-16 | 2004-07-15 | Hadano Shinji | Als2 gene and amyotrophic lateral sclerosis type 2 |
WO2013041577A1 (fr) * | 2011-09-20 | 2013-03-28 | Vib Vzw | Procédés de diagnostic de la sclérose latérale amyotrophique et de la dégénérescence lobaire frontotemporale |
US20130109589A1 (en) * | 2006-11-30 | 2013-05-02 | Translational Genomics Research Institute | Single nucleotide polymorphisms associated with amyotrophic lateral sclerosis |
US20160177389A1 (en) * | 2008-07-22 | 2016-06-23 | The General Hospital Corporation D/B/A Massachusetts General Hospital | Fus/tls-based compounds and methods for diagnosis, treatment and prevention of amyotrophic lateral sclerosis and related motor neuron diseases |
US20160338328A1 (en) * | 2009-08-25 | 2016-11-24 | Hiroshima University | Animal model and cell model developing amyotrophic lateral sclerosis |
-
2021
- 2021-10-14 WO PCT/US2021/071865 patent/WO2022082199A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040137450A1 (en) * | 2001-04-16 | 2004-07-15 | Hadano Shinji | Als2 gene and amyotrophic lateral sclerosis type 2 |
US20130109589A1 (en) * | 2006-11-30 | 2013-05-02 | Translational Genomics Research Institute | Single nucleotide polymorphisms associated with amyotrophic lateral sclerosis |
US20160177389A1 (en) * | 2008-07-22 | 2016-06-23 | The General Hospital Corporation D/B/A Massachusetts General Hospital | Fus/tls-based compounds and methods for diagnosis, treatment and prevention of amyotrophic lateral sclerosis and related motor neuron diseases |
US20160338328A1 (en) * | 2009-08-25 | 2016-11-24 | Hiroshima University | Animal model and cell model developing amyotrophic lateral sclerosis |
WO2013041577A1 (fr) * | 2011-09-20 | 2013-03-28 | Vib Vzw | Procédés de diagnostic de la sclérose latérale amyotrophique et de la dégénérescence lobaire frontotemporale |
Non-Patent Citations (2)
Title |
---|
CALINI DANIELA, CORRADO LUCIA, DEL BO ROBERTO, GAGLIARDI STELLA, PENSATO VIVIANA, VERDE FEDERICO, CORTI STEFANIA, MAZZINI LETIZIA,: "Analysis of hnRNPA1, A2/B1, and A3 genes in patients with amyotrophic lateral sclerosis", NEUROBIOLOGY OF AGING, vol. 34, no. 11, November 2013 (2013-11-01) - 2 July 2013 (2013-07-02), pages 1 - 4, XP028691762, DOI: 10.1016/j.neurobiolaging. 2013.05.02 5 * |
LIU KEVIN X., EDWARDS BENJAMIN, LEE SHEENA, FINELLI MATTÉA J., DAVIES BEN, DAVIES KAY E., OLIVER PETER L.: "Neuron-specific antioxidant OXR1 extends survival of a mouse model of amyotrophic lateral sclerosis", BRAIN, vol. 138, no. 5, May 2015 (2015-05-01), pages 1167 - 1168, XP055933313, DOI: 10.1093/brain/awv039 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024077282A3 (fr) * | 2022-10-07 | 2024-06-13 | Neu Bio, Inc. | Biomarqueurs pour le diagnostic de la sclérose latérale amyotrophique |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
De Roeck et al. | NanoSatellite: accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION | |
US20200335178A1 (en) | Detecting repeat expansions with short read sequencing data | |
Cooper et al. | A copy number variation morbidity map of developmental delay | |
AU2014281635B2 (en) | Method for determining copy number variations in sex chromosomes | |
KR101718940B1 (ko) | 알츠하이머성 치매 또는 경도인지장애를 위한 후생유전학 조기진단용 조성물 | |
HUE030510T2 (hu) | Magzati kromoszómális aneuploidia diagnosztizálása genomszekvenálás alkalmazásával | |
CN111292804B (zh) | 一种借助高通量测序检测smn1基因突变的方法和系统 | |
CN105555970B (zh) | 同时进行单体型分析和染色体非整倍性检测的方法和系统 | |
AU2014346680A1 (en) | Targeted screening for mutations | |
CN111534602A (zh) | 一种基于高通量测序分析人类血型基因型的方法及其应用 | |
WO2022082199A1 (fr) | Procédé de détection de la sclérose latérale amyotrophique | |
CN116083562B (zh) | 一种与阿司匹林抵抗辅助诊断相关的snp标志物组合、引物合集及其应用 | |
Yadav et al. | Next-Generation sequencing transforming clinical practice and precision medicine | |
CN108570496A (zh) | 一种遗传性骨病的分子诊断方法及试剂盒 | |
US20240209446A1 (en) | Circulating noncoding rnas as a signature of autism spectrum disorder symptomatology | |
Seo et al. | Quality threshold evaluation of Sanger confirmation for results of whole exome sequencing in clinically diagnostic setting | |
US11920198B2 (en) | Method and kit for identifying gene mutations | |
CN115579056B (zh) | 一组用于评估精神分裂症分子分型的基因群及其诊断产品和应用 | |
JP2020517304A (ja) | Dna分析のためのオフターゲット配列の使用 | |
Soucy et al. | Molecular Genetic Testing Approaches for Retinitis Pigmentosa | |
CN112442527B (zh) | 孤独症诊断试剂盒、基因芯片、基因靶点筛选方法及应用 | |
Yilmaz | Structural Variants in Health and Disease | |
CN110144403B (zh) | 一种乳腺癌治病基因rbm12b的新突变snp位点及其应用 | |
Quinones-Valdez et al. | Long-read RNA-seq demarcates cis-and trans-directed alternative RNA splicing | |
Vecoli | Next-generation sequencing technology in the genetics of cardiovascular disease |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21881316 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21881316 Country of ref document: EP Kind code of ref document: A1 |