CN116287198A - Primer group, kit and method for detecting mutation types of genes related to various diseases - Google Patents
Primer group, kit and method for detecting mutation types of genes related to various diseases Download PDFInfo
- Publication number
- CN116287198A CN116287198A CN202310100120.8A CN202310100120A CN116287198A CN 116287198 A CN116287198 A CN 116287198A CN 202310100120 A CN202310100120 A CN 202310100120A CN 116287198 A CN116287198 A CN 116287198A
- Authority
- CN
- China
- Prior art keywords
- mutation
- equal
- less
- sample
- gene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000035772 mutation Effects 0.000 title claims abstract description 116
- 238000000034 method Methods 0.000 title claims abstract description 74
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 44
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 43
- 201000010099 disease Diseases 0.000 title claims abstract description 41
- 208000002903 Thalassemia Diseases 0.000 claims abstract description 57
- 206010011878 Deafness Diseases 0.000 claims abstract description 46
- 208000016354 hearing loss disease Diseases 0.000 claims abstract description 46
- 208000002320 spinal muscular atrophy Diseases 0.000 claims abstract description 44
- 108010085238 Actins Proteins 0.000 claims abstract description 43
- 206010064571 Gene mutation Diseases 0.000 claims abstract description 42
- 102000007469 Actins Human genes 0.000 claims abstract description 38
- 231100000888 hearing loss Toxicity 0.000 claims abstract description 29
- 230000010370 hearing loss Effects 0.000 claims abstract description 29
- 231100000895 deafness Toxicity 0.000 claims abstract description 17
- 230000002068 genetic effect Effects 0.000 claims abstract description 9
- 108700039887 Essential Genes Proteins 0.000 claims abstract description 5
- 102100021947 Survival motor neuron protein Human genes 0.000 claims description 153
- 101000617738 Homo sapiens Survival motor neuron protein Proteins 0.000 claims description 149
- 239000000523 sample Substances 0.000 claims description 108
- 230000037430 deletion Effects 0.000 claims description 80
- 238000012217 deletion Methods 0.000 claims description 80
- 238000012163 sequencing technique Methods 0.000 claims description 68
- 230000003321 amplification Effects 0.000 claims description 57
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 57
- 239000012634 fragment Substances 0.000 claims description 39
- 230000002438 mitochondrial effect Effects 0.000 claims description 29
- 101001009007 Homo sapiens Hemoglobin subunit alpha Proteins 0.000 claims description 24
- 238000006243 chemical reaction Methods 0.000 claims description 24
- 101150081851 SMN1 gene Proteins 0.000 claims description 21
- 239000013642 negative control Substances 0.000 claims description 20
- 108020004414 DNA Proteins 0.000 claims description 18
- 102100027685 Hemoglobin subunit alpha Human genes 0.000 claims description 16
- 230000036438 mutation frequency Effects 0.000 claims description 15
- 239000011324 bead Substances 0.000 claims description 13
- 238000001914 filtration Methods 0.000 claims description 13
- 238000012165 high-throughput sequencing Methods 0.000 claims description 13
- 238000012360 testing method Methods 0.000 claims description 12
- 101000899111 Homo sapiens Hemoglobin subunit beta Proteins 0.000 claims description 11
- 238000004422 calculation algorithm Methods 0.000 claims description 11
- 238000007403 mPCR Methods 0.000 claims description 11
- 238000010276 construction Methods 0.000 claims description 9
- 239000003623 enhancer Substances 0.000 claims description 9
- 238000003908 quality control method Methods 0.000 claims description 9
- 102100021519 Hemoglobin subunit beta Human genes 0.000 claims description 7
- 102100039397 Gap junction beta-3 protein Human genes 0.000 claims description 6
- 101000954092 Homo sapiens Gap junction beta-2 protein Proteins 0.000 claims description 6
- 101000889136 Homo sapiens Gap junction beta-3 protein Proteins 0.000 claims description 6
- 102100035278 Pendrin Human genes 0.000 claims description 6
- 108091006507 SLC26A4 Proteins 0.000 claims description 6
- 101150015954 SMN2 gene Proteins 0.000 claims description 6
- 241000208340 Araliaceae Species 0.000 claims description 5
- 108090000790 Enzymes Proteins 0.000 claims description 5
- 102000004190 Enzymes Human genes 0.000 claims description 5
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 claims description 5
- 235000003140 Panax quinquefolius Nutrition 0.000 claims description 5
- 239000000539 dimer Substances 0.000 claims description 5
- 235000008434 ginseng Nutrition 0.000 claims description 5
- 238000003766 bioinformatics method Methods 0.000 claims description 4
- 230000007614 genetic variation Effects 0.000 claims description 4
- 230000037431 insertion Effects 0.000 claims description 4
- 238000003780 insertion Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 238000012224 gene deletion Methods 0.000 claims description 3
- 210000003470 mitochondria Anatomy 0.000 claims description 3
- 108700024394 Exon Proteins 0.000 claims 2
- 238000001514 detection method Methods 0.000 abstract description 52
- 238000002474 experimental method Methods 0.000 abstract description 6
- 230000007918 pathogenicity Effects 0.000 abstract 1
- 208000022074 proximal spinal muscular atrophy Diseases 0.000 description 20
- 108020005196 Mitochondrial DNA Proteins 0.000 description 19
- 238000013461 design Methods 0.000 description 17
- 208000026350 Inborn Genetic disease Diseases 0.000 description 16
- 208000016361 genetic disease Diseases 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 14
- 108091093088 Amplicon Proteins 0.000 description 13
- 239000000047 product Substances 0.000 description 13
- 239000000203 mixture Substances 0.000 description 11
- 210000000349 chromosome Anatomy 0.000 description 10
- 238000005457 optimization Methods 0.000 description 9
- 108700028369 Alleles Proteins 0.000 description 8
- 201000006288 alpha thalassemia Diseases 0.000 description 7
- 238000012216 screening Methods 0.000 description 7
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 6
- 239000003153 chemical reaction reagent Substances 0.000 description 6
- 239000007791 liquid phase Substances 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- 210000002161 motor neuron Anatomy 0.000 description 5
- 238000002360 preparation method Methods 0.000 description 5
- 239000006228 supernatant Substances 0.000 description 5
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 4
- 238000012408 PCR amplification Methods 0.000 description 4
- 208000005980 beta thalassemia Diseases 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000007792 addition Methods 0.000 description 3
- 238000010009 beating Methods 0.000 description 3
- 238000005119 centrifugation Methods 0.000 description 3
- 239000013611 chromosomal DNA Substances 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 238000009396 hybridization Methods 0.000 description 3
- 230000032405 negative regulation of neuron apoptotic process Effects 0.000 description 3
- 230000008506 pathogenesis Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 208000011580 syndromic disease Diseases 0.000 description 3
- 108010054147 Hemoglobins Proteins 0.000 description 2
- 238000007622 bioinformatic analysis Methods 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 210000004602 germ cell Anatomy 0.000 description 2
- 231100000518 lethal Toxicity 0.000 description 2
- 230000001665 lethal effect Effects 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 238000002844 melting Methods 0.000 description 2
- 108020004707 nucleic acids Proteins 0.000 description 2
- 102000039446 nucleic acids Human genes 0.000 description 2
- 150000007523 nucleic acids Chemical class 0.000 description 2
- 239000002773 nucleotide Substances 0.000 description 2
- 125000003729 nucleotide group Chemical group 0.000 description 2
- 230000001717 pathogenic effect Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 239000002096 quantum dot Substances 0.000 description 2
- HDTRYLNUVZCQOY-UHFFFAOYSA-N α-D-glucopyranosyl-α-D-glucopyranoside Natural products OC1C(O)C(O)C(CO)OC1OC1C(O)C(O)C(O)C(CO)O1 HDTRYLNUVZCQOY-UHFFFAOYSA-N 0.000 description 1
- KWIUHFFTVRNATP-UHFFFAOYSA-N Betaine Natural products C[N+](C)(C)CC([O-])=O KWIUHFFTVRNATP-UHFFFAOYSA-N 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 208000032170 Congenital Abnormalities Diseases 0.000 description 1
- 206010010356 Congenital anomaly Diseases 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 238000007400 DNA extraction Methods 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 206010011882 Deafness congenital Diseases 0.000 description 1
- 208000024556 Mendelian disease Diseases 0.000 description 1
- 208000010428 Muscle Weakness Diseases 0.000 description 1
- 206010028372 Muscular weakness Diseases 0.000 description 1
- KWIUHFFTVRNATP-UHFFFAOYSA-O N,N,N-trimethylglycinium Chemical compound C[N+](C)(C)CC(O)=O KWIUHFFTVRNATP-UHFFFAOYSA-O 0.000 description 1
- 101150113275 Smn gene Proteins 0.000 description 1
- HDTRYLNUVZCQOY-WSWWMNSNSA-N Trehalose Natural products O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@@H]1O[C@@H]1[C@H](O)[C@@H](O)[C@@H](O)[C@@H](CO)O1 HDTRYLNUVZCQOY-WSWWMNSNSA-N 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- HDTRYLNUVZCQOY-LIZSDCNHSA-N alpha,alpha-trehalose Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 HDTRYLNUVZCQOY-LIZSDCNHSA-N 0.000 description 1
- 229940126575 aminoglycoside Drugs 0.000 description 1
- 239000012491 analyte Substances 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 229960003237 betaine Drugs 0.000 description 1
- 230000007698 birth defect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000005251 capillar electrophoresis Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000007850 degeneration Effects 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 230000001079 digestive effect Effects 0.000 description 1
- 238000001647 drug administration Methods 0.000 description 1
- 201000002664 drug-induced hearing loss Diseases 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 235000011389 fruit/vegetable juice Nutrition 0.000 description 1
- 230000007849 functional defect Effects 0.000 description 1
- 102000054766 genetic haplotypes Human genes 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000008774 maternal effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012775 microarray technology Methods 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 201000000585 muscular atrophy Diseases 0.000 description 1
- 208000018360 neuromuscular disease Diseases 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000003753 real-time PCR Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000011895 specific detection Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 210000000278 spinal cord Anatomy 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/686—Polymerase chain reaction [PCR]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/16—Primer sets for multiplex assays
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/30—Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change
Abstract
The invention relates to a primer group, a kit and a method for detecting mutation types of genes related to various diseases. The invention can detect genetic deafness, thalassemia and spinal muscular atrophy disease related gene mutation types simultaneously, only one tube is needed, SNP, INDEL and copy number variation of clinically common known pathogenicity can be detected simultaneously by one experiment, and the detection efficiency of three diseases and the standardization degree of clinical detection are obviously improved. Wherein the primer set comprises: 1) The primer for detecting the genetic mutation type of the hereditary hearing loss is shown as SEQ ID NO. 1-30; 2) The primer for detecting the gene mutation type of thalassemia is shown as SEQ ID NO. 31-340; 3) The primer for detecting the gene mutation type of the spinal muscular atrophy is shown as SEQ ID NO. 341-350; 4) The primer for detecting the beta-actin housekeeping gene is shown as SEQ ID NO. 351-352.
Description
Technical Field
The invention belongs to the field of biotechnology and medical research application, and particularly relates to a primer set, a kit and a method for detecting genetic mutation types related to hereditary hearing loss, thalassemia and spinal muscular atrophy.
Background
In order to effectively reduce birth defects caused by inheritance, the national drug administration has passed a plurality of products related to genetic disease detection in recent years, wherein three diseases of hereditary hearing loss, thalassemia and spinal muscular atrophy are of special interest.
Hereditary hearing loss is a typical monogenic disease, and has strong genetic heterogeneity. The current technical platforms for genetic deafness gene screening comprise fluorescence quantitative PCR, PCR+diversion hybridization method, microarray technology, high-throughput sequencing and the like, the specific detection principles of which are different, and the current kit for detecting the loci by only comprising PCR amplification reaction or subsequent combined probe hybridization reaction is common. These products vary in coverage and pathogenic sites, but for most products, mitochondrial DNA (mtDNA) mutation detection is most chosen to be included in the product detection. Hereditary hearing loss caused by mtDNA mutation includes syndrome type hereditary hearing loss and non-syndrome type hereditary hearing loss, wherein m.1555A > G and m.1494C > T mutation on mtDNA 12SrRNA are susceptible sites of aminoglycoside drug induced hearing loss. In addition, other mtDNA mutations may also cause syndrome-type inherited deafness, such as maternal inherited diabetes with inherited deafness, and the like. However, unlike other germ line mutations, mitochondrial DNA is randomly assigned to mtDNA mutations during cell replication, and the mother-to-daughter mutation ratios may differ significantly, with mtDNA presence thresholds, i.e., the corresponding phenotype is produced only if the mutant DNA reaches a certain load rate or mtDNA functional defects to a certain extent. Thus, over the years, many scholars suggested that detection of mitochondrial heterogeneity should be increased in order to more accurately assess the role of mitochondrial mutations in disease occurrence ≡! In the currently known detection methods, the high throughput sequencing method can detect autosomal inherited deafness mutations and simultaneously analyze the mitochondrial heterogeneity ratio. However, the use of high throughput sequencing based on amplicon banking techniques for mitochondrial mutation detection presents another problem: mitochondrial heterogeneity varies from person to person, the copy number of mitochondrial genomes is quite different from chromosome, which results in multiple PCR in amplifying mitochondrial DNA and genomic DNA, with different amounts of DNA templates initially placed, i.e., different copy numbers of templates initially placed, which presents challenges for subsequent PCR condition optimization and subsequent bioinformatic analysis.
The types of thalassemia existing in China at present mainly comprise alpha thalassemia and beta thalassemia, and the pathogenesis causes of the alpha thalassemia and the beta thalassemia are slightly different, wherein the alpha thalassemia is caused by the deletion of alpha hemoglobin genes, and a small number of the alpha thalassemia is caused by point mutation; whereas beta thalassemia is mainly characterized by point mutation of beta hemoglobin gene, and a few are deleted. This difference results in different detection techniques, and in clinical diagnosis, PCR-based detection methods, such as Gap-PCR methods, can effectively distinguish different types of thalassemias, but due to limitations of methodology, various degrees of false positives or false negatives are caused. In recent years, researchers have attempted to employ NGS sequencing methods, but most have employed liquid phase capture techniques; amplicon library construction for thalassemia detection, which is also mainly focused on Beta thalassemia (CN 105331694B) mainly comprising point mutation, and the conventional amplicon library construction method is used for detecting alpha thalassemia mainly comprising large fragment deletion, so that technical difficulty still exists. This difficulty arises mainly from the fact that the coding regions of the 2 critical genes HBA1 and HBA2 of thalassemia are identical and that the thalassemia breakpoint region falls more in the intron region, typically the GC content of the multiplex amplification PCR is between 40% and 60%, optimally 50%. However, the GC content of the intron region is high, and the partial region reaches more than 80%, so that the amplification efficiency of the primer is seriously affected, the primer is unevenly amplified, and the judgment of the thalassemia deletion fragment is further affected. Therefore, the use of amplicon library construction, balancing the amplification efficiency among the primers, ensuring the amplification uniformity as much as possible, and is particularly important for judging the deletion of the large fragment of thalassemia.
Spinal muscular atrophy (spinal muscular atrophy, SMA) is a neuromuscular disease characterized by progressive muscle weakness and muscular atrophy due to degeneration of spinal cord anterior horn motor neurons, and is also the second most lethal clinical autosomal recessive genetic disease. The disease is caused by a deficiency in motor neuron survival gene 1 (survival motor neuron, smn 1). At least 1 copy of the SMN1 gene exists on two chromosomes 5 of a normal human, and the SMN1 gene of an SMA patient is deleted (particularly, the homozygous deletion of exon 7) or the SMN1 gene with 1 mutation. Studies have shown that about 95% of SMA patients are homozygous for the deletion of the SMN1 double allele, and that another 5% of patients are due to mutations in the SMN1 gene (i.e., one allele is deleted and the other allele undergoes slightly pathogenic variation). Although the gene carrying rate of the population is as high as 1/40, the incidence rate is about 1/10000-1/6000, the population still occupies the first place of the lethal genetic disease of children under 2 years of age. Thus, attempts have been made to establish tertiary control systems for SMA in order to reduce the incidence of SMA. NMPA has also approved three detection kits based on PCR technology, all of which are aimed at detection of 95% of SMN deletions, and can be used for screening or auxiliary diagnosis of patients, carriers and normal persons. Although SMN1 is the causative gene of SMA, determining the occurrence of the disease, SMN2 is a phenotype modifying gene, affecting the severity and progression of the disease. SMN1 is highly homologous to SMN2, with only a 5 base difference. CN112048548A provides a method for detecting SMN gene copy number using SMNP as a control, by amplifying SMNP and SMN1 simultaneously with one pair of primers, and by amplifying SMNP and SMN2 simultaneously with the other pair of primers, and distinguishing by capillary electrophoresis. The invention simultaneously amplifies the SMN1 and the SMN2 by the same pair of primers, and uses a high-throughput sequencing method to distinguish the SMN1 and the SMN2, so that the method is simpler and more accurate. The limitations of the products that have been developed at present are: 1) The detection of minor variations accounting for 5% of the total mass cannot be detected; 2) Cannot be used for detection of SMN2 copy number, and thus cannot be used to assess the severity and progression of disease. However, due to the high degree of homology between SMN1 and SMN2, primer design and analysis is greatly difficult if the detection is performed using a second generation sequencing technique. As such, domestic bereconazole uses longer three-generation sequencing technologies for detection, but because of the high cost of three-generation sequencing technologies, clinical popularization still requires time.
Disclosure of Invention
At present, hereditary hearing loss, thalassemia and spinal muscular atrophy are three most common genetic diseases which can effectively reduce the incidence rate through gene screening. The key requirement of genetic disease screening is met by 1. The requirement of a large number of sample flux detection is met; 2. can meet the requirement of detecting a plurality of diseases, genes or loci in one test; 3. the cost investment is low, and the requirement of low-cost purchasing is met; 4. the detection turnaround time is relatively short. High throughput sequencing technology is certainly an ideal tool for genetic disease screening. Genetic disease screening is currently performed based on high throughput sequencing technology, with the most common detection techniques being liquid phase capture techniques and amplicon banking techniques. Compared with amplicon database construction, the liquid phase capture technology has the following characteristics: 1. the detection turnover time is longer, even if the rapid hybridization technology is adopted, the step from DNA to on-machine sequencing can be completed within 1-2 days, but the amplicon library establishment can be completed in 3 hours; 2. the cost of liquid phase capture is much higher than amplicon pooling; 3. the requirement for the amount of sample DNA is relatively high; 4. the capture probe design is more tolerant to mismatch than the amplicon PCR primer design, which also means that the amplicon PCR primer design difficulty is much higher than the capture probe design difficulty. From the description of the two second-generation sequencing library-building methods, the detection of the three diseases by adopting the amplicon library-building combined with the second-generation sequencing technology is more in line with the genetic disease screening requirement, but more technical challenges are faced.
In view of the above, the invention is proved by a large number of experiments, and a method and a kit for simultaneously detecting genetic mutation types of three genetic diseases such as hereditary hearing loss, thalassemia and spinal muscular atrophy by one primer are developed based on a multiplex PCR amplification technology combined with a high-throughput sequencing technology.
Specifically, according to a first aspect of the present invention, there is provided a primer set for detecting mutation types of a plurality of disease-associated genes, comprising:
1) The primer for detecting the genetic mutation type of the hereditary hearing loss is shown as SEQ ID NO. 1-30;
2) The primer for detecting the gene mutation type of thalassemia is shown as SEQ ID NO. 31-340;
3) The primer for detecting the gene mutation type of the spinal muscular atrophy is shown as SEQ ID NO. 341-350;
4) The primer for detecting the beta-actin housekeeping gene is shown as SEQ ID NO. 351-352.
In one embodiment, the type of genetic mutation for hereditary hearing loss is a point mutation of the genes GJB2, GJB3, SLC26A4 and 12 SrRNA.
In one embodiment, the type of gene mutation in thalassemia is a point mutation, short fragment indel, and large fragment indel of the genes HBA1, HBA2, HBB.
In one embodiment, the type of gene mutation in spinal muscular atrophy is a SMN1 point mutation and an SMN1/SMN2 gene copy number variation of exon 7, 8.
According to a second aspect of the present invention, there is provided a kit for simultaneous detection of genetic deafness, thalassemia and spinal muscular atrophy disease-related gene mutation types, comprising the primer set of the first aspect of the present invention.
In one embodiment, the kit further comprises an amplification enzyme Mix, primer digest, amplification enhancers, index primers, and purified magnetic beads.
According to a third aspect of the present invention, there is provided a method for simultaneously detecting a plurality of disease-associated gene mutation types, comprising the steps of:
s1, using whole genome DNA of a sample as a template, carrying out targeted ultra-multiplex PCR amplification in the same reaction tube by using the primer group of the first aspect of the invention, purifying amplification products by using magnetic beads, and removing primers and primer dimers;
s2, constructing a library of the amplicon obtained based on the S1;
s3, sequencing the library constructed based on the S2 by using a high-throughput sequencing method, analyzing a sequencing result by using a bioinformatics method to obtain genetic variation information of various gene mutation related diseases, and further judging whether the sample has a corresponding gene mutation type;
wherein the disease is hereditary hearing loss, thalassemia and spinal muscular atrophy.
In one embodiment, the type of genetic mutation for hereditary hearing loss is a point mutation of the genes GJB2, GJB3, SLC26A4 and 12 SrRNA.
In one embodiment, the type of gene mutation in thalassemia is a point mutation, short fragment indel, and large fragment indel of the genes HBA1, HBA2, HBB.
In one embodiment, the type of gene mutation in spinal muscular atrophy is a SMN1 point mutation and an SMN1/SMN2 gene copy number variation of exon 7, 8.
In one embodiment, the step S3 further comprises the steps of:
s31: sequencing the library by using a high-throughput sequencer to obtain raw sequencing data;
s32: performing basic quality control on the original sequencing data measured in the step S31, wherein the basic quality control comprises filtering sequencing joints and low-quality bases to obtain effective data, and comparing the effective data with a ginseng genome by using BWT algorithm software to obtain a comparison information file with base positions;
s33: based on the comparison information file obtained in step S32, the comparison file is split into 2 parts, which are autosomal comparison information and mitochondrial genome comparison information, respectively. Based on the autosomal alignment information, mutation of SNP, INDEL was detected using bayesian algorithm. SNP mutation information is obtained by scoring, ordering and filtering mitochondrial genome comparison results.
In one embodiment, the method for determining the type of genetic mutation associated with deafness, thalassemia and spinal muscular atrophy is as follows:
1) Method for judging inherited deafness, thalassemia and spinal muscular atrophy autosomal SNP and INDEL mutation: the mutation allele frequency (VAF) is more than 95%, the mutation is homozygous, and the mutation frequency is more than or equal to 15% and less than or equal to 95%, and the mutation is heterozygous.
2) The method for judging the SNP mutation of the hereditary hearing loss mitochondria comprises the following steps: mutation is present if the Variant Allele Frequency (VAF) > 3%, otherwise wild-type
3) A method for judging spinal muscular atrophy caused by SMN1 gene deletion:
a) Extracting the read length of the region which is compared with beta-actin, SMN1 exon7 and exon8, SMN2 exon7 and exon8 in the comparison information file, filtering out the read length with the comparison quality value of <60 and with mismatch of more than 3 bases, and obtaining the filtered sequence comparison information file;
b) Based on the filtered sequence comparison information file in the step a), counting the average coverage depth of the read length of the target area, and dividing the average coverage depth of the exon7 and exon8 of the SMN1 and SMN2 genes by the average coverage depth of the beta-actin to obtain the copy numbers of S1-7, S1-8, S2-7 and S2-8;
c) Obtaining a data set of negative samples S1-7, S1-8, S2-7 and S2-8 through known and verified SMA negative samples; calculating the median of S1-7 and S1-8;
d) Calculating multiples of S1-7 and S1-8 of a sample to be tested and an SMA negative sample set, and judging the deletion type of the sample;
e) Judging whether the sample to be tested is a spinal muscular atrophy patient or a carrier; the specific judging method comprises the following steps:
average depth of coverage of S1-7 = SMN1 exon 7/average depth of coverage of β -actin;
average depth of coverage of S1-8 = SMN1 exon 8/average depth of coverage of β -actin;
average depth of coverage of S2-7 = SMN2 exon 7/average depth of coverage of β -actin;
average depth of coverage of S2-8 = SMN2 exon 8/average depth of coverage of β -actin;
ratio (S1-7) =s1-7/median of negative control set S1-7;
ratio (S1-8) =s1-8/median of negative control set S1-8;
if the judging value Ratio (S1-7) of the SMN1 gene is less than or equal to 0.2 and the Ratio (S1-8) is less than or equal to 0.2, the sample to be detected is homozygous missing, namely the copy number of the SMN1 is 0;
if Ratio (S1-7) is less than or equal to 0.2 and Ratio (S1-8) is less than or equal to 0.7, the sample to be tested is exon7 homozygous deletion and exon8 heterozygous deletion;
if Ratio (S1-7) is less than or equal to 0.2< 0.7 and Ratio (S1-8) is less than or equal to 0.2, the sample to be tested is exon7 heterozygous deletion and exon8 homozygous deletion;
if ratio (S1-7) is less than or equal to 0.2< 0.7 and ratio (S1-8) is less than or equal to 0.7, the sample to be tested is SMN1 heterozygous deletion;
if S2-7 or S2-8 is less than or equal to 0.3, SMN2 exon7 or exon8 is 0 copy;
If 0.3< S2-7 or S2-8 is less than or equal to 0.7, SMN2 exon7 or exon8 is 1 copy;
if 0.7< S2-7 or S2-8 is less than or equal to 1.2, SMN2 exon7 or exon8 is 2 copies;
if 1.2< S2-7 or S2-8 is less than or equal to 1.7, SMN2 exon7 or exon8 is 3 copies;
if 1.7< S2-7 or S2-8 is less than or equal to 2.2, SMN2 exon7 or exon8 is 4 copies;
if S2-7 or S2-8>2.2, SMN2 exon7 or exon8 is 5 copies or more;
3) Method for judging thalassemia caused by deletion of HBA2, HBA1 and HBB genes:
a) Counting the read length coverage depth of a target area, and normalizing the sequencing depth of the target area based on the average sequencing depth of the beta-actin gene to obtain a Z value, wherein the Z value=the sequencing depth of the target area/the average sequencing depth of the beta-actin is 2000;
b) Calculating the ratio of the Z value of the test sample to the median of the Z value of the negative control sample set, and determining a missing breakpoint based on an HMM algorithm;
c) Judging the deletion copy number of the thalassemia related gene of the sample to be detected, wherein the judgment standard is as follows:
judgment value b=log 2 (test sample Z value/negative control sample set Z value median)
If the judgment value B is less than or equal to minus 1.4, the sample to be detected is homozygous missing, namely the gene copy number is 0;
if the judgment value B is less than or equal to minus 1.4 and less than or equal to minus 0.6, the sample to be detected is heterozygous deletion, namely the copy number of the gene is 1;
If the judgment value B is less than or equal to 0.6 and less than or equal to 0.6, the copy number of the sample to be detected is normal, namely the copy number of the gene is 2.
According to a fourth aspect of the present invention there is provided a system for detecting a plurality of disease-associated gene mutation types, comprising:
1) Primer amplification module: taking the whole genome DNA of a sample as a template, carrying out targeted super-multiplex PCR amplification in the same reaction tube by using the primer group disclosed by the first aspect of the invention, purifying an amplification product by using magnetic beads, and removing the primer and the primer dimer;
2) Constructing a library module: library construction of the amplicon fragments obtained based on module 1);
3) And a data processing module: sequencing the library constructed based on the module 2) by using a high-throughput sequencing method, analyzing a sequencing result by using a bioinformatics method to obtain genetic variation information of various gene mutation related diseases, and further judging whether the sample has a corresponding gene mutation type;
wherein the disease is hereditary hearing loss, thalassemia and spinal muscular atrophy.
In one embodiment, the type of genetic mutation for hereditary hearing loss is a point mutation of the genes GJB2, GJB3, SLC26A4 and 12 SrRNA.
In one embodiment, the type of gene mutation in thalassemia is a point mutation, short fragment indel, and large fragment indel of the genes HBA1, HBA2, HBB.
In one embodiment, the type of gene mutation in spinal muscular atrophy is a SMN1 point mutation and an SMN1/SMN2 gene copy number variation of exon 7, 8.
In one embodiment, the 3) data processing module further comprises the following:
3.1: sequencing the library by using a high-throughput sequencer to obtain raw sequencing data;
3.2: performing basic quality control on the original sequencing data measured in the step 3.1, including filtering a sequencing joint and low-quality bases to obtain effective data, and comparing the effective data with a ginseng genome by using BWT algorithm software to obtain a comparison information file with base positions;
3.3: based on the comparison information file obtained in the step 3.2, splitting the comparison file into two parts, namely an autosomal comparison information file and a mitochondrial genome comparison information file; detecting autosomal SNP and INDEL mutation by using a Bayesian algorithm according to the autosomal comparison information file; and obtaining mitochondrial SNP mutation information by scoring, sequencing and filtering the results of the mitochondrial genome comparison information file.
In one embodiment, the method for determining the type of genetic mutation associated with deafness, thalassemia and spinal muscular atrophy is as follows:
1) Method for judging inherited deafness, thalassemia and spinal muscular atrophy autosomal SNP and INDEL mutation: the mutation allele frequency (VAF) is more than 95%, the mutation is homozygous, and the mutation frequency is more than or equal to 15% and less than or equal to 95%, and the mutation is heterozygous.
2) Genetic deafness mitochondrial SNP mutation: mutation is present if the Variant Allele Frequency (VAF) > 3%, otherwise wild-type
3) A method for judging spinal muscular atrophy caused by SMN1 gene deletion:
a) Extracting the read length of the beta-actin, SMN1 exon7, exon8, SMN2exon7 and exon8 areas in the bam file, filtering out the read length with the comparison quality value smaller than 60 and having mismatch of more than 3 bases;
b) Based on the file, counting the average coverage depth of the read length of the target area, and dividing the average coverage depth of exon7 and exon8 of SMN1 and SMN2 genes by the average coverage depth of beta-actin to obtain the copy numbers of S1-7, S1-8, S2-7 and S2-8;
c) Obtaining a data set of negative samples S1-7, S1-8, S2-7 and S2-8 through known and verified SMA negative samples; calculating the median of S1-7 and S1-8;
d) Calculating multiples of S1-7 and S1-8 of a sample to be tested and an SMA negative sample set, and judging the deletion type of the sample;
e) Judging whether the sample to be tested is a spinal muscular atrophy patient or a carrier; the specific judging method comprises the following steps:
Average depth of coverage of S1-7 = SMN1 exon 7/average depth of coverage of β -actin;
average depth of coverage of S1-8 = SMN1 exon 8/average depth of coverage of β -actin;
average depth of coverage of S2-7 = SMN2 exon 7/average depth of coverage of β -actin;
average depth of coverage of S2-8 = SMN2 exon 8/average depth of coverage of β -actin;
ratio (S1-7) =s1-7/median of negative control set S1-7;
ratio (S1-8) =s1-8/median of negative control set S1-8;
if the judging value Ratio (S1-7) of the SMN1 gene is less than or equal to 0.2 and the Ratio (S1-8) is less than or equal to 0.2, the sample to be detected is homozygous missing, namely the copy number of the SMN1 is 0;
if Ratio (S1-7) is less than or equal to 0.2 and Ratio (S1-8) is less than or equal to 0.7, the sample to be tested is exon7 homozygous deletion and exon8 heterozygous deletion;
if Ratio (S1-7) is less than or equal to 0.2< 0.7 and Ratio (S1-8) is less than or equal to 0.2, the sample to be tested is exon7 heterozygous deletion and exon8 homozygous deletion;
if ratio (S1-7) is less than or equal to 0.2< 0.7 and ratio (S1-8) is less than or equal to 0.7, the sample to be tested is SMN1 heterozygous deletion;
if S2-7 or S2-8 is less than or equal to 0.3, SMN2 exon7 or exon8 is 0 copy;
if 0.3< S2-7 or S2-8 is less than or equal to 0.7, SMN2 exon7 or exon8 is 1 copy;
if 0.7< S2-7 or S2-8 is less than or equal to 1.2, SMN2 exon7 or exon8 is 2 copies;
if 1.2< S2-7 or S2-8 is less than or equal to 1.7, SMN2 exon7 or exon8 is 3 copies;
If 1.7< S2-7 or S2-8 is less than or equal to 2.2, SMN2 exon7 or exon8 is 4 copies;
if S2-7 or S2-8>2.2, SMN2 exon7 or exon8 is 5 copies or more;
4) Method for judging thalassemia caused by deletion of HBA2, HBA1 and HBB genes:
a) Counting the read length coverage depth of a target area, and normalizing the sequencing depth of the target area based on the average sequencing depth of the beta-actin gene to obtain a Z value, wherein the Z value=the sequencing depth of the target area/the average sequencing depth of the B-actin is 2000;
b) Calculating the ratio of the Z value of the test sample to the median of the Z value of the negative control sample set, and determining a mutation breakpoint based on an HMM algorithm;
c) Judging the deletion copy number of the thalassemia related gene of the sample to be detected, wherein the judgment standard is as follows:
judgment value b=log 2 (test sample Z value/negative control sample set Z value median)
If the judgment value B is less than or equal to minus 1.4, the sample to be detected is homozygous missing, namely the gene copy number is 0;
if the judgment value B is less than or equal to minus 1.4 and less than or equal to minus 0.6, the sample to be detected is heterozygous deletion, namely the copy number of the gene is 1;
if the judgment value B is less than or equal to 0.6 and less than or equal to 0.6, the copy number of the sample to be detected is normal, namely the copy number of the gene is 2.
According to a fifth aspect of the present invention, there is provided the use of a primer set according to the first aspect of the present invention for the preparation of a kit for detecting a variety of disease-related gene mutation types.
In one embodiment, the disease is hereditary hearing loss, thalassemia, and spinal muscular atrophy.
The excellent technical effects of the primer group, the method, the system and the kit are mainly characterized in that:
1. simultaneously detecting three genetic diseases including hereditary deafness gene mutation, thalassemia (alpha+beta) and SMA by adopting a single tube 2-step PCR amplification library construction technology; the current similar technology mostly adopts liquid phase capture for detection, or adopts a multiplex PCR amplification method for detection aiming at different single diseases. Compared with the liquid phase capturing and library building technology, the invention has the advantages of rapidness, simple and convenient operation, high efficiency and low cost, and is more suitable for wide clinical application.
2. The invention uses a single-tube PCR method to detect multiple diseases and has higher requirements on primer design. The compatibility, the equilibrium of the amplification efficiency and the like of different primers in the same reaction system are required to be considered. This places higher demands on primer design, system optimization and subsequent bioinformatic analysis. In the three genetic diseases of the invention, the mutation types are complex, including point mutation, insertion or deletion of small fragments, deletion of large fragments, copy number change and the like. Of these variant types, it is difficult for multiplex PCR pooling designs to have large fragment deletions, especially those that are contained in both alpha thalassemia and SMA. In the prior art, conventional treatment schemes have 2 forms: 1 st, designing a primer aiming at a definite breakpoint region, and analyzing a missing region according to the length of an amplified product, wherein the method can only detect the loss of a fixed type, and when the missing breakpoint is in a non-designed region, the missing breakpoint is missed; and 2. Covering the PCR fragment in the deletion area of the large fragment, and determining the deletion breakpoint by using a method of belief analysis through the change of the PCR signal depth of the coverage area. The method has the advantages that the detection of the deletion type is flexible, and the deletion break points in the PCR coverage area can be detected. In the invention, the PCR primer pair is designed to detect the deletion mutation according to the mode 2 at relatively fixed intervals (generally designed to be within 500 bp). Since the deletion fragment region of alpha thalassemia reaches 35kb, if other disease species are synthesized according to the conventional design principle, the whole product is completed by more than 400 pairs of PCR amplification primers. However, a large number of primers in the same reaction system increases the possibility of mutual interference of the primers and instability of the system, and increases the cost. In the research, a method of marking SNP is adopted, the problems of common disease gene mutation types, mutual interference among primers and the like are comprehensively considered, and the quantity of PCR primers is reduced to 176 pairs through optimization, so that the stability of an experiment system is ensured, and the cost and the detection performance are also considered.
3. The invention can detect the heterogeneity variation of mitochondrial DNA and the germ line mutation of chromosome. Each mitochondria contains about 2-10 mitochondrial DNA copies, and the mitochondrial DNA copy number of each cell of normal human is about 10 3 -10 4 The heterogeneity differences between different people are very pronounced. In view of mitochondrial DNA heterogeneity and differences in copy number from chromosomal DNA, in other studies, separate systems of mitochondrial DNA and chromosomal DNA are often amplified separately. The kit of the invention optimizes the reaction conditions,The bioinformatic method is adjusted, so that the uniform and stable amplification of mitochondrial DNA and chromosomal DNA in the same tube is realized.
4. The invention is used for detecting thalassemia large fragment deletion mutation. In order to meet the requirements of disease detection, a plurality of pairs of primers are designed within a 35kb range of a 16 # chromosome as much as possible in order to cover a deletion mutation breakpoint region, however, the sequence in the region has the characteristics of high repetition, high GC content (part of the high GC region (more than 80%), and the like, and the primers are sorted and detected by detecting different deletion types by a conventional method so as to avoid the mutual influence between the primers or systems. The method optimizes an amplification experiment system and a biological information analysis method by adjusting the region designed by the primer and the proportion among the primers, ensures the detection of various types of deletions of thalassemia, can be performed in one-tube PCR reaction, reduces the complexity of the experiment and improves the detection efficiency.
5. The conventional SMA detection method is used for detecting homozygous deletion mutation of the exon of the SMN1 7 with highest incidence, however, the copy number of the SMN2 has a certain compensation effect on the symptoms, so that the copy number of the SMN2 can be detected simultaneously, and the diagnosis of diseases by clinicians is facilitated. However, SMN1 and SMN2 are highly homologous, only have a difference of 5 bases, and the design of a primer capable of distinguishing and detecting the SMN1 and the SMN2 has high technical difficulty. According to the method, through primer design adjustment, the exon7 and exon8 of the SMN1 and SMN2 genes can be amplified simultaneously, the consistency of the amplification efficiency of the region is ensured, the batch effect of a beta-actin housekeeping gene correction amplification system is introduced, and the copy numbers of the SMN1 and the SMN2 can be specifically detected.
6. The conventional method for detecting the large fragment deletion is data volume correction and GC content correction, and a sample to be detected is compared with a control set to obtain copy number change of the sample to be detected. However, the method uses multiplex PCR amplification technology, and due to the difference in amplification efficiency, the instability between batches is increased, resulting in a decrease in detection accuracy. The method adds the beta-actin housekeeping gene primer in an amplification system, corrects the amplification efficiency of the sample through the average sequencing depth of the gene, reduces the influence of amplification, judges the deletion of large fragments of SMA and thalassemia by using a mutual comparison method in the sample and between samples, and improves the detection stability.
Drawings
Fig. 1: a primer design schematic diagram;
fig. 2: amplifying the QC spectrum of the fragments of the library, wherein the sizes of the fragments are concentrated at 200 bp-500 bp, and the main peak is about 380 bp;
fig. 3: technical route of multiplex PCR amplification;
fig. 4: mitochondrial primer proportion adjustment front-rear depth distribution heat map;
fig. 5: verification results using the "human motor neuron survival gene 1 (SMN 1) detection kit (PCR-melting curve method)", wherein fig. 5A shows the detection result of NA03813 SMN1 and fig. 5B shows the detection result of NA03814 SMN 1;
fig. 6: mitochondrial heterogeneity mutation distribution of PF02-L-14 samples.
Detailed Description
The following description of the preferred embodiments of the present application is further made with reference to the accompanying drawings, which are given by way of illustration and not limitation, and any other similar situations fall within the scope of the present application.
Definition of the definition
For a better understanding of the present invention, definitions and explanations of related terms are provided below.
As used herein, "detecting" is the presence or absence of a specified analyte substance.
As used herein, "comprising," "including," "having," "containing," or any other variation thereof, are intended to cover a non-exclusive inclusion. For example, a composition, step, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such composition, step, method, article, or apparatus.
As used herein, the conjunction "consisting of … …" excludes any unspecified element, step or component. If used in a claim, such phrase will cause the claim to be closed, such that it does not include materials other than those described, except for conventional impurities associated therewith. When the phrase "consisting of … …" appears in a clause of the claim body, rather than immediately following the subject, it is limited to only the elements described in that clause; other elements are not excluded from the stated claims as a whole.
As used herein, when a ratio or other value or parameter is expressed as a range, a preferred range, or a range bounded by a list of upper preferable values and lower preferable values, this is to be understood as specifically disclosing all ranges formed from any pair of any upper range limit or preferred value and any lower range limit or preferred value, regardless of whether ranges are separately disclosed. For example, when ranges of "1 to 5" are disclosed, the described ranges should be construed to include ranges of "1 to 4", "1 to 3", "1 to 2 and 4 to 5", "1 to 3 and 5", and the like. When a numerical range is described herein, unless otherwise indicated, the range is intended to include its endpoints and all integers and fractions within the range.
As used herein, "and/or" is used to mean that one or both of the illustrated cases may occur, e.g., a and/or B include (a and B) and (a or B).
As used herein, "plurality" refers to two or more.
As used herein, the term "clean data" or "valid data" refers to raw off-the-shelf data, which is available for subsequent analysis by removing sequences containing sequencing adaptors and low quality bases.
As used herein, the term "read" or "read length" refers to sequence information of a DNA fragment.
As used herein, the term "amplification enzyme Mix" refers to an amplification mixture of DNA polymerase, dntps, and buffer.
As used herein, the term "primer digest" refers to a reagent that removes residual dntps and primers after PCR is complete.
As used herein, the term "amplification enhancer" refers to an additive that improves the PCR amplification performance of a high GC template, containing a concentration of dimethyl sulfoxide (DMSO), betaine, and trehalose.
As used herein, the term "Index primer" refers to a primer that carries a specific base sequence and is capable of amplifying a complete sequencing adapter, wherein "Index" refers to a specific base sequence for sample discrimination when the library is mixed sequenced.
As used herein, the term "purified magnetic bead" refers to a magnetic bead reagent that can separate and purify DNA.
As used herein, the term "frequency of base mutation" refers to the change in the structure of a gene caused by substitution, addition, and deletion of base pairs in a DNA molecule; the frequency refers to the ratio of the base sequence in which the mutation has occurred, and the calculation method is the ratio of the number of reads in which the mutation has occurred to the number of all reads in the site.
As used herein, the term "HMM algorithm" refers to Hidden Markov Model, hidden markov models.
Summary of the sequence Listing
The present application is accompanied by a sequence listing comprising a plurality of nucleic acids. Table a provides all the mutation site information contained and table B provides an overview of the primer sequences contained.
Table A
Table B
Example 1Preparation of gene mutation detection kit for 3 genetic diseases
1.1 design of primer set
1.1.1 round 1 primer set
In the invention, a method of marking SNP is adopted, the problems of common disease gene mutation types, mutual interference among primers and the like are comprehensively considered, and the quantity of PCR primers is reduced through optimization, so that the stability of an experimental system is ensured, and the cost and the detection performance are also considered. A schematic of the primer design is shown in FIG. 1.
Determining the detection range of the genes related to the three genetic diseases of the detected hereditary hearing loss, the thalassemia and the spinal muscular atrophy, and designing a primer aiming at the mutation related to the detection range. For large fragment deletion regions, marker SNP sites are selected, the Minimum Allele Frequency (MAF) of the SNP sites should be close to 50%, and the interval between each marker SNP site is about 500 bp. In the process of designing the primer, factors such as primer dimer, high GC content, annealing temperature and the like are considered. The use of BLAST for all designed primers compared to each other and to the ginseng genome determines that there is no complementary sequence between the primers and has a high specificity. Finally 176 pairs of primers were obtained for subsequent experimental evaluation. When the primers are synthesized, the specific upstream and downstream primers are respectively added with the universal primer sequences for sequencing, so that the primer sequences of the first round of PCR (the first round of primer F (universal sequence F+specific primer F) and the 1 st round of primer R (universal sequence R+specific primer R) can be obtained.
1.1.2 round 2 primer set primers are sequencing platform complete linker sequence primers, comprising a linker sequence, a universal sequence and index for different sample discrimination, the structure of the index-containing complete linker sequence is shown in Table 1.
TABLE 1 index-containing complete linker sequences
Remarks: XXXXXXXX is an 8bp index, and different samples contain different combinations of indices.
1.2 preparation of kit for detecting 3 genetic mutations
The composition of the kit for detecting the mutation of 3 genetic diseases is shown in the following table 2:
table 2 three genetic diseases gene mutation detection kit
Reagent name | Storage temperature |
Multiplex primer | -20℃±5℃ |
Amplification enzyme mix | -20℃±5℃ |
Amplification enhancer 1 | -20℃±5℃ |
Amplification enhancer 2 | -20℃±5℃ |
Index primer(I5/I7) | -20℃±5℃ |
Primer digestive juice | 4℃ |
Purified magnetic bead | 4℃ |
Example 2Optimization of multiplex PCR reaction systems
In the research process, the inventor screens a plurality of pairs of PCR primers aiming at a target area, and finally screens primers with good specificity, and the amplification efficiency and multiplex amplification can meet the experimental requirements. The following describes systematic optimization of primer formulation, etc., for high GC template primer design and system optimization, SMN1 and SMN2 amplification efficiency balance, mitochondrial DNA and chromosome in-tube detection, 176.
2.1 Primer design and amplification system optimization for high GC content region of chromosome 16
The thalassemia large fragment deletion mutation is positioned in the 35kb range of chromosome 16, wherein the breakpoint positions of fragment deletion are distributed in various ways, and in order to cover possible breakpoint areas as much as possible, a plurality of pairs of primers are designed to cover more areas, which is more helpful for detection and judgment, however, the characteristics of complex base composition, sequence repetition area, high GC content (more than 80 percent of partial high GC area) and the like of the areas are achieved, and the conventional amplification method is used for detecting the primers by detecting different deletion types so as to avoid the mutual influence between the primers or systems.
In order to solve the problems, the invention designs a plurality of pairs of primers to cover corresponding areas, optimizes an experimental reaction system for compatibility with the amplification efficiency of a high GC area, obviously improves the amplification efficiency of the partial area by adjusting the final concentration of an amplification enhancer (the final concentration of the amplification enhancer before adjustment is 2 percent and the final concentration of the amplification enhancer after adjustment is 6 percent), and ensures that the average sequencing depth of the area is increased from 100 to 300X to 2000 to 3000X, thereby carrying out stable detection and ensuring the subsequent analysis. The details before and after adjustment are shown in table 3.
TABLE 3 high GC fragment optimization test for thalassemia large fragment deleted region
2.2 SMN1 and SMN2 amplification efficiency balance
The pathogenesis of spinal muscular atrophy (spinal muscular atrophy, SMA) has been clarified, i.e., exon7 or exon7-8 of SMN1 is deleted, leading to more than 95% of patients with the onset of disease. The SMN1 and the SMN2 genes are homologous genes, the SMN2 is a pseudogene of the SMN1 gene, the copy numbers of the pseudogene are different in the population, and the increase of the copy numbers has a certain relieving effect on SMA pathogenesis, and in the detection, the copy numbers of the SMN1 and the SMN2 are known, so that the judgment of a clinician on the disease condition and the decision affecting treatment are facilitated. In order to ensure that the kit can stably detect the copy numbers of the SMN1 and the SMN2 and reduce the influence of amplification efficiency, the positions of the primers are adjusted, so that the exon7 and exon8 of the SMN1 and the SMN2 can be amplified simultaneously by a pair of primers, and the coverage depth meets the analysis requirement, as shown in a table 4.
TABLE 4 SMN1 and SMN2 are optimized to give simultaneous amplification
2.3 mitochondrial DNA and chromosome detect mitochondrial genome copy number and have differences because of individuals at the same time, however, normal human nuclear genome chromosome is 2 copies, therefore, in order to guarantee the uniformity of amplified products when mitochondrial DNA and chromosome DNA are combined and amplified and detected, the mitochondrial primer needs to be optimally adjusted in the preparation proportion, the proportion of mitochondrial and autosomal primer before adjustment is 1:1, and the mitochondrial region amplification depth is far higher than the autosomal depth; the ratio of the two primers is 1:10 after adjustment, the amplification depths of the mitochondrial region and the autosomal region are nearly identical, and the ratio of the primers before and after adjustment is shown in FIG. 4.
2.4 overall primer concentration ratio adjustment
Based on the adjustment of the special region, the primer amplification efficiency of other regions is combined, and the primer amplification efficiency is improved by balancing the input proportion of each primer and optimizing the primers, so that the sequencing depth of each region is ensured to be uniform, and the primer amplification method is particularly aimed at a CNV detection region. The conditions of partial primer adjustment before and after sequencing depth are shown in Table 5.
Sequencing alignment of primer number 118 of table 5 before and after adjustment:
example 3National reference and simulated clinical sample detection
Using the kit obtained in example 1, the library construction was performed by using the whole genome DNA of the sample as a template in combination with a multiplex PCR method, the library was sequenced using a high throughput sequencing technique, and sequencing data was analyzed by bioinformatics, thereby detecting and analyzing the mutation situation of the gene.
3.1 samples from hereditary hearing loss and thalassemia are national references; SMA was genomic DNA purchased from Coriell and verified using the "human motor neuron survival Gene 1 (SMN 1) detection kit (PCR-melting curve method)" (national mechanical injection 20213400937), as shown in Table 6.
Table 6 sources of samples used in the study
3.2 detection method
3.2.1 whole genome DNA extraction blood or blood card whole genome DNA was extracted using the indicated nucleic acid extraction kit.
3.2.2 round 1 PCR: targeted sequence amplification
Each sample was subjected to a reaction under the reaction conditions shown in Table 8 by placing the prepared system on a PCR instrument, and the reaction system was prepared in a PCR tube according to Table 7.
TABLE 7 reaction system
[1] According to the quantitative result of the DNA Qubit, the input volume is determined, and the initial amount is required to be 5-100 ng/reaction tube.
TABLE 8 reaction conditions
3.3 magnetic bead purification
3.3.1 after amplification, taking out the PCR tube, adding 0.9 times volume of magnetic beads (27 mu L) into 30 mu L of each reaction system, sucking and beating uniformly, standing for 5min, performing instantaneous centrifugation, and placing the PCR tube on a magnetic rack for 3min until the solution is clear.
3.3.2 thoroughly removing the supernatant, taking the PCR tube off the magnetic rack, adding 50 mu L of primer digestion liquid into the tube, sucking and beating uniformly, standing at room temperature for 5min, centrifuging instantaneously, placing the PCR tube on the magnetic rack for 3min, and clarifying the solution.
3.3.3 discarding the supernatant, adding 180. Mu.L of 80% ethanol solution, and standing for 30s; the supernatant was discarded, 180. Mu.L of 80% ethanol solution was again added, and the mixture was allowed to stand for 30 seconds; .
3.3.4 discarding the supernatant, performing instantaneous centrifugation, and removing residual ethanol at the bottom.
3.3.5 airing the magnetic beads, adding 20 mu L of ddH20, taking the PCR tube off the magnetic frame, sucking and beating, mixing uniformly, and standing for 2min at room temperature. And (5) performing instantaneous centrifugation, placing the solution in a magnetic rack, standing for 2min, and clarifying the solution.
3.3.6 supernatants were taken and placed in new PCR tubes for subsequent library construction.
3.4 round 2 PCR-addition of linker sequence the PCR amplified products obtained in step 3.3 were subjected to linker addition, the reaction system was as shown in Table 9, and the reaction conditions were as shown in Table 10.
TABLE 9 reaction System for adding linker sequences
Reagent(s) | Volume (mu L) |
ddH 2 O | 2 |
|
2.5 |
index primer(I5/I7)(5μM) | 2 |
PCR products from the previous round | 13.5 |
|
10 |
|
30 |
TABLE 10 reaction conditions
3.5 round 2 magnetic bead purification
And the same as in step 3.3.
3.6 library quantification and quality control
Taking 1 mu L of library, carrying out library concentration measurement by using Qubit, and recording the library concentration; 1. Mu.L of library was taken and assayed for library fragment length using Agilent DNA 1000kit or equivalent type of instrument reagent.
3.7 high throughput sequencing and data analysis
3.7.1 the library obtained in step 3.5 was high throughput sequenced using a second generation sequencing platform such as Illumina Novaseq, nextseq CN500, etc., to obtain the sequencing original sequence.
3.7.2 removing sequencing linker and low quality base (base quality < 10) from the original sequencing data by using fastp software (website: https:// github. Com/OpenGene/fastp), filtering the read length of less than 40bp, and obtaining clean data. And comparing the clean data with the hg38 reference genome by a BWA software MEM algorithm to obtain a reading length corresponding position.
3.7.3 after completion of step 3.7.2, nuclear genomic Single Nucleotide Variation (SNV) and INDEL mutation (INDEL) were detected with GATK software (https:// gitsub.com/broadinstrument/GATK/release) Haplotype Caller, mitochondrial genomic single nucleotide variation was detected with Mutect2, and information such as mutation site read length coverage depth, mutation frequency, etc. was obtained.
3.7.4 annotating the information obtained in step C with ANNOVAR software (https:// ANnovar. Opnobioinformation. Org/en/last /), determining the homozygous information of the nuclear gene variation according to the mutation frequencies of the nuclear genome SNV and INDEL, and if the mutation frequency is greater than or equal to 0.95, the locus is homozygous and if the mutation frequency is greater than or equal to 0.15 and less than or equal to 0.95, the heterozygous mutation. Sites with mitochondrial genome mutation frequencies < 5% were filtered and wild-type.
3.7.5 the gene mutation sites of the sample to be tested about 3 single-gene genetic diseases only comprise SNP and Indel mutation, and an Excel file is output.
3.7.6 the read length of the region which is aligned to beta-actin, SMN1 exon7 and exon8, SMN2exon7 and exon8 in the bam file is extracted, the comparison quality value is filtered to be less than 60, and the read length of the mismatch of more than 3 bases is obtained;
3.7.7 based on the above file, counting the average coverage depth of the read length of the target region, and dividing the average coverage depth of exon7 and exon8 of SMN1 and SMN2 genes by the average coverage depth of beta-actin to obtain the copy numbers of S1-7, S1-8, S2-7 and S2-8;
3.7.8 obtaining a data set of negative samples S1-7, S1-8, S2-7, S2-8 by known validated SMA negative samples; calculating the median of S1-7 and S1-8;
3.7.9 calculating multiples of S1-7 and S1-8 of a sample to be detected and an SMA negative sample set, and judging the deletion type of the sample;
3.7.10 judging whether the sample to be tested is a spinal muscular atrophy patient or a carrier; the specific judging method comprises the following steps:
average depth of coverage of S1-7 = SMN1 exon 7/average depth of coverage of β -actin;
average depth of coverage of S1-8 = SMN1 exon 8/average depth of coverage of β -actin;
average depth of coverage of S2-7 = SMN2exon 7/average depth of coverage of β -actin;
Average depth of coverage of S2-8 = SMN2 exon 8/average depth of coverage of β -actin;
ratio (S1-7) =s1-7/median of negative control set S1-7;
ratio (S1-8) =s1-8/median of negative control set S1-8;
if the judging value Ratio (S1-7) of the SMN1 gene is less than or equal to 0.2 and the Ratio (S1-8) is less than or equal to 0.2, the sample to be detected is homozygous missing, namely the copy number of the SMN1 is 0;
if Ratio (S1-7) is less than or equal to 0.2 and Ratio (S1-8) is less than or equal to 0.7, the sample to be tested is exon7 homozygous deletion and exon8 heterozygous deletion;
if Ratio (S1-7) is less than or equal to 0.2< 0.7 and Ratio (S1-8) is less than or equal to 0.2, the sample to be tested is exon7 heterozygous deletion and exon8 homozygous deletion;
if ratio (S1-7) is less than or equal to 0.2< 0.7 and ratio (S1-8) is less than or equal to 0.7, the sample to be tested is SMN1 heterozygous deletion;
if S2-7 or S2-8 is less than or equal to 0.3, SMN2 exon7 or exon8 is 0 copy;
if 0.3< S2-7 or S2-8 is less than or equal to 0.7, SMN2 exon7 or exon8 is 1 copy;
if 0.7< S2-7 or S2-8 is less than or equal to 1.2, SMN2 exon7 or exon8 is 2 copies;
if 1.2< S2-7 or S2-8 is less than or equal to 1.7, SMN2 exon7 or exon8 is 3 copies;
if 1.7< S2-7 or S2-8 is less than or equal to 2.2, SMN2 exon7 or exon8 is 4 copies;
if S2-7 or S2-8>2.2, SMN2 exon7 or exon8 is 5 copies or more;
3.7.11 the coverage depth of the target region reads was counted, and the sequencing depth of the target region of the HBB, HBA1, HBA2 genes was normalized based on the average sequencing depth of the β -actin genes (Z value = target region sequencing depth/B-actin average sequencing depth 2000)), to obtain the Z value.
3.7.12 the ratio of the Z value of the test sample to the median of the Z value of the negative control sample set is calculated, and the mutation breakpoint is determined based on the HMM algorithm.
3.7.13 the deletion copy number of the thalassemia-related gene of the sample to be detected is judged, and the judgment standard is as follows:
judgment value b=log 2 (test sample Z value/negative control sample set Z value median)
If the judgment value B is less than or equal to minus 1.4, the sample to be detected is homozygous missing, namely the gene copy number is 0;
if the judgment value B is less than or equal to minus 1.4 and less than or equal to minus 0.6, the sample to be detected is heterozygous deletion, namely the copy number of the gene is 1;
if the judgment value B is less than or equal to 0.6 and less than or equal to 0.6, the copy number of the sample to be detected is normal, namely the copy number of the gene is 2.
3.8 national reference and simulated clinical sample detection results
41 national references and simulated clinical samples, the overall performance meets the expectations by using the kit of the invention, and the coverage of the negative sample targeting area above 100X is close to 100%. Sample-based quality control information is shown in table 11.
Table 11 sample basic data quality control
Among the 39 national reference samples of hereditary hearing loss and thalassemia, 30 positive samples with mutation types of SNP and Indel, 3 hereditary hearing loss negative national reference samples, and 2 thalassemia negative national reference samples, all SNP and Indel mutation sites are detected through the experiment of the kit, and the detailed detection conditions are shown in Table 12.
TABLE 12 SNP and INDEL mutation site detection
11 thalassemia national references containing large fragment deletions and 2 SMA clinical simulation samples containing 7 and 8 exon deletions were detected by using the kit of the invention, and all mutations were accurately detected, as shown in Table 13.
TABLE 13 copy number variation check list
The PF02-L-14 sample contains about 2.5% of mitochondrial heterogeneity mutation, and when the sequencing data amount is 0.15G, the site coverage depth is 517X, 21 mutation reading lengths are detected, the mutation frequency is 4%, and the mutation can be successfully detected, as shown in FIG. 6, which is the characteristic that the currently marketed products do not have.
It should be understood that while the present invention has been described by way of example in terms of its preferred embodiments, it is not limited to the above embodiments, but is capable of numerous modifications and variations by those skilled in the art. The reagents, reaction conditions, etc. involved in the preparation of the kit for detecting the mutation types of the various disease-associated genes and the method for detecting the mutation types of the various disease-associated genes can be adjusted and changed accordingly according to specific needs. It will thus be appreciated that those skilled in the art will be able to devise numerous alternative arrangements which, although not explicitly described herein, embody the principles of the invention and are included within its spirit and scope.
Claims (10)
1. A primer set for detecting mutation types of a plurality of disease-associated genes, comprising:
1) The primer for detecting the genetic mutation type of the hereditary hearing loss is shown as SEQ ID NO. 1-30, wherein the genetic mutation type of the hereditary hearing loss is the point mutation of genes GJB2, GJB3, SLC26A4 and 12 SrRNA;
2) The primer for detecting the gene mutation type of thalassemia is shown as SEQ ID NO. 31-340, wherein the gene mutation type of thalassemia is point mutation, short fragment insertion deletion and large fragment deletion of genes HBA1, HBA2 and HBB;
3) The primer for detecting the gene mutation type of the spinal muscular atrophy is shown as SEQ ID NO. 341-350, wherein the gene mutation type of the spinal muscular atrophy is SMN1 point mutation and SMN1/SMN2 gene 7 and 8 exon copy number variation;
4) The primer for detecting the beta-actin housekeeping gene is shown as SEQ ID NO. 351-352.
2. A kit for simultaneously detecting genetic deafness, thalassemia and spinal muscular atrophy disease-related gene mutation types, comprising the primer set of claim 1.
3. The kit of claim 2, wherein the kit further comprises an amplification enzyme Mix, primer digest, amplification enhancers, index primers, and purified magnetic beads.
4. A method for simultaneously detecting multiple disease-associated gene mutation types, comprising the steps of:
s1, using whole genome DNA of a sample as a template, carrying out targeted ultra-multiplex PCR amplification in the same reaction tube by using the primer group of claim 1, purifying amplified fragments by using magnetic beads, and removing primers and primer dimers;
s2, constructing a library of the amplified fragments obtained based on the S1;
s3, sequencing the library constructed based on the S2 by using a high-throughput sequencing method, analyzing a sequencing result by using a bioinformatics method to obtain genetic variation information of various gene mutation related diseases, and further judging whether the sample has a corresponding gene mutation type;
wherein the disease is hereditary hearing loss, thalassemia and spinal muscular atrophy;
wherein the genetic mutation type of the hereditary hearing loss is point mutation of genes GJB2, GJB3, SLC26A4 and 12 SrRNA;
wherein the gene mutation types of the thalassemia are point mutation, short fragment insertion deletion and large fragment deletion of genes HBA1, HBA2 and HBB;
wherein the gene mutation type of the spinal muscular atrophy is SMN1 point mutation and copy number variation of exons 7 and 8 of the SMN1/SMN2 gene.
5. The method of claim 4, wherein S3 further comprises the steps of:
s31: sequencing the library by using a high-throughput sequencer to obtain raw sequencing data;
s32: performing basic quality control on the original sequencing data measured in the step S31, wherein the basic quality control comprises filtering sequencing joints and low-quality bases to obtain effective data, and comparing the effective data with a ginseng genome by using BWT algorithm software to obtain a comparison information file with base positions;
s33: based on the comparison information file obtained in the step S32, splitting the comparison information file into two parts, namely an autosomal comparison information file and a mitochondrial genome comparison information file; detecting autosomal SNP and INDEL mutation by using a Bayesian algorithm according to the autosomal comparison information file; and obtaining mitochondrial SNP mutation information by scoring, sequencing and filtering the results of the mitochondrial genome comparison information file.
6. The method according to claim 5, wherein the genetic deafness, thalassemia and spinal muscular atrophy-related gene mutation type judging method is as follows:
1) Method for judging inherited deafness, thalassemia and spinal muscular atrophy autosomal SNP and INDEL mutation: base mutation frequency (VAF) > 95%, homozygous; the base mutation frequency is more than or equal to 15% and less than or equal to 95%, and the hybrid mutation is the heterozygous mutation;
2) The method for judging the SNP mutation of the hereditary hearing loss mitochondria comprises the following steps: base mutation frequency (VAF) > 3%, if mutation is present, otherwise no mutation is present;
3) A method for judging spinal muscular atrophy caused by SMN1 gene copy number variation:
a) Extracting the read length of the region which is compared with beta-actin, SMN1 exon7 and exon8, SMN2 exon7 and exon8 in the comparison information file, filtering out the read length with the comparison quality value of <60 and with mismatch of more than 3 bases, and obtaining the filtered sequence comparison information file;
b) Based on the filtered sequence comparison information file in the step a), counting the average coverage depth of the read length of the target area, and dividing the average coverage depth of the exon7 and exon8 of the SMN1 and SMN2 genes by the average coverage depth of the beta-actin to obtain the copy numbers of S1-7, S1-8, S2-7 and S2-8;
c) Obtaining a data set of negative samples S1-7, S1-8, S2-7 and S2-8 through known and verified SMA negative samples; calculating the median of S1-7 and S1-8;
d) Calculating multiples of S1-7 and S1-8 of a sample to be tested and an SMA negative sample set, and judging the deletion type of the sample;
e) Judging whether the sample to be tested is a spinal muscular atrophy patient or a carrier; the specific judging method comprises the following steps:
average depth of coverage of S1-7 = SMN1 exon 7/average depth of coverage of β -actin;
Average depth of coverage of S1-8 = SMN1 exon 8/average depth of coverage of β -actin;
average depth of coverage of S2-7 = SMN2 exon 7/average depth of coverage of β -actin;
average depth of coverage of S2-8 = SMN2 exon 8/average depth of coverage of β -actin;
ratio (S1-7) =s1-7/median of negative control set S1-7;
ratio (S1-8) =s1-8/median of negative control set S1-8;
if the judging value Ratio (S1-7) of the SMN1 gene is less than or equal to 0.2 and the Ratio (S1-8) is less than or equal to 0.2, the sample to be detected is homozygous missing, namely the copy number of the SMN1 is 0;
if Ratio (S1-7) is less than or equal to 0.2 and Ratio (S1-8) is less than or equal to 0.7, the sample to be tested is exon7 homozygous deletion and exon8 heterozygous deletion;
if Ratio (S1-7) is less than or equal to 0.2< 0.7 and Ratio (S1-8) is less than or equal to 0.2, the sample to be tested is exon7 heterozygous deletion and exon8 homozygous deletion;
if ratio (S1-7) is less than or equal to 0.2< 0.7 and ratio (S1-8) is less than or equal to 0.7, the sample to be tested is SMN1 heterozygous deletion;
if S2-7 or S2-8 is less than or equal to 0.3, SMN2 exon7 or exon8 is 0 copy;
if 0.3< S2-7 or S2-8 is less than or equal to 0.7, SMN2 exon7 or exon8 is 1 copy;
if 0.7< S2-7 or S2-8 is less than or equal to 1.2, SMN2 exon7 or exon8 is 2 copies;
if 1.2< S2-7 or S2-8 is less than or equal to 1.7, SMN2 exon7 or exon8 is 3 copies;
if 1.7< S2-7 or S2-8 is less than or equal to 2.2, SMN2 exon7 or exon8 is 4 copies;
If S2-7 or S2-8>2.2, SMN2 exon7 or exon8 is 5 copies or more;
4) Method for judging thalassemia caused by deletion of HBA2, HBA1 and HBB genes:
a) Counting the read length coverage depth of a target area, and normalizing the sequencing depth of the target area based on the average sequencing depth of the beta-actin gene to obtain a Z value, wherein the Z value=the sequencing depth of the target area/the average sequencing depth of the beta-actin is 2000;
b) Calculating the ratio of the Z value of the test sample to the median of the Z value of the negative control sample set, and determining a missing breakpoint based on an HMM algorithm;
c) Judging the deletion copy number of the thalassemia related gene of the sample to be detected, wherein the judgment standard is as follows:
judgment value b=log 2 (Z value of test sample/Z value median of negative control sample set), if judging value B is less than or equal to-1.4, the sample to be tested is homozygously deleted, namely the gene copy number is 0;
if the judgment value B is less than or equal to minus 1.4 and less than or equal to minus 0.6, the sample to be detected is heterozygous deletion, namely the copy number of the gene is 1;
if the judgment value B is less than or equal to 0.6 and less than or equal to 0.6, the copy number of the sample to be detected is normal, namely the copy number of the gene is 2.
7. A system for detecting a plurality of disease-associated gene mutation types, comprising:
1) Primer amplification module: using the whole genome DNA of the sample as a template, performing targeted super-multiplex PCR amplification in the same reaction tube by using the primer group as defined in claim 1, purifying the amplified fragments by using magnetic beads, and removing the primers and primer dimers;
2) Constructing a library module: library construction is carried out on the amplified fragments obtained based on the module 1);
3) And a data processing module: sequencing the library constructed based on the module 2) by using a high-throughput sequencing method, analyzing a sequencing result by using a bioinformatics method to obtain genetic variation information of various gene mutation related diseases, and further judging whether the sample has a corresponding gene mutation type;
wherein the disease is hereditary hearing loss, thalassemia and spinal muscular atrophy;
wherein the genetic mutation type of the hereditary hearing loss is point mutation of genes GJB2, GJB3, SLC26A4 and 12 SrRNA;
wherein the gene mutation types of the thalassemia are point mutation, short fragment insertion deletion and large fragment deletion of genes HBA1, HBA2 and HBB;
wherein the gene mutation type of the spinal muscular atrophy is SMN1 point mutation and copy number variation of exons 7 and 8 of the SMN1/SMN2 gene.
8. The system of claim 7, wherein the 3) data processing module further comprises the following:
3.1: sequencing the library by using a high-throughput sequencer to obtain raw sequencing data;
3.2: performing basic quality control on the original sequencing data measured in the step 3.1, including filtering a sequencing joint and low-quality bases to obtain effective data, and comparing the effective data with a ginseng genome by using BWT algorithm software to obtain a comparison information file with base positions;
3.3: based on the comparison information file obtained in the step 3.2, splitting the comparison file into two parts, namely an autosomal comparison information file and a mitochondrial genome comparison information file; detecting autosomal SNP and INDEL mutation by using a Bayesian algorithm according to the autosomal comparison information file; and obtaining mitochondrial SNP mutation information by scoring, sequencing and filtering the results of the mitochondrial genome comparison information file.
9. The system of claim 7, wherein the genetic deafness, thalassemia and spinal muscular atrophy-related gene mutation type judgment method is as follows:
1) Method for judging inherited deafness, thalassemia and spinal muscular atrophy autosomal SNP and INDEL mutation: the base mutation frequency (VAF) is more than 95%, the base mutation frequency is more than or equal to 15% and less than or equal to 95%, and the base mutation frequency is heterozygous mutation;
2) Genetic deafness mitochondrial SNP mutation: base mutation frequency (VAF) > 3%, if mutation is present, otherwise no mutation is present;
3) A method for judging spinal muscular atrophy caused by SMN1 gene deletion:
a) Extracting the read length of the region which is compared with beta-actin, SMN1 exon7 and exon8, SMN2 exon7 and exon8 in the comparison information file, filtering out the read length with the comparison quality value of <60 and with mismatch of more than 3 bases, and obtaining the filtered comparison information file;
b) Based on the filtered comparison information file in the step a), counting the average coverage depth of the read length of the target area, and dividing the average coverage depth of the exon7 and exon8 of the SMN1 and the SMN2 by the average coverage depth of the beta-actin to obtain the copy numbers of S1-7, S1-8, S2-7 and S2-8;
c) Obtaining a data set of negative samples S1-7, S1-8, S2-7 and S2-8 through known and verified SMA negative samples; calculating the median of S1-7 and S1-8;
d) Calculating multiples of S1-7 and S1-8 of a sample to be tested and an SMA negative sample set, and judging the deletion type of the sample;
e) Judging whether the sample to be tested is a spinal muscular atrophy patient or a carrier; the specific judging method comprises the following steps:
average depth of coverage of S1-7 = SMN1 exon 7/average depth of coverage of β -actin;
average depth of coverage of S1-8 = SMN1 exon 8/average depth of coverage of β -actin;
average depth of coverage of S2-7 = SMN2 exon 7/average depth of coverage of β -actin;
average depth of coverage of S2-8 = SMN2 exon 8/average depth of coverage of β -actin;
ratio (S1-7) =s1-7/median of negative control set S1-7;
ratio (S1-8) =s1-8/median of negative control set S1-8;
if the judging value Ratio (S1-7) of the SMN1 gene is less than or equal to 0.2 and the Ratio (S1-8) is less than or equal to 0.2, the sample to be detected is homozygous missing, namely the copy number of the SMN1 is 0;
If Ratio (S1-7) is less than or equal to 0.2 and Ratio (S1-8) is less than or equal to 0.7, the sample to be tested is exon7 homozygous deletion and exon8 heterozygous deletion;
if Ratio (S1-7) is less than or equal to 0.2< 0.7 and Ratio (S1-8) is less than or equal to 0.2, the sample to be tested is exon7 heterozygous deletion and exon8 homozygous deletion;
if ratio (S1-7) is less than or equal to 0.2< 0.7 and ratio (S1-8) is less than or equal to 0.7, the sample to be tested is SMN1 heterozygous deletion;
if S2-7 or S2-8 is less than or equal to 0.3, SMN2 exon7 or exon8 is 0 copy;
if 0.3< S2-7 or S2-8 is less than or equal to 0.7, SMN2 exon7 or exon8 is 1 copy;
if 0.7< S2-7 or S2-8 is less than or equal to 1.2, SMN2 exon7 or exon8 is 2 copies;
if 1.2< S2-7 or S2-8 is less than or equal to 1.7, SMN2 exon7 or exon8 is 3 copies;
if 1.7< S2-7 or S2-8 is less than or equal to 2.2, SMN2 exon7 or exon8 is 4 copies;
if S2-7 or S2-8>2.2, SMN2 exon7 or exon8 is 5 copies or more;
4) Method for judging thalassemia caused by deletion of HBA2, HBA1 and HBB genes:
a) Counting the read length coverage depth of a target area, and normalizing the sequencing depth of the target area based on the average sequencing depth of the beta-actin gene to obtain a Z value, wherein the Z value=the sequencing depth of the target area/the average sequencing depth of the beta-actin is 2000;
b) Calculating the ratio of the Z value of the test sample to the median of the Z value of the negative control sample set, and determining a mutation breakpoint based on an HMM algorithm;
c) Judging the deletion copy number of the thalassemia related gene of the sample to be detected, wherein the judgment standard is as follows:
judgment value b=log 2 (Z value of test sample/Z value median of negative control sample set), if judging value B is less than or equal to-1.4, the sample to be tested is homozygously deleted, namely the gene copy number is 0;
if the judgment value B is less than or equal to minus 1.4 and less than or equal to minus 0.6, the sample to be detected is heterozygous deletion, namely the copy number of the gene is 1;
if the judgment value B is less than or equal to 0.6 and less than or equal to 0.6, the copy number of the sample to be detected is normal, namely the copy number of the gene is 2.
10. The use of the primer set according to claim 1 for preparing a kit for detecting mutation types of genes related to various diseases, wherein the diseases are hereditary hearing loss, thalassemia and spinal muscular atrophy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310100120.8A CN116287198A (en) | 2023-02-09 | 2023-02-09 | Primer group, kit and method for detecting mutation types of genes related to various diseases |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310100120.8A CN116287198A (en) | 2023-02-09 | 2023-02-09 | Primer group, kit and method for detecting mutation types of genes related to various diseases |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116287198A true CN116287198A (en) | 2023-06-23 |
Family
ID=86798764
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310100120.8A Pending CN116287198A (en) | 2023-02-09 | 2023-02-09 | Primer group, kit and method for detecting mutation types of genes related to various diseases |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116287198A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116386718A (en) * | 2023-05-30 | 2023-07-04 | 北京华宇亿康生物工程技术有限公司 | Method, apparatus and medium for detecting copy number variation |
-
2023
- 2023-02-09 CN CN202310100120.8A patent/CN116287198A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116386718A (en) * | 2023-05-30 | 2023-07-04 | 北京华宇亿康生物工程技术有限公司 | Method, apparatus and medium for detecting copy number variation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7081829B2 (en) | Analysis of tumor DNA in cell-free samples | |
AU2021280311A1 (en) | Methods for detection of donor-derived cell-free DNA | |
EP3564391B1 (en) | Method, device and kit for detecting fetal genetic mutation | |
Henderson et al. | An assessment of the apex microarray technology in genotyping patients with Leber congenital amaurosis and early-onset severe retinal dystrophy | |
CN110468192B (en) | Time-of-flight mass spectrometry nucleic acid analysis method for detecting human spinal muscular atrophy gene mutation | |
CN111218506B (en) | Detection kit for copy numbers of SMN1 and SMN2 genes | |
WO2024027569A1 (en) | Haplotype construction method independent of proband | |
WO2022182878A1 (en) | Methods for detection of donor-derived cell-free dna in transplant recipients of multiple organs | |
CN114317728B (en) | Primer group, kit, method and system for detecting multiple mutations in SMA | |
CN116287198A (en) | Primer group, kit and method for detecting mutation types of genes related to various diseases | |
US20180119210A1 (en) | Fetal haplotype identification | |
CN116479103B (en) | Kit for detecting spinal muscular atrophy related genes | |
WO2020244482A1 (en) | Method for detecting smn gene copy number using smnp as reference | |
KR101890810B1 (en) | Noninvasive prenatal diagnosis method for autosomal recessive disease using picodrolet digital PCR and kit therefor | |
Contu et al. | Sex-related bias and exclusion mapping of the nonrecombinant portion of chromosome Y in human type 1 diabetes in the isolated founder population of Sardinia | |
Lalrohlui et al. | Genomic profiling of mitochondrial DNA reveals novel complex gene mutations in familial type 2 diabetes mellitus individuals from Mizo ethnic population, Northeast India | |
JPWO2007032496A1 (en) | Method for determining the risk of developing type 2 diabetes | |
Gibbs et al. | The application of recombinant DNA technology for genetic probing in epidemiology | |
CN117487907B (en) | KCNH2 gene mutant, mutant protein, reagent, kit and application | |
CN116555417A (en) | Site for detecting congenital deafness before embryo implantation, primer combination and application thereof | |
KR102254341B1 (en) | methods for diagnosing the high risk group of Diabetes based on Genetic Risk Score | |
CN116606924A (en) | Detection kit for small mutation of SMN1-SMN2 fusion gene and application thereof | |
Tanimoto et al. | Development and evaluation of a rapid and cost-efficient NGS-based MHC class I genotyping method for macaques by using a prevalent short-read sequencer | |
Feng et al. | Molecular characterization of a novel 83.9-kb deletion of the α-globin upstream regulatory elements by long-read sequencing | |
CN117448444A (en) | Biomarker and primer combination for detecting genetic family G6PD gene disorder of broad bean disease and application of biomarker and primer combination |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |