CN117925808A - Method, kit and system for determining gene copy number - Google Patents
Method, kit and system for determining gene copy number Download PDFInfo
- Publication number
- CN117925808A CN117925808A CN202211318904.XA CN202211318904A CN117925808A CN 117925808 A CN117925808 A CN 117925808A CN 202211318904 A CN202211318904 A CN 202211318904A CN 117925808 A CN117925808 A CN 117925808A
- Authority
- CN
- China
- Prior art keywords
- seq
- gene
- amplification
- atp7a
- rpp40
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 132
- 238000000034 method Methods 0.000 title claims abstract description 53
- 239000000758 substrate Substances 0.000 claims abstract description 122
- 230000003321 amplification Effects 0.000 claims abstract description 96
- 238000003199 nucleic acid amplification method Methods 0.000 claims abstract description 96
- 238000001514 detection method Methods 0.000 claims abstract description 51
- 238000006243 chemical reaction Methods 0.000 claims abstract description 30
- 238000001819 mass spectrum Methods 0.000 claims abstract description 12
- 238000005516 engineering process Methods 0.000 claims abstract description 10
- 238000001840 matrix-assisted laser desorption--ionisation time-of-flight mass spectrometry Methods 0.000 claims abstract description 7
- 102100021947 Survival motor neuron protein Human genes 0.000 claims description 94
- 101000617738 Homo sapiens Survival motor neuron protein Proteins 0.000 claims description 91
- 101000849720 Homo sapiens Ribonuclease P protein subunit p40 Proteins 0.000 claims description 59
- 102100033789 Ribonuclease P protein subunit p40 Human genes 0.000 claims description 59
- 102000012437 Copper-Transporting ATPases Human genes 0.000 claims description 55
- 108010022637 Copper-Transporting ATPases Proteins 0.000 claims description 55
- 101000710886 Homo sapiens Collagen alpha-5(IV) chain Proteins 0.000 claims description 49
- 102100033775 Collagen alpha-5(IV) chain Human genes 0.000 claims description 48
- 108091092584 GDNA Proteins 0.000 claims description 39
- 206010013801 Duchenne Muscular Dystrophy Diseases 0.000 claims description 19
- 208000002320 spinal muscular atrophy Diseases 0.000 claims description 18
- 108020004707 nucleic acids Proteins 0.000 claims description 14
- 102000039446 nucleic acids Human genes 0.000 claims description 14
- 150000007523 nucleic acids Chemical class 0.000 claims description 14
- 108010001237 Cytochrome P-450 CYP2D6 Proteins 0.000 claims description 10
- 102100021704 Cytochrome P450 2D6 Human genes 0.000 claims description 10
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 6
- 238000011109 contamination Methods 0.000 claims description 5
- 238000002360 preparation method Methods 0.000 claims description 5
- 238000004949 mass spectrometry Methods 0.000 claims description 4
- 208000035977 Rare disease Diseases 0.000 claims description 2
- 208000035475 disorder Diseases 0.000 claims description 2
- 102100024108 Dystrophin Human genes 0.000 claims 3
- 101001053946 Homo sapiens Dystrophin Proteins 0.000 claims 3
- 101001053942 Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) Diphosphomevalonate decarboxylase Proteins 0.000 claims 3
- 239000000523 sample Substances 0.000 description 90
- 230000002860 competitive effect Effects 0.000 description 37
- 108020004414 DNA Proteins 0.000 description 24
- 239000000047 product Substances 0.000 description 19
- 101150081851 SMN1 gene Proteins 0.000 description 14
- 238000007838 multiplex ligation-dependent probe amplification Methods 0.000 description 13
- 239000011541 reaction mixture Substances 0.000 description 10
- 208000003954 Spinal Muscular Atrophies of Childhood Diseases 0.000 description 7
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 7
- 230000035945 sensitivity Effects 0.000 description 7
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- 239000008280 blood Substances 0.000 description 6
- 210000004369 blood Anatomy 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 6
- 230000004060 metabolic process Effects 0.000 description 6
- 239000000203 mixture Substances 0.000 description 6
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 210000001766 X chromosome Anatomy 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 230000015556 catabolic process Effects 0.000 description 4
- 238000006731 degradation reaction Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 230000004907 flux Effects 0.000 description 4
- 238000007481 next generation sequencing Methods 0.000 description 4
- 238000003757 reverse transcription PCR Methods 0.000 description 4
- 230000004544 DNA amplification Effects 0.000 description 3
- 238000005119 centrifugation Methods 0.000 description 3
- 230000036267 drug metabolism Effects 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 201000006938 muscular dystrophy Diseases 0.000 description 3
- 230000001717 pathogenic effect Effects 0.000 description 3
- 230000000750 progressive effect Effects 0.000 description 3
- 238000000746 purification Methods 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 101000655398 Homo sapiens General transcription factor IIH subunit 2 Proteins 0.000 description 2
- 101000650649 Homo sapiens Small EDRK-rich factor 1 Proteins 0.000 description 2
- 108010006696 Neuronal Apoptosis-Inhibitory Protein Proteins 0.000 description 2
- 102000005445 Neuronal Apoptosis-Inhibitory Protein Human genes 0.000 description 2
- 101150096042 Rpp40 gene Proteins 0.000 description 2
- 102100027693 Small EDRK-rich factor 1 Human genes 0.000 description 2
- 239000007795 chemical reaction product Substances 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 239000012154 double-distilled water Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000105 evaporative light scattering detection Methods 0.000 description 2
- 239000007850 fluorescent dye Substances 0.000 description 2
- 201000006913 intermediate spinal muscular atrophy Diseases 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 210000004185 liver Anatomy 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 239000013558 reference substance Substances 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 101150056589 A5 gene Proteins 0.000 description 1
- 101150054444 Atp7a gene Proteins 0.000 description 1
- 101150114528 COL4A5 gene Proteins 0.000 description 1
- 101100257134 Caenorhabditis elegans sma-4 gene Proteins 0.000 description 1
- 238000007400 DNA extraction Methods 0.000 description 1
- AHCYMLUZIRLXAA-SHYZEUOFSA-N Deoxyuridine 5'-triphosphate Chemical compound O1[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(=O)NC(=O)C=C1 AHCYMLUZIRLXAA-SHYZEUOFSA-N 0.000 description 1
- 102100021238 Dynamin-2 Human genes 0.000 description 1
- 102000001039 Dystrophin Human genes 0.000 description 1
- 108010069091 Dystrophin Proteins 0.000 description 1
- 108700039887 Essential Genes Proteins 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 206010064571 Gene mutation Diseases 0.000 description 1
- 206010071602 Genetic polymorphism Diseases 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101100329196 Homo sapiens CYP2D6 gene Proteins 0.000 description 1
- 101000817607 Homo sapiens Dynamin-2 Proteins 0.000 description 1
- 108010000178 IGF-I-IGFBP-3 complex Proteins 0.000 description 1
- 208000026072 Motor neurone disease Diseases 0.000 description 1
- 208000010428 Muscle Weakness Diseases 0.000 description 1
- 206010028289 Muscle atrophy Diseases 0.000 description 1
- 208000021642 Muscular disease Diseases 0.000 description 1
- 206010028372 Muscular weakness Diseases 0.000 description 1
- 201000009623 Myopathy Diseases 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 208000032225 Proximal spinal muscular atrophy type 1 Diseases 0.000 description 1
- 208000037340 Rare genetic disease Diseases 0.000 description 1
- 108700005075 Regulator Genes Proteins 0.000 description 1
- 101150015954 SMN2 gene Proteins 0.000 description 1
- 208000026481 Werdnig-Hoffmann disease Diseases 0.000 description 1
- 239000000443 aerosol Substances 0.000 description 1
- 210000002226 anterior horn cell Anatomy 0.000 description 1
- 230000003288 anthiarrhythmic effect Effects 0.000 description 1
- 239000003416 antiarrhythmic agent Substances 0.000 description 1
- 239000000935 antidepressant agent Substances 0.000 description 1
- 229940005513 antidepressants Drugs 0.000 description 1
- 239000000164 antipsychotic agent Substances 0.000 description 1
- 229940005529 antipsychotics Drugs 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000037444 atrophy Effects 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000009954 braiding Methods 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 210000000133 brain stem Anatomy 0.000 description 1
- 230000000747 cardiac effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007850 degeneration Effects 0.000 description 1
- 238000010612 desalination reaction Methods 0.000 description 1
- 238000003795 desorption Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 101150015424 dmd gene Proteins 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 238000010813 internal standard method Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 210000003141 lower extremity Anatomy 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 239000011259 mixed solution Substances 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 210000002161 motor neuron Anatomy 0.000 description 1
- 208000005264 motor neuron disease Diseases 0.000 description 1
- 201000000585 muscular atrophy Diseases 0.000 description 1
- 210000001087 myotubule Anatomy 0.000 description 1
- 230000032405 negative regulation of neuron apoptotic process Effects 0.000 description 1
- 239000002773 nucleotide Substances 0.000 description 1
- 125000003729 nucleotide group Chemical group 0.000 description 1
- 210000004940 nucleus Anatomy 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000007789 sealing Methods 0.000 description 1
- 210000000278 spinal cord Anatomy 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000001269 time-of-flight mass spectrometry Methods 0.000 description 1
- 208000032471 type 1 spinal muscular atrophy Diseases 0.000 description 1
- 210000001364 upper extremity Anatomy 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
- C12Q1/6872—Methods for sequencing involving mass spectrometry
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Genetics & Genomics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Immunology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Pathology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention relates to a method for determining gene copy number, a kit and application thereof. Wherein the method comprises: (1) Amplifying the target gene, the reference gene and the competing substrates of the target gene and the reference gene by using an amplification primer to obtain an amplification product; (2) Carrying out an extension reaction on the amplification product obtained in the step (1) by using an extension primer to obtain an extension product; (3) Mass spectrum detection is carried out on the extension product in the step (2) by using a mass spectrometer through a matrix-assisted laser desorption ionization time-of-flight mass spectrometry technology, and the copy number of the target gene is calculated through the peak heights of all detection sites; wherein the reference genes comprise at least three genes without CNV variation.
Description
Technical Field
The invention relates to a method for detecting genes by using a time-of-flight mass spectrometry nucleic acid analysis method, which can be used for diagnosing and screening rare genetic diseases such as human spinal muscular atrophy, du's muscular dystrophy and individual drug metabolism differences and can detect copy number variation of genes such as SMN1, SMN2, CYP2D6, DMD and the like.
Background
The basic principle of the Massary technical system is a matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) technology, and the Massary ARRAY technical system has extremely high specificity and sensitivity. The system uses matrix assisted laser desorption/ionization time of flight (MALDI-TOF) mass spectrometry to accurately detect DNA molecules. Genetic variation is distinguished by analyzing the individual mass, eliminating the need for fluorescence or labeling.
CYP2D6 is an enzyme coded by human CYP2D6 genes, is mainly expressed in the liver, is one of important members of CYP enzyme families, is one of enzymes most related to drug metabolism, and takes part in 20% -25% of the metabolic processes of drugs including antidepressants, antiarrhythmics, antipsychotics and the like, although accounting for 2% -9% of the total amount of liver enzymes. CYP2D6 has a high degree of polymorphism, a number of allelic variations of which have been found to either lead to an increase or decrease in enzymatic activity, and its metabolism into four types, ultrafast metabolism (ultrarapid metabolizers, UMs), fast metabolism (extensive metabolizers, EMs), medium metabolism (INTERMEDIATE METABOLIZERS, IMs) and slow metabolism (poormetabolizers, PMs), which widely exist genetic polymorphisms that are responsible for differences in drug metabolism among individuals.
Du's muscular dystrophy DMD (Duchenne Muscular Dystrophy), also known as pseudohypertrophic muscular dystrophy. DMD gene is located at xp21.2-3, the largest gene known to humans, and has the main role of producing anti-dystrophin protein (Dystrophin), which is expressed predominantly on the inner face of skeletal and cardiac membrane, a cytoskeletal protein, which has the main role of stabilizing and protecting muscle fibers. DMD is transmitted through a transmission linked recessive genotype, since the causative gene is located on the X chromosome, whereas men have only one X chromosome, one genetic mutation is sufficient to cause disease; women have two X chromosomes, and when two copies of the gene are mutated at the same time, the female X chromosomes carry two pathogenic genes rarely, so men are more prone to Du muscular dystrophy than women. About 400-500 cases of DMD infants are born every year in China, and about 7 thousands of people are diagnosed as DMD, so that the infant DMD is one of the most number of the patients in the world.
SMA (spinal muscular atrophy ) is an autosomal recessive inherited progressive motor neuron disease, characterized primarily by progressive degeneration of spinal cord anterior horn cells and brain stem motor brain nuclei. Clinically, progressive, symmetric muscle weakness and atrophy are seen with proximal ends being heavier than distal ends and lower limbs being heavier than upper limbs. The patient accounts for about 1/10000 in newborns, and the pathogenic gene carrier accounts for about 1/50. Type 4, i.e., less than 6 months of onset spinal muscular atrophy type I (spinal muscular atrophy type I, SMN1, also known as Werdnig-Hoffmann disease), 6-18 months of onset spinal muscular atrophy type II (spinal muscular atrophy type II, SMN 2), childhood or adolescence onset spinal muscular atrophy type III (spinal muscular atrophy type III, SMA3, also known as KugelHerg-WELANDER DISEASE), and adult onset spinal muscular atrophy type IV (spinal muscular atrophy type IV, SMA 4) can be classified according to age of onset and clinical manifestations. The SMA pathogenic genes of the 4 subtypes are identical and are motor neuron survival gene 1 (survival motor neuronl, SMN 1) [ OMIM600354]. The SMN1 gene is located on chromosome 5 and has a total length of about 20kb and 9 exons. The SMN2 and SMN1 in its immediate vicinity are highly homologous, differing by only 5 nucleotides. SMN2 is a regulator gene whose copy number is inversely proportional to the severity of SMA disease.
CN110468192A provides a flight time mass spectrum nucleic acid analysis method for detecting SMA gene mutation, and the relative copy number of genes is calculated according to the peak area values of two internal reference and target genes of a sample to be detected and a control sample by quantitatively detecting the copy number of related sequences of SMN1, SMN2, NAIP, H4F5 and GTF2H2 genes. CN111020023a provides a method for accurate quantification of gene copy number by subsequent multi-step correction, detected by massaray in combination with single base extension reaction, competitive PCR (real-competitive PCR) technique. CN114107451a provides a method for simultaneous amplification of a gene of interest and its competing substrates using a pair of amplification primers modified with locked nucleic acids.
The invention directly detects SMN1 and SMN2 genes, does not need to detect the change of NAIP, H4F5 and GTF2H2 genes, calculates copy number by using peak height, introduces 3 internal reference genes and 1 external reference substance, adopts a locked nucleic acid modification to an amplification primer, and adopts a UNG-dUTP anti-pollution system to improve the accuracy of results. The external reference substance is the purchased SMN1 copy number of the normal human gDNA with the confirmation of 2, so that the influence of factors such as inconsistent degradation rates of different samples in the extraction and sample addition processes, volume errors in the sample addition process and the like can be reduced; the reference gene is usually a housekeeping gene which can be stably expressed in a human body, and the difference of absolute template quantity caused by sample degradation or sample addition errors can be reduced as much as possible by calculating the ratio of SMN1 to the reference gene.
Diagnosis and screening of partial SMA is currently possible based on MLPA(multiplex ligation-dependent probe amplification)、RT-PCR(Reverse Transcription-Polymerase Chain Reaction) and NGS (Next-generation Sequencing) technologies, but there are also limitations in terms of:
1. although MLPA can detect SMN2 copy number of a patient at the same time, the MLPA has high requirement on template quality, the amplified probe is a fluorescent probe, the signal intensity is obtained by collecting fluorescence, the cost is high, signal deviation is easy to cause, adjacent fragments need to be considered when CNV is interpreted, meanwhile, the flux is limited, and the flow is complex.
2. RT-PCR has low requirements on equipment and high detection flow and speed, but the SMN1 point mutation cannot be detected, and the SMN2 copy number of a patient needs to be additionally designed to be detected. The signal interpretation is to use fluorescence signal intensity, and interference of similar sites is required to be eliminated by using methods such as a competitive probe, so that interference is difficult to be effectively eliminated, data deviation is caused, and interpretation difficulty is high. Furthermore, one reaction tube of RT-PCR can only detect one fragment, and multiple detection requires multiple reaction tubes. The difference between the reaction holes is easy to be caused, the cost is low, and the detection flux is limited.
3. The NGS can detect SMA directionally and cover similar myopathy of other related phenotypes at the same time, but can only be used as a primary screen, suspected positive results also need to be verified by MLPA and other methods, and the NGS has low accuracy and reliability and higher cost in SMN2 copy number analysis.
Disclosure of Invention
In view of the above, the invention provides a method and a kit for determining gene copy number by using Massarray, which comprise the detection of genes such as SMN1, SMN2, CYP2D6, DMD and the like, and particularly can realize the simultaneous accurate detection of the copy number of the SMN1 and the SMN2 aiming at the genes of the SMN1 and the SMN2, further increase the template DNA feeding range and facilitate the operation. According to the method provided by the invention, multiple groups of detection can be realized, the detection flux is large, and a large amount of commercial detection requirements can be conveniently realized.
Specifically, the key point of the invention is to design a plurality of amplification primers and extension primers for amplifying and extending target genes, three or more reference genes and competing substrates of the three reference genes and the reference genes to obtain amplification products and extension products, and detect the amplification products and the extension products through mass spectrum to obtain peak heights of detection sites so as to calculate the copy numbers of SMN1 and SMN 2.
In the conventional method for determining the absolute copy number of a gene to be detected by using a single reference or two reference genes, when the single reference experimental result cannot be determined, the copy number of the gene to be detected cannot be accurately determined, or when the 2 reference results are inconsistent, the risk of which reference to select cannot be determined. Therefore, the method of the invention uses 3 internal reference genes without CNV variation, calculates absolute copy number interval value of the gene to be detected by using the average value thereof, can eliminate the risk that the experimental result caused by single internal reference or two internal references cannot be determined or is wrong as much as possible, and can obviously reduce the gray area of data interpretation, so that the result is more accurate. Meanwhile, the template DNA feeding range is enlarged, so that the accurate measurement can be realized in the whole blood sample extraction DNA feeding amount ranging from about 10 ng to about 30ng (NanoDrop 2000 measurement), and the operation is easy.
Furthermore, the method of the invention also adds locked nucleic acid modification on the PCR primer, sets differential bases on the competitive substrate, reduces the amplification efficiency of the competitive substrate, and enables the competitive substrate to be fed and detected in a stable concentration range. Meanwhile, a UNG-dUTP anti-pollution system is adopted to thoroughly eliminate pollution sources and avoid false negative results. Compared with the common Massary, the method adds the competitive substrate of the CNV detection site in the amplification link, and can accurately distinguish the conditions of SMN 1=0, 1 and SMN1 being more than or equal to 2 copies by detecting the peak height ratio with the competitive substrate, so that the detection result is refined, the detection result is more accurate, and the commercial mass detection requirement is conveniently realized.
The term "CNV" as used in the present invention refers to copy number variation.
The term "QC" as used in the present invention is a competing substrate.
The term "IV" as used in the present invention is an absolute copy number interval value.
Specifically, according to a first aspect of the present invention, there is provided a method for determining the copy number of a gene, comprising: 1) Amplifying target genes, reference genes and competing substrates of the target genes and the reference genes by using amplification primers to obtain amplification products; 2) Performing an extension reaction in the amplification product of the step 1) by using an extension primer to obtain an extension product; 3) Mass spectrum detection is carried out on the extension product in the step 2) by using a mass spectrometer through a matrix-assisted laser desorption ionization time-of-flight mass spectrometry technology, and the copy number of the target gene is calculated through the peak height of each detection site; wherein the reference genes comprise at least three genes without CNV variation.
In one embodiment, the gene of interest comprises SMN1, SMN2, CYP2D6, DMD.
In one embodiment, the reference genes include RPP40, ATP7A, and COL4A5.
In one embodiment, the amplification primer is a locked nucleic acid modified amplification primer.
In one embodiment, the amplification reaction of step 1) employs a UNG-dUTP anti-contamination system.
In one embodiment, the amplification primers for the gene of interest are: amplification primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: 1-2.
In one embodiment, the extension primer of the gene of interest is: extension primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: shown at 9.
In one embodiment, the amplification primers for the reference gene are selected from the group consisting of: i) Amplification primer of Intron 6 of RPP40, as shown in SEQ ID NO: 3-4; ii) an amplification primer for Exon 1 of ATP7A, as set forth in SEQ ID NO: 5-6; iii) The amplification primer of Exon 41 of COL4A5 is shown in SEQ ID NO: 7-8.
In one embodiment, the extension primer of the reference gene is selected from the group consisting of: i) Extension primer of Intron 6 of RPP40, as shown in SEQ ID NO:10 is shown in the figure; ii) extension primer of Exon 1 of ATP7A, as shown in SEQ ID NO: 11; iii) Extension primer of Exon 41 of COL4A5, as shown in SEQ ID NO: shown at 12.
In one embodiment, the competing substrate is selected from the group consisting of: i) A competing substrate for SMN1, as set forth in SEQ ID NO: 13; ii) a competing substrate for RPP40, as set forth in SEQ ID NO: 14; iii) A competing substrate for ATP7A, as set forth in SEQ ID NO: 15; iii) a competing substrate for COL4A5, as set forth in SEQ ID NO: shown at 16.
In one embodiment, the formula for calculating SMN1 and SMN2 copy numbers is:
Wherein: h N-SMN is the peak height of the sample SMN1-2E7 to be detected; h N_QC-SMN is the peak height of the sample SMN1-2E7 corresponding to QC; h N-RPP40 is the peak height of the sample corresponding to the RPP40 of the sample to be detected; h N_QC-RPP40 is the peak height of the sample RPP40 corresponding to QC; h N-ATP7A is the peak height of the sample ATP7A of the sample to be detected; h N_QC-ATP7A is the peak height of the sample to be detected ATP7A corresponding to QC; h N-COL4A5 is the peak height of the sample COL4A5 corresponding to the sample to be detected; h N_QC-COL4A5 is the peak height of the sample COL4A5 to be detected corresponding to QC; h gDNA-SMN is the peak height of the sample corresponding to gDNA SMN1-2 E7; h gDNA_QC-SMN is the peak height of gDNA SMN1-2E7 corresponding to QC; h gDNA-RPP40 is the peak height of the sample corresponding to gDNA RPP 40; h gDNA-QC-RPP40 is the peak height of gDNA RPP40 corresponding to QC; h gDNA-ATP7A is the peak height of the gDNA ATP7A corresponding sample; h gDNA_QC-ATP7A is the peak height of gDNA ATP7A corresponding to QC; h gDNA-COL4A5 is the peak height of the sample corresponding to gDNA COL4A 5; h gDNA_QC-COL4A5 is the peak height of gDNA COL4A5 corresponding to QC.
In one embodiment, the final concentration of the competing substrate of RPP40 is 0.5-8 ng/. Mu.L.
According to a preferred embodiment, the final concentration of competing substrate of RPP40 is 0.6-7.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of competing substrate of RPP40 is 0.7-7 ng/. Mu.L.
According to a preferred embodiment, the final concentration of competing substrate of RPP40 is 0.8-6.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of competing substrate of RPP40 is 0.81-6.06 ng/. Mu.L.
In one embodiment, the final concentration of the competing substrate for ATP7A is 1.5-18 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of ATP7A is 1.5-17.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of ATP7A is 1.55-17 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of ATP7A is 1.6-16.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of ATP7A is 1.63-16.26 ng/. Mu.L.
In one embodiment, the COL4A5 has a final concentration of competitor substrate of 1.5-10 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of COL4A5 is 1.55-9.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of COL4A5 is 1.6-9 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of COL4A5 is 1.65-8.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of COL4A5 is 1.73-8.49 ng/. Mu.L.
According to a second aspect of the present invention there is provided a kit for determining the copy number of a gene comprising: amplification primers and extension primers for a gene of interest, and competing substrates for the gene of interest; and amplification primers and extension primers for the reference gene, and a competing substrate for the reference gene, wherein the reference gene comprises at least three genes without CNV variation.
In one embodiment, the gene of interest comprises SMN1, SMN2, CYP2D6, DMD.
In one embodiment, the reference genes include RPP40, ATP7A, and COL4A5.
In one embodiment, the amplification primer is a locked nucleic acid modified amplification primer.
In one embodiment, the amplification primers for the gene of interest are: amplification primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: 1-2.
In one embodiment, the extension primer of the gene of interest is: extension primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: shown at 9.
In one embodiment, the amplification primers for the reference gene are selected from the group consisting of: i) Amplification primer of Intron 6 of RPP40, as shown in SEQ ID NO: 3-4; ii) an amplification primer for Exon 1 of ATP7A, as set forth in SEQ ID NO: 5-6; iii) The amplification primer of Exon 41 of COL4A5 is shown in SEQ ID NO: 7-8.
In one embodiment, the extension primer of the reference gene is selected from the group consisting of: i) Extension primer of Intron 6 of RPP40, as shown in SEQ ID NO:10 is shown in the figure; ii) extension primer of Exon 1 of ATP7A, as shown in SEQ ID NO: 11; iii) Extension primer of Exon 41 of COL4A5, as shown in SEQ ID NO: shown at 12.
In one embodiment, the competing substrate is selected from the group consisting of: i) A competing substrate for SMN1, as set forth in SEQ ID NO: 13; ii) a competing substrate for RPP40, as set forth in SEQ ID NO: 14; iii) A competing substrate for ATP7A, as set forth in SEQ ID NO: 15; iii) a competing substrate for COL4A5, as set forth in SEQ ID NO: shown at 16.
In one embodiment, the kit is for detecting spinal muscular atrophy in a subject.
In one embodiment, the kit is for detecting duchenne muscular dystrophy in a subject.
According to a third aspect of the present invention there is provided a system for determining the copy number of a gene comprising: 1) Amplification module: amplifying the target gene, the reference gene and the competing substrates of the target gene and the reference gene by using an amplification primer to obtain an amplification product; 2) Extension module: performing an extension reaction on the amplification product obtained in the step 1) by using an extension primer to obtain an extension product; 3) Mass spectrometry module: mass spectrum detection is carried out on the extension product in the step 2) by using a mass spectrometer through a matrix-assisted laser desorption ionization time-of-flight mass spectrometry technology, and the copy number of the target gene is calculated through the peak height of each detection site; wherein the reference genes comprise at least three genes without CNV variation.
In one embodiment, the gene of interest comprises SMN1, SMN2, CYP2D6, DMD.
In one embodiment, the reference genes include RPP40, ATP7A, and COL4A5.
In one embodiment, the amplification primer is a locked nucleic acid modified amplification primer.
In one embodiment, the amplification reaction of step 1) employs a UNG-dUTP anti-contamination system.
In one embodiment, the amplification primers for the gene of interest are: amplification primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: 1-2.
In one embodiment, the extension primer of the gene of interest is: extension primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: shown at 9.
In one embodiment, the amplification primers for the reference gene are selected from the group consisting of: i) Amplification primer of Intron 6 of RPP40, as shown in SEQ ID NO: 3-4; ii) an amplification primer for Exon 1 of ATP7A, as set forth in SEQ ID NO: 5-6; iii) The amplification primer of Exon 41 of COL4A5 is shown in SEQ ID NO: 7-8.
In one embodiment, the extension primer of the reference gene is selected from the group consisting of: i) Extension primer of Intron 6 of RPP40, as shown in SEQ ID NO:10 is shown in the figure; ii) extension primer of Exon 1 of ATP7A, as shown in SEQ ID NO: 11; iii) Extension primer of Exon 41 of COL4A5, as shown in SEQ ID NO: shown at 12.
In one embodiment, the competing substrate is selected from the group consisting of: i) A competing substrate for SMN1, as set forth in SEQ ID NO: 13; ii) a competing substrate for RPP40, as set forth in SEQ ID NO: 14; iii) A competing substrate for ATP7A, as set forth in SEQ ID NO: 15; iii) a competing substrate for COL4A5, as set forth in SEQ ID NO: shown at 16.
In one embodiment, the formula for calculating SMN1 and SMN2 copy numbers is:
Wherein: h N-SMN is the peak height of the sample SMN1-2E7 to be detected; h N_QC-SMN is the peak height of the sample SMN1-2E7 corresponding to QC; h N-RPP40 is the peak height of the sample corresponding to the RPP40 of the sample to be detected; h N_QC-RPP40 is the peak height of the sample RPP40 corresponding to QC; h N-ATP7A is the peak height of the sample ATP7A of the sample to be detected; h N_QC-ATP7A is the peak height of the sample to be detected ATP7A corresponding to QC; h N-COL4A5 is the peak height of the sample COL4A5 corresponding to the sample to be detected; h N_QC-COL4A5 is the peak height of the sample COL4A5 to be detected corresponding to QC; h gDNA-SMN is the peak height of the sample corresponding to gDNA SMN1-2 E7; h gDNA_QC-SMN is the peak height of gDNA SMN1-2E7 corresponding to QC; h gDNA-RPP40 is the peak height of the sample corresponding to gDNA RPP 40; h gDNA-QC-RPP40 is the peak height of gDNA RPP40 corresponding to QC; h gDNA-ATP7A is the peak height of the gDNA ATP7A corresponding sample; h gDNA_QC-ATP7A is the peak height of gDNA ATP7A corresponding to QC; h gDNA-COL4A5 is the peak height of the sample corresponding to gDNA COL4A 5; h gDNA_QC-COL4A5 is the peak height of gDNA COL4A5 corresponding to QC.
In one embodiment, the final concentration of the competing substrate of RPP40 is 0.5-8 ng/. Mu.L.
According to a preferred embodiment, the final concentration of competing substrate of RPP40 is 0.6-7.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of competing substrate of RPP40 is 0.7-7 ng/. Mu.L.
According to a preferred embodiment, the final concentration of competing substrate of RPP40 is 0.8-6.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of competing substrate of RPP40 is 0.81-6.06 ng/. Mu.L.
In one embodiment, the final concentration of the competing substrate for ATP7A is 1.5-18 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of ATP7A is 1.5-17.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of ATP7A is 1.55-17 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of ATP7A is 1.6-16.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of ATP7A is 1.63-16.26 ng/. Mu.L.
In one embodiment, the COL4A5 has a final concentration of competitor substrate of 1.5-10 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of COL4A5 is 1.55-9.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of COL4A5 is 1.6-9 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of COL4A5 is 1.65-8.5 ng/. Mu.L.
According to a preferred embodiment, the final concentration of the competing substrate of COL4A5 is 1.73-8.49 ng/. Mu.L.
According to a fourth aspect of the present invention there is provided the use of the system in the preparation of a kit for detecting rare diseases.
In one embodiment, the rare disorder is spinal muscular atrophy or duchenne muscular dystrophy.
The excellent technical effects of the method, the system and the kit of the invention are mainly that:
1. The absolute copy number of the gene to be detected is calculated by using the average value of 3 CNV-free variation reference genes, so that the problem that experimental results caused by one or two reference genes cannot be determined or are wrong as far as possible can be eliminated, and the results are more accurate. Meanwhile, the influence of other factors besides the copy number of the sample, such as the difference of absolute template quantity caused by sample degradation or sample addition errors, can be eliminated. Furthermore, the gray area of the data interpretation can be significantly reduced relative to a single reference, calculated with a 3-reference average. Furthermore, the template DNA feeding range can be increased, in a single reference system, the problem of crossing and gray areas exists in IV values for DNA samples with different concentrations, for example, 1 copy (carrier) and 2+ copy (normal person) of the DNA sample with the concentration of 10ng/ul and the DNA sample with the concentration of 30ng/ul cannot be distinguished, 3 reference calculation is introduced, the problem of crossing exists between the maximum IV value of 1 copy sample and the minimum IV value of 2+ copy for each concentration of DNA, and the feeding amount of the extracted DNA of the whole blood sample is about 10-30ng (measured by NanoDrop 2000).
2. Because the competitive substrate is artificially synthesized, the competition substrate method is adopted to judge the condition of the self-reference gene of the testee and the human factors in the extraction process are not needed to be relied on, so that the batch-to-batch difference is very small. On the other hand, the sequence of the competitive substrate is almost identical with that of the gene to be detected, the result is easy to be interpreted, and an internal standard method is adopted, namely, the detection signal peak and the signal peak of the object to be detected are carried out in the same amplification reaction with the specimen to be detected, and the detection signal peak and the signal peak of the object to be detected are in one interpretation frame without searching in other interpretation frames. Since the signal values are derived simultaneously, the accuracy of the conversion to the CNV values is more reliable.
3. The anti-pollution system of UNG-dUTP is adopted in the amplification reaction, that is, UNG enzyme is added in the PCR reaction system, dUTP and dTTP are simultaneously added according to a certain proportion, and the effect of preventing aerosol pollution of the PCR product can be achieved. Since the copy number of the PCR product is very large, contamination of the PCR product with an extremely small amount may cause false positive results. The method has the advantage of thoroughly eliminating pollution sources, UNG treatment can be carried out in the same reaction tube with PCR, and the operation is simple and convenient.
4. Because the introduced competitive substrate is the artificial synthetic sequence and the genome DNA are in the same reaction hole, the artificial synthetic sequence has no amplification steric hindrance. By introducing the locked nucleic acid modification into the amplification primer, the amplification efficiency of the artificially synthesized sequence can be reduced, and the stability of competing substrates is improved under the condition that the genome and the genome are regulated to the same copy number.
5. For whole blood specimens with the concentration of more than 10 ng/. Mu.L, the SMA can be well detected by using the Massarray technology, and the method is suitable for common screening. The single base extension is adopted, the difference of the molecular weight among different bases is utilized to distinguish detection sites, the base type of the detected detection site can be directly and accurately detected, and the specificity is strong; and fluorescent probes are not used, so that fluorescent interference of similar sites is avoided, the detection result is accurate, and the cost is low.
Brief description of the drawings
FIGS. 1A-C: the mass spectrum of the detection site on SMN1-2 Exon7, wherein fig. 1A represents the massaray profile of the 0 copy sample (patient), wherein smn1=0, smn2=2, fig. 1B represents the massaray profile of the 1 copy sample (carrier), wherein smn1=1, smn2=2, and fig. 1C represents the massaray profile of the 2+ copy sample (normal), wherein smn1=2, smn2=2.
Fig. 2A-C: the mass spectrum of the detection site on the RPP40 gene, wherein fig. 2A shows the massaray profile of the 2+ copy sample (normal) at 3 different competitive substrate concentrations, wherein smn1=2, smn2=2, RPP40 competitive substrate concentrations are 0.5ng/ul,4.0 ng/ul, 8.0 ng/ul, respectively, fig. 2B shows the massaray profile of the 1 copy sample (carrier) at 3 different competitive substrate concentrations, wherein smn1=1, smn2=2, RPP40 competitive substrate concentrations are 0.5ng/ul,4.0 ng/ul, 8.0 ng/ul, respectively, and fig. 2C shows the massaray profile of the 0 copy sample (patient) at 3 different competitive substrate concentrations, wherein smn1=0, smn2=2, RPP40 competitive substrate concentrations are 0.5ng/ul,4.0 ng/ul, 8.0 ng/ul, respectively.
Fig. 3A-C: the mass spectrum of the detection site on the ATP7A gene, wherein fig. 3A shows the massaray profile of the 2+ copy sample (normal) at 3 different competitive substrate concentrations, wherein smn1=2, smn2=2, atp7a competitive substrate concentrations are 1.5ng/μl,9.0ng/μl,18.0ng/μl, respectively, and fig. 3B shows the massaray profile of the 1 copy sample (carrier) at 3 different competitive substrate concentrations, wherein smn1=1, smn2=2, atp7a competitive substrate concentrations are 1.5ng/μl,9.0ng/μl,18.0ng/μl, respectively, and fig. 3C shows the massaray profile of the 0 copy sample (patient) at 3 different competitive substrate concentrations, wherein smn1=0, smn2=2, atp7a competitive substrate concentrations are 1.5ng/μl,9.0ng/μl,18.0ng/μl, respectively.
Fig. 4A-C: the mass spectrum of the detection site on the COL4A5 gene, wherein fig. 4A shows the massaray profile of the 2+ copy sample (normal) at 3 different competitive substrate concentrations, wherein smn1=2, smn2=2, the col4a5 competitive substrate concentrations are 1.5ng/μl,5.0ng/μl,10.0ng/μl, respectively, and fig. 4B shows the massaray profile of the 1 copy sample (carrier) at 3 different competitive substrate concentrations, wherein smn1=1, smn2=2, the col4a5 competitive substrate concentrations are 1.5ng/μl,5.0ng/μl,10.0ng/μl, respectively, and fig. 4C shows the massaray profile of the 0 copy sample (patient) at 3 different competitive substrate concentrations, wherein smn1=0, smn2=2, the col4a5 substrate concentrations are 1.5ng/μl,5.0ng/μl,10.0ng/μl, respectively.
Fig. 5: the gray area for data interpretation can be significantly reduced by calculating the average value of 3 internal references relative to the single internal reference.
Fig. 6: compared with the MLPA of the gold standard, the method of Massary has higher sensitivity, specificity and yin-yang judgment rate compared with the ROC curve of the MLPA of the Massary.
Detailed Description
The following detailed description of the preferred embodiments of the application, taken in conjunction with the accompanying drawings, is given by way of illustration and not limitation, and any other similar situations are intended to fall within the scope of the application.
Example 1: design of primers and competitor substrates
In this example, according to the pathological characteristics of SMA disease and the sequence information of SMN1 and SMN2 genes, amplification primers, competitive substrates and extension primers for the corresponding fragments of the 7 th exon of SMN1, SMN2, and amplification primers, competitive substrates and extension primers for the RPP40, ATP7A and COL4A5 genes as internal reference genes were designed.
Specifically, the information of the amplification primers designed and used is shown in Table 1. Wherein [ X ] represents that the base is modified by a locked nucleic acid, and X may be A, T, C or G. The purpose of PCR is to obtain target DNA.
Table 1: amplification primer information table
Information on the extension primers designed and used is shown in Table 2 below. The extension primers were taken from a portion of the PCR amplified sequence, and the purpose of MassArray was to detect copy number variation of SMA.
Table 2: extension primer information table
The information on the competing substrates designed and used is shown in table 3 below.
Table 3: competitive substrate information table
Wherein, for example, t represents that the sequence of the competing substrate is different from the target gene sequence at the base t position, and the different base introduced is a genotype which does not appear on the human gene.
The sequence information of the amplified products of the target gene and the competing substrate is shown in Table 4 below:
Table 4: amplification product sequence information table
Wherein, the black bolded is PCR amplified primer sequence, the italic is extended primer sequence, the underlined bolded is detection site or extended base, and the bolded is the extension product of sample DNA and competitive substrate respectively.
Example 2: determination of copy number of SMN1 and SMN2 Using MassArray
This example uses the primers and competitor substrates described in example 1 to make a MassArray assay for the copy number of SMN1 and SMN 2.
2.1 Preparation of reaction Mixed solution
The PCR primer mixture, QC MIX reaction mixture, PCR reaction mixture, SAP reaction mixture, extension primer mixture, and extension reaction mixture used in this example were prepared as shown in tables 5 to 10 below.
Table 5: PCR primer mixture preparation
Table 6: preparing QC MIX reaction mixture, and respectively preparing 3 concentrations of competing substrates
Table 6-1:
table 6-2:
table 6-3:
Table 7: PCR reaction mixture preparation
Table 8: SAP reaction mixture configuration
ddH2O | 1.53μL |
SAP Buffer | 0.17μL |
SAP Enzyme | 0.3μL |
Total amount of | 2μL |
Table 9: extension primer iPLEX Primer MIX configuration table:
Table 10: extension reaction mixture configuration
2.2 Genomic DNA acquisition
Fresh or frozen anticoagulated blood samples are used for extracting genome DNA by using a CRICHO magnetic bead method blood genome DNA extraction kit.
GDNA was a2 copy control, human Genomic DNA purchased from Promega, cat# G1471.
The SMN1 gene, the SMN2 gene, the RPP40 gene and the ATP7A, COL A5 gene are all 2 copies.
2.3 Multiplex amplification Engles
2.3.1PCR amplification Link
Mixing the mixed liquid of the PCR reaction prepared by 2.1, split charging the mixed liquid into the corresponding reaction holes of 384-hole plates, split charging 4 mu L of each hole, and then adding 1 mu L of sample to be detected. Each test required 2 copies of control gDNA and blank water. Sticking a film, placing the film on a gene amplification instrument after transient centrifugation, and amplifying according to the following PCR program:
2.3.2SAP purification procedure
To each well of PCR amplification reaction product after 3.1PCR was added 2. Mu.L of SAP reaction mixture (2.1 configuration), attached to a film, slightly centrifuged, and placed on a gene amplification apparatus for purification according to the following SAP procedure:
37℃ | 40min |
85℃ | 5min |
8℃ | Hold |
2.3.3 extension reaction Link
To each well of SAP purification reaction product from 3.2SAP was added 2. Mu.L of extension reaction mixture (2.1 configuration), applied a film, and after transient centrifugation, placed on a gene amplification apparatus, and extended according to the following extension reaction procedure:
2.4 desalination and Mass Spectrometry
After the extension reaction procedure was completed, the pellet was centrifuged transiently. Adding 16 mu L of sterilized double distilled water into each hole, sealing the film, and performing instantaneous centrifugation; plaiting, placing 384 plates and chips, and typing.
2.5 Data analysis and detection result interpretation
2.5.1SNP determination of genotype results at detection sites
Xml file, path View-PLATE DATA PANE of the original file exported by instrument matching software Typer 4.0.0; the Call column results are SNP detection locus genotype results.
2.5.2 Analysis of copy number detection results
Xml file, path View-PLATE DATA PANE of original file exported by instrument matched software Typer 4.0;
calculated by peak HEIGHT (heighth) of each detection site.
The calculation formula is as follows:
Wherein:
H N-SMN is the peak height of the sample SMN1-2 E7 to be detected;
H N_QC-SMN is the peak height of the sample SMN1-2 E7 corresponding to QC;
h N-RPP40 is the peak height of the sample corresponding to the RPP40 of the sample to be detected;
h N_QC-RPP40 is the peak height of the sample RPP40 corresponding to QC;
H N-ATP7A is the peak height of the sample ATP7A of the sample to be detected;
H N_QC-ATP7A is the peak height of the sample to be detected ATP7A corresponding to QC;
H N-COL4A5 is the peak height of the sample COL4A5 corresponding to the sample to be detected;
h N_QC-COL4A5 is the peak height of the sample COL4A5 to be detected corresponding to QC;
h gDNA-SMN is the peak height of the sample corresponding to gDNA SMN1-2 E7;
h gDNA_QC-SMN is the peak height of gDNA SMN1-2 E7 corresponding to QC;
h gDNA-RPP40 is the peak height of the sample corresponding to gDNA RPP 40;
h gDNA-QC-RPP40 is the peak height of gDNA RPP40 corresponding to QC;
h gDNA-ATP7A is the peak height of the gDNA ATP7A corresponding sample;
H gDNA_QC-ATP7A is the peak height of gDNA ATP7A corresponding to QC;
H gDNA-COL4A5 is the peak height of the sample corresponding to gDNA COL4A 5;
H gDNA_QC-COL4A5 is the peak height of gDNA COL4A5 corresponding to QC.
The numerical interpretation ranges are as follows:
Results: the mass spectra of detection sites on the SMN1-2 Exon7, RPP40 and ATP7A, COL A5 genes are shown in figures 1-4, and as can be seen from the figures, the reaction systems of 3 concentrations of competing substrates can effectively distinguish 0 copy (patient), 1 copy (carrier) and 2+ copy (normal person); the 98 samples SMN1 and SMN2 copy number results related thereto were tested by massaray method and compared with the MLPA gold standard method, as shown in table 11 below.
Table 11: comparison of detection results of Massary method and MLPA gold-labeled method
To evaluate the sensitivity, specificity and yin-yang judging rate of the massaray method of the present invention compared with the gold standard MLPA, the detection results of the copy numbers of the samples SMN1 and SMN2 of 53 cases as shown in table 12 were increased, the sensitivity, specificity and yin-yang judging rate of the massaray method as shown in table 13 were obtained by analyzing the detection results of the samples of 151 cases (including the 98 cases related to table 11 and the 53 cases related to table 12), and the ROC curve of the massaray method and the gold standard MLPA comparison as shown in fig. 6 was further drawn, and as can be seen from table 13 and fig. 6, the sensitivity, specificity and yin-yang judging rate of the massaray method were all close to the gold standard MLPA.
Table 12: comparison of detection results of Massary method and MLPA gold-labeled method
Table 13: sensitivity, specificity and yin-yang judgment rate of Massary method
Index (I) | Numerical value |
Sensitivity of | 96% |
Specificity (specificity) | 100% |
Positive judgment rate | 100% |
Negative judgment rate | 98% |
Example 3: comparative experiments with single reference RPP40 and multiple references
In this embodiment, the sample experimental data corresponding to table 11 is used to calculate the single internal reference RPP40 IV value, so as to obtain the copy number.
Xml file, path View-PLATE DATA PANE of original file exported by instrument matched software Typer 4.0;
calculated by peak HEIGHT (heighth) of each detection site.
The calculation formula is as follows:
Wherein:
H N-SMN is the peak height of the sample SMN1-2 E7 to be detected;
H N_QC-SMN is the peak height of the sample SMN1-2 E7 corresponding to QC;
h N-RPP40 is the peak height of the sample corresponding to the RPP40 of the sample to be detected;
h N_QC-RPP40 is the peak height of the sample RPP40 corresponding to QC;
h gDNA-SMN is the peak height of the sample corresponding to gDNA SMN1-2 E7;
h gDNA_QC-SMN is the peak height of gDNA SMN1-2 E7 corresponding to QC;
h gDNA-RPP40 is the peak height of the sample corresponding to gDNA RPP 40;
h gDNA-QC-RPP40 is the peak height of gDNA RPP40 corresponding to QC;
Results: as shown in FIG. 5, the maximum IV value of the positive sample and the minimum IV value of the negative sample are marked by dotted lines respectively, and it can be seen from the graph that the single internal reference RPP40 has the problems of crossing and gray areas for different concentrations of DNA samples, for example, the problem that 1 mu L of DNA is fed, the concentration of 10 ng/. Mu.L of DNA sample cannot be distinguished from the 1 copy (carrier) and the 2+ copy (normal person) of DNA sample with the concentration of 30 ng/. Mu.L, three internal reference calculation is introduced, the problem that the maximum IV value of 1 copy sample and the minimum IV value of 2+ copy have no crossing is solved, the total blood sample extraction DNA feeding amount can be detected in the range of about 10-30ng (NanoDrop 2000 measurement), and experimental results show that the gray area for data reading can be obviously reduced by using the three internal reference average calculation compared with the single internal reference calculation, the problem that the IV value has crossing for different concentrations of DNA samples in the single internal reference system is solved, and the template DNA feeding range can be enlarged.
Therefore, the method of the invention introduces 3 internal reference genes as references, can reduce the difference of absolute template quantity caused by sample degradation or sample addition errors as much as possible, increases the template DNA feeding range, and is easy to operate. In addition, by adding locked nucleic acid modification on the PCR primer, differential bases are arranged on the competitive substrate, so that the amplification efficiency of the competitive substrate is reduced, and the competitive substrate can be fed and detected in a stable concentration range. The competitive PCR technology can eliminate environmental differences such as temperature, pressure and the like among different reaction holes, and the detection result is more accurate; the application of the Massary technology can realize the detection of multiple PCR products in a single hole, and has simple operation and lower cost. The combination of the two can finish different groups of detection in different reaction holes of PCR, can realize multiple groups of detection at the same time, has large detection flux, and is convenient for realizing commercial mass detection requirements. Compared with the common Massary, the method adds the competitive substrate of the CNV detection site in the amplification link, and detects the peak height ratio through the competitive substrate, so that the conditions of SMN1=0, 1, SMN1 not less than 2, SMN2=0, 1 and SMN2 not less than 2 copy numbers can be accurately distinguished, and the detection result can be refined, so that the detection result is more accurate.
It should be understood that while the present invention has been described by way of example in terms of its preferred embodiments, it is not limited to the above embodiments, but is capable of numerous modifications and variations by those skilled in the art. The primers, reaction conditions, etc., used in the amplification and extension reactions may be adjusted and varied accordingly to the particular needs. It will thus be appreciated that those skilled in the art will be able to devise numerous alternative arrangements which, although not explicitly described herein, embody the principles of the invention and are included within its spirit and scope.
Claims (40)
1. A method of determining gene copy number comprising:
1) Amplifying the target gene, the reference gene and the competing substrates of the target gene and the reference gene by using an amplification primer to obtain an amplification product;
2) Carrying out an extension reaction on the amplification product obtained in the step 1) by using an extension primer to obtain an extension product;
3) Mass spectrum detection is carried out on the extension product in the step 2) by using a mass spectrometer through a matrix-assisted laser desorption ionization time-of-flight mass spectrometry technology, and the copy number of the target gene is calculated through the peak height of each detection site;
wherein the reference genes comprise at least three genes without CNV variation.
2. The method of claim 1, wherein the gene of interest comprises SMN1, SMN2, CYP2D6, DMD.
3. The method of claim 1 or 2, wherein the reference genes comprise RPP40, ATP7A and COL4A5.
4. The method of any one of claims 1-3, wherein the amplification primer is a locked nucleic acid modified amplification primer.
5. The method of any one of claims 1-4, wherein the amplification reaction of step 1) employs a UNG-dUTP anti-contamination system.
6. The method of any one of claims 1-5, wherein the amplification primers for the gene of interest are: amplification primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: 1-2.
7. The method of any one of claims 1-6, wherein the extension primer of the gene of interest is: extension primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: shown at 9.
8. The method of any one of claims 1-7, wherein the amplification primers of the reference gene are selected from the group consisting of:
i) Amplification primer of Intron 6 of RPP40, as shown in SEQ ID NO: 3-4;
ii) an amplification primer for Exon 1 of ATP7A, as set forth in SEQ ID NO: 5-6;
iii) The amplification primer of Exon 41 of COL4A5 is shown in SEQ ID NO: 7-8.
9. The method of any one of claims 1-8, wherein the extension primer of the reference gene is selected from the group consisting of:
i) Extension primer of Intron 6 of RPP40, as shown in SEQ ID NO:10 is shown in the figure;
ii) extension primer of Exon 1 of ATP7A, as shown in SEQ ID NO: 11;
iii) Extension primer of Exon 41 of COL4A5, as shown in SEQ ID NO: shown at 12.
10. The method of any one of claims 1-9, wherein the competing substrate is selected from the group consisting of:
i) Competing substrates of SMN1 and SMN2, as set forth in SEQ ID NO: 13;
ii) a competing substrate for RPP40, as set forth in SEQ ID NO: 14;
iii) A competing substrate for ATP7A, as set forth in SEQ ID NO: 15;
iii) a competing substrate for COL4A5, as set forth in SEQ ID NO: shown at 16.
11. The method of any of claims 3-10, wherein the formula for calculating SMN1 and SMN2 copy numbers is:
Wherein:
H N-SMN is the peak height of the sample SMN1-2 E7 to be detected;
H N_QC-SMN is the peak height of the sample SMN1-2 E7 corresponding to QC;
h N-RPP40 is the peak height of the sample corresponding to the RPP40 of the sample to be detected;
h N_QC-RPP40 is the peak height of the sample RPP40 corresponding to QC;
H N-ATP7A is the peak height of the sample ATP7A of the sample to be detected;
H N_QC-ATP7A is the peak height of the sample to be detected ATP7A corresponding to QC;
H N-COL4A5 is the peak height of the sample COL4A5 corresponding to the sample to be detected;
h N_QC-COL4A5 is the peak height of the sample COL4A5 to be detected corresponding to QC;
h gDNA-SMN is the peak height of the sample corresponding to gDNA SMN1-2 E7;
h gDNA_QC-SMN is the peak height of gDNA SMN1-2 E7 corresponding to QC;
h gDNA-RPP40 is the peak height of the sample corresponding to gDNA RPP 40;
h gDNA-QC-RPP40 is the peak height of gDNA RPP40 corresponding to QC;
h gDNA-ATP7A is the peak height of the gDNA ATP7A corresponding sample;
H gDNA_QC-ATP7A is the peak height of gDNA ATP7A corresponding to QC;
H gDNA-COL4A5 is the peak height of the sample corresponding to gDNA COL4A 5;
H gDNA_QC-COL4A5 is the peak height of gDNA COL4A5 corresponding to QC.
12. The method of any one of claims 3-11, wherein the final concentration of competing substrate for RPP40 is 0.5-8ng/μl.
13. The method of any one of claims 3-12, wherein the final concentration of the competing substrate for ATP7A is 1.5-18ng/μl.
14. The method of any one of claims 3-13, wherein the final concentration of competing substrate of COL4A5 is 1.5-10ng/μl.
15. A kit for determining the copy number of a gene comprising:
Amplification primers and extension primers for a gene of interest, and competing substrates for the gene of interest; and amplification primers and extension primers for the reference gene, and a competing substrate for the reference gene, wherein the reference gene comprises at least three genes without CNV variation.
16. The kit of claim 15, wherein the gene of interest comprises SMN1, SMN2, CYP2D6, DMD.
17. The kit of claim 15 or 16, wherein the reference genes comprise RPP40, ATP7A and COL4A5.
18. The kit of any one of claims 15-17, wherein the amplification primer is a locked nucleic acid modified amplification primer.
19. The kit of any one of claims 15-18, wherein the amplification primers for the gene of interest are: amplification primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: 1-2.
20. The kit of any one of claims 15-19, wherein the extension primer of the gene of interest is: extension primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: shown at 9.
21. Kit according to any one of claims 15 to 20, wherein the amplification primers of the reference gene are selected from the following:
i) Amplification primer of Intron 6 of RPP40, as shown in SEQ ID NO: 3-4;
ii) an amplification primer for Exon 1 of ATP7A, as set forth in SEQ ID NO: 5-6;
iii) The amplification primer of Exon 41 of COL4A5 is shown in SEQ ID NO: 7-8.
22. Kit according to any one of claims 15 to 21, wherein the extension primer of the reference gene is selected from the group consisting of:
i) Extension primer of Intron 6 of RPP40, as shown in SEQ ID NO:10 is shown in the figure;
ii) extension primer of Exon 1 of ATP7A, as shown in SEQ ID NO: 11;
iii) Extension primer of Exon 41 of COL4A5, as shown in SEQ ID NO: shown at 12.
23. The kit of any one of claims 15-22, wherein the competing substrate is selected from the group consisting of:
i) Competing substrates of SMN1 and SMN2, as set forth in SEQ ID NO: 13;
ii) a competing substrate for RPP40, as set forth in SEQ ID NO: 14;
iii) A competing substrate for ATP7A, as set forth in SEQ ID NO: 15;
iii) a competing substrate for COL4A5, as set forth in SEQ ID NO: shown at 16.
24. The kit of any one of claims 15-23 for use in detecting spinal muscular atrophy in a subject.
25. The kit of any one of claims 15-23 for use in detecting duchenne muscular dystrophy in a subject.
26. A system for determining gene copy number, comprising:
1) Amplification module: amplifying the target gene, the reference gene and the competing substrates of the target gene and the reference gene by using an amplification primer to obtain an amplification product;
2) Extension module: performing an extension reaction on the amplification product obtained in the step 1) by using an extension primer to obtain an extension product;
3) Mass spectrometry module: mass spectrum detection is carried out on the extension product in the step 2) by using a mass spectrometer through a matrix-assisted laser desorption ionization time-of-flight mass spectrometry technology, and the copy number of the target gene is calculated through the peak height of each detection site;
wherein the reference genes comprise at least three genes without CNV variation.
27. The system of claim 26, wherein the gene of interest comprises SMN1, SMN2, CYP2D6, DMD.
28. The system of claim 26 or 27, wherein the reference genes comprise RPP40, ATP7A and COL4A5.
29. The system of any one of claims 26-28, wherein the amplification primer is a locked nucleic acid modified amplification primer.
30. The system of any one of claims 26-29, wherein the amplification reaction of step 1) employs a UNG-dUTP anti-contamination system.
31. The system of any one of claims 26-30, wherein the amplification primers for the gene of interest are: amplification primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: 1-2.
32. The system of any one of claims 26-31, wherein the extension primer of the gene of interest is: extension primers of Exon 7 of SMN1 and SMN2, as set forth in SEQ ID NO: shown at 9.
33. The system of any one of claims 26-32, wherein the amplification primers of the reference gene are selected from the group consisting of:
i) Amplification primer of Intron 6 of RPP40, as shown in SEQ ID NO: 3-4;
ii) an amplification primer for Exon 1 of ATP7A, as set forth in SEQ ID NO: 5-6;
iii) The amplification primer of Exon 41 of COL4A5 is shown in SEQ ID NO: 7-8.
34. The system of any one of claims 26-33, wherein the extension primer of the reference gene is selected from the group consisting of:
i) Extension primer of Intron 6 of RPP40, as shown in SEQ ID NO:10 is shown in the figure;
ii) extension primer of Exon 1 of ATP7A, as shown in SEQ ID NO: 11;
iii) Extension primer of Exon 41 of COL4A5, as shown in SEQ ID NO: shown at 12.
35. The system of any one of claims 26-34, wherein the competing substrate is selected from the group consisting of:
i) Competing substrates of SMN1 and SMN2, as set forth in SEQ ID NO: 13;
ii) a competing substrate for RPP40, as set forth in SEQ ID NO: 14;
iii) A competing substrate for ATP7A, as set forth in SEQ ID NO: 15;
iii) a competing substrate for COL4A5, as set forth in SEQ ID NO: shown at 16.
36. The system of any one of claims 28-35, wherein the final concentration of competing substrate for RPP40 is 0.5-8ng/μl.
37. The system of any one of claims 28-36, wherein the final concentration of competing substrate for ATP7A is 1.5-18ng/μl.
38. The system of any one of claims 28-37, wherein the final concentration of competing substrate of COL4A5 is 1.5-10ng/μl.
39. Use of the system of claims 26-38 for the preparation of a kit for detecting rare diseases.
40. The use of claim 39, wherein the rare disorder is spinal muscular atrophy or duchenne muscular dystrophy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211318904.XA CN117925808A (en) | 2022-10-26 | 2022-10-26 | Method, kit and system for determining gene copy number |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211318904.XA CN117925808A (en) | 2022-10-26 | 2022-10-26 | Method, kit and system for determining gene copy number |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117925808A true CN117925808A (en) | 2024-04-26 |
Family
ID=90763503
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211318904.XA Pending CN117925808A (en) | 2022-10-26 | 2022-10-26 | Method, kit and system for determining gene copy number |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117925808A (en) |
-
2022
- 2022-10-26 CN CN202211318904.XA patent/CN117925808A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108048548A (en) | People's spinal muscular atrophy Disease-causing gene copy number detects PCR kit for fluorescence quantitative | |
CN110093413A (en) | Detect the primer sets and kit of beta Thalassemia | |
CN107488711B (en) | Method for detecting genotype of point mutation and kit thereof | |
CN110468192B (en) | Time-of-flight mass spectrometry nucleic acid analysis method for detecting human spinal muscular atrophy gene mutation | |
CN110577987B (en) | Detection method of FMR1 gene CGG repetitive sequence and application thereof | |
CN114085903B (en) | Primer pair probe combination product for detecting mitochondria 3243A & gtG mutation, kit and detection method thereof | |
CN112538528A (en) | Primer group and kit for detecting ALDH2 gene polymorphism | |
CN106367491A (en) | Kit for detecting deafness susceptibility genes | |
CN103614477B (en) | Fluorescent quantitative PCR (Polymerase Chain Reaction) kit for diagnosing human spinal muscular atrophy | |
CN117070607B (en) | Fluorescent quantitative PCR kit for detecting human spinal muscular atrophy SMN1 and SMN2 gene copy numbers | |
CN103451268B (en) | A kind of standard substance of detection line plastochondria A3243G heterozygous mutant rate, test kit and detection method thereof | |
CN111118141B (en) | Primer sequence and kit for detecting glucose-6-phosphate dehydrogenase (G6 PD) gene mutation | |
CN107385028B (en) | Target sequence complementary quenching probe for detecting beta globin gene point mutation and kit thereof | |
Lefferts et al. | Evaluation of the nanosphere verigene® system and the verigene® F5/F2/MTHFR nucleic acid tests | |
CN116479103B (en) | Kit for detecting spinal muscular atrophy related genes | |
CN117925808A (en) | Method, kit and system for determining gene copy number | |
CN116144750A (en) | Primer probe set, digital PCR kit and application | |
CN114350786A (en) | For detecting HLA-B15: primer group of 02 allele and application thereof | |
CN113718020A (en) | Primer-probe combination and kit for detecting internal tandem repeat mutation of human leukemia FLT3 gene and application of primer-probe combination and kit | |
CN113322317A (en) | Primer pair, probe set and kit for mitochondrial obesity gene mutation detection | |
CN118291631B (en) | Detection primer set for genetic abnormal sarcoma accompanied with BCOR, kit and application of detection primer set | |
CN110964827A (en) | SNP marker related to Chinese non-small cell lung cancer auxiliary diagnosis and application thereof | |
CN113502332B (en) | Primer, probe and kit for detecting FLT3 gene mutation | |
CN113718025B (en) | SNP kit for diabetic retinopathy gene detection based on KASP | |
CN115058507A (en) | Primer group and kit for detecting folic acid metabolism related SNP (Single nucleotide polymorphism) marker |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication |