CN112735599A - Evaluation method for judging rare hereditary diseases - Google Patents
Evaluation method for judging rare hereditary diseases Download PDFInfo
- Publication number
- CN112735599A CN112735599A CN202110104870.3A CN202110104870A CN112735599A CN 112735599 A CN112735599 A CN 112735599A CN 202110104870 A CN202110104870 A CN 202110104870A CN 112735599 A CN112735599 A CN 112735599A
- Authority
- CN
- China
- Prior art keywords
- phenotype
- sequencing
- genetic
- mutation
- diseases
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 208000026350 Inborn Genetic disease Diseases 0.000 title claims abstract description 20
- 238000011156 evaluation Methods 0.000 title claims description 19
- 208000024556 Mendelian disease Diseases 0.000 title abstract description 12
- 230000035772 mutation Effects 0.000 claims abstract description 32
- 201000010099 disease Diseases 0.000 claims abstract description 28
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 28
- 238000000034 method Methods 0.000 claims abstract description 25
- 230000002068 genetic effect Effects 0.000 claims abstract description 24
- 238000007482 whole exome sequencing Methods 0.000 claims abstract description 16
- 230000001717 pathogenic effect Effects 0.000 claims abstract description 14
- 238000001914 filtration Methods 0.000 claims abstract description 13
- 238000004458 analytical method Methods 0.000 claims abstract description 12
- 208000016361 genetic disease Diseases 0.000 claims abstract description 9
- 206010064571 Gene mutation Diseases 0.000 claims abstract description 7
- 238000005457 optimization Methods 0.000 claims abstract description 4
- 238000012163 sequencing technique Methods 0.000 claims description 30
- 108090000623 proteins and genes Proteins 0.000 claims description 28
- 238000001514 detection method Methods 0.000 claims description 18
- 238000004364 calculation method Methods 0.000 claims description 8
- 210000005259 peripheral blood Anatomy 0.000 claims description 7
- 239000011886 peripheral blood Substances 0.000 claims description 7
- 208000029726 Neurodevelopmental disease Diseases 0.000 claims description 6
- 230000007918 pathogenicity Effects 0.000 claims description 6
- 108700028369 Alleles Proteins 0.000 claims description 5
- 208000035977 Rare disease Diseases 0.000 claims description 5
- 208000037340 Rare genetic disease Diseases 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 5
- 210000004369 blood Anatomy 0.000 claims description 4
- 239000008280 blood Substances 0.000 claims description 4
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 238000012217 deletion Methods 0.000 claims description 3
- 230000037430 deletion Effects 0.000 claims description 3
- 238000003780 insertion Methods 0.000 claims description 3
- 230000037431 insertion Effects 0.000 claims description 3
- 230000004060 metabolic process Effects 0.000 claims description 3
- 238000010172 mouse model Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 239000002773 nucleotide Substances 0.000 claims description 3
- 125000003729 nucleotide group Chemical group 0.000 claims description 3
- 210000005105 peripheral blood lymphocyte Anatomy 0.000 claims description 3
- 241000894007 species Species 0.000 claims description 3
- 238000012252 genetic analysis Methods 0.000 claims 1
- 238000012360 testing method Methods 0.000 description 8
- 230000000875 corresponding effect Effects 0.000 description 7
- 108020004414 DNA Proteins 0.000 description 6
- 208000025494 Aortic disease Diseases 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000008506 pathogenesis Effects 0.000 description 3
- 238000012070 whole genome sequencing analysis Methods 0.000 description 3
- 101150062966 FBN1 gene Proteins 0.000 description 2
- 208000001826 Marfan syndrome Diseases 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 208000031229 Cardiomyopathies Diseases 0.000 description 1
- 206010011878 Deafness Diseases 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 201000005978 Loeys-Dietz syndrome Diseases 0.000 description 1
- 206010028372 Muscular weakness Diseases 0.000 description 1
- 208000016012 Phenotypic abnormality Diseases 0.000 description 1
- 108091005735 TGF-beta receptors Proteins 0.000 description 1
- 102000014172 Transforming Growth Factor-beta Type I Receptor Human genes 0.000 description 1
- 108010011702 Transforming Growth Factor-beta Type I Receptor Proteins 0.000 description 1
- 102000004060 Transforming Growth Factor-beta Type II Receptor Human genes 0.000 description 1
- 108010082684 Transforming Growth Factor-beta Type II Receptor Proteins 0.000 description 1
- 238000001793 Wilcoxon signed-rank test Methods 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000009223 counseling Methods 0.000 description 1
- 231100000895 deafness Toxicity 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000008303 genetic mechanism Effects 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 238000011331 genomic analysis Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 208000016354 hearing loss disease Diseases 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 201000001091 isolated ectopia lentis Diseases 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- 238000010208 microarray analysis Methods 0.000 description 1
- 230000036473 myasthenia Effects 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 238000002610 neuroimaging Methods 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 238000013517 stratification Methods 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/40—Population genetics; Linkage disequilibrium
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/50—Mutagenesis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Chemical & Material Sciences (AREA)
- Theoretical Computer Science (AREA)
- Genetics & Genomics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Data Mining & Analysis (AREA)
- Public Health (AREA)
- Molecular Biology (AREA)
- Physiology (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Ecology (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses an assessment method for judging rare hereditary diseases, which comprises deep phenotype analysis, whole exome sequencing, gene mutation identification and filtration, mutation optimization combined with a hereditary mode and a disease phenotype, and combined with the whole exome sequencing and a human phenotype ontology, can help clinicians to judge and assess pathogenic factors of the rare hereditary diseases in time, improve the evaluable rate of the rare hereditary diseases, provide basis for determining proper treatment measures and clinical management strategies, provide genetic consultation and prenatal assessment for families, reduce the birth of similar children and lighten the economic burden of families and society. Therefore, the method has important significance for judging genetic diseases with high phenotypic heterogeneity and low incidence.
Description
Technical Field
The invention belongs to the field of hereditary diseases judgment and evaluation and pathogenic genes, and particularly relates to an evaluation method for judging rare hereditary diseases with unknown reasons.
Background
At present, the pathogenesis and genetic mechanism of about 4600 rare diseases in world medicine are relatively clear, but more than 50% of rare hereditary diseases are not clearly judged. The level of research associated with rare stubborn diseases is significantly delayed compared to common diseases. Because the understanding of the pathogenesis of rare diseases is limited, the diagnosis and treatment means are all insufficient. Traditionally, clinical laboratory judgment and assessment techniques for rare genetic diseases of unknown origin include neuroimaging, metabolic screening, genetic testing (e.g., karyotype, chromosome microarray analysis, clinical exome sequencing (MES), and targeted disease gene panel), and invasive testing, among others. With the continuous improvement of next generation sequencing technology and the rapid development of bioinformatics technology, the discovery of genome data and new gene mutations has been explosively increased. However, the inability to efficiently correlate genomics information with clinical phenotypes does not provide a good explanation for complex genetic variation and biomedical related problems.
Most genetic diseases are not judged by solely relying on phenotypic information before genetic testing can be performed. Phenotypic data include quantitative data (weight, height), qualitative data (behavior), photographs, clinical profiles, medical history, physical examination, and biochemical examination. The detection items of the phenotypic information comprise invasive detection (such as blood detection and tissue detection) and non-invasive detection (such as heart rate detection and X-ray detection). The clinician decides whether to make these tests by combining the corresponding clinical criteria. The patient parental phenotypic information should be included in the judgment of rare diseases. Before and after gene detection, phenotypic information is crucial to the decision of clinical judgment strategies. Before gene detection, phenotypic information can assist in locking a region to be researched and help in screening test items; and after genetic testing, phenotypic information can be correlated with genetic information to explain disease. For example, one patient has aortic disease; the variation of multiple genes is related to the onset of aortic disease, and the clinical manifestations of different subtypes of aortic disease are also overlapped. The Loeys-Dietz syndrome and Marfan syndrome have very similar phenotypes, but the former has a mutation in the TGF-beta receptor gene (TGFBR1, TGFBR2) and the latter has a mutation in the FBN1 gene. In contrast, the Ectopia Lentis syndrome and Marfan syndrome are mutations in the FBN1 gene, but differ in phenotype. Therefore, combining phenotypic and genotypic information, i.e., establishing genotype-phenotype unambiguous associations, is critical to the judgment and assessment of disease.
Human Phenotype Ontology (HPO) is a standardized vocabulary library of phenotypic abnormalities associated with over 7000 diseases, used by thousands of researchers, clinicians, informatics, and electronic health record systems worldwide. Its detailed description of clinical abnormalities and calculable disease definition make the human phenotypic ontology a de facto standard for deep phenotypes in the field of rare diseases. Researchers have combined the sequencing data of all exons of patients with myasthenia congenita and the phenotype encoded by human phenotype ontology to derive phenotypic analysis based on human phenotype ontology that helps to discover pathogenic variations in patients with the disease.
Different clinical gene tests have different advantages and disadvantages. The detection of the targeted disease gene panel has specific phenotype for clinic but strong genetic heterogeneity, and is suitable for disease species needing to be distinguished and judged, such as deafness, cardiomyopathy and the like; however, the clinical efficacy of the gene panel depends to a large extent on the proportion of known disease-causing genes that explain such diseases, in addition to the clinician's grasp of the clinical judgment of the disease species. In addition, as new pathogenic genes are continuously discovered, the gene panel is also continuously updated. The clinical exome sequencing (MES) strategy contains all known causative genes. However, this strategy also has corresponding drawbacks due to the rapid discovery of new disease genes, and the fact that the disease genes we believe in the past are not, or not yet sufficient, evidence to support authentic disease genes. Whole Genome Sequencing (WGS) has begun to be routinely used in the analysis of Copy Number Variation (CNV) and is not more costly than current common chromosomal chips; however, due to cost (detection of sequence variation requires coverage >30 ×), whole genome sequencing is not currently routinely used for clinical detection of sequencing variation. Full exome sequencing (WES) contains the vast majority of the encoded regions, and although the actual content is partially different due to company product design, and the coverage of some regions is inevitably insufficient during sequencing, these limitations are not enough to affect the wide application of full exome sequencing. In addition, whole exome sequencing can detect sequence variations of known genetic diseases, newly discovered disease-causing genes can also be detected, and there is also an opportunity to discover new candidate disease-causing genes. Therefore, whole exome sequencing can create higher clinical value with relatively lower detection cost, and is an ideal gene detection means and an effective strategy for discovering new pathogenic genes when the conditions exist, such as complicated clinical phenotype, non-specific clinical phenotype, unknown clinical judgment or cases without expression of clinical phenotype of patients.
At the current stage, the judgment period of hereditary diseases is long, the judgment and evaluation probability is low, the understanding of disease gene pathogenesis is not sufficient, the same gene can cause different diseases in different variation modes, sufficient Chinese population reference data is lacked in a shared population database, and the difficulty is increased for analyzing the clinical significance of variation. Therefore, the invention combines abnormal phenotype and whole exome sequencing data standardized by human phenotype ontology, determines exome sequence priority by using clinical relevance and cross-species phenotype comparison, screens candidate variation by using a large number of other calculation and filtering systems for variation frequency, pathogenicity prediction and pedigree analysis, and is clinically significant for finding new disease genes or identification judgment and evaluation of Mendelian diseases.
Disclosure of Invention
To address the deficiencies of the above-described technology, the present invention provides an assessment method for determining rare genetic diseases that combines human phenotypic ontologies and whole exome sequencing. Phenotypic and genotypic data are combined, statistical evidence is obtained by detailed analysis, and accurate judgment and assessment is provided for each patient according to disease stratification.
The technical scheme adopted by the invention is as follows: an assessment method for judging a rare genetic disease, comprising the steps of:
the method comprises the steps of firstly, performing deep phenotype analysis, collecting phenotype information of all family members, excluding patients which can be judged by using traditional gene detection or metabolism screening, leaving unknown-reason core families, performing phenotype standardization based on a human phenotype ontology system, firstly obtaining Chinese phenotype description from clinical medical history, then using a Chinese HPO database to perform inquiry and matching to obtain HPO numbers, and then obtaining names corresponding to the phenotype ontology and an affiliated system from the HPO database;
step two: sequencing a whole exome, namely sequencing a peripheral blood whole exome of a member of the core family, and obtaining a mutation file VCF (variable calling file) taking the core family as a unit according to the optimal guide of a genome analysis tool;
step three: and (3) identifying and filtering gene mutation, taking a family as a unit, using the family phenotype information, PED files and mutated VCF files together for phenotype matching calculation, and combining phenotype and whole exon sequencing data to perform priority ordering on candidate gene variation conforming to a genetic pattern for judgment and evaluation, thereby finally obtaining accurate genetics judgment and evaluation.
Preferably, the phenotypic standardization in the first step is that various clinical records including medical history, laboratory examination and radiology reports are collected from hospital electronic medical records, and phenotypic information corresponding to symptoms is extracted from each proband by a clinician.
Preferably, the whole exome sequencing in the second step is to collect peripheral Blood of the core family member, extract genomic DNA from peripheral Blood lymphocytes by using QIAamp DNA Blood Mini kit (Qiagen, Germany), fragment the genomic DNA into 250-300bp by using an ultrasonograph (Covaris, Woburn MA), construct a sequencing library by using an Agilent SureSelect Human All Exon V6 kit, select an illumina HiSeqXten sequencing platform in Beijing genomics center by whole exome sequencing, obtain sequencing data with the paired ends of 150bp from the core family, and obtain the average sequencing depth of about 30 x.
Optimized, the identification and filtering of gene mutations in step three, are performed by aligning whole exome sequencing reads to reference genome hg19 using alignment tool BWA, Single Nucleotide Variation (SNV) and small insertion and deletion (indels) identification using GATK (v.4.1) optimal guideline procedure, and preliminary filtering using GATK's standard filtering procedure, after mutation identification, annotating and classifying mutations into five classes including pathogenic, potentially pathogenic, uncertain (VUS), potentially benign and benign using intersar according to ACMG/AMP guidelines, while obtaining minor allele frequencies (minor allele frequencies, MAF) of mutations in 1000 genome project (1000G) or genome aggregation database (gnomAD, v.2.0.2); known pathogenic variants associated with neurodevelopmental disorders were identified in databases such as ClinVar, OMIM (omim.org) and HGMD.
Further, the assessment method also includes mutation optimization in accordance with genetic patterns and disease phenotypes, to determine potential causative mutations for genetic judgment, mutations are prioritized (i.e., cross-species comparison) by integrating phenotypic similarity calculations between known human diseases and mouse models, and mutations are evaluated based on mutation pathogenicity, MAF (< 0.1%) and genetic patterns, thereby allowing sufficient combination of genotypic and phenotypic information for genetic disease judgment evaluation of unknown cause.
The invention has the beneficial effects that: the invention combines the sequencing of the whole exome and the human phenotype ontology, can help clinicians to judge and evaluate the pathogenic factors of the rare hereditary diseases in time, improve the evaluable rate of the rare hereditary diseases, provide basis for determining proper treatment measures and clinical management strategies, provide genetic counseling and prenatal evaluation for families, reduce the birth of similar children and lighten the economic burden of families and society. Therefore, the method has important significance for judging genetic diseases with high phenotypic heterogeneity and low incidence.
For the genetic diseases with definite treatment methods, the definite genetic judgment result is beneficial to early judgment and treatment, and the disability rate and the fatality rate can be effectively reduced; for genetic diseases lacking effective treatment measures, the method is also beneficial to judging prognosis and testing new treatment methods, and helps parents of children patients to make reasonable medical decisions.
Drawings
FIG. 1 is a flow chart of the evaluation method of the present invention.
FIG. 2 is a graph of the distribution of phenotypic ontologies in judged and undetermined families according to the present invention. NS stands for insignificant and x stands for significance with a p value of less than 0.05.
FIG. 3 is a diagram showing the result of evaluation of genetic judgment of a core pedigree of neurodevelopmental disorder in an embodiment of the present invention, wherein (A)22 pedigrees are judged and their genetic patterns are determined; (B) mutation patterns and statistics of the distribution of new mutations.
Detailed Description
The invention will be further described with reference to specific embodiments and the accompanying drawings.
As shown in fig. 1, an evaluation method for judging a rare genetic disease, comprising the steps of:
the method comprises the steps of firstly, performing deep phenotype analysis, collecting phenotype information of all family members, excluding patients which can be judged by using traditional gene detection or metabolism screening, leaving core families with unknown reasons, performing phenotype standardization based on a human phenotype ontology system, collecting various clinical records including medical history, laboratory examination and radiology reports from hospital electronic medical records, extracting phenotype information conforming to symptoms from each proband by a clinician, obtaining Chinese phenotype description from the clinical medical history, then using a Chinese HPO database for query and matching to obtain HPO numbers, and then obtaining names corresponding to the phenotype ontology and an affiliated system from the HPO database;
step two: sequencing a whole exome, namely sequencing a peripheral blood whole exome of a member of the core family, and obtaining a mutation file VCF (variable calling file) taking the core family as a unit according to the optimal guide of a genome analysis tool; collecting peripheral Blood of a member of a core family, extracting genomic DNA from peripheral Blood lymphocytes by using a QIAamp DNA Blood Mini kit (Qiagen, Germany), then fragmenting the genomic DNA into 250-fold 300bp by using an ultrasonograph (Covaris, Woburn MA), constructing a sequencing library by using an Agilent SureSelect Human All Exon V6 kit, sequencing the whole exome, selecting an illumina HiSeqXten sequencing platform of the Beijing genomics center, and obtaining sequencing data with the length of a matched end of 150bp from the core family, wherein the average sequencing depth is about 30 x.
Step three: and (3) identifying and filtering gene mutation, taking a family as a unit, using the family phenotype information, PED files and mutated VCF files together for phenotype matching calculation, and combining phenotype and whole exon sequencing data to perform priority ordering on candidate gene variation conforming to a genetic pattern for judgment and evaluation, thereby finally obtaining accurate genetics judgment and evaluation. Aligning whole exome sequencing reads to reference genome hg19 using alignment tool BWA, Single Nucleotide Variation (SNV) and small insertions and deletions (indels) identification using GATK (v.4.1) best guideline procedure and preliminary filtering using GATK's standard filtering procedure, after mutation identification, annotating and classifying the mutations into five classes including pathogenicity, possibly pathogenicity, uncertainty (VUS), possibly benign and benign using interavar according to ACMG/AMP guidelines, and at the same time obtaining minor allele frequency (minor allele frequency, MAF) of the mutations in 1000 genome project (1000G) or genome aggregation database (gnomAD, v.2.0.2); known pathogenic variants associated with neurodevelopmental disorders were identified in databases such as ClinVar, OMIM (omim.org) and HGMD.
Optimization of mutations consistent with genetic patterns and disease phenotypes to determine potential causative mutations for genetic judgment, mutations were prioritized (i.e., cross-species comparisons) by integrating phenotypic similarity calculations between known human diseases and mouse models, and evaluated according to mutation pathogenicity, MAF (< 0.1%) and genetic patterns, to fully combine genotypic and phenotypic information for use in genetic disease judgment evaluation of unknown cause.
Examples
In this example, a total of 45 core families of neurodevelopmental disorders (PED (pedigree) files in family units, which could not be determined by the conventional detection method, are collected as shown in Table 1).
TABLE 1 data acquisition Table
The deep phenotype of 45 unknown-cause neurodevelopmental disorder core families was normalized. Firstly, Chinese phenotype description is obtained from clinical history, then a Chinese HPO database (http:// www.chinahpo.org /) is used for query and matching to obtain an HPO number, and then the name corresponding to the phenotype ontology and the system thereof are obtained from the HPO database. We autonomously abbreviate it for subsequent analysis. Phenotypic information was collected for all family members and phenotypical normalization was performed based on the human phenotypic ontology system, with the results shown in table 2.
TABLE 2 phenotypic Standard statistics Table
We performed peripheral blood whole exon sequencing on core family members and according to the best guidelines of genomic analysis tools, we obtained 45 mutation files VCF (variant calling file) in units of core families, then, in units of families, we used family phenotype information, PED files, and mutated VCF files together for phenotype-based matching calculations, with the objective of combining phenotype and whole exon sequencing data to prioritize candidate genetic variants that fit genetic patterns for judgment and evaluation. Finally, 22 families were precisely genetically judged and evaluated, as shown in fig. 3.
In addition, we have made an example of families that were evaluated with a precise judgment in the corresponding generation sequencing (Sanger sequencing) validation section, and it is noted that, as shown in fig. 2, the number of involved non-nervous system-related phenotypes (n ═ 22) was significantly higher in the judged families than in the undetermined families (n ═ 23) (Wilcoxon test, p ═ 9 × 10-4), and the progress indicates the importance of the phenotype; it is also stated that such a method is more effective for families characterized by syndromes.
Claims (5)
1. An assessment method for determining rare genetic diseases, characterized by combining a human phenotype database and exon sequencing, allowing phenotype calculable and more accurate matching of phenotype and genotype. It comprises the following steps:
the method comprises the steps of firstly, performing deep phenotype analysis, collecting phenotype information of all family members, excluding patients which can be judged by using traditional gene detection or metabolism screening, leaving unknown-reason core families, performing phenotype standardization based on a human phenotype ontology system, firstly obtaining Chinese phenotype description from clinical medical history, then using a Chinese HPO database to perform inquiry and matching to obtain HPO numbers, and then obtaining names corresponding to the phenotype ontology and an affiliated system from the HPO database;
step two: sequencing a whole exome, namely sequencing a peripheral blood whole exome of a member of the core family, and obtaining a mutation file VCF (variable calling file) taking the core family as a unit according to the optimal guide of a genome analysis tool;
step three: and (3) identifying and filtering gene mutation, taking a family as a unit, using the family phenotype information, PED files and mutated VCF files together for phenotype matching calculation, and combining phenotype and whole exon sequencing data to perform priority ordering on candidate gene variation conforming to a genetic pattern for judgment and evaluation, thereby finally obtaining accurate genetics judgment and evaluation.
2. The method of claim 1, wherein the phenotype normalization in step one is performed by collecting clinical records from hospital electronic medical records, including medical history, laboratory examinations, and radiology reports, extracting phenotype-associated information from each proband by a clinician, and performing phenotype normalization using a Chinese HPO database to correspond to computer-recognizable standard HPO terms.
3. The method according to claim 1, wherein the whole exome sequencing in step two comprises collecting peripheral Blood of the core family member, extracting genomic DNA from peripheral Blood lymphocytes by using QIAamp DNA Blood Mini kit (Qiagen, Germany), fragmenting the genomic DNA into 250-300bp by using an ultrasonoscope (Covaris, Woburn MA), constructing a sequencing library by using an Agilent SureSelect Human Exon V6 kit, selecting an illumina HiSeqXten sequencing platform in Beijing genomics center, obtaining sequencing data with a paired end length of 150bp from the core family, and obtaining an average sequencing depth of about 30 x.
4. The method of claim 1, wherein the genetic rare disease is determined by a genetic analysis, the identification and filtration of gene mutation in step three, aligning the whole exome sequencing reads to the reference genome hg19 using the alignment tool BWA, identifying Single Nucleotide Variants (SNV) and small insertions and deletions (indels) using the GATK (v.4.1) optimal guide procedure, and performing preliminary filtration using the standard filtration procedure for GATK, after mutation identification, mutations were annotated and classified into five classes, including pathogenic, potentially pathogenic, uncertain (VUS), potentially benign and benign using InterVar according to the ACMG/AMP guidelines, at the same time, the Minor Allele Frequency (MAF) of the mutation in the 1000 genome project (1000G) or genome aggregation database (gnomAD, v.2.0.2) was obtained; known pathogenic variants associated with neurodevelopmental disorders were identified in databases such as ClinVar, OMIM (omim.org) and HGMD.
5. The method of claim 1, further comprising mutation optimization to determine the underlying pathogenic mutation and thereby to make genetic determinations, wherein mutations are prioritized (i.e., cross species comparison) by integrating phenotypic similarity calculations between known human diseases and mouse models, and wherein mutations are evaluated based on their pathogenicity, MAF (< 0.1%) and genetic patterns, thereby allowing sufficient combination of genotype and phenotype information for use in genetic disease assessment of unknown cause.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110104870.3A CN112735599A (en) | 2021-01-26 | 2021-01-26 | Evaluation method for judging rare hereditary diseases |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110104870.3A CN112735599A (en) | 2021-01-26 | 2021-01-26 | Evaluation method for judging rare hereditary diseases |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112735599A true CN112735599A (en) | 2021-04-30 |
Family
ID=75593559
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110104870.3A Pending CN112735599A (en) | 2021-01-26 | 2021-01-26 | Evaluation method for judging rare hereditary diseases |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112735599A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113611361A (en) * | 2021-08-10 | 2021-11-05 | 飞科易特(广州)基因科技有限公司 | Matching method of single-gene autosomal recessive genetic disease for marriage and love matching |
CN113628681A (en) * | 2021-07-21 | 2021-11-09 | 哈尔滨星云医学检验所有限公司 | Family denovo mutation-based analysis method and application thereof |
CN114023384A (en) * | 2022-01-06 | 2022-02-08 | 天津金域医学检验实验室有限公司 | Method for automatically generating standardized report of full exome sequencing annotation table |
WO2022241481A1 (en) * | 2021-05-14 | 2022-11-17 | Tmaccelerator Company, Llc | Precision medicine systems and methods |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150310163A1 (en) * | 2012-09-27 | 2015-10-29 | The Children's Mercy Hospital | System for genome analysis and genetic disease diagnosis |
CN106575321A (en) * | 2014-01-14 | 2017-04-19 | 欧米希亚公司 | Methods and systems for genome analysis |
CN108959848A (en) * | 2018-05-30 | 2018-12-07 | 广州普世医学科技有限公司 | Based on genetic mutation and the matched hereditary disease forecasting system of disease phenotype auto-associating |
CN109086571A (en) * | 2018-08-03 | 2018-12-25 | 国家卫生计生委科学技术研究所 | A kind of method and system that monogenic disease hereditary variation is intelligently interpreted and reported |
CN110021364A (en) * | 2017-11-24 | 2019-07-16 | 上海暖闻信息科技有限公司 | Analysis detection system based on patients clinical symptom data and full sequencing of extron group data screening single gene inheritance disease Disease-causing gene |
CN110364226A (en) * | 2019-08-16 | 2019-10-22 | 复旦大学 | It is a kind of for supplementary reproduction for the genetic risk method for early warning and system of smart strategy |
-
2021
- 2021-01-26 CN CN202110104870.3A patent/CN112735599A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150310163A1 (en) * | 2012-09-27 | 2015-10-29 | The Children's Mercy Hospital | System for genome analysis and genetic disease diagnosis |
CN106575321A (en) * | 2014-01-14 | 2017-04-19 | 欧米希亚公司 | Methods and systems for genome analysis |
CN110021364A (en) * | 2017-11-24 | 2019-07-16 | 上海暖闻信息科技有限公司 | Analysis detection system based on patients clinical symptom data and full sequencing of extron group data screening single gene inheritance disease Disease-causing gene |
CN108959848A (en) * | 2018-05-30 | 2018-12-07 | 广州普世医学科技有限公司 | Based on genetic mutation and the matched hereditary disease forecasting system of disease phenotype auto-associating |
CN109086571A (en) * | 2018-08-03 | 2018-12-25 | 国家卫生计生委科学技术研究所 | A kind of method and system that monogenic disease hereditary variation is intelligently interpreted and reported |
CN110364226A (en) * | 2019-08-16 | 2019-10-22 | 复旦大学 | It is a kind of for supplementary reproduction for the genetic risk method for early warning and system of smart strategy |
Non-Patent Citations (3)
Title |
---|
PETER N. ROBINSON, ET AL: "Improved exome prioritization of disease genes through cross-species phenotype comparison", GENOME RESEARCH, vol. 2, no. 24, pages 340 - 348 * |
SMEDLEYD, JACOBSENJO, JÄGERM, ETAL: "Next-generationdiagnosticsanddisease-genediscoverywiththeExomiser", NATUREPROTOCOLS, vol. 12, no. 10, pages 1 - 27 * |
TOMASZ ZEMOJTEL, ET AL: "Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome", SCIENCE TRANSLATIONAL MEDICINE, vol. 252, no. 6, pages 1 - 9 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022241481A1 (en) * | 2021-05-14 | 2022-11-17 | Tmaccelerator Company, Llc | Precision medicine systems and methods |
CN113628681A (en) * | 2021-07-21 | 2021-11-09 | 哈尔滨星云医学检验所有限公司 | Family denovo mutation-based analysis method and application thereof |
CN113611361A (en) * | 2021-08-10 | 2021-11-05 | 飞科易特(广州)基因科技有限公司 | Matching method of single-gene autosomal recessive genetic disease for marriage and love matching |
CN113611361B (en) * | 2021-08-10 | 2023-08-08 | 飞科易特(广州)基因科技有限公司 | Matching method for single-gene autosomal recessive genetic disease for wedding love matching |
CN114023384A (en) * | 2022-01-06 | 2022-02-08 | 天津金域医学检验实验室有限公司 | Method for automatically generating standardized report of full exome sequencing annotation table |
CN114023384B (en) * | 2022-01-06 | 2022-04-05 | 天津金域医学检验实验室有限公司 | Method for automatically generating standardized report of full exome sequencing annotation table |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2019229273B2 (en) | Ultra-sensitive detection of circulating tumor DNA through genome-wide integration | |
JP7145907B2 (en) | Systems and Methods for Detection and Treatment of Diseases Exhibiting Disease Cell Heterogeneity and Communication Test Results | |
CN112735599A (en) | Evaluation method for judging rare hereditary diseases | |
JP4437050B2 (en) | Diagnosis support system, diagnosis support method, and diagnosis support service providing method | |
CN106319047B (en) | Resolving genome fractions using polymorphic counts | |
CN105143466B (en) | Pass through extensive parallel RNA sequencing analysis mother blood plasma transcript profile | |
KR101693510B1 (en) | Genotype analysis system and methods using genetic variants data of individual whole genome | |
CN111192634A (en) | Method for processing genomic data | |
AU2009279734A1 (en) | Methods for allele calling and ploidy calling | |
KR101693504B1 (en) | Discovery system for disease cause by genetic variants using individual whole genome sequencing data | |
WO2013026411A1 (en) | Single cell classification method, gene screening method and device thereof | |
HUE030510T2 (en) | Diagnosing fetal chromosomal aneuploidy using genomic sequencing | |
JP2003021630A (en) | Method of providing clinical diagnosing service | |
CN110770838A (en) | Method and system for determining clonality of somatic mutations | |
CN105555970B (en) | Method and system for simultaneous haplotyping and chromosomal aneuploidy detection | |
US20210343414A1 (en) | Methods and apparatus for phenotype-driven clinical genomics using a likelihood ratio paradigm | |
CN115244622A (en) | Systems and methods for calling variants using methylation sequencing data | |
JP2023546240A (en) | How to assess your risk of developing a disease | |
CN116209777A (en) | Genetic relationship judging method and device based on noninvasive prenatal gene detection data | |
US11869630B2 (en) | Screening system and method for determining a presence and an assessment score of cell-free DNA fragments | |
EP3635138B1 (en) | Method for analysing cell-free nucleic acids | |
KR20210120782A (en) | Construction method of customized variant-based reference data set | |
US20030170638A1 (en) | Methods to determine genetic risk through analysis of very large families | |
Belmont et al. | Complex phenotypes and complex genetics: an introduction to genetic studies of complex traits | |
CN113039606A (en) | Methods and systems for pedigree enrichment and family-based analysis within pedigrees |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |