CN110544537A - Generation method of single-gene genetic disease gene analysis report and electronic equipment thereof - Google Patents

Generation method of single-gene genetic disease gene analysis report and electronic equipment thereof Download PDF

Info

Publication number
CN110544537A
CN110544537A CN201910688048.9A CN201910688048A CN110544537A CN 110544537 A CN110544537 A CN 110544537A CN 201910688048 A CN201910688048 A CN 201910688048A CN 110544537 A CN110544537 A CN 110544537A
Authority
CN
China
Prior art keywords
data
gene
variation
disease
site data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910688048.9A
Other languages
Chinese (zh)
Inventor
胡菲菲
李明明
李明壮
明泓博
张静艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
UNITED ELECTRONICS CO Ltd
Original Assignee
UNITED ELECTRONICS CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by UNITED ELECTRONICS CO Ltd filed Critical UNITED ELECTRONICS CO Ltd
Priority to CN201910688048.9A priority Critical patent/CN110544537A/en
Publication of CN110544537A publication Critical patent/CN110544537A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a method for generating a gene analysis report of a monogenic genetic disease and electronic equipment thereof. Specifically, the method comprises the following steps: obtaining a sample comprising at least one mutation site data; screening the variation site data according to judgment conditions to obtain at least one target variation site data; inquiring an interpretation database according to the target variation site data to obtain medical interpretation data corresponding to the target variation site data; and generating a gene analysis report of the monogenic genetic disease according to the sample, the data of the at least one target variation site and the corresponding medical interpretation data. The technical scheme of the invention not only improves the analysis efficiency of the gene of the monogenic genetic disease, but also conforms to the scientific rule and has high accuracy.

Description

generation method of single-gene genetic disease gene analysis report and electronic equipment thereof
Technical Field
the invention relates to the technical field of biological information, in particular to a method for generating a gene analysis report of a monogenic genetic disease and electronic equipment thereof.
Background
The monogenic hereditary disease is a hereditary disease controlled by a pair of alleles, and the rare disease is a disease with the diseased people accounting for 0.65-1 per mill of the total population, is mostly chronic and serious disease and is often dangerous to life. It is currently known that 80% of monogenic genetic diseases are rare diseases. Although the prevalence rate of single-gene genetic diseases is low, the diseases are various, about 6000 to 7000 diseases are generated globally, and account for about 10 percent of human diseases. Because the single-gene genetic diseases are of various types and the number of the sick people of each disease is small, the accurate diagnosis time is up to five years on average, and the diagnosis and treatment of patients are seriously influenced.
Under the hot tide of precise medicine, the gene detection technology is developed and applied rapidly. The application of gene detection technology in screening and diagnosis of monogenic genetic diseases provides effective support and guarantee for effectively screening, controlling and treating monogenic genetic diseases. However, the data amount of the gene detection result with the variation site is very large, so that clinical researchers need to pay a lot of time and energy to analyze the data of the variation site, and the efficiency is very low.
Disclosure of Invention
In view of this, an embodiment of the present invention is to provide a method for generating a gene analysis report of a monogenic genetic disease and an electronic device thereof, so as to solve the problems of long time consumption and low efficiency of the monogenic genetic disease gene analysis.
the present invention provides a method for generating a gene analysis report for a monogenic genetic disease based on the above object, comprising:
Obtaining a sample comprising at least one mutation site data;
screening the variation site data according to judgment conditions to obtain at least one target variation site data;
Inquiring an interpretation database according to the target variation site data to obtain medical interpretation data corresponding to the target variation site data;
and generating a gene analysis report of the monogenic genetic disease according to the sample, the data of the at least one target variation site and the corresponding medical interpretation data.
Optionally, when the number of samples is multiple and has a relationship, the method further includes: and comparing the data of the at least one target variation site of different samples with the corresponding medical interpretation data to obtain the genetic rule of the disease and the disease pathogenic gene, and recording the genetic rule and the disease pathogenic gene in the single-gene genetic disease gene analysis report.
optionally, in the process of screening the mutation site data according to the determination condition, if any of the mutation site data does not meet the determination condition, a negative monogenic genetic disease gene analysis report is generated according to the sample.
optionally, the method further includes: determining the determination condition from the sample, the determination condition being selected from at least one of genetic pattern, pathogenicity of variation, population frequency, gene combination, type of variation, gene, and sequencing data quality control.
Optionally, when the determination condition is multiple, the mutation site data is respectively screened by using the multiple determination conditions, and the mutation site data meeting the multiple determination conditions at the same time is determined as target mutation site data.
optionally, when the sample includes phenotype information, the determination condition further includes a phenotype;
The screening the mutation site data according to the judgment condition to obtain at least one target mutation site data includes:
And inquiring variation locus data in a library corresponding to the phenotype information in the biomedical database, wherein the variation locus data passes the phenotype screening when variation locus data matched with the variation locus data in the library exists in the at least one variation locus data.
optionally, when the determination condition includes a crowd frequency;
The screening the mutation site data according to the judgment condition to obtain at least one target mutation site data includes:
And querying the crowd frequency information of the variation site data by using a biomedical database, and screening the variation site data according to a query result.
Optionally, the determination conditions include pathogenicity of variation, population frequency and sequencing data quality control.
optionally, the interpretation database comprises a disease-introduction data table, a gene-disease-first reference data table, a gene-risk cue-second reference data table, a disease-guidance advice data table, and a disease-guidance advice-third reference data table.
In another aspect of the embodiments of the present invention, an electronic device is provided, including:
at least one processor; and the number of the first and second groups,
A memory coupled to the at least one processor; wherein the content of the first and second substances,
The memory stores instructions executable by the one processor to be executed by the at least one processor to enable the at least one processor to perform the generation method as previously described.
as can be seen from the above description, according to the method for generating a gene analysis report of a monogenic genetic disease and the electronic device thereof provided by the embodiments of the present invention, data of a plurality of mutation sites are screened according to determination conditions, so that the amount of data of the mutation sites entering the interpretation database is reduced, analysis of data of unrelated mutation sites is reduced, and the pertinence and the analysis efficiency of interpretation are improved. And then, efficiently reading the data of the target variable sites by using the reading database to obtain corresponding medical reading data and displaying the medical reading data in an analysis report. The method not only improves the analysis efficiency of the gene of the monogenic genetic disease, but also conforms to the scientific rule and has high accuracy. The examinee and clinical medical staff can know the possible association between the target variation site data and the disease by reading the analysis report, and the readability and the practicability of the variation site data are greatly improved.
Drawings
FIG. 1 is a schematic flow chart showing a method for generating a gene analysis report of a monogenic genetic disease according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart showing the screening steps in an embodiment of the method for generating a single-gene genetic disease gene analysis report according to the present invention;
FIG. 3 is a schematic view showing the flow of the interpretation step in an embodiment of the method for generating a gene analysis report of a monogenic genetic disease according to the present invention;
FIG. 4 is a schematic structural view of an embodiment of an apparatus for generating a single-gene genetic disease gene analysis report according to the present invention;
Fig. 5 is a schematic structural diagram of an embodiment of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to specific embodiments and the accompanying drawings.
it should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
In view of the above, the first aspect of the embodiments of the present invention provides a method for generating a single-gene genetic disease gene analysis report with high accuracy and high analysis efficiency. Fig. 1 is a schematic flow chart of a method for generating a monogenic genetic disease analysis report according to an embodiment of the present invention.
The generation method of the single-gene genetic disease analysis report comprises the following steps:
Step 101: obtaining a sample comprising at least one mutation site data;
Step 102: screening the variation site data according to judgment conditions to obtain at least one target variation site data;
Step 103: inquiring an interpretation database according to the target variation site data to obtain medical interpretation data corresponding to the target variation site data;
step 104: and generating a gene analysis report of the monogenic genetic disease according to the sample, the data of the at least one target variation site and the corresponding medical interpretation data.
It can be seen from the foregoing embodiments that, in the method for generating a single-gene genetic disease gene analysis report according to the embodiments of the present invention, data of a plurality of mutation sites are screened according to determination conditions, so that the amount of data of the mutation sites entering the interpretation database is reduced, analysis of data of unrelated mutation sites is reduced, and the pertinence of interpretation and the analysis efficiency are improved. And then, efficiently reading the data of the target variable sites by using the reading database to obtain corresponding medical reading data and displaying the medical reading data in an analysis report. The method not only improves the analysis efficiency of the gene of the monogenic genetic disease, but also conforms to the scientific rule and has high preparation rate. The examinee and clinical medical staff can know the possible association between the target variation site data and the disease by reading the analysis report, and the readability and the practicability of the variation site data are greatly improved.
Optionally, the variation site data comprises annotation information, sequencing data quality control and variation pathogenicity.
Wherein the annotation information can be obtained by annotating gene data obtained by detecting the monogenic genetic disease by using a gene annotation tool. The annotation information includes, but is not limited to, phred quality values, allelic status, allelic frequency, type of variation, genetic patterns, gene signatures (for identifying genes), transcript signatures, HGVSc signatures (HGVSc represents the human genome variation association named DNA reference sequence), HGVSp (HGVSp represents the human genome variation association named protein reference sequence), exon start positions, exon stop positions, exon number, intron start positions, intron stop positions, intron number, and parent/parent detection status, etc.
The sequencing data quality control is directly derived from a sample detection result, and can be used for evaluating the sequencing reliability of the mutation site.
Wherein the variant pathogenicity is used to assess the pathogenicity of a variant site, such as: pathogenic, suspected pathogenic, unknown clinical significance, suspected benign, benign.
optionally, the sample further comprises information such as the name, sex, age, clinical diagnosis, treatment history, family history, time of sample testing, or institution of sample testing of the subject. Here, the clinical diagnosis includes phenotype information that is matched through a human phenotype standards alliance (HPO) database and can be used directly to query the HPO database.
In some optional embodiments, when the number of samples is multiple and has a relationship, the method further comprises: and comparing the data of the at least one target variation site of different samples with the corresponding medical interpretation data to obtain the genetic rule of the disease and the disease pathogenic gene, and recording the genetic rule and the disease pathogenic gene in the single-gene genetic disease gene analysis report. Optionally, the genetic model is set based on the proband and father/mother samples, and target variant locus data having clinical significance to the disease genetic model can be rapidly and accurately analyzed.
In some optional embodiments, in the screening of the variant site data according to the determination condition, if any of the variant site data does not meet the determination condition, a negative monogenic genetic disease gene analysis report is generated according to the sample. As will be appreciated by those skilled in the art, in such a case, the sample includes mutation site data that is not associated with a monogenic genetic disorder, and that can be further confirmed by re-performing genetic testing to expand the mutation site data in the sample and then analyzing the expanded mutation site data.
In some optional embodiments, further comprising: determining the determination condition from the sample, the determination condition being selected from at least one of genetic pattern, pathogenicity of variation, type of variation, combination of genes, population frequency, and sequencing data quality control.
Wherein the genetic pattern conditions are used to screen for a genetic pattern. Specifically, the genetic patterns include autosomal dominant, autosomal recessive, X-linked dominant, X-linked recessive, and the like. For example: at present, the genetic pattern of the methylmalonic aciduria is known to be autosomal recessive, when the phenotype of the known sample is the methylmalonic aciduria, the genetic pattern condition is determined to be the autosomal recessive, the variation locus data of which the genetic pattern is the autosomal recessive variation locus data in the variation locus data can be analyzed, the variation locus data of other genetic patterns are filtered, the time for subsequently reading the variation locus data is saved, and the method is more targeted.
Wherein the variant pathogenicity condition is used for screening variant pathogenicity in variant locus data. Generally, the variation pathogenicity condition filters the data of which the variation pathogenicity is benign and suspected to be benign in the variation locus data, and the variation locus data is obviously irrelevant to the gene analysis of the monogenic genetic disease, thereby being beneficial to improving the analysis efficiency.
wherein the mutation type condition is used for selecting a mutation type so as to screen out a specific mutation type. Here, the mutation types specifically include, but are not limited to, a splice acceptor mutation, a splice donor mutation, a frameshift mutation, a stop codon gain, a stop codon deletion, a start codon deletion, a missense mutation, a nonsense mutation, a synonymous mutation, a coding box insertion, a coding box deletion, a splice region mutation, a 5 'UTR mutation, a 3' UTR mutation, an intron mutation, a transcription start site upstream mutation, a transcription start site downstream mutation, and the like.
Wherein, the gene combination (Panel) condition is used for determining the gene combination of interest so as to screen the variation site data in the target region.
wherein the genetic condition is used to select a gene of interest to screen out variant locus data on the target gene.
Wherein, the sequencing data quality control condition is used for screening the mutation site data with reliable sequencing result. For example, the variant locus data with high sequencing depth has better reliability, and the variant locus data with the sequencing depth reaching a certain value can be screened out by using the sequencing data quality control condition, while the variant locus data without reaching a certain value is filtered (the failure of a certain data indicates that the variant locus data is limited by the sequencing result and is unreliable), so as to ensure the scientific accuracy of the analysis result.
And determining the judgment condition according to the specific situation that the mutation site data comprises information such as a genetic pattern, mutation pathogenicity, a mutation type, a gene, sequencing data quality control and the like.
In some optional embodiments, when the determination condition is multiple, the variant site data is screened by using the multiple determination conditions, and the variant site data meeting the multiple determination conditions at the same time is determined as target variant site data.
Referring to fig. 2, as an alternative embodiment, when the determination condition includes the first determination condition, the second determination condition … … and the nth determination condition, the process of screening the at least one mutation site data 201 is as follows:
Step 202: judging whether the mutation site data meet a first judgment condition, and if so, carrying out the next step;
Step 203: judging whether the mutation site data meeting the first judgment condition meets a second judgment condition or not, and if so, carrying out the next step;
in this way, until step 204, the mutation site data satisfies the nth criterion, that is, the mutation site data satisfies the n criterion at the same time, the mutation site data is the target mutation site data 205.
According to the technical scheme, the mutation site data which does not meet the previous judgment condition does not need to be judged by the next judgment condition, the data processing amount and the processing time of the mutation site data screening process can be effectively reduced, and the whole analysis efficiency is improved.
Optionally, in the screening process, when no mutation site data satisfies a certain determination condition, the whole screening process may be stopped, and a single-gene genetic disease gene analysis report based on the sample is generated.
As another alternative embodiment, when the determination condition includes a first determination condition, a second determination condition … …, and an nth determination condition, the at least one mutation site data is screened by using the first determination condition, the second determination condition … …, and the nth determination condition, respectively, to obtain a first set, which is a mutation site data set satisfying the first determination condition, and an nth set, which is a mutation site data set satisfying the second determination condition, which is a mutation site data set … … satisfying the nth determination condition, which are combined, and the target mutation site data can be obtained by intersecting the first set, the second set … …, and the nth set.
optionally, if the intersection is zero, generating a single-gene genetic disease gene analysis report based on the sample.
In some alternative embodiments, when the sample comprises phenotypic information, the decision condition further comprises a phenotype;
The step 102 of screening the mutation site data according to the determination condition to obtain at least one target mutation site data specifically includes:
and inquiring variation locus data in a library corresponding to the phenotype information in the biomedical database, wherein the variation locus data passes the phenotype screening when variation locus data matched with the variation locus data in the library exists in the at least one variation locus data.
a means for utilizing phenotypic information in combination with biomedical data when a substantial amount of variation site data is present in a sample; on one hand, the total amount of the variation site data is rapidly reduced, and the analysis time is shortened, and on the other hand, the gene analysis of the monogenic genetic disease is more targeted, more scientific and more accurate due to the corresponding relation between the phenotype information and the variation site data. For example, if the phenotypic information of the sample is epileptic encephalopathy and infant spasm, the phenotypic conditions can be used to select the variation site data of epileptic encephalopathy and infant spasm, and the target variation site data of the subsequent analysis is related to epileptic encephalopathy and infant spasm.
optionally, the biomedical databases include a human phenotype standards alliance (HPO) database, a public population frequency database, a dbSNP database, and a ClinVar database. The public crowd frequency database, the HPO database, the dbSNP database and the ClinVar database are public authoritative databases, specific contents are not described in detail, and only the utilization mode of biomedical data is illustrated. For example, the HPO database includes phenotype information and variation site data in the library corresponding to the phenotype information, variation site data in the library corresponding to the phenotype information can be queried through the phenotype information of the sample, and when variation site data in the sample can be matched with variation site data in the library, the variation site data meets the phenotype determination condition.
In some alternative embodiments, when the determination condition includes a crowd frequency; the step 102 of screening the mutation site data according to the determination condition to obtain at least one target mutation site data specifically includes: and querying the crowd frequency information of the variation site data by using a biomedical database, and screening the variation site data according to a query result. For example, the public population frequency database comprises various databases, such as the global thousand human genomes, east asia of thousand human genomes, global ESP6500, an ExAC database, global gnomAD, east asia of gnomAD, south asia of gnomAD, and the like, an appropriate population frequency database can be determined according to the race to which the sample belongs, and the population frequency information of the variant locus data in the database is queried, so that the variant in a control population or a common population is filtered out, and the screening of the variant locus data is realized.
In some alternative embodiments, the decision conditions include pathogenicity of variation, population frequency, and sequencing data quality control. The sequencing data quality control is used for screening out variant locus data corresponding to a high-quality sequencing result, filtering variant locus data with unreliable sequencing result, reducing subsequent analysis workload, and effectively avoiding the influence of the unreliable variant locus data on the analysis result, which is the basis for ensuring the scientific accuracy of an analysis report. Regarding the population frequency, as mentioned above, the common variation of the general population or the control population is obviously unrelated to the monogenic genetic disease, so that the specific variation site data of the monogenic genetic disease is screened through the population frequency condition, the pertinence is stronger, and the analysis report result is more reliable. In addition, the data of the variation sites corresponding to the benign are filtered through the variation pathogenicity conditions, and the workload of subsequent analysis is also reduced. Through the three judgment conditions, the purposes of filtering variation locus data and guaranteeing the scientific and accurate analysis result can be simultaneously realized.
In some alternative embodiments, the interpretation database comprises a disease-introduction data table, a gene-disease-first reference data table, a gene-risk cue-second reference data table, a disease-guidance advice data table, and a disease-guidance advice-third reference data table.
Referring to fig. 3, the structure and usage of the interpretation database will be described in detail as follows:
the disease-introduction data table includes disease identification, gene identification, and disease description. The disease identification and disease description in the disease-presentation data sheet can be extracted based on the gene identification in the target mutation site data.
The gene-disease-first reference data table includes a disease identification, a gene identification, a first reference content. The corresponding first reference identity and first reference content may be extracted based on the gene identity and disease identity that the disease-introduction data table has confirmed matching the gene identity and disease identity of the gene-disease-first reference data table.
The gene-risk cue data sheet includes a disease identification, a gene function content identification (abbreviated as a function identification in fig. 3), a gene function content (abbreviated as a gene function in fig. 3), a risk cue content identification (abbreviated as a risk identification in fig. 3), and a risk cue content (abbreviated as a risk cue in fig. 3). Corresponding gene function content and risk cue content can be extracted according to the confirmed disease identification and gene identification of the disease-introduction data sheet and matching with the disease identification and gene identification of the gene-risk cue data sheet. Another precondition for displaying risk prompting content in the single-gene genetic disease detection report is that the result of the ACMG mutation intelligent judgment system for the gene mutation is a pathogenic or suspected pathogenic result.
The gene-risk cue-second reference data table comprises a gene function content identifier, a risk cue content identifier, a second reference identifier and a second reference content. Corresponding second reference identifications and second reference contents may be extracted based on the confirmed gene function content identifications of the gene-risk cue data table matching the gene-risk cue-second reference data table. Corresponding second reference identifications and second reference contents may be extracted based on the confirmed risk cue content identifications of the gene-risk cue data table matching the risk cue content identifications of the gene-risk cue-reference data table.
the disease-guidance suggestion data table includes a disease identifier, a disease guidance suggestion content identifier (abbreviated as suggestion identifier in fig. 3), and a disease guidance suggestion content (abbreviated as guidance suggestion in fig. 3). Corresponding disease guidance advice content can be extracted according to the disease identification of the disease-introduction data table which is confirmed to match the disease identification of the disease-guidance advice data table.
the disease-guideline recommendation-third reference data table includes a disease guideline recommendation content identification, a third reference content. Corresponding third reference identifications, third reference content, may be extracted based on the confirmed disease guidance recommendation content identifications of the disease-guidance recommendation data table matching the disease-guidance recommendation content identifications of the disease-guidance recommendation-third reference data table.
based on the sample, the at least one target mutation site data, and the corresponding medical interpretation data, the monogenic genetic analysis report generated by embodiments of the present invention may include one or more of the following:
Information such as the name, sex, age, clinical diagnosis, treatment history, family history, sample testing time or sample testing institution of the subject;
Sample quality control: obtaining the sequencing data quality control according to the target mutation site data;
and (3) variation statistics: the method comprises the following steps of (1) obtaining by counting gene identification and variation pathogenicity in at least one target variation site data;
gene: obtaining the target mutation site data according to the gene identification in the target mutation site data;
The type of variation: obtaining the target mutation site data according to the mutation type in the target mutation site data;
The genetic mode is as follows: acquiring according to a genetic pattern in target mutation site data;
A gene subregion; obtaining the number of the total exons, the initial positions of the introns, the termination positions of the introns, the number of the total introns and the like according to the initial positions of the exons, the termination positions of the exons, the total number of the exons, the initial positions of the introns, the termination positions of the introns, and the like in the target variation site data;
Heterozygous/homozygous: obtaining the target mutation site data according to the allelic state;
and (4) family verification result: determining according to the father/mother detection state of the target mutation site data (if the father/mother detection state is not detected, the family verification result is that the father/mother is not detected, if the father/mother detection state is not detected, the family verification result is that the father/mother does not carry the mutation, if the father/mother detection state is detected, the family verification result is that the father/mother is heterozygous or homozygous mutation state);
variable pathogenicity: acquiring the variation pathogenicity according to the data of the target variation site;
description of the function of the genes: obtaining according to a gene-risk prompt data table in an interpretation database;
gene-related monogenic genetic diseases: obtaining according to a disease-introduction data table in the reading database;
Disease risk suggestion: obtaining according to a gene-risk prompt data table in an interpretation database;
Guidance and suggestion: obtaining according to a disease-guidance suggestion data table in an interpretation database;
reference documents: and obtaining according to a gene-disease-first reference data table, a gene-risk hint-second reference data table and a disease-guidance suggestion-third reference data table in the reading database.
The generated monogenic genetic disease analysis report comprises sample quality control, genes, gene functions, disease description, disease risk tips, guidance suggestions, reference documents and the like, and can be referred by the examinees and clinicians. The form and the content of the single-gene genetic disease analysis report can be adaptively adjusted according to the sample, the at least one target mutation site data and the related information acquired from the corresponding medical interpretation data, and the result of analyzing the mutation site data in the sample in the embodiment of the invention can be simply and clearly displayed.
optionally, the monogenic genetic disease analysis report is directly matched to a suitable template, which is more rapid, and the presentation of the monogenic genetic disease analysis report by the standardized template also facilitates the clinical researchers to view the analysis report in a familiar manner.
In view of the above-mentioned objects, a second aspect of the embodiments of the present invention provides a device for generating a single-gene genetic disease gene analysis report, which is scientific, accurate, and efficient. Fig. 4 is a schematic structural diagram of an embodiment of a device for generating a single-gene genetic disease analysis report according to the present invention.
the generating device of the monogenic genetic disease gene analysis report comprises:
an acquire sample module 301 for acquiring a sample, the sample comprising at least one mutation site data;
A screening module 302, configured to screen the variant site data according to the determination condition to obtain at least one target variant site data;
The interpretation module 303 is configured to query an interpretation database according to the target variant site data to obtain medical interpretation data corresponding to the target variant site data;
a report generation module 304, configured to generate a single-gene genetic disease gene analysis report according to the sample, the at least one target mutation site data, and the corresponding medical interpretation data.
in some optional embodiments, when the number of samples is multiple and has a relationship, the method further comprises: and the comparison module is used for comparing the data of the at least one target variation site of different samples with the corresponding medical interpretation data to obtain the genetic rule and disease pathogenic genes of the disease and recording the genetic rule and the disease pathogenic genes in the single-gene genetic disease gene analysis report.
in some optional embodiments, the screening module 302 is further configured to: in the process of screening the variant locus data according to the determination condition, if any of the variant locus data does not meet the determination condition, the report generation module 304 is further configured to generate a negative monogenic genetic disease gene analysis report according to the sample.
In some optional embodiments, the screening module 302 is further configured to: determining the determination condition from the sample, the determination condition being selected from at least one of genetic pattern, pathogenicity of variation, population frequency, gene combination, type of variation, gene, and sequencing data quality control.
In some optional embodiments, the screening module 302 is further configured to: and when the judging conditions are multiple, screening the variation site data by using the judging conditions respectively, and determining the variation site data meeting the judging conditions at the same time as target variation site data.
In some alternative embodiments, when the sample comprises phenotypic information, the decision condition further comprises a phenotype; the screening module 302 is further configured to: and inquiring variation locus data in a library corresponding to the phenotype information in the biomedical database, wherein the variation locus data passes the phenotype screening when variation locus data matched with the variation locus data in the library exists in the at least one variation locus data.
In some alternative embodiments, when the determination condition includes a crowd frequency; the screening module 302 is further configured to: and querying the crowd frequency information of the variation site data by using a biomedical database, and screening the variation site data according to a query result.
in some alternative embodiments, the decision conditions include pathogenicity of variation, population frequency, and sequencing data quality control.
in some alternative embodiments, the interpretation database comprises a disease-introduction data table, a gene-disease-first reference data table, a gene-risk cue-second reference data table, a disease-guidance advice data table, and a disease-guidance advice-third reference data table.
in view of the above-mentioned objects, a third aspect of the embodiments of the present invention provides an embodiment of an apparatus for generating a gene analysis report of a monogenic genetic disease. Fig. 5 is a schematic diagram of a hardware configuration of an embodiment of the apparatus for generating a single-gene genetic disease gene analysis report according to the present invention.
As shown in fig. 5, the apparatus includes:
one or more processors 401 and a memory 402, one processor 401 being exemplified in fig. 4.
The apparatus for the method for generating a single gene genetic disease gene analysis report may further include: an input device 403 and an output device 404.
The processor 401, the memory 402, the input device 403 and the output device 404 may be connected by a bus or other means, and fig. 5 illustrates an example of a connection by a bus.
The memory 402 is a non-volatile computer-readable storage medium, and can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the method for generating a single-gene genetic disease gene analysis report in the embodiment of the present application (for example, the sample obtaining module 301, the screening module 302, the reading module 303, and the report generating module 304 shown in fig. 4). The processor 401 executes various functional applications of the server and data processing by running the nonvolatile software programs, instructions and modules stored in the memory 402, that is, implements the method for generating a single-gene genetic disease gene analysis report of the above-described method embodiment.
The memory 402 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data region may store data created from use of a generating apparatus for a single gene genetic disease gene analysis report, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 402 may optionally include memory located remotely from processor 401, which may be connected to the member user behavior monitoring device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 403 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the gene sample analyzing device. The output device 404 may include a display device such as a display screen.
the one or more modules are stored in the memory 402 and when executed by the one or more processors 401 perform the method of generating a single-gene genetic disease gene analysis report of any of the method embodiments described above. The technical effect of the embodiment of the device for executing the method for generating the monogenic genetic disease gene analysis report is the same as or similar to that of any method embodiment.
embodiments of the present invention provide a non-transitory computer storage medium, where a computer-executable instruction is stored in the computer storage medium, and the computer-executable instruction may execute a processing method for list item operations in any of the above method embodiments. Embodiments of the non-transitory computer storage medium may be the same or similar in technical effect to any of the method embodiments described above.
those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the idea of the invention, also features in the above embodiments or in different embodiments may be combined, steps may be implemented in any order, and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.
In addition, well known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures for simplicity of illustration and discussion, and so as not to obscure the invention. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the invention, and also in view of the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the present invention is to be implemented (i.e., specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the invention, it should be apparent to one skilled in the art that the invention can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative instead of restrictive.
While the present invention has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic ram (dram)) may use the discussed embodiments.
The embodiments of the invention are intended to embrace all such alternatives, modifications and variances that fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements and the like that may be made without departing from the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (10)

1. A method for generating a gene analysis report of a monogenic genetic disease, comprising:
Obtaining a sample comprising at least one mutation site data;
screening the variation site data according to judgment conditions to obtain at least one target variation site data;
Inquiring an interpretation database according to the target variation site data to obtain medical interpretation data corresponding to the target variation site data;
And generating a gene analysis report of the monogenic genetic disease according to the sample, the data of the at least one target variation site and the corresponding medical interpretation data.
2. The generating method according to claim 1, wherein when the number of samples is plural and has a relationship, further comprising: and comparing the data of the at least one target variation site of different samples with the corresponding medical interpretation data to obtain the genetic rule of the disease and the disease pathogenic gene, and recording the genetic rule and the disease pathogenic gene in the single-gene genetic disease gene analysis report.
3. the generating method according to claim 1, wherein in the screening of the mutation site data according to the determination condition, if any of the mutation site data does not meet the determination condition, a negative monogenic genetic disease gene analysis report is generated according to the sample.
4. The generation method according to claim 1, further comprising: determining the determination condition from the sample, the determination condition being selected from at least one of genetic pattern, pathogenicity of variation, population frequency, gene combination, type of variation, gene, and sequencing data quality control.
5. the generation method according to claim 4, wherein when the determination condition is plural, the mutation site data is screened using each of the plural determination conditions, and the mutation site data satisfying the plural determination conditions at the same time is determined as target mutation site data.
6. The generation method according to claim 4, wherein when the sample includes phenotype information, the judgment condition further includes a phenotype;
The screening the mutation site data according to the judgment condition to obtain at least one target mutation site data includes:
And inquiring variation locus data in a library corresponding to the phenotype information in the biomedical database, wherein the variation locus data passes the phenotype screening when variation locus data matched with the variation locus data in the library exists in the at least one variation locus data.
7. The generation method according to claim 4, wherein when the determination condition includes a crowd frequency;
The screening the mutation site data according to the judgment condition to obtain at least one target mutation site data includes:
And querying the crowd frequency information of the variation site data by using a biomedical database, and screening the variation site data according to a query result.
8. The generation method of claim 4, wherein the decision conditions include pathogenicity of variation, population frequency, and sequencing data quality control.
9. the generation method of claim 1, wherein the interpretation database comprises a disease-introduction data table, a gene-disease-first reference data table, a gene-risk cue-second reference data table, a disease-guidance suggestion data table, and a disease-guidance suggestion-third reference data table.
10. an electronic device, comprising:
at least one processor; and the number of the first and second groups,
A memory coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the one processor to cause the at least one processor to perform the method of generating as claimed in any one of claims 1 to 9.
CN201910688048.9A 2019-07-29 2019-07-29 Generation method of single-gene genetic disease gene analysis report and electronic equipment thereof Pending CN110544537A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910688048.9A CN110544537A (en) 2019-07-29 2019-07-29 Generation method of single-gene genetic disease gene analysis report and electronic equipment thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910688048.9A CN110544537A (en) 2019-07-29 2019-07-29 Generation method of single-gene genetic disease gene analysis report and electronic equipment thereof

Publications (1)

Publication Number Publication Date
CN110544537A true CN110544537A (en) 2019-12-06

Family

ID=68709851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910688048.9A Pending CN110544537A (en) 2019-07-29 2019-07-29 Generation method of single-gene genetic disease gene analysis report and electronic equipment thereof

Country Status (1)

Country Link
CN (1) CN110544537A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111091867A (en) * 2019-12-18 2020-05-01 中国科学院大学 Gene variation site screening method and system
CN111798926A (en) * 2020-06-30 2020-10-20 广州金域医学检验中心有限公司 Pathogenic gene locus database and establishment method thereof
CN111883223A (en) * 2020-06-11 2020-11-03 国家卫生健康委科学技术研究所 Report interpretation method and system for structural variation in patient sample data
CN112908412A (en) * 2021-02-10 2021-06-04 北京贝瑞和康生物技术有限公司 Methods, devices and media for compounding the applicability of heterozygous variant pathogenic evidence
CN114783589A (en) * 2022-04-02 2022-07-22 中国医学科学院阜外医院 Automatic interpretation system for aortic disease genetic mutation (HTAADVar)
CN117373696A (en) * 2023-12-08 2024-01-09 神州医疗科技股份有限公司 Automatic genetic disease interpretation system and method based on literature evidence library

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106702018A (en) * 2017-03-21 2017-05-24 为朔医学数据科技(北京)有限公司 Single gene inheritance disease detection method and device
CN109086571A (en) * 2018-08-03 2018-12-25 国家卫生计生委科学技术研究所 A kind of method and system that monogenic disease hereditary variation is intelligently interpreted and reported
CN109754856A (en) * 2018-12-07 2019-05-14 北京荣之联科技股份有限公司 Automatically generate method and device, the electronic equipment of genetic test report
CN109994154A (en) * 2017-12-30 2019-07-09 安诺优达基因科技(北京)有限公司 A kind of screening plant of single-gene recessive genetic disorder candidate disease causing genes

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106702018A (en) * 2017-03-21 2017-05-24 为朔医学数据科技(北京)有限公司 Single gene inheritance disease detection method and device
CN109994154A (en) * 2017-12-30 2019-07-09 安诺优达基因科技(北京)有限公司 A kind of screening plant of single-gene recessive genetic disorder candidate disease causing genes
CN109086571A (en) * 2018-08-03 2018-12-25 国家卫生计生委科学技术研究所 A kind of method and system that monogenic disease hereditary variation is intelligently interpreted and reported
CN109754856A (en) * 2018-12-07 2019-05-14 北京荣之联科技股份有限公司 Automatically generate method and device, the electronic equipment of genetic test report

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111091867A (en) * 2019-12-18 2020-05-01 中国科学院大学 Gene variation site screening method and system
CN111883223A (en) * 2020-06-11 2020-11-03 国家卫生健康委科学技术研究所 Report interpretation method and system for structural variation in patient sample data
CN111883223B (en) * 2020-06-11 2021-05-25 国家卫生健康委科学技术研究所 Report interpretation method and system for structural variation in patient sample data
CN111798926A (en) * 2020-06-30 2020-10-20 广州金域医学检验中心有限公司 Pathogenic gene locus database and establishment method thereof
CN111798926B (en) * 2020-06-30 2023-09-29 广州金域医学检验中心有限公司 Pathogenic gene locus database and establishment method thereof
CN112908412A (en) * 2021-02-10 2021-06-04 北京贝瑞和康生物技术有限公司 Methods, devices and media for compounding the applicability of heterozygous variant pathogenic evidence
CN114783589A (en) * 2022-04-02 2022-07-22 中国医学科学院阜外医院 Automatic interpretation system for aortic disease genetic mutation (HTAADVar)
CN114783589B (en) * 2022-04-02 2022-10-04 中国医学科学院阜外医院 Automated interpretation system for genetic mutations in aortic disease HTAADVar
CN117373696A (en) * 2023-12-08 2024-01-09 神州医疗科技股份有限公司 Automatic genetic disease interpretation system and method based on literature evidence library
CN117373696B (en) * 2023-12-08 2024-03-01 神州医疗科技股份有限公司 Automatic genetic disease interpretation system and method based on literature evidence library

Similar Documents

Publication Publication Date Title
CN110544537A (en) Generation method of single-gene genetic disease gene analysis report and electronic equipment thereof
CN109754856B (en) Method and device for automatically generating gene detection report and electronic equipment
Williams et al. RNA‐seq data: challenges in and recommendations for experimental design and analysis
Girolami et al. Contemporary genetic testing in inherited cardiac disease: tools, ethical issues, and clinical applications
CN110544508B (en) Method and device for analyzing monogenic genetic disease genes and electronic equipment
Smedley et al. Phenotype-driven strategies for exome prioritization of human Mendelian disease genes
CN109686439B (en) Data analysis method, system and storage medium for genetic disease gene detection
RU2654575C2 (en) Method for detecting chromosomal structural abnormalities and device therefor
CN107408163B (en) Method and apparatus for analyzing gene
CN108664766B (en) Method, device, and apparatus for analyzing copy number variation, and storage medium
CN113724791B (en) CYP21A2 gene NGS data analysis method, device and application
US20140088942A1 (en) Molecular genetic diagnostic system
Muller et al. OutLyzer: software for extracting low-allele-frequency tumor mutations from sequencing background noise in clinical practice
Xu et al. Cell type-specific analysis of human brain transcriptome data to predict alterations in cellular composition
US20180196924A1 (en) Computer-implemented method and system for diagnosis of biological conditions of a patient
Talevich et al. CNVkit-RNA: copy number inference from RNA-sequencing data
Kuśmirek et al. Comparison of kNN and k-means optimization methods of reference set selection for improved CNV callers performance
US20160132637A1 (en) Noise model to detect copy number alterations
CN114429785B (en) Automatic classification method and device for genetic variation and electronic equipment
CN116386718A (en) Method, apparatus and medium for detecting copy number variation
Caputo et al. Comprehensive genome profiling by next generation sequencing of circulating tumor DNA in solid tumors: a single academic institution experience
US20240029827A1 (en) Method for determining the pathogenicity/benignity of a genomic variant in connection with a given disease
EP3815095A1 (en) Computing device with improved user interface for interpreting and visualizing data
Varga et al. Is deafness etiology important for prediction of functional outcomes in pediatric cochlear implantation?
US12020777B1 (en) Cancer diagnostic tool using cancer genomic signatures to determine cancer type

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 1002-1, 10th floor, No.56, Beisihuan West Road, Haidian District, Beijing 100080

Applicant after: Ronglian Technology Group Co.,Ltd.

Address before: 100080, Beijing, Haidian District, No. 56 West Fourth Ring Road, glorious Times Building, 10, 1002-1

Applicant before: UNITED ELECTRONICS Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20191206

RJ01 Rejection of invention patent application after publication