CN114023384A - Method for automatically generating standardized report of full exome sequencing annotation table - Google Patents

Method for automatically generating standardized report of full exome sequencing annotation table Download PDF

Info

Publication number
CN114023384A
CN114023384A CN202210010414.7A CN202210010414A CN114023384A CN 114023384 A CN114023384 A CN 114023384A CN 202210010414 A CN202210010414 A CN 202210010414A CN 114023384 A CN114023384 A CN 114023384A
Authority
CN
China
Prior art keywords
variation
clinical
information
outputting
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210010414.7A
Other languages
Chinese (zh)
Other versions
CN114023384B (en
Inventor
刘洪洲
李恪
李冬梅
喻长顺
刘晴晴
王燕霞
赵丽丽
陈建春
贾晓冬
李行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Jinyu Medical Laboratory Co ltd
Original Assignee
Tianjin Jinyu Medical Laboratory Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Jinyu Medical Laboratory Co ltd filed Critical Tianjin Jinyu Medical Laboratory Co ltd
Priority to CN202210010414.7A priority Critical patent/CN114023384B/en
Publication of CN114023384A publication Critical patent/CN114023384A/en
Application granted granted Critical
Publication of CN114023384B publication Critical patent/CN114023384B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • G16B35/20Screening of libraries
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Chemical & Material Sciences (AREA)
  • Library & Information Science (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a method for automatically generating a standardized report by a full exome sequencing annotation table, which comprises the steps of reading the full exome sequencing annotation table, and extracting variation information and variation site information of a proband and parents thereof; calling a gene-disease database, carrying out logic judgment according to a field verification strategy, and outputting contents in a detection result brief statement, a gene mutation result, a suggestion and explanation, other suggestions and an explanation part of gene variation according to a judgment result and a first output condition, a second output condition, a third output condition, a fourth output condition and a fifth output condition; if the SNV variation of 1-4 levels exists, generating a corresponding chart template, and filling an experiment number according to the form file information; if 5-level SNV variation exists, outputting a 5-level SNV attached table, judging whether 5-level CNV variation exists, and if yes, additionally generating a 5-level CNV variation profile after the 5-level SNV attached table as a final report attached table to output. The invention can realize one-key generation of the report after mutation screening.

Description

Method for automatically generating standardized report of full exome sequencing annotation table
Technical Field
The invention relates to the technical field of computer data processing, in particular to a method for automatically generating a standardized report by a full exome sequencing annotation table.
Background
Currently, the off-line data of whole exome sequencing of a single patient sample contains more than 300 million variations, and analysts can screen variations with high risk among the variations, and manually write a sequencing result report, wherein the variations are highly consistent with clinical symptoms of patients. Writing a report this step was done manually with the following problems: 1. the manual writing needs training of personnel, the writing process consumes time and energy 2, the manual writing inevitably causes careless omission and errors 3, and the manual writing report is not easy to standardize due to different habits of different people.
Disclosure of Invention
In view of the above, the problem to be solved by the present invention is to provide a method for automatically generating a standardized report for a full exome sequencing annotation table.
In order to solve the technical problems, the invention adopts the technical scheme that: a method for automatically generating a standardized report by a full exome sequencing annotation table comprises the following steps:
reading a whole exome sequencing annotation table, and extracting variation information and variation locus information of a proband and parents thereof according to the whole exome sequencing annotation table;
calling a gene-disease database, carrying out logic judgment on the mutation information and the information of the mutation sites according to a field verification strategy, and outputting corresponding report contents at corresponding positions of a detection result brief description part, a gene mutation result part, a suggestion and explanation part, other suggestion parts and a gene mutation explanation part according to a judgment result respectively according to a first output condition, a second output condition, a third output condition, a fourth output condition and a fifth output condition;
generating a chart template, and filling an experiment number in the chart template according to the information of the form file;
reading the information of the column where the Clinical _ Tier of the full exome sequencing annotation table is located, marking the line with content as a hierarchical information line, traversing the hierarchical information line, then screening out the hierarchical information line with content of 5, judging whether the line where the hierarchical information line is located is SNV variation, if so, outputting a 5-level SNV attached table after the report content, judging whether a Fast column or a Triof column where the hierarchical information line is located contains CNV, and if so, additionally generating a 5-level CNV variation profile as a final report attached table for outputting after the 5-level SNV attached table. In the present invention, preferably, the field verification policy includes a proband gender verification policy, a sample category verification policy, and a clinical hierarchical verification policy, where the proband gender verification policy is to invoke the sample category verification policy to determine that a sample category is a single sample or a family sample, when the sample category is the single sample, the proband gender output is female if female chromosome information appears in field information traversing the full exome sequencing annotation table, and the proband gender output is male if male chromosome information appears in the field information traversing the full exome sequencing annotation table; when the sample type is a family sample, traversing a first field containing HGVS in the registry, and judging that the proband is female if the sample type is ended by female chromosome information, and the proband is male if the sample type is ended by male chromosome information; the sample type verification strategy is field information of the whole exome sequencing annotation table, if a Fast field exists, a single sample is output, and if a Triof field exists, a family sample is output; the Clinical grading verification strategy is to identify the information in the column where the Clinical _ Tier is located, and the identified numbers are respectively and correspondingly output as the variation levels of the corresponding numbers.
In the present invention, preferably, the first output condition is to invoke a clinical hierarchical verification strategy, and if there is no level 1-4 variation, the output is to output that no pathogenic variation related to the clinical hint is detected; otherwise, judging whether the 1 and 2-stage variation exists, if so, outputting the first row of the brief description part of the detection result without detecting the pathogenic variation clearly explaining the clinical condition, and sequentially inputting the number of each clinical grade of the detected variation and the corresponding description according to the number of each stage of variation.
In the present invention, preferably, the second output condition is to determine variation situations of the SNV and the CNV, and if digital information appears in the information identifying the columns of the SNV and the CNV at the same time, indicating that the SNV and the CNV have variation at the same time, a double variation result is output, and if one of the information identifying the columns of the SNV and the CNV is digital information and the other is blank information, indicating that the SNV or the CNV has variation, a single variation result is output; otherwise, a sample type verification strategy is called, the family sample outputs the whole exome sequencing without detecting pathogenic variation, and the single sample outputs the high-throughput sequencing without detecting pathogenic variation.
In the present invention, preferably, the third output condition is to invoke a Clinical graded verification strategy, where the Clinical graded verification strategy is to identify information in a column where Clinical _ Tier is located, output the identified numbers respectively and correspondingly as variation levels of corresponding numbers, determine whether there is 1-4 level variation, if so, continue to invoke the sample class verification strategy, perform high throughput sequencing on single sample output, and if there is no 1-2 level variation in invoking the Clinical graded verification strategy, output that no pathogenic variation clearly explaining the Clinical situation is detected; outputting a family sample, executing whole exome sequencing, calling a clinical hierarchical verification strategy, and if no 1-2 level variation exists, outputting that no pathogenic variation clearly explaining the clinical condition is detected; otherwise, no pathogenic variation and copy number variation were detected in the single sample output, and whole exome sequencing was performed in the family sample output.
In the present invention, preferably, the third output condition includes that when a Clinical graded verification strategy is called to determine that there is 1-4 grade variation, the Clinical graded verification strategy is to identify information in which Clinical _ Tier is located, and the identified numbers are respectively outputted as variation grades with corresponding numbers, and if there is SNV variation in the genes of parents, the number variation of predecessor genes is outputted; if the parent gene has CNV variation, judging whether the CNV variation contains important known gene, if yes, outputting the CNV variation containing important known gene; otherwise the output does not contain important known genes; the sex verification strategy of the proband is called, if the sex of the proband is output as a male, the male is mutated on the X chromosome, and the father carries the mutation, a new mutation is output; otherwise, outputting a blank; the results of the degree of variation corresponding to the levels are output according to the 1-4 level variation results, and the output results are shown in fig. 3.
In the present invention, preferably, the fourth output condition is that a Clinical hierarchical verification policy is invoked to determine whether a predecessor has a mutation, the Clinical hierarchical verification policy is to identify information in a column where the Clinical _ Tier is located, determine whether the information in the column where the Clinical _ Tier is located is digital information, determine whether a sample of a parent of the parent is submitted for inspection if the information is digital information, respectively determine whether the parent carries the mutation if the parent carries the mutation, and output a parent mutation gene inspection suggestion if the parent carries the mutation; if the mother carries the variation, outputting a mother variation gene inspection suggestion; otherwise, outputting a parent sample submission suggestion if the parent sample is not submitted for submission; judging whether the mutation of the proband is a new mutation or an autosomal recessive genetic mutation, and only outputting an SCN1A gene examination suggestion; if the SNV is the 4-level SNV variation, outputting a consultation suggestion of the corresponding site gene according to 'ACT-73 Actionable variants related genes' in a database; if the CNV is the 1-2 level CNV variation, outputting a copy number variation check suggestion; if the CNV is the 3-level CNV variation, outputting a copy number variation tracking suggestion; if the CNV is the 4-level CNV variation, outputting a copy number variation consultation suggestion; otherwise, the proband has no variation, and outputs undetected pathogenic variation and institution recommendation; and outputting the mechanism recommendation.
In the present invention, preferably, the fifth output condition is to invoke a Clinical hierarchical verification policy, the Clinical hierarchical verification policy is to identify information in a column where the Clinical _ Tier is located, and output the identified numbers as variation levels with corresponding numbers, respectively, if there is 1-4 level variation, name the HGVS in which the variation is output first for each variation, and explain the occurrence form of the variation according to a naming rule; otherwise, outputting that no pathogenic variation is found according to the clinical prompt of the examinee.
The invention has the advantages and positive effects that: the method comprises the steps of reading a full exome sequencing annotation table, and extracting variation information and variation locus information of a proband and parents of the proband according to the full exome sequencing annotation table; calling a gene-disease database, carrying out logic judgment on the information of the variation information and the variation sites according to a field verification strategy, outputting corresponding report contents according to corresponding positions of a first output condition, a second output condition, a third output condition, a fourth output condition and a fifth output condition in a detection result brief description part, a gene mutation result part, a suggestion and explanation part, other suggestion parts and a gene variation explanation part, realizing automatic generation of an all-out report, synthesizing results, presenting artificially screened SNV (single nucleotide polymorphism variation) sites and CNV (copy number variation) sites in a report, simultaneously generating a report attached table and a variation map attached table of 5-level variation, inputting a file which is the screened all-out exon sequencing annotation table, outputting report contents of natural language, realizing one-key generation of the report after the variation screening, corresponding suggestion and explanation are provided for the detected gene mutation, and convenience is provided for visualization of a data report.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a schematic flow diagram of a method for automatically generating a standardized report from a full exome sequencing annotation table according to the present invention;
FIG. 2 is a partial content diagram of the gene mutation result generated via the second output condition of a full exome sequencing annotation table automatic generation standardized report method of the present invention;
FIG. 3 is a schematic diagram of the output of the third output condition of the full exome sequencing annotation table automatic generation standardized report method of the present invention;
FIG. 4 is a partial schematic diagram of a 5-level variation appendix of a full exome sequencing annotation table automatic generation standardized report method of the invention;
FIG. 5 is a schematic diagram of the input fields of a family sample of a full exome sequencing annotation table automatic generation standardized reporting method of the present invention;
FIG. 6 is a schematic diagram of a sample input file subjected to a fourth output condition of a full exome sequencing annotation table automatic generation standardized reporting method of the present invention;
FIG. 7 is a schematic diagram of a sample input file subjected to a fifth output condition of a full exome sequencing annotation table automatic generation standardized reporting method of the present invention;
FIG. 8 is a schematic diagram of a full exome sequencing annotation table of a standardized report method of automatic generation of a full exome sequencing annotation table in accordance with the present invention;
FIG. 9 is a partial schematic diagram of a 5-level variation appendix of a full exome sequencing annotation table automatic generation standardized reporting method of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that when an element is referred to as being "secured to" another element, it can be directly on the other element or intervening elements may also be present. When a component is referred to as being "connected" to another component, it can be directly connected to the other component or intervening components may also be present. When a component is referred to as being "disposed on" another component, it can be directly on the other component or intervening components may also be present. The terms "vertical," "horizontal," "left," "right," and the like as used herein are for illustrative purposes only.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
As shown in FIG. 1, the present invention provides a method for automatically generating a standardized report from a full exome sequencing annotation table, comprising the following steps:
reading a whole exome sequencing annotation table, and extracting variation information and variation locus information of a proband and parents thereof according to the whole exome sequencing annotation table;
calling a gene-disease database, carrying out logic judgment on the mutation information and the information of the mutation sites according to a field verification strategy, and outputting corresponding report contents at corresponding positions of a detection result brief description part, a gene mutation result part, a suggestion and explanation part, other suggestion parts and a gene mutation explanation part according to a judgment result respectively according to a first output condition, a second output condition, a third output condition, a fourth output condition and a fifth output condition;
generating a chart template, and filling an experiment number in the chart template according to the information of the form file;
reading the information of the column where the Clinical _ Tier of the full exome sequencing annotation table is located, marking the line with content as a hierarchical information line, traversing the hierarchical information line, then screening out the hierarchical information line with content of 5, judging whether the line where the hierarchical information line is located is SNV variation, if so, outputting a 5-level SNV attached table after the report content, specifically referring to fig. 4 and 9, judging whether a Fast column or a Triof column where the hierarchical information line is located contains CNV, and if so, additionally generating a 5-level CNV variation profile as a final report attached table for outputting after the 5-level SNV attached table. The column of 'Clinical _ Tier' (Clinical grade) is originally empty. The analyst scores the common phenotype of the variation, population frequency, and various credit software predictions, synthesizes the clinical phenotype of the patient, and selects significant variations, and fills in the column, where the general annotation table contains tens of thousands of variations, and the analyst selects less than 10 significant variations and fills in the column (filling numbers 1-5, the numbers representing clinical grades), and the column has entries which are not empty or filled in 1-5, that is, the analyst judges that the significant variations exist.
In this embodiment, the field verification strategy includes a proband gender verification strategy, a sample category verification strategy, and a clinical hierarchical verification strategy, where the proband gender verification strategy is to invoke the sample category verification strategy to determine whether a sample category is a single sample or a family sample, when the sample category is the single sample, the proband gender output is female if female chromosome information appears in the field information traversing the full exome sequencing annotation table, and the proband gender output is male if male chromosome information appears in the field information traversing the full exome sequencing annotation table; when the sample type is a family sample, traversing a first field containing HGVS in the registry, and judging that the proband is female if the sample type is ended by female chromosome information, and the proband is male if the sample type is ended by male chromosome information; the sample type verification strategy is field information of the whole exome sequencing annotation table, if a Fast field exists, a single sample is output, and if a Triof field exists, a family sample is output; the Clinical grading verification strategy is to identify the information in the column where the Clinical _ Tier is located, and correspondingly output the identified numbers as the variation levels of the corresponding numbers respectively. Wherein female chromosome information is set to ". XX _ HGVS" male chromosome information is set to ". XY _ HGVS",
in this embodiment, the first output condition is to invoke a Clinical hierarchical verification strategy, where the Clinical hierarchical verification strategy is to identify information in a column where the Clinical _ Tier is located, and output the identified numbers as variation levels with corresponding numbers, respectively, and if there is no 1-4 level variation, output that no pathogenic variation related to the Clinical prompt is detected; otherwise, judging whether the 1 and 2-stage variation exists, if so, outputting the first row of the brief description part of the detection result without detecting the pathogenic variation clearly explaining the clinical condition, and sequentially inputting the number of each clinical grade of the detected variation and the corresponding description according to the number of each stage of variation. 1-stage variation output: "xxx variations detected that are likely to explain clinical conditions (or have extremely strong clinical guidance"); 2-stage variation output: "detection of xxx variants that may be helpful in explaining clinical (or potentially clinically guided) variability"; 3-stage variation output: "the detection of xxx clinical implications is not certain, nor does it completely exclude clinically relevant variations"; 4-stage variation output: "accidentally found xxx variants with potential clinical intervention guidance significance", if it is a family sample and the proband itself does not detect the variants, then output "accidentally found depended xxx variants with potential clinical intervention guidance significance" e.g. input file is successful in two 3-level variants, then output: "no detection of a pathogenic variation that clearly explains a clinical situation detects two clinically insignificant variations, nor does it exclude a variation of clinical relevance altogether".
In this embodiment, the second output condition is to determine variation conditions of the SNV and the CNV, and if both the SNV and the CNV have variation, a double variation result is output, and if only one of the SNV and the CNV has variation, a single variation result is output; otherwise, calling a sample type verification strategy, and outputting the complete exome sequencing by the family sample without detecting pathogenic variation. Among these, the double variation results are specifically shown as "the following variations were detected by high throughput sequencing: minor variations detected: SNV profiles; detected copy number variation: CNV profile ", the single variation results are specifically shown as" the following variations were detected by high throughput sequencing: corresponding to the profile, the fact that no pathogenic variation is detected by whole exome sequencing is specifically shown in the specification that the whole exome sequencing is carried out on all samples of the family, and no pathogenic variation meeting the clinical suggestion is detected. "; single sample output "high throughput sequencing did not detect clear pathogenic variations that met clinical cues. ", high-throughput sequencing did not detect pathogenic variations specifically show" high-throughput sequencing did not detect clear pathogenic variations that meet clinical cues. ". As shown in fig. 8, in the 1.5 ten thousand row variation, the corresponding 6 rows (628 th, 2181 th, 7012 th, 7472 th, 15673 th and 15676 th) under the header column Clinical _ Tier are labeled with numbers 1-5, two of which contain "CNV" in the TrioF column (15673 th and 15676 th).
In this embodiment, taking a family sample as an example, the input file field is shown in fig. 5, and the result obtained by the second output condition in fig. 5 is shown in fig. 2.
In this embodiment, further, the third output condition is to invoke a Clinical graded verification policy, where the Clinical graded verification policy is to identify information in a column where the Clinical _ Tier is located, output the identified numbers respectively corresponding to the variation levels of the corresponding numbers, determine whether there is 1-4 level variation, if so, continue to invoke the sample class verification policy, output a single sample that has been performed high-throughput sequencing, and if there is no 1-2 level variation in invoking the Clinical graded verification policy, output that no pathogenic variation clearly explaining the Clinical situation is detected; outputting a family sample, executing whole exome sequencing, calling a clinical hierarchical verification strategy, and if no 1-2 level variation exists, outputting that no pathogenic variation clearly explaining the clinical condition is detected; otherwise, no pathogenic variation and copy number variation were detected in the single sample output, and whole exome sequencing was performed in the family sample output. Where high-throughput sequencing has been performed, shown as "1, the incoming sample is high-throughput sequenced. ", full exome sequencing has been performed showing" 1, full exome sequencing has been performed on all samples of the family. ", no detected pathogenic variation and copy number variation are shown as" 1 ", no detection clearly could cause clinical cue related pathogenic variation. 2. No pathogenic copy number variation was detected. Note that the present assay does not completely replace the gene chip, chromosome microarray analysis, and chromosome karyotype analysis in the detection of copy number variation. ", whole exome sequencing has been performed, no pathogenic variation detected and no copy number variation indicated as" 1, no pathogenic variation clearly causing clinical implications. Note: all samples of the family were subjected to whole exome sequencing, and no clear variation with clinical guidance was found. 2. No pathogenic copy number variation was detected. Note that the present assay does not completely replace the gene chip, chromosome microarray analysis, and chromosome karyotype analysis in the detection of copy number variation. ".
Only one variation is found in autosomal recessive inheritance, and a sentence "attention: carrying { gene name (tick mark separation) } a deleterious variation is often insufficient to cause the gene-related recessive genetic disease, but does not exclude the possibility that the variation has a modifying effect on the clinical phenotype of the subject; meanwhile, the result cannot exclude the possibility that other mutations exist in undetected regions such as the deep part of an intron and a promoter region of the gene, and the comprehensive analysis is suggested by combining clinical conditions. "if the genetic pattern of the gene is X-linked, the following gene is additionally output: note that the gene is inherited in an X-linked manner, and men with pathogenic variation often develop the disease, while women carrying the pathogenic variation usually do not, but individual carrier women have a certain degree of clinical phenotype, but the phenotype is usually light. "call a database of genes-diseases, and if the gene is included in the database, output a phenotypic profile description of the gene. And finally add "note: genetic diseases vary from individual to individual, not every patient presents with the full phenotype, and the extent of expression also varies. "
In this embodiment, the third output condition includes that when a Clinical hierarchical verification strategy is called to determine that there is 1-4 level variation, the Clinical hierarchical verification strategy is to identify information in a column where Clinical _ Tier is located, and the identified numbers are respectively outputted as variation levels with corresponding numbers, and if there is SNV variation in the gene of the parent, the number variation of the proband gene is outputted; if the parent gene has CNV variation, judging whether the CNV variation contains important known gene, if yes, outputting the CNV variation containing important known gene; otherwise the output does not contain important known genes; the sex verification strategy of the proband is called, if the sex of the proband is output as a male, the male is mutated on the X chromosome, and the father carries the mutation, a new mutation is output; otherwise, outputting a blank; and respectively outputting the results of the variation degrees of the corresponding levels according to the variation results of the levels 1-4. Among them, the proband gene number variation shows that "proband { gene } gene { variation } variation is detected: variation of { cDNA Change } { pathogenic Classification } { homozygous/heterozygous }, { none of the samples inherited from mother/inherited from father/father was detected, and it was presumed to be a new mutation }. "containing an important known gene is shown as" detecting that { absence/duplication } of { length } exists in the predecessor { cytoBand } region, including functional genes: { Gene name, Tun code separated }. ", absence of a significant known gene was shown as" detection of length { deletion/duplication } of the prey { cytoBand } region, which does not contain a significant known gene but does not completely exclude pathogenicity, { class [ lclass [ i ] ] }' ", and a new mutation was shown as" detection of this mutation in a paternal sample, but with patient X chromosome from mother, presumably a new mutation. The results of the variation degrees of the corresponding levels of the 1-4 levels are respectively shown as "the clinical condition is likely to be explained (or has extremely strong clinical guidance meaning)", "the clinical meaning is likely to be explained (or has potential clinical guidance meaning)", "the clinical meaning is not certain, the clinical relevance cannot be completely excluded, and" the potential clinical intervention guidance meaning is possessed ".
In this embodiment, the fourth output condition is to invoke a Clinical hierarchical verification strategy to determine whether there is a mutation in the proband, where the Clinical hierarchical verification strategy is to identify information in a column where the Clinical _ Tier is located, if there is digital information indicating that the proband has a mutation, if there is blank information indicating that the proband has no mutation, determine whether information in the column where the Clinical _ Tier is located is digital information, if so, determine whether a parent sample is sent for inspection, if so, determine whether the parent carries the mutation, and if the parent carries the mutation, output a parent mutation gene inspection suggestion; if the mother carries the variation, outputting a mother variation gene inspection suggestion; otherwise, outputting a parent sample submission suggestion if the parent sample is not submitted for submission; judging whether the mutation of the proband is a new mutation or an autosomal recessive genetic mutation, and only outputting an SCN1A gene examination suggestion; if the SNV is the 4-level SNV variation, outputting a consultation suggestion of the corresponding site gene according to 'ACT-73 Actionable variants related genes' in a database; if the CNV is the 1-2 level CNV variation, outputting a copy number variation check suggestion; if the CNV is the 3-level CNV variation, outputting a copy number variation tracking suggestion; if the CNV is the 4-level CNV variation, outputting a copy number variation consultation suggestion; otherwise, the proband has no variation, and outputs undetected pathogenic variation and institution recommendation; and outputting the mechanism recommendation. The father variant gene examination advice shows that the father is advised to correspondingly examine the xxx gene related diseases, the father is also advised to correspondingly examine or understand the conditions of the xxx gene related diseases, and if necessary, the father can also carry out site detection on reported xxx gene variant proband grandparents, thereby possibly being beneficial to clear the correlation between the xxx gene variant and clinic. "the mother variation genetic screening recommendation shows" the mother is recommended to perform corresponding screening for xxx gene related diseases by the predecessor, and is also recommended to perform corresponding screening or understanding for xxx gene related diseases by the mother, if necessary, site detection can be performed for the reported xxx gene variation predecessor and grandparents, and the method can help to clarify the clinical relevance of xxx gene variation. The parental sample submission recommendations show that "the proband is advised to perform corresponding tests for xxx gene related diseases, and that if necessary, site detection can be performed for the proband parents of the reported variant site to clarify the genetic transmission of the variant and provide further explanation for the clinic. ", the SCN1A gene inspection recommendation is shown as" the recommendation of proband to conduct the corresponding inspection for SCN1A gene-related diseases. "the consultation advice for genes at the corresponding sites is shown as" advice against i gene-related diseases, consultation with disease area clinical experts ". "the copy number variation examination proposal shows" the examination is carried out for the { cytoBand, ton sign separation } region copy number variation related diseases. "copy number variation tracking recommendations are shown as" recommendations to continuously track relevant literature reports for { cytoBand, ton separation } regional copy number variations. "the copy number variation consultation advice is shown as" advice to clinical experts in the relevant field of consultation of disease related to { cytoBand, ton mark separation } regional copy number variation. ", undetected pathogenic variation and institutional recommendations show" when no pathogenic variation is detected, clinical judgment should rely primarily on clinical phenotype, other relevant examinations, and family history, due to limitations in the genetic testing techniques themselves. And recommending to qualified organizations for genetic consultation. "the institution recommendations are shown as" recommendations to qualified institutions for genetic counseling. ".
In the present embodiment, the input file fields are as shown in fig. 6 below, and the output result of the input file according to fig. 6 after passing through the fourth output condition is: "1, all samples from this family were subjected to whole exome sequencing.
A mutation with unknown heterozygous significance of the proband KCNQ4 gene is detected: c.2039C > T, inherited from mother. Possibly explaining the clinical situation (or having a strong clinical guidance).
Mutation of proband KAT6B gene VUS partial pathogenic heterozygous was detected: 94A > C, inherited from the father. May be helpful in explaining the clinic (or have potential clinical guidance).
A variation of unknown heterozygous sense of the proband piezoo 2 gene was detected: c.790G > A, inherited from mother. The clinical significance is not certain, and clinical relevance cannot be completely excluded.
The mutation which is not clearly detected in the meaning of the father LDLR gene c.431C > T is detected. Possibly explaining the clinical situation (or having a strong clinical guidance).
The presence of 1.4Mb of repeats in the proband chr20q13.33 region was detected, and included functional genes: CHRNA4, RTEL1, KCNQ2, SLC17A9, SOX18, PRPF6, DNAJC5 and EEF1A2, and the parental samples did not detect the mutation and are presumed to be new mutations. Possibly explaining the clinical situation (or having a strong clinical guidance).
The presence of a 2.3Mb deletion in the proband chr22q13.32q13.33 region, containing the functional genes: MLC1, SCO2, SBF1, TYMP, ALG12, SHANK3, ARSA, TUBGCP6, CHKB, and parental samples did not detect the mutation, which is presumed to be a new mutation. Possibly explaining the clinical situation (or having a strong clinical guidance).
Note that: carrying PIEZO2 an adverse variation is often insufficient to cause the gene-related recessive genetic disease, but does not exclude the possibility that the variation has a modifying effect on the clinical phenotype of the subject; meanwhile, the result cannot exclude the possibility that other mutations exist in undetected regions such as the deep part of an intron and a promoter region of the gene, and the comprehensive analysis is suggested by combining clinical conditions.
2. KCNQ4 gene-associated disease and phenotype profiles:
deafness type 2A (MIM600101), autosomal dominant inheritance; correlation phenotype spectra:
ear: deafness after idiom, high-frequency hearing loss during onset, middle-low frequency hearing loss, tinnitus (difference) and no forecourt damage
Other characteristics are as follows: attack of 5-15 years old, progressive disease, severe hearing loss below 50 years old
3. KAT6B gene-related disease and phenotype profiles:
patellar syndrome (MIM606170), autosomal dominant inheritance; correlation phenotype spectra:
a head part: microcephaly deformity
Face: rough face and mandibular deformity
Ear: hearing loss
Eye part: palpebral fissure declination
A nose part: wide nose, big nose and high bridge of nose
Tooth: delay of tooth eruption
Heart: atrial septal defect, ventricular septal defect
A throat part: softening of laryngeal cartilage
Lung: dysplasia of the lung
Gastrointestinal tract: anal preponderance and dysphagia
Male external genitalia: dysplastic perineum, hypoplastic scrotum and penis
External female genitalia: hypertrophy of clitoris and labia minora
Male internal genitalia: cryptorchidism
Kidney: polycystic kidney and hydronephrosis
Pelvis: contracture of hip joint, flexion deformity of hip joint, ischial dysplasia, hypopubic development, congenital dislocation of hip joint
Four limbs: contracture of limbs, patellar deformity, knee joint curvature deformity, patellar dislocation, ulnar and radial fusion
A hand part: short finger disease, short finger bone
Foot part: foot deformity
Skin: fossa on knee
Hair: alopecia (baldness)
Central nervous system: dysgenesis of corpus callosum, mental motor development retardation, severe, decreased muscle tension, enlarged deformity of occipital horn, and ectopic white matter neurons around ventricles of brain
Amniotic fluid: excessive amniotic fluid
SBBYSS syndrome (MIM603736), autosomal dominant inheritance; correlation phenotype spectra:
a head part: deformity of the small head and protrusion of occipital bone
Ear: low ear position and ear posterior horn shape
Eye part: narrow palpebral fissure and inverted inner canthus excrescence
A nose part: flat nose bridge and garlic nose
Oral cavity: cracked mouth and face and mandibular deformity
Tooth: small tooth and cuspid
Heart: dilated cardiomyopathy and structural heart defect
Male internal genitalia: cryptorchidism
A hand part: fifth finger inclined position
Central nervous system: severe mental retardation, delayed motor development index, speech disturbance, and decreased muscle tone
Endocrine system: hypothyroidism
4. PIEZO2 gene-associated disease and phenotype profiles:
Marden-Walker syndrome (MIM248700), autosomal dominant inheritance; correlation phenotype spectra:
and (3) growth and development: prenatal growth defects and postnatal growth defects
A head part: deformity of the small head and bregma
Face: woodruff with facial expression, mandibular deformity and senior citizen
Ear: the ear position is low
Eye part: strabismus, narrow palpebral fissure, small eye deformity, wide interocular distance, eruption of inner canthus, ptosis of upper eyelid
A nose part: the tip of the nose facing the sky
Oral cavity: cleft palate, high arch palate, small mouth
Neck: short neck
Heart: right heart
Lung: dysplasia of the lung
Sternum clavicle and scapular ribs: chicken breast or funnel breast with clavicle defect
Gastrointestinal tract: Zollinger-Ellison syndrome, pyloric stenosis, duodenal tangle, and yolk vessel patency
Male external genitalia: hypospadias, penile, inguinal hernia
Male internal genitalia: cryptorchidism
Kidney: micro cystic kidney and kidney dysplasia
Spinal column: scoliosis and kyphosis (humpback)
Four limbs: congenital contracture of joints and radius and ulna combination
A hand part: flexor, slender finger (toe)
Foot part: talipes equinovarus
Muscle and soft tissue: reduction of muscle volume
Central nervous system: moderate-severe mental retardation, decreased muscle tone, epilepsy, ventricular dilatation, corpus callosum hypoplasia, cerebellar hypoplasia, Dandy-Walker deformity, lack of primary reflex, hypoplasia of the lower earthworm, brain stem dysplasia
Other characteristics are as follows: new cases of mutations have been reported
Distal joint contracture type 3 (MIM114300), autosomal dominant inheritance; correlation phenotype spectra:
height: short and small in stature
Face: malformation of mandible and asymmetry of face
Eye part: ptosis, inner canthus excrescence, ophthalmoplegia (partial cases)
Oral cavity: cleft palate, submucosal cleft, uvula bifidus, and palatine arch
Neck: short neck and mild neck web
The appearance characteristics are as follows: funnel chest with inclined shoulder
Male internal genitalia: cryptorchidism
Spinal column: lordosis, scoliosis, thoracolumbar curvature and scoliosis
Pelvis: congenital dislocation of hip joint and limitation of hip abduction
Four limbs: flexion contracture of knee joint
A hand part: short finger, flexed finger of proximal interphalangeal joint, ulnar deviation, cutaneous syndactyly deformity, penetrating palm (partial cases), no fold lines between fingers
Foot part: equinovarus, index of refraction, overlapping toes
Skin: penetrating palms (partial cases) and no fold lines between fingers
Muscle and soft tissue: reduction of muscle volume
Central nervous system: cerebellum Chiari I malformation (partial cases), mild mental developmental disorder (partial cases), mild mental motor developmental delay (partial cases)
Distal joint contracture type 5 (MIM108145), autosomal dominant inheritance; correlation phenotype spectra:
height: short and small in stature
Face: triangular face, facial expression reduction
Ear: protruding ear
Eye part: paralysis of eye muscle, deep depression of eye, eruption of inner canthus, abnormality of electroretinogram, narrow eyelid fissure, hyperopia (partial case), ptosis, Duane's abnormality, retinal pigmentation abnormality, strabismus, astigmatism, macular retinal goffer, spherical cornea, and keratoconus
Oral cavity: high arch palate, reduced mouth-opening ability (partial case)
Lung: restrictive lung disease
The appearance characteristics are as follows: cocked and forward-tilted shoulder
Sternum clavicle and scapular ribs: funnel chest
Spinal column: stiffness of the spine, scoliosis (rare)
Four limbs: limited rotation of forearm, bilateral anterior cruciate ligament loss (partial cases)
A hand part: congenital contracture of fingers, long fingers, no fold between fingers, poor formation of palmar lines, limitation of extending wrist, flexor, inward bending of fingers (toes), first metacarpophalangeal joint hypermobility (some cases)
Foot part: double-side clubbed foot
Skin: no fold between fingers, poor formation of palm print, and large sockets
Muscle and soft tissue: decreased muscle volume and rigidity
Central nervous system: normal intelligence, tendon reflex weakness or loss (partial cases) of knee and ankle
Distal joint contractures with impaired proprioceptive and tactile sensations (MIM617146), autosomal recessive inheritance; correlation phenotype spectra:
height: short and small in stature
A head part: poor head control
Face: skin disease facial appearance
A nose part: long nose and wide nose bridge
Oral cavity: thin upper lip and high arch palate
A respiratory system: respiratory insufficiency of neonates
Spinal column: congenital contracture, scoliosis
Pelvis: congenital dysplasia of the hip joint
A hand part: contracture of fingers, contracture of hands, slender fingers (toes), flexor, thumb duckbill deformity
Foot part: deformity of foot, talipes equinovarus, flat foot, and significant deformity of intertoe spaces
Muscle and soft tissue: muscular weakness, lower limb involvement and higher limb, muscular atrophy, lower limb involvement and higher limb, muscular weakness mainly occurring at the distal end
Central nervous system: bradykinesia, delayed walking ability, inability to walk, wide base gait, sensory ataxia, impaired fine motor skills, dysarthria, difficult to erect signs of eye closure
Peripheral nervous system: reduction or absence of vibration sensation, reduction or absence of slight tactile sensation, reduction or absence of proprioception, disappearance of neuroreflex, mild sensory axonal neuropathy, and reduction of amplitude of sensory nerve action potential
Other characteristics are as follows: beginning in the first decade, progressive disease
5. LDLR gene-associated disease and phenotype profiling:
LDL cholesterol level (quantitative trait site 2) (MIM143890), autosomal dominant or recessive inheritance; correlation phenotype spectra:
eye part: corneal ring, macular tumors
Heart: coronary artery disease, heterozygotes appearing after age 30 years of age, and homozygotes appearing in childhood
Skin: myotenoxanthoma occurs in heterozygotes after age 20, in homozygotes after age 4, and planar xanthoma in homozygotes
Laboratory examination: hypercholesterolemia, heterozygote 350-550mg/L, homozygote 650-1000mg/L
6. The chr20q13.33 region repeats, correlating with the phenotype spectra:
similar to this case, the missing region has not been reported, the clinical relevance is not clear, and please carefully interpret in combination with the specific clinical situation.
7. Deletion of the chr22q13.32q13.33 region, associated phenotype profile:
similar to this case, the missing region has not been reported, the clinical relevance is not clear, and please carefully interpret in combination with the specific clinical situation.
8. Note that: genetic diseases vary from individual to individual, not every patient presents with the full phenotype, and the extent of expression also varies. "
In this embodiment, a fifth output condition is to invoke a Clinical hierarchical verification policy, where the Clinical hierarchical verification policy is to identify information in a column where the Clinical _ Tier is located, and output the identified numbers as variation levels with corresponding numbers, respectively, and if there is 1-4 level variation, name the HGVS in which the variation is output first for each variation, and explain a form in which the variation occurs according to a naming rule; otherwise, outputting that no pathogenic variation is found according to the clinical prompt of the examinee. Wherein, the absence of a pathogenic variation that is consistent with the subject's clinical cue is indicated as "the absence of a pathogenic variation that is consistent with the subject's clinical cue. This means that no pathogenic variation is visible within the detectable range of the present method, consistent with the clinical implications of the subject. At the same time, please see the description of the methodology performance in the 'methodology description' section. Especially when no pathogemia is detected in clinically very suspected cases. "
For example "SAMD 11 chr1:876569 NM-152486.4 Exon8 c.752G > A p (Gly251Asp)
This variation is a missense mutation (expected to result in the replacement of the Gly by Asp at position 251 of the encoded protein). "
The specific logic is as follows:
if the variation occurs in the intron region, if it is. + -.1 or. + -.2, the outcome "the variation is a splicing mutation, which may cause a gene coding disorder. ", otherwise, output" the variation is an intron variation "
If the mutation occurs in the UTR region, the output is ' the mutation is a {3 ' UTR/5 ' UTR } region mutation. "
If the variation occurs in the exon regions, the following considerations are taken:
if the cDNA variant contains "delins", the mutation site, normal amino acid and mutated amino acid are extracted by regular expression, and the result is output as an insertion deletion mutation (which is expected to result in a { xxx } deletion at the { xxx } position of the encoded protein and a { xxx } substitution). "
If a cDNA variant does not contain "delins" but contains "del" and "_", the mutation site, the missing amino acid, and the output "are deletion mutations (predicted to result in deletion of the encoded protein from { xxx } at { xxx } th position to { xxx } at { xxx } th position) by canonical expression. "
If a cDNA variant does not contain "delins" and "_" but "del", it is expressed by canonical expression, extracts the mutation site, missing amino acids, and outputs "the mutation is a deletion mutation (expected to result in the encoded protein being deleted from { xxx } at { xxx } position to { xxx } at { xxx } position). "
If a cDNA variant does not contain "delins" and "_" but contains "ins", then by regular expression, the mutation site, inserted amino acid, is extracted, outputting "the mutation is a deletion mutation (expected to result in the encoded protein having { xxx } inserted between { xxx } at the { xxx } th position and { xxx } at the { xxx } th position). "
If the cDNA variant contains "dup" and "_", the mutation site, repeated amino acids are extracted by regular expression, and the output "the mutation is deletion mutation (expected to cause the encoded protein to be deleted from xxx at position { p1} to xxx at position xxx). "
If the cDNA variant contains only "dup", then the canonical expression extracts the site of variation, the amino acid repeat, and outputs "the variation is a repeat mutation (expected to result in the encoded protein repeating at { xxx } position). "
If a cDNA variant contains "[", then by canonical expression, the mutation site, repeated amino acid, number of repeats, and output "the variant is a repeat mutation (expected to result in { xxx } repeats of the encoded protein at { xxx } position). "
If the cDNA variant contains "fs" and two "+", the output "the variant is a frameshift mutation (which is expected to result in delayed translation termination)"
If the cDNA mutation contains "fs", the mutation site, the starting amino acid, and the output "the mutation is a frame shift mutation (which is expected to cause a frame shift of the encoded protein from { xxx } at { xxx } position) are extracted by regular expression. ", if a" + "is included at a synonym of cDNA, then a sentence is added after" frame shift occurs "and results in premature termination of translation"
If the cDNA mutation name contains no "fs" and contains ". about.", the mutation site and the variant amino acid are extracted by regular expression, and the result is output as "the mutation is nonsense mutation (which is expected to cause the amino acid { xxx } at the { xxx } th position of the encoded protein to be changed into a stop codon, and the translation of the protein is expected to be terminated early). "
If a cDNA mutation name contains "=", the output "the mutation is a synonymous mutation (after the gene mutation, the amino acid of the encoded protein is not expected to be changed). "
If a cDNA variant contains a "(" the variant is exported "as a stop codon mutation (resulting in a loss of the stop codon, which is expected to extend translation of the protein).
If a cDNA variant contains an "+" it is output as a nonsense mutation (which would be expected to result in the change of the amino acid xxx at position xxx of the encoded protein to a stop codon, which would be expected to stop translation of the protein prematurely). "
If the cDNA variant name does not meet the above rule, the variant site and variant amino acid are extracted by regular expression, and "the variant is missense mutation (it is expected that xxx at the xxx-th position of the encoded protein is substituted by xxx)" is output. "
If the ' HGMD _ home site recording condition ' is null, outputting ' HGMD missing recording; "otherwise output" there is a literature report that the mutation was detected in { here is the name of the disease in the literature, not null } patient ({ here is the PMID number in the literature }); if other variant forms of the same site of HGMD _ are available, the amino acids after variant are the same, and the additional output is that the HGMD database contains other base mutations of the same coding subunit but the same amino acid changes ({ here, cDNA and protein changes }), and is classified as { here, pathogenicity classification }; "; if the HGMD _ contains other variants of the same code (the amino acids after variant are different), the additional output is that the HGMD database contains other base mutations of the same code but different amino acids ({ here, cDNA and protein changes }), and is classified as { here, pathogenicity classification }; "; additionally outputting the recording of the mutation by each database, for example, "the database frequency of thousand human genomes (1000 g2015aug _ ALL) is 0.091254; ESP6500siv2_ ALL database frequency is 0.0824; the ExAC _ ALL database frequency is 0.048; the gnomAD _ genome _ ALL database frequency is 0.082; and the number of the dbSNP147 database is rs 7523549. ", another example" ESP6500siv2_ ALL database frequency is 0.1; the gnomAD _ genome _ ALL database frequency is 0.4; thousand human genomes (1000 g2015aug _ ALL), ExAC _ ALL, not included; the dbSNP147 database has the accession number rs 2311. ". If ADA _ SCORE is greater than 0.4 or RF _ SCORE is greater than 0.4, the additional output "bioinformatics software predicts that it may affect mRNA splicing. "; if REVELscore is greater than 0.4, the additional output "multiple bioinformatic prediction tools support the variation is prone to be detrimental. ", if REVELscore is less than 0.1, the additional output" multiple bioinformatic prediction tools support that the variation is benign. "last added output" is taken into account, and the variance classification is { here, ACMG variance classification }. "
For the CNV variation of grade 3, if it does not contain known important functional genes, the output "the region does not contain known important functional genes, and it is not yet confirmed whether xxx in the region has pathogenicity. "if it contains a known important functional gene" the range of variation contains { gene name, division of pause } gene, it is not confirmed whether xxx in the region is pathogenic. "
In this embodiment, the input file field is shown in fig. 7, "× × information formed by 10-digit letters or digital random numbers, and is used to distinguish and identify the identity of the sample, and the output result after the fifth output condition according to the field information of fig. 7 is:
KCNQ4 chr1:41304146 NM_004700.4 Exon14 c.2039C>T p.(Ser680Phe)
this variation is a missense mutation (expected to change the Ser at position 680 of the encoded protein to Phe). This variation (DM) has been reported in the literature to be detected in a sensory nonaural imaging patient, idiopathetic patient; thousand human genomes (1000 g2015aug _ ALL) database frequency is 0; ESP6500siv2_ ALL database frequency is 0; the ExAC _ ALL database frequency is 0.00002476; the gnomAD _ genome _ ALL database frequency is 0.00003232; and the dbSNP147 database has the accession number rs 772135867. Various bioinformatic prediction tools support that the variation is predisposed to be detrimental. Taken together, the classification of this variation is of unknown significance.
KAT6B chr10:76602709 NM_012330.4 Exon3 c.94A>C p.(Ile32Leu)
This variation is a missense mutation (expected to change the Ile at position 32 of the encoded protein to Leu). HGMD is not recorded; thousand human genomes (1000 g2015aug _ ALL) database frequency is 0; ESP6500siv2_ ALL database frequency is 0; ExAC _ ALL database frequency is 0; the gnomAD _ genome _ ALL database frequency is 0; dbSNP147 database is not included. Various bioinformatic prediction tools support that the variation is predisposed to be detrimental. Taken together, the variability is classified as VUS-bias pathogenicity.
PIEZO2 chr18:10855478 NM_022068.4 Exon7 c.790G>A p.(Asp264Asn)
This variation is a missense mutation (expected to change the Asp at position 264 of the encoded protein to Asn). HGMD is not recorded; thousand human genomes (1000 g2015aug _ ALL) database frequency is 0; ESP6500siv2_ ALL database frequency is 0; ExAC _ ALL database frequency is 0; the gnomAD _ genome _ ALL database frequency is 0; the dbSNP147 database has accession number rs 1310236262. Taken together, the classification of this variation is of unknown significance.
LDLRchr 19:11216013 NM-000527.5 Exon4 c.431C > T p. (Pro144Leu) the variation is a missense mutation (would be expected to change Pro at position 144 of the encoded protein to Leu). HGMD is not recorded; thousand human genomes (1000 g2015aug _ ALL) database frequency is 0; ESP6500siv2_ ALL database frequency is 0; ExAC _ ALL database frequency is 0; the gnomAD _ genome _ ALL database frequency is 0; and the number of the dbSNP147 database is rs 912448894. Taken together, the classification of this variation is of unknown significance.
wes [ hg19] chr20q13.33(61537128-62904963) X31.4Mb repeats the variation range which comprises CHRNA4, RTEL1, KCNQ2, SLC17A9, SOX18, PRPF6, DNAJC5 and EEF1A2 genes and has certain pathogenicity.
wes [ hg19] chr22q13.32q13.33 (48885395-.
The method comprises the steps of reading a full exome sequencing annotation table, and extracting variation information and variation locus information of a proband and parents of the proband according to the full exome sequencing annotation table; calling a gene-disease database, carrying out logic judgment on variation information and information of variation sites according to a field verification strategy, outputting corresponding report contents according to corresponding positions of a first output condition, a second output condition, a third output condition, a fourth output condition and a fifth output condition in a detection result brief description part, a gene mutation result part, a suggestion and explanation part, other suggestion parts and a gene variation explanation part, realizing automatic generation of an all-out report, synthesizing results, presenting artificially screened SNV (single nucleotide polymorphism variation) sites and CNV (copy number variation) sites in a report, simultaneously generating a report attached table and a variation map attached table of 5-level variation, inputting a file which is the screened all-out exon sequencing annotation table, outputting report contents of natural language, outputting report contents (detection results) according to the all-out sequencing annotation table in a rapid, automatic and standardized manner Summary, detected gene mutation, suggestion and explanation, other suggestions, explanation of detected gene mutation, a mutation chart and a 5-level mutation attached table), realizes one-key generation of a report after mutation screening, provides corresponding suggestion and explanation for the detected gene mutation, and provides convenience for visualization of a data report.
The embodiments of the present invention have been described in detail, but the description is only for the preferred embodiments of the present invention and should not be construed as limiting the scope of the present invention. All equivalent changes and modifications made within the scope of the present invention should be covered by the present patent.

Claims (8)

1. A method for automatically generating a standardized report by a full exome sequencing annotation table is characterized by comprising the following steps:
reading a whole exome sequencing annotation table, and extracting variation information and variation locus information of a proband and parents thereof according to the whole exome sequencing annotation table;
calling a gene-disease database, carrying out logic judgment on the mutation information and the information of the mutation sites according to a field verification strategy, and outputting corresponding report contents at corresponding positions of a detection result brief description part, a gene mutation result part, a suggestion and explanation part, other suggestion parts and a gene mutation explanation part according to a judgment result respectively according to a first output condition, a second output condition, a third output condition, a fourth output condition and a fifth output condition;
generating a chart template, and filling an experiment number in the chart template according to the information of the form file;
reading information of a column where the Clinical _ Tier of the full exome sequencing annotation table is located, marking a row with content as a hierarchical information row, traversing the hierarchical information row and then screening the hierarchical information row with content of 5;
and judging whether the row of the hierarchical information row is SNV variation or not, if so, outputting a 5-level SNV additional table after the report content, judging whether a Fast column or a Triof column of the hierarchical information row contains CNV or not, and if so, additionally generating a 5-level CNV variation profile after the 5-level SNV additional table to be used as a final report additional table for outputting.
2. The method as claimed in claim 1, wherein the field verification strategy includes a proband gender verification strategy, a sample category verification strategy and a clinical grading verification strategy, the proband gender verification strategy is to invoke the sample category verification strategy to determine whether the sample category is a single sample or a family sample, when the sample category is a single sample, the proband gender output is female if female chromosome information appears in the field information traversing the whole exome sequencing annotation table, and male if male chromosome information appears in the field information, the proband gender output is male; when the sample type is a family sample, traversing a first field containing HGVS in the registry, and judging that the proband is female if the sample type is ended by female chromosome information, and the proband is male if the sample type is ended by male chromosome information; the sample type verification strategy is field information of the whole exome sequencing annotation table, if a Fast field exists, a single sample is output, and if a Triof field exists, a family sample is output; the Clinical grading verification strategy is to identify the information in the column where the Clinical _ Tier is located, and the identified numbers are respectively and correspondingly output as the variation levels of the corresponding numbers.
3. The method for automatically generating a standardized report according to the whole exome sequencing annotation table of claim 1, wherein the first output condition is invoking a Clinical hierarchical verification strategy, the Clinical hierarchical verification strategy is identifying information listed in a Clinical _ Tier, and outputting the identified numbers as a corresponding number of variation levels, and if no 1-4 variation exists, outputting that no pathogenic variation related to Clinical suggestion is detected; otherwise, judging whether the 1 and 2-stage variation exists, if so, outputting the first row of the brief description part of the detection result without detecting pathogenic variation clearly explaining the clinical condition, and sequentially inputting the number of each clinical grade of the detected variation and the corresponding description according to the number of each stage of variation.
4. The method for automatically generating a standardized report according to the full exome sequencing annotation table of claim 1, wherein the second output condition is a condition for judging the variation of SNV and CNV, if the information for identifying the columns of SNV and CNV has digital information at the same time, it indicates that the variation of SNV and CNV exists at the same time, a double variation result is output, and if one of the information for identifying the columns of SNV and CNV is digital information and the other is blank information, it indicates that the variation of SNV or CNV exists, a single variation result is output; otherwise, a sample type verification strategy is called, the family sample outputs the whole exome sequencing without detecting pathogenic variation, and the single sample outputs the high-throughput sequencing without detecting pathogenic variation.
5. The method for automatically generating a standardized report according to the full exome sequencing annotation table of claim 1, wherein the third output condition is calling a Clinical hierarchical verification strategy, the Clinical hierarchical verification strategy is identifying information listed by Clinical _ Tier, outputting the identified numbers as a corresponding number of variation levels respectively, judging whether 1-4 levels of variation exist, if so, continuing calling a sample class verification strategy, outputting a single sample with high throughput sequencing, and if not, outputting that pathogenic variation clearly explaining Clinical conditions is not detected; outputting a family sample, executing whole exome sequencing, calling a clinical hierarchical verification strategy, and if no 1-2 level variation exists, outputting that no pathogenic variation clearly explaining the clinical condition is detected; otherwise, no pathogenic variation and copy number variation were detected in the single sample output, and whole exome sequencing was performed in the family sample output.
6. The method of claim 1, wherein the third output condition comprises that when a Clinical hierarchical verification strategy is invoked to determine that there is a 1-4 level variation, the Clinical hierarchical verification strategy is to identify the information in the column where Clinical _ Tier is located, and output the identified numbers as a corresponding number of variation levels, and if there is an SNV variation in the genes of the parents, output the proband gene number variation; if the parent gene has CNV variation, judging whether the CNV variation contains important known gene, if yes, outputting the CNV variation containing important known gene; otherwise the output does not contain important known genes; the sex verification strategy of the proband is called, if the sex of the proband is output as a male, the male is mutated on the X chromosome, and the father carries the mutation, a new mutation is output; otherwise, outputting a blank; and respectively outputting the results of the variation degrees of the corresponding levels according to the variation results of the levels 1-4.
7. The method for automatically generating a standardized report for a whole exome sequencing annotation table according to claim 1, wherein the fourth output condition is to invoke a Clinical hierarchical verification strategy to determine whether a predecessor has a mutation, the Clinical hierarchical verification strategy is to identify information in a column where a Clinical _ Tier is located, determine whether information in the column where the Clinical _ Tier is located is digital information, determine whether a sample of a parent of the parent is submitted for inspection if the information is digital information, respectively determine whether the parent carries the mutation if the parent carries the mutation, and output a gene inspection suggestion of the parent mutation if the parent carries the mutation; if the mother carries the variation, outputting a mother variation gene inspection suggestion; otherwise, outputting a parent sample submission suggestion if the parent sample is not submitted for submission; judging whether the mutation of the proband is a new mutation or an autosomal recessive genetic mutation, and only outputting an SCN1A gene examination suggestion; if the SNV variation is level 4, outputting a consultation suggestion of the corresponding site gene according to the database; if the CNV is the 1-2 level CNV variation, outputting a copy number variation check suggestion; if the CNV is the 3-level CNV variation, outputting a copy number variation tracking suggestion; if the CNV is the 4-level CNV variation, outputting a copy number variation consultation suggestion; otherwise, the proband has no variation, and outputs undetected pathogenic variation and institution recommendation; and outputting the mechanism recommendation.
8. The method for automatically generating a standardized report according to the whole exome sequencing annotation table of claim 1, wherein the fifth output condition is invoking a Clinical hierarchical verification strategy, the Clinical hierarchical verification strategy is identifying information in a column where Clinical _ Tier is located, outputting the identified numbers as a corresponding number of variation levels, if there are 1-4 levels of variation, outputting the HGVS name of the variation first for each variation, and interpreting the occurrence form of the variation according to a naming rule; otherwise, outputting that no pathogenic variation is found according to the clinical prompt of the examinee.
CN202210010414.7A 2022-01-06 2022-01-06 Method for automatically generating standardized report of full exome sequencing annotation table Active CN114023384B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210010414.7A CN114023384B (en) 2022-01-06 2022-01-06 Method for automatically generating standardized report of full exome sequencing annotation table

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210010414.7A CN114023384B (en) 2022-01-06 2022-01-06 Method for automatically generating standardized report of full exome sequencing annotation table

Publications (2)

Publication Number Publication Date
CN114023384A true CN114023384A (en) 2022-02-08
CN114023384B CN114023384B (en) 2022-04-05

Family

ID=80069568

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210010414.7A Active CN114023384B (en) 2022-01-06 2022-01-06 Method for automatically generating standardized report of full exome sequencing annotation table

Country Status (1)

Country Link
CN (1) CN114023384B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114927191A (en) * 2022-04-13 2022-08-19 北京高灵智腾信息科技有限公司 Interpretation method for NGS report of blood system disease
CN116453591A (en) * 2023-05-08 2023-07-18 上海信诺佰世医学检验有限公司 RNA-seq data analysis-based variation rating and report generation system and method
CN117275656A (en) * 2023-11-22 2023-12-22 北斗生命科学(广州)有限公司 Method and system for automatically generating standardized report of clinical test record
CN117877578A (en) * 2024-01-16 2024-04-12 广东劢智医疗科技有限公司 Gene variation scoring and sorting method for genetic variation analysis

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170009287A1 (en) * 2015-07-08 2017-01-12 Quest Diagnostics Investments Incorporated Detecting genetic copy number variation
CN107391965A (en) * 2017-08-15 2017-11-24 上海派森诺生物科技股份有限公司 A kind of lung cancer somatic mutation determination method based on high throughput sequencing technologies
CN107391963A (en) * 2017-07-21 2017-11-24 上海桑格信息技术有限公司 Eucaryon based on calculating cloud platform is without ginseng transcript profile interaction analysis system and method
CN110770838A (en) * 2017-12-01 2020-02-07 Illumina公司 Method and system for determining clonality of somatic mutations
CN112375815A (en) * 2020-11-11 2021-02-19 上海市儿童医院 Genetic disease high-throughput sequencing pathogenic mutation screening method based on core family
CN112735599A (en) * 2021-01-26 2021-04-30 河南省人民医院 Evaluation method for judging rare hereditary diseases

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170009287A1 (en) * 2015-07-08 2017-01-12 Quest Diagnostics Investments Incorporated Detecting genetic copy number variation
CN107391963A (en) * 2017-07-21 2017-11-24 上海桑格信息技术有限公司 Eucaryon based on calculating cloud platform is without ginseng transcript profile interaction analysis system and method
CN107391965A (en) * 2017-08-15 2017-11-24 上海派森诺生物科技股份有限公司 A kind of lung cancer somatic mutation determination method based on high throughput sequencing technologies
CN110770838A (en) * 2017-12-01 2020-02-07 Illumina公司 Method and system for determining clonality of somatic mutations
CN112375815A (en) * 2020-11-11 2021-02-19 上海市儿童医院 Genetic disease high-throughput sequencing pathogenic mutation screening method based on core family
CN112735599A (en) * 2021-01-26 2021-04-30 河南省人民医院 Evaluation method for judging rare hereditary diseases

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114927191A (en) * 2022-04-13 2022-08-19 北京高灵智腾信息科技有限公司 Interpretation method for NGS report of blood system disease
CN116453591A (en) * 2023-05-08 2023-07-18 上海信诺佰世医学检验有限公司 RNA-seq data analysis-based variation rating and report generation system and method
CN117275656A (en) * 2023-11-22 2023-12-22 北斗生命科学(广州)有限公司 Method and system for automatically generating standardized report of clinical test record
CN117275656B (en) * 2023-11-22 2024-04-09 北斗生命科学(广州)有限公司 Method and system for automatically generating standardized report of clinical test record
CN117877578A (en) * 2024-01-16 2024-04-12 广东劢智医疗科技有限公司 Gene variation scoring and sorting method for genetic variation analysis

Also Published As

Publication number Publication date
CN114023384B (en) 2022-04-05

Similar Documents

Publication Publication Date Title
CN114023384B (en) Method for automatically generating standardized report of full exome sequencing annotation table
Menke et al. Further delineation of an entity caused by CREBBP and EP300 mutations but not resembling Rubinstein–Taybi syndrome
Statland et al. Facioscapulohumeral muscular dystrophy
Temtamy et al. Brachydactyly
Battaglia et al. Further delineation of deletion 1p36 syndrome in 60 patients: a recognizable phenotype and common cause of developmental delay and mental retardation
Dikoglu et al. Mutations in LONP1, a mitochondrial matrix protease, cause CODAS syndrome
Dietz FBN1-related Marfan syndrome
Hvidberg et al. Catalog of 199 register-based definitions of chronic conditions
Taylor et al. Inverted tandem (“mirror”) duplications in human chromosomes: inv dup 8p, 4q, 22q
Johnson Myocilin and glaucoma: A TIGR by the tail?
Minatogawa et al. Clinical and molecular features of 66 patients with musculocontractural Ehlers− Danlos syndrome caused by pathogenic variants in CHST14 (mcEDS-CHST14)
Scheuerle et al. Birth defect classification by organ system: a novel approach to heighten teratogenic signalling in a pregnancy registry
Lyons MED12-related disorders
CN116472591A (en) Techniques for generating predictive outcomes associated with spinal muscular atrophy using artificial intelligence
Marzin et al. Weill-marchesani syndrome
Nowak Genetics and hearing loss: a review of Stickler syndrome
Glinton et al. Phenotypic expansion of the BPTF‐related neurodevelopmental disorder with dysmorphic facies and distal limb anomalies
Leong et al. Clinical, biochemical and genetic profiles of patients with mucopolysaccharidosis type IVA (Morquio A syndrome) in Malaysia: the first national natural history cohort study
Lim et al. Causes of childhood blindness in the United States using the IRIS® registry (intelligent research in sight)
Kulkarni et al. Traboulsi syndrome due to ASPH mutation: an under-recognised cause of ectopia lentis
Blachford The Gale encyclopedia of genetic disorders
Day et al. A clinical and genetic study of the Say/Barber/Biesecker/Young‐Simpson type of Ohdo syndrome
Powel et al. Genetics of non‐isolated hemivertebra: A systematic review of fetal, neonatal, and infant cases
Arya et al. An adult female with 5q34-q35. 2 deletion: A rare syndromic presentation of left ventricular non-compaction and congenital heart disease
Cohen et al. Long term ophthalmic complications of distal arthrogryposis type 5D

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant