CN111161824A - Automatic report interpretation method and system - Google Patents

Automatic report interpretation method and system Download PDF

Info

Publication number
CN111161824A
CN111161824A CN201911328539.9A CN201911328539A CN111161824A CN 111161824 A CN111161824 A CN 111161824A CN 201911328539 A CN201911328539 A CN 201911328539A CN 111161824 A CN111161824 A CN 111161824A
Authority
CN
China
Prior art keywords
data
report
database
evidence
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911328539.9A
Other languages
Chinese (zh)
Inventor
梁萌萌
余伟师
谢欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Semek Gene Technology Co ltd
Original Assignee
Suzhou Semek Gene Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Semek Gene Technology Co ltd filed Critical Suzhou Semek Gene Technology Co ltd
Priority to CN201911328539.9A priority Critical patent/CN111161824A/en
Publication of CN111161824A publication Critical patent/CN111161824A/en
Priority to PCT/CN2020/092902 priority patent/WO2021120528A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Public Health (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Epidemiology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the field of letter generation detection, and designs an automatic report interpretation method and a system, wherein the method comprises the following steps: acquiring various evidence data sources of letter generation analysis; calculating the scores of all evidence data sources, defining the value representing pathogenicity in the calculation result as a value A, and defining the value representing benign result in the calculation result as a value B; sequencing all the sites of the raw letter analysis in sequence according to the scores of the calculation results; screening pathogenic loci according to an industry gold standard; corresponding the pathogenic locus data, the phenotype data of the patient and the related data in the local relational database and then importing the data into a template; and adding conclusion description into the template to obtain a complete report. The core data obtained by searching and the peripheral information of the core data are displayed in a multi-dimensional mode, the associated data are integrated to the maximum extent, and the raw message analysis report is simple and easy to read.

Description

Automatic report interpretation method and system
Technical Field
The invention belongs to the field of letter generation detection, and designs an automatic report reading method and an automatic report reading system.
Background
With the rapid development of sequencing technology and the continuous reduction of cost, more and more patients will adopt the doctor's advice to receive the detection of molecular diagnostic technology, and the most popular contemporary gene is sequenced. However, as is known, neither the original result file obtained by sequencing nor the output file obtained by analyzing, filtering and annotating the original result by using various algorithms by a letter engineer can provide the most direct reference for doctors; it requires further processing of the data by a professional medical interpreter to form a clear and easily readable final report to aid clinical decision-making. In the process of writing the report, the interpreters need to query various public databases to re-screen the thousands of sites of the message output, and rank the variation of the selected sites according to the gold standard in the industry, so as to classify the sites as pathogenicity, suspected pathogenicity or unknown clinical significance. Finally, the interpreter must complete the report in the document format prescribed by the doctor.
At present, although all the main public databases provide web pages for information retrieval, the relevance among all the databases is poor, and a relatively obvious information island is formed, so that unscrambling personnel need to continuously switch on all query pages instead of obtaining multidimensional display of complete data through one-time query. Meanwhile, when the unscrambler manually screens the thousands of sites at the present stage, an automatic sequencing mechanism aiming at the pathogenic sites of a specific disease is lacked, so that more time is consumed in the step. In addition, when making a report, the standardization of the report and the aesthetic degree of the layout are also important factors influencing the overall interpretation rate.
Disclosure of Invention
The application provides an automatic report interpretation method and system, which can be used for carrying out multi-dimensional display on core data obtained by searching and peripheral information of the core data together, integrating related data to the maximum extent and enabling a raw letter analysis report to be simple and easy to read.
In order to achieve the technical purpose, the technical scheme adopted by the application is as follows: an automated report interpretation method, comprising:
acquiring various evidence data sources of letter generation analysis;
calculating the scores of all evidence data sources, defining the value representing pathogenicity in the calculation result as a value A, and defining the value representing benign result in the calculation result as a value B;
sequencing all the sites of the raw letter analysis in sequence according to the scores of the calculation results;
screening pathogenic loci according to an industry gold standard;
corresponding the pathogenic locus data, the phenotype data of the patient and the related data in the local relational database and then importing the data into a template; wherein the relevant data comprises gene function data, phenotypic description data, and graded evidence;
and adding conclusion description into the template to obtain a complete report.
As an improved technical scheme of the application, A in the value A is a number; b in the value B is a number; the sites of the student's letter analysis are sorted by number according to the score of the calculated results.
The improved technical scheme further comprises the steps of generating report data in a JSON format from the complete report, and storing the report in the JSON format in a historical report database.
As an improved technical scheme of the application, the local relational database comprises an OMIM database, a CHPO database, an HGMD database and a historical report database; and an OMIM database, a CHPO database, an HGMD database and a history report database in the local relational database are associated according to a gene-phenotype relation by adopting an ER relational graph mode to form a multi-dimensional data system.
As an improved technical scheme of the application, the method also comprises the step of synthesizing the report data of the complete report generation JSON format with the HTML text to form a PDF report.
As an improved technical scheme of the application, the weighted average calculation is adopted for calculating the scores of all evidence data sources.
As an improved technical scheme of the application, a logistic regression algorithm is adopted for calculating the scores of all evidence data sources.
It is another object of the present application to provide an automated report interpretation system, comprising
The intelligent analysis module is used for acquiring various evidence data source files of letter generation analysis, performing weighted average calculation on various data in the result file, and sequencing all the points in the calculation result according to pathogenicity;
the report writing module is used for acquiring the calculation result of the intelligent analysis module, the patient phenotype data and the data in the local relational database and carrying out conclusive descriptive text;
and the generating module is used for receiving the data reported by the report writing module, combining the HTML text in the template editing module, and generating a PDF report.
According to another embodiment of the application, a storage medium is characterized in that the storage medium has stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.
According to another embodiment of the present application, an electronic device comprises a memory and a processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program to perform the steps of any of the above method embodiments.
Advantageous effects
The method has the advantages that the weighted average calculation of multiple pathogenicity evidence data sources is performed through a Logistic Regression (Logistic Regression) algorithm, the screening speed of pathogenicity sites can be increased, the semi-automation of the screening of the pathogenicity sites is realized, meanwhile, the accuracy of sequencing results can be continuously improved by combining continuously accumulated historical data, so that the confidence that unscrambling personnel judge that the detection results are positive is increased, and meanwhile, the efficiency is improved;
by means of converting HTML to PDF, typesetting and beautifying of centralized management interpretation reports are achieved by using HTML style editing, time for editing the reports is shortened, uniformity of report pages is improved, interpretation personnel only need to relate to the contents of the reports instead of styles when making the reports, and about 30% of time can be saved;
the interpretation data written in the report can be effectively stored in the database, and is convenient for searching and consulting in a structured mode.
Through the association and integration of the database and the gene, phenotype and disease association structure system established during data integration, the problem of information isolated islands among various data sources can be effectively solved, unnecessary repeated query steps for acquiring relevant information of core query results by reading personnel during query are reduced, and the time of the reading personnel is saved;
it should be understood that all combinations of the foregoing concepts and additional concepts described in greater detail below can be considered as part of the subject matter of the present disclosure unless such concepts are mutually inconsistent.
The foregoing and other aspects, embodiments and features of the present teachings can be more fully understood from the following description taken in conjunction with the accompanying drawings. Additional aspects of the present disclosure, such as features and/or advantages of exemplary embodiments, will be apparent from the description which follows, or may be learned by practice of the specific embodiments in accordance with the teachings of the present disclosure.
Drawings
Fig. 1 is a schematic diagram of an overall structure of an automated report interpretation method according to the present application.
Fig. 2 is a graph of ER relationships employed by the local relational database.
Detailed Description
For a better understanding of the technical content of the present application, specific embodiments are described below in conjunction with the appended drawings.
Embodiments of the present disclosure are not necessarily intended to include all aspects of the present application. It should be appreciated that the various concepts and embodiments described above, as well as those described in greater detail below, may be implemented in any of numerous ways, as the concepts and embodiments disclosed herein are not limited to any implementation. In addition, some aspects of the present disclosure may be used alone or in any suitable combination with other aspects of the present disclosure.
According to the method and the device, when a technical scheme is designed, the defects caused by information isolated islands need to be effectively reduced in the process of making the reading report, the core data obtained by searching and the peripheral information of the core data are displayed in a multi-dimensional mode, and relevant data are integrated to the maximum extent. Meanwhile, an easy pathogenic site automatic sequencing model which accords with a company reading frame needs to be established, and the screening of the sites is accelerated. When a final PDF report is made, the problem of report style uniformity needs to be solved, central management needs to be strengthened, and interpreters can pay more attention to report contents rather than typesetting styles during the writing process.
Example 1
The method provided by the embodiment of the application can be executed in a cloud or a local server cluster. The local server cluster may include one or more processors (which may include, but are not limited to, x86 or ARM architecture processing devices) and memory for storing data, and optionally may also include transmission equipment for communication functions and input-output equipment.
The memory may be used to store computer programs, for example, software programs and modules of application software, such as a computer program corresponding to the automatic report interpretation method in the embodiment of the present application, and the processor executes various functional applications and data processing by running the computer programs stored in the memory, that is, implementing the method described above.
The storage can comprise high-speed random access memory, and data redundancy is realized through a RAID1 or RAID5 disk array, so that the safety of data is ensured.
The transmission device is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the local server cluster. In one example, the transmission device includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet.
In this embodiment, an automated report interpretation method operating in the local server cluster or the network architecture is provided, and with reference to fig. 1, the automated report interpretation method includes the following steps:
acquiring various evidence data sources of the biographical analysis, wherein the various evidence data sources are shown in the following table 1;
table 1 illustrates various sources of evidence data in a raw trust analysis results file
Figure BDA0002329012740000041
Figure BDA0002329012740000051
Calculating the scores of all evidence data sources, defining the value representing pathogenicity in the calculation result as a value A, and defining the value representing benign result in the calculation result as a value B; the value A wherein A is a number, such as 1.0; value B wherein B is a number, such as 0.0;
screening pathogenic loci according to an industry gold standard; the industry Standard may use the ACMG genetic variation Classification standards and guidelines.
Corresponding the pathogenic locus data, the phenotype data of the patient and the related data in the local relational database and then importing the data into a template; the related data comprise gene function data, phenotype description data, rating evidence and the like, and the information island problem among various data sources is effectively eliminated.
And adding conclusion description into the template to obtain a complete report.
The local relational database comprises an OMIM database, a CHPO database, an HGMD database and a historical report database; and an OMIM database, a CHPO database, an HGMD database and a history report database in the local relational database are associated according to a gene-phenotype relation by adopting an ER relational graph mode to form a multi-dimensional data system. At the beginning, the local relational database is generated in advance and continuously updated, so the local relational model in this embodiment is a continuously updated model.
And generating JSON-format report data from the complete report, storing the JSON-format report in a historical report database, and synthesizing the JSON-format report data generated from the complete report and an HTML text to form a PDF report.
Through the steps, the gene, phenotype and disease associated structural system created during data integration can effectively eliminate the information island problem among various data sources; the weighted average calculation of multiple pathogenicity evidence data sources is carried out through a Logistic Regression (Logistic Regression) algorithm, so that the screening speed of pathogenicity sites can be improved; and editing by using an HTML style to realize typesetting and beautifying of the interpretation report and compress the time for editing the report. The problem that the relevance among databases is poor, a relatively obvious information island is formed, and therefore unscrambling personnel need to switch on each query page continuously instead of obtaining the multidimensional display of complete data through one-time query is solved effectively; when the unscrambler manually screens the thousands of sites at the present stage, an automatic sequencing mechanism aiming at the pathogenic sites of a specific disease is lacked, so that more time is consumed in the step; when the report is made, the problems of standardization, typesetting aesthetic degree and the like of the report are reported.
Preferably, the scores of the evidence data sources are calculated by adopting a logistic regression algorithm to perform weighted average calculation so as to improve the screening speed of the pathogenic loci. The sites of the student's letter analysis are sorted by number according to the score of the calculated results.
Example 2
In this embodiment, an automatic report interpretation system is further provided, and the system is used to implement the foregoing embodiments and preferred embodiments, and the description of which is already given is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
An automated report interpretation system comprising
The intelligent analysis module is used for acquiring various evidence data source files of letter generation analysis, performing weighted average calculation on various data in the result file, and sequencing all the points in the calculation result according to pathogenicity;
the report writing module is used for acquiring the calculation result of the intelligent analysis module, the patient phenotype data and the data in the local relational database and carrying out conclusive descriptive text; in actual operation, the calculation results of the intelligent analysis module, the patient phenotype data and the data in the local relational database are acquired in a one-to-one correspondence manner;
and the generating module is used for receiving the data reported by the report writing module, combining the HTML text in the template editing module, and generating a PDF report.
Optionally, the intelligent analysis module comprises:
the receiving unit is used for acquiring various evidence data source files of the credit generation analysis;
the calculating unit is used for carrying out weighted average calculation on each item of data in the result file;
and the sorting unit is used for sorting all the points in the weighted average calculation result according to the pathogenicity.
Optionally, the report composition module comprises:
the receiving unit is used for acquiring the calculation result of the intelligent analysis module, the patient phenotype data and the data in the local relational database and corresponding the calculation result, the patient phenotype data and the data in the local relational database one by one;
and the description unit is used for conclusively describing the characters of the data.
Optionally, the generating module includes:
the receiving unit is used for receiving the data reported by the report writing module;
the integration unit is used for combining the data obtained by the receiving unit with HTML text in the template editing module for synthesis;
and the report generating unit is used for generating a PDF report from the synthesized text.
Optionally, a wkhtmltopdf tool is provided in the report generation unit.
Example 3
An automated report interpretation method comprising the steps of:
after the letter analysis result file is obtained, the letter analysis result file is firstly imported into an intelligent analysis module, the module carries out weighted average calculation on scores of all evidence data sources in the file, and then sites are ranked according to the pathogenicity according to the calculation result, wherein the score is 1.0 and represents pathogenicity, and the score is 0.0 and represents benign. The weighted average calculation is performed on the evidence data source in the above table 1 by using a logistic regression (logistic regression) algorithm, and then pathogenicity is ranked from top to bottom according to the calculation result.
Table 1 illustrates various sources of evidence data in a raw trust analysis results file
Kind of evidence Data source
Function prediction Polyphen2-HVAR
Conservation of evolution LRT
Function prediction SIFT
Conservation of evolution phastCons100way
Conservation of evolution GERP++
Structural domains Gene
Crowd frequency gnomAD
Structural domains dbNSFP Interpro
Function prediction MutationTaster2
History rating Company history data
Based on the result, the unscrambler can finally screen the pathogenic site according to the industry gold standard; meanwhile, the screening results of each imported file and the interpretation personnel are also included in the model continuous learning of the module, so that the accuracy of the subsequent calculation sequencing is continuously improved.
After the screening result is determined, the data is imported into a report writing module, and also the phenotype data of the patient, and relevant data and historical data which are integrated in a local relational database and are captured from various public databases, wherein the relevant data and the historical data comprise but are not limited to gene functions, phenotype description, rating evidence and the like; the local relational database is generated in advance and continuously updated. The method loads data in a public database through a REST API interface and a file in a Tab Separated (TSV)/Comma Separated (CSV) format, and associates the data according to a gene-phenotype relationship to form a multi-dimensional data system. The creation core of this database is shown in fig. 2 based on the following ER relationship diagram, where 1: m represents a one-to-many relationship, and m:1 represents a many-to-one relationship.
The interpretation personnel combines the automatically acquired data, fills conclusive descriptive words into the report writing module, generates report data (without styles) in JSON format for final synthesis of reports, and saves the report data in a historical report database. The JSON-format report data is easy to expand, and reports of various templates can be compatible under the condition that report contents are continuously optimized. After the JSON format report is stored in the PostgreSQL relational database, the JSON format data are conveniently searched and reviewed in the later period by means of the processing capacity of the JSON format report on the JSON format data. The JSON-format report is stored without any style, so that the decoupling of page content and typesetting is realized to the maximum extent, and the report template is convenient to be reintroduced when updated.
And synthesizing the report data in the JSON format and HTML text with styles which is designed in a template editing module in advance. The HTML template realizes central control and is generated in advance. The style of the template is processed by CSS, and the modified template can be applied to a plurality of reports edited by a plurality of people after being issued once.
Generating a final version of PDF report by using an open-source wkhtmltopdf tool; and combining the JSON content and the HTML template to present a final PDF report. In this process, the user who legitimates the report does not need to pay attention to the page type of the report.
Example 4
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
acquiring various evidence data sources of letter generation analysis;
calculating the scores of all evidence data sources, defining the value representing pathogenicity in the calculation result as a value A, and defining the value representing benign result in the calculation result as a value B;
sequencing all the sites of the raw letter analysis in sequence according to the scores of the calculation results;
screening pathogenic loci according to an industry gold standard;
corresponding the pathogenic locus data, the phenotype data of the patient and the related data in the local relational database and then importing the data into a template; wherein the relevant data comprises gene function data, phenotypic description data, and graded evidence;
and adding conclusion description into the template to obtain a complete report.
Example 5
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
acquiring various evidence data sources of letter generation analysis;
calculating the scores of all evidence data sources, defining the value representing pathogenicity in the calculation result as a value A, and defining the value representing benign result in the calculation result as a value B;
sequencing all the sites of the raw letter analysis in sequence according to the scores of the calculation results;
screening pathogenic loci according to an industry gold standard;
corresponding the pathogenic locus data, the phenotype data of the patient and the related data in the local relational database and then importing the data into a template; wherein the relevant data comprises gene function data, phenotypic description data, and graded evidence;
and adding conclusion description into the template to obtain a complete report.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An automated report interpretation method, comprising:
acquiring various evidence data sources of letter generation analysis;
calculating the scores of all evidence data sources, defining the value representing pathogenicity in the calculation result as a value A, and defining the value representing benign result in the calculation result as a value B;
sequencing all the sites of the raw letter analysis in sequence according to the scores of the calculation results;
screening pathogenic loci according to an industry gold standard;
corresponding the pathogenic locus data, the phenotype data of the patient and the related data in the local relational database and then importing the data into a template; wherein the relevant data comprises gene function data, phenotypic description data, and graded evidence;
and adding conclusion description into the template to obtain a complete report.
2. The automated report interpretation method of claim 1, wherein a in the value a is a number; b in the value B is a number; the sites of the student's letter analysis are sorted by number according to the score of the calculated results.
3. The automated report interpretation method of claim 1, further comprising generating the complete report into JSON formatted report data and storing the JSON formatted report in a historical report database.
4. The automated report interpretation method of claim 1, wherein the local relational database comprises an OMIM database, a CHPO database, an HGMD database, and a historical report database; and an OMIM database, a CHPO database, an HGMD database and a history report database in the local relational database are associated according to a gene-phenotype relation by adopting an ER relational graph mode to form a multi-dimensional data system.
5. The automated report interpretation method of claim 1, further comprising synthesizing complete report generation JSON formatted report data with HTML text and into a PDF report.
6. The automated report interpretation method of claim 1, wherein the score of each evidence data source is calculated using a weighted average calculation.
7. The automated report reading method according to claim 1, wherein a logistic regression algorithm is used to calculate the scores of each evidence data source.
8. An automated report interpretation system, comprising
The intelligent analysis module is used for acquiring various evidence data source files of letter generation analysis, performing weighted average calculation on various data in the result file, and sequencing all the points in the calculation result according to pathogenicity;
the report writing module is used for acquiring the calculation result of the intelligent analysis module, the patient phenotype data and the data in the local relational database and carrying out conclusive descriptive text;
and the generating module is used for receiving the data reported by the report writing module, combining the HTML text in the template editing module, and generating a PDF report.
9. A storage medium, in which a computer program is stored, wherein the computer program is arranged to perform the method of any of claims 1 to 7 when executed.
10. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and wherein the processor is arranged to execute the computer program to perform the method of any of claims 1 to 7.
CN201911328539.9A 2019-12-20 2019-12-20 Automatic report interpretation method and system Pending CN111161824A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201911328539.9A CN111161824A (en) 2019-12-20 2019-12-20 Automatic report interpretation method and system
PCT/CN2020/092902 WO2021120528A1 (en) 2019-12-20 2020-05-28 Automatic report interpretation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911328539.9A CN111161824A (en) 2019-12-20 2019-12-20 Automatic report interpretation method and system

Publications (1)

Publication Number Publication Date
CN111161824A true CN111161824A (en) 2020-05-15

Family

ID=70557611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911328539.9A Pending CN111161824A (en) 2019-12-20 2019-12-20 Automatic report interpretation method and system

Country Status (2)

Country Link
CN (1) CN111161824A (en)
WO (1) WO2021120528A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021120528A1 (en) * 2019-12-20 2021-06-24 苏州赛美科基因科技有限公司 Automatic report interpretation method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183371A (en) * 2007-12-12 2008-05-21 中兴通讯股份有限公司 Method for quick finishing large data-handling and reporting system
CN108039193A (en) * 2017-11-17 2018-05-15 哈尔滨工大服务机器人有限公司 A kind of method and device for automatically generating physical examination report
CN109686439A (en) * 2018-12-04 2019-04-26 东莞博奥木华基因科技有限公司 Data analysing method, system and the storage medium of hereditary disease genetic test
CN109739869A (en) * 2018-12-29 2019-05-10 北京航天数据股份有限公司 Model running report-generating method and system
CN109754856A (en) * 2018-12-07 2019-05-14 北京荣之联科技股份有限公司 Automatically generate method and device, the electronic equipment of genetic test report
CN109817299A (en) * 2019-02-14 2019-05-28 北京安智因生物技术有限公司 A kind of relevant genetic test report automatic generating method of disease and system
CN109859831A (en) * 2018-12-19 2019-06-07 海南一龄医疗产业发展有限公司 A kind of medical information management system
CN110428127A (en) * 2019-06-19 2019-11-08 深圳壹账通智能科技有限公司 Automated analysis method, user equipment, storage medium and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105512508B (en) * 2014-09-22 2018-05-15 深圳华大基因研究院 Automatically generate the method and device of genetic test report
CN109086571B (en) * 2018-08-03 2019-08-23 国家卫生健康委科学技术研究所 A kind of method and system that monogenic disease hereditary variation is intelligently interpreted and reported
CN110544508B (en) * 2019-07-29 2023-03-10 荣联科技集团股份有限公司 Method and device for analyzing monogenic genetic disease genes and electronic equipment
CN111161824A (en) * 2019-12-20 2020-05-15 苏州赛美科基因科技有限公司 Automatic report interpretation method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183371A (en) * 2007-12-12 2008-05-21 中兴通讯股份有限公司 Method for quick finishing large data-handling and reporting system
CN108039193A (en) * 2017-11-17 2018-05-15 哈尔滨工大服务机器人有限公司 A kind of method and device for automatically generating physical examination report
CN109686439A (en) * 2018-12-04 2019-04-26 东莞博奥木华基因科技有限公司 Data analysing method, system and the storage medium of hereditary disease genetic test
CN109754856A (en) * 2018-12-07 2019-05-14 北京荣之联科技股份有限公司 Automatically generate method and device, the electronic equipment of genetic test report
CN109859831A (en) * 2018-12-19 2019-06-07 海南一龄医疗产业发展有限公司 A kind of medical information management system
CN109739869A (en) * 2018-12-29 2019-05-10 北京航天数据股份有限公司 Model running report-generating method and system
CN109817299A (en) * 2019-02-14 2019-05-28 北京安智因生物技术有限公司 A kind of relevant genetic test report automatic generating method of disease and system
CN110428127A (en) * 2019-06-19 2019-11-08 深圳壹账通智能科技有限公司 Automated analysis method, user equipment, storage medium and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021120528A1 (en) * 2019-12-20 2021-06-24 苏州赛美科基因科技有限公司 Automatic report interpretation method and system

Also Published As

Publication number Publication date
WO2021120528A1 (en) 2021-06-24

Similar Documents

Publication Publication Date Title
US11500818B2 (en) Method and system for large scale data curation
US11232365B2 (en) Digital assistant platform
US7917377B2 (en) Patient data mining for automated compliance
CN109785927A (en) Clinical document structuring processing method based on internet integration medical platform
CN109785918B (en) Data acquisition system and method applied to clinical scientific research
CN111489800A (en) Analysis method and system for identifying and storing medical record and report list images
CN109994216A (en) A kind of ICD intelligent diagnostics coding method based on machine learning
US10430716B2 (en) Data driven featurization and modeling
CN112349369A (en) Medical image big data intelligent analysis method, system and storage medium
CN110991170A (en) Chinese disease name intelligent standardization method and system based on electronic medical record information
WO2021114635A1 (en) Patient grouping model constructing method, patient grouping method, and related device
CN113919336A (en) Article generation method and device based on deep learning and related equipment
CN112214515A (en) Data automatic matching method and device, electronic equipment and storage medium
CN113707323B (en) Disease prediction method, device, equipment and medium based on machine learning
Jameel et al. Analyses the performance of data warehouse architecture types
CN109841285B (en) Clinical research collaboration system and method
US20210357808A1 (en) Machine learning model generation system and machine learning model generation method
WO2019191817A1 (en) A system and method for generating documents
CN111161824A (en) Automatic report interpretation method and system
CN113420018A (en) User behavior data analysis method, device, equipment and storage medium
Wang et al. Nationwide hospital admission data statistics and disease-specific 30-day readmission prediction
CN116775897A (en) Knowledge graph construction and query method and device, electronic equipment and storage medium
CN110737749B (en) Entrepreneurship plan evaluation method, entrepreneurship plan evaluation device, computer equipment and storage medium
CN113870998A (en) Interrogation method, device, electronic equipment and storage medium
CN114003787A (en) Data visualization method based on artificial intelligence and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200515