CN115482926B - Knowledge-driven rare disease visual question-answer type auxiliary differential diagnosis system and method - Google Patents
Knowledge-driven rare disease visual question-answer type auxiliary differential diagnosis system and method Download PDFInfo
- Publication number
- CN115482926B CN115482926B CN202211146285.0A CN202211146285A CN115482926B CN 115482926 B CN115482926 B CN 115482926B CN 202211146285 A CN202211146285 A CN 202211146285A CN 115482926 B CN115482926 B CN 115482926B
- Authority
- CN
- China
- Prior art keywords
- disease
- phenotype
- information
- prevalence
- patient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 208000035977 Rare disease Diseases 0.000 title claims abstract description 78
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000003748 differential diagnosis Methods 0.000 title claims abstract description 37
- 230000000007 visual effect Effects 0.000 title claims abstract description 31
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 262
- 201000010099 disease Diseases 0.000 claims abstract description 261
- 238000003745 diagnosis Methods 0.000 claims abstract description 62
- 238000004364 calculation method Methods 0.000 claims description 18
- 230000004927 fusion Effects 0.000 claims description 15
- 239000002131 composite material Substances 0.000 claims description 10
- 230000008901 benefit Effects 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 8
- 238000011156 evaluation Methods 0.000 claims description 6
- 230000003993 interaction Effects 0.000 claims description 6
- 238000005259 measurement Methods 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 3
- 238000012800 visualization Methods 0.000 claims 1
- 208000024891 symptom Diseases 0.000 abstract description 4
- 238000007794 visualization technique Methods 0.000 abstract 1
- 230000003044 adaptive effect Effects 0.000 description 7
- 238000013459 approach Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 5
- 230000002159 abnormal effect Effects 0.000 description 4
- 208000016012 Phenotypic abnormality Diseases 0.000 description 3
- 238000010339 medical test Methods 0.000 description 3
- 230000003449 preventive effect Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 210000003734 kidney Anatomy 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000001717 pathogenic effect Effects 0.000 description 2
- 108090000623 proteins and genes Proteins 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000003205 genotyping method Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000000265 homogenisation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 208000038009 orphan disease Diseases 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a visual question-answering type auxiliary differential diagnosis system and a visual question-answering type auxiliary differential diagnosis method for rare diseases, wherein the system comprises the following steps: the front-end UI module is used for acquiring basic information and phenotype information of a patient and recording the basic information and the phenotype information of the patient so as to acquire an initial self-report of the patient; the dialogue diagnosis service module provides phenotype inquiry recommendation service, disease diagnosis recommendation service and knowledge base inquiry service through the API; the phenotype inquiry recommendation service recommends a proper phenotype according to the current dialogue state so as to achieve higher dialogue income; the disease diagnosis recommendation service recommends a proper disease according to new phenotype information generated during the initial self-report and dialogue of the patient; the knowledge base inquiry service inquires and invokes the knowledge stored in the knowledge storage module according to the need; and the knowledge storage module is used for storing the human phenotype ontology knowledge base and the disease-phenotype relation knowledge base. The system is able to collect additional symptoms beyond its self-report and display auxiliary information for differential diagnosis using visualization techniques.
Description
Technical Field
The invention relates to the technical field of intelligent auxiliary diagnosis, in particular to a knowledge-driven rare disease visual question-answering type auxiliary differential diagnosis system and method.
Background
Rare diseases, also known as orphan diseases, refer to those diseases with very low incidence. World Health Organization (WHO) defines rare diseases as diseases with a prevalence of 6.5 to 10 per 10000 people. The Chinese rare disease definition research report 2021 lists the diseases with the morbidity of the newborns being less than 1/ten thousand, the prevalence being less than 1/ten thousand and the number of patients being less than 14 ten thousand as rare diseases. The number of internationally recognized rare diseases is nearly 7000, accounting for about 10% of human diseases.
Diagnosis and treatment of rare diseases are difficult, and the diagnosis and treatment are difficult, the treatment cost is high and the drug research and development are difficult. Rare diseases are of a wide variety and often have complex phenotypic and molecular characteristics that make diagnosis of rare diseases more difficult than other types of diseases. According to a survey by the European rare disease organization (EURORDIS), a quarter of rare genetic patients take 5 to 30 years to diagnose, and the initial diagnosis misdiagnosis rate is as high as 40%. These misdiagnosis can lead to waste of medical resources and delay of treatment time, even directly endangering the life of the patient. Therefore, how to perform rapid and effective diagnosis is an important issue in the field of rare diseases.
Many rare disease knowledge bases are currently being built. Orphanet was established in France in 1997, aimed at collecting rare disease knowledge and improving the diagnosis, care and treatment levels for rare patients. To date, orphanet has become an important reference source for rare disease diagnosis. The human phenotype ontology (Human Phenotype Ontology, HPO) was proposed by Robinson professor et al in 2008 as a standardized vocabulary of phenotypic abnormalities found in human disease, and is widely used in rare disease phenotype genotyping. Clinicians can use a rare disease knowledge base to screen patients for potential disease based on their phenotypic characteristics, however this approach has considerable limitations in clinical practice: firstly, inaccurate phenotypes and even noise phenotypes are often acquired in the process of clinically acquiring patient phenotypes; 2. patient phenotype information is incomplete due to missing phenotypes; thirdly, the more phenotype overlapping among rare diseases leads to higher diagnosis difficulty. These problems can adversely affect the diagnostic outcome in various rare disease diagnostic scenarios.
Currently, conversational chat robots or task oriented conversational systems are attracting more and more attention, especially in the medical field, intelligent conversational systems targeting disease diagnosis are capable of active patient extra symptom information collection to achieve better disease diagnosis results. Existing medical diagnostic dialogue systems mostly rely on data-driven machine learning methods. However, in rare disease fields, true data driven learning is difficult to develop due to the large variety of diseases and the lack of case resources. In this context, integrating knowledge bases to develop knowledge-driven medical diagnostic dialogs is a potentially effective approach.
Disclosure of Invention
The invention provides a knowledge-driven rare disease visual question-answering type auxiliary differential diagnosis system which can collect additional symptoms beyond self-report of a patient/doctor in a dialogue mode and display auxiliary information of differential diagnosis by utilizing a visual technology so as to further alleviate the problem of difficult diagnosis of rare diseases.
The technical scheme of the invention is as follows:
a rare disease visual question-answering type auxiliary differential diagnosis system, comprising:
the front-end UI module is used for performing man-machine interaction, collecting basic information and phenotype information of a patient, recording the basic information and the phenotype information of the patient to obtain an initial self-report of the patient, and displaying related information by utilizing a plurality of dimension visual views for differential diagnosis by clinical staff;
a dialogue diagnosis service module for providing phenotype inquiry recommendation service, disease diagnosis recommendation service and knowledge base inquiry service through API (Application Programming Interface); the phenotype inquiry recommendation service recommends a proper phenotype according to the current dialogue state to achieve higher dialogue income, and the phenotype of the user inquiring the system can select answer 'present', 'not present' or 'uncertain'; the disease diagnosis recommendation service recommends proper diseases according to the initial self report of the patient collected by the front-end UI module and new phenotype information generated in the conversation process; the knowledge base query service queries and invokes knowledge stored in the knowledge storage module according to the requirements of the phenotype query recommendation service and the disease diagnosis recommendation service;
the knowledge storage module is used for storing the human phenotype ontology knowledge base and the disease-phenotype relation knowledge base.
The human phenotype ontology knowledge base is a human phenotype ontology (Human Phenotype Ontology, HPO), wherein the HPO is a standardized structural ontology of human abnormal phenotypes constructed based on ontology ideas. HPO is constructed as a directed acyclic graph with a hierarchical structure, HPO terms are connected by a relationship of "is_a", and children have a more precise definition than parents. By 2 months of 2022, HPO had a total of 16226 current phenotypically abnormal phenotypes (HP: 0000118-Phenotypic abnormality and its offspring), each having a unique HPO code.
The disease-phenotype relation knowledge base is a rare disease authority knowledge base Orphanet, and the rare disease authority knowledge base Orphanet comprises disease epidemic, average disease death age, disease phenotype frequency characteristics and pathogenic gene information.
The knowledge storage module provides knowledge driving for the dialogue diagnosis service module.
The front end UI module includes:
the patient information acquisition sub-module comprises a patient basic information view and a phenotype information input view;
the visual question-answer differential diagnosis sub-module is used for displaying relevant information for differential diagnosis by clinical staff through a plurality of dimension visual views, wherein the disease evaluation view, the dialogue state view, the patient phenotype state view, the dialogue history view and the question-answer view.
The patient information acquisition submodule comprises a patient basic information view and a phenotype information input view, and a user can input information such as patient age, gender at birth, positive phenotype, negative phenotype and the like.
The question-answer differential diagnosis submodule comprises:
a disease evaluation view for displaying disease prevalence, posterior probability, and phenotype matching;
a dialogue state view, in which the posterior probability (labeled as P) of the disease with the greatest current advantage, the phenotype matching condition (labeled as M) and the uncertainty measure (information entropy labeled as E and the Kidney index labeled as G) of the disease recommendation result are displayed in the form of a radar chart;
a patient phenotype status view, a sunglass plot of reactive patient phenotype features (positive, negative, indeterminate) based on the human phenotype ontology and drawn using annotation propagation rules;
a dialogue history view and a question and answer view for exposing phenotypic information of system queries and providing feedback buttons (present, absent, uncertain) to the user.
The question-answer differential diagnosis sub-module displays relevant information in a multi-dimensional visual view for differential diagnosis by clinical staff.
The dialogue diagnosis service module provides phenotype inquiry recommendation service, disease diagnosis recommendation service and knowledge base inquiry service through API (Application Programming Interface).
The disease recommendation service comprises disease prior probability estimation and disease posterior probability calculation, and the dialogue diagnosis service module recommends a disease with the maximum posterior probability at the end of dialogue;
the method for estimating the prior probability of the disease comprises the following steps:
(a-1) uniformly converting disease prevalence types in an Orphanet knowledge base into point prevalence;
(a-2) calculating a priori probability of the disease from the point prevalence of the disease within the rare disease range by:
wherein Prob pretest (D) Representing the prior probability of disease D (Pretest Probability), pointPrev D Point preventives representing the Prevalence of disease D, ΣPointPrev representing the sum of the Prevalence of all rare disease points;
the method for calculating the posterior probability of the disease comprises the following steps:
the conversion relationship between (b-1) winnings (Odds) and probabilities (Prob) is:
converting the disease prior probability into a disease prior winner according to the formula (2);
(b-2) calculating a composite likelihood ratio of the tabular information according to the disease phenotype frequency annotation and annotation propagation rule in accordance with the formula (3):
wherein P represents the collected phenotype set, (P) 1 ,p 2 ,...,p n )∈P,Represents all rare diseases except rare disease D, pr (p i I D) represents phenotype p i The frequency of occurrence in disease D, the composite likelihood ratio of phenotype set P is calculated as the individual phenotype P i The product of likelihood ratios;
(b-3) calculating the posterior winnings Odds of the disease according to equation (4) posterior (P,D):
Odds posterior (P,D)=Odds pretest (D)*CompositeLR(P,D) (4)
Wherein Odds pretest (D) A priori winner representing rare disease D;
(b-4) calculating the posterior probability of disease Prob according to the formula (5) posterior (D):
In the step (a-1), the method for uniformly converting the disease epidemic type into the point prevalence rate comprises the following steps:
(1) if the epidemic type of the disease is point prevalence, preferentially selecting an item with a prevalence geographic area of 'worldwide', and using the numerical point prevalence of the item or the median of the prevalence of category points; if the prevalence geographic area has no "worldwide" entries but there are prevalence entries counted in a country or region, then using the point prevalence average of the prevalence entries counted in the country or region as the point prevalence of the disease;
(2) if the epidemic type of the disease is annual morbidity, acquiring average morbidity age and death age information of the disease from an Orphanet library, taking the morbidity age to the death age as the duration of the disease, and estimating the point morbidity of the disease by multiplying the annual morbidity by the duration of the disease;
(3) if the epidemic type of the disease is the number of cases, directly using the number of cases divided by the total world population as the point prevalence of the disease;
(4) if the epidemic type of the disease is the prevalence at birth and the age of onset of the disease is in neonatal period, the ratio of the prevalence at birth multiplied by the population of the continuous age group of the disease in the general population of the world is used as the point prevalence of the disease;
(5) if the epidemic type of disease is a lifetime prevalence, then the default parts per million is used as the point prevalence of the disease.
Aiming at the problem that the phenotype of a patient which is clinically collected is inaccurate and noisy, the invention applies annotation retransmission rules on the human phenotype ontology when calculating the phenotype likeness ratio, so that the analysis of the disease-phenotype relation is more efficient.
Preferably, the annotation retransmission rule is as follows: the positive phenotype of the patient may infer that its ancestral phenotype is also positive at the patient, and the negative phenotype of the patient may infer that its offspring phenotype is negative at the patient.
The phenotype recommendation service selects the phenotype with the highest expected benefit according to the current posterior probability distribution situation of the disease and the disease-phenotype annotation relation, and the phenotype is inquired by the system.
The phenotype recommendation service includes:
information Gain (Info Gain) calculations for phenotypes, including:
(a-1) using information Entropy (Information Entropy) to represent uncertainty of disease recommendation results, defining information Entropy (W) with classification weight in formula (6):
wherein W is a set of weights, W i Representing weights of a single element in a set of weights W sum Representing the sum of all element weights in W;
information Entropy of posterior probability of disease in current dialogue state cur (D) The method comprises the following steps:
Entropy cur (D)=Entropy({Prob posterior (d)|d∈D}) (7)
wherein D represents the set of all rare diseases, prob posterior (d) A posterior probability representing disease d;
(a-2) the information gain under single phenotype conditions is:
wherein the first term is the information entropy of the posterior probability of the disease in the current dialogue state, the second term is the conditional entropy when the phenotype p is positive, the third term is the conditional entropy when the phenotype p is negative, D represents the set of all rare diseases, prob posterior (d) Representing posterior probability of disease d, gamma is the branching weight of positive phenotype, i.e., { Prob posterior (d) Pr (p|d) |d ε D, the sum of the elements of the set Pr (p|d) represents the frequency of occurrence of phenotype p in disease D,indicating the frequency with which phenotype p does not occur in disease d;
a keni index calculation of a phenotype comprising:
(B-1) Using the Gini Index (Gini Index) to represent the purity of the disease recommendation, the Gini Index Gini (W) with classification weights is defined by equation (9):
wherein W is a set of weights, W i Representing weights of a single element in a set of weights W sum Representing the sum of all element weights in W;
the base index of disease posterior probability in the current dialogue state is:
Gini cur (D)=Gini({Prob posterior (d)|d∈D}) (10)
wherein D represents the set of all rare diseases, prob posterior (d) A posterior probability representing disease d;
(B-2) the base index gain under single phenotype conditions is:
the first term is the disease posterior probability base index under the current dialogue state, the second term is the conditional base index when the phenotype p is positive, the third term is the conditional base index when the phenotype p is negative, and the definition of other parameters is equal to the formula (8);
the information gain and the keni index are adaptively fused to obtain a fusion metric AIGGI (Adaptive Information Gain and Gini Index):
AIGGI(p,D)=α*InfoGain(p,D)+(1-α)*GiniGain(p,D) (12)
wherein D represents the set of all rare diseases, p represents a single phenotype, α is the uncertainty or clutter of the posterior probability distribution of the disease in the current dialogue state, α uses Gini cur (D) The coming list is not available;
the system selects the phenotype for which the fusion metric AIGGI is greatest based on the current dialog state for which the user selects the "present", "absent" or "uncertain" answer.
Preferably, to increase the efficiency of the conversation, annotation replay rules are applied to filter out some phenotypes when the phenotypes query recommendations, including:
the reported phenotype is not queried in the next dialog turn;
for the reported positive phenotypes, defaulting that the ancestral phenotypes are positive, filtering the ancestral phenotypes, and not inquiring any more;
for reported negative phenotypes, all negative offspring phenotypes are defaulted, and offspring phenotypes are filtered out and no longer interrogated.
The invention also provides a method for auxiliary differential diagnosis by adopting the rare disease visual question-answer type auxiliary differential diagnosis system, which comprises the following steps:
(1) The user inputs patient information in a patient information acquisition sub-module of the front-end UI module, wherein the patient information comprises patient age, birth sex, positive phenotype and negative phenotype;
(2) The dialogue diagnosis service module selects a phenotype with the maximum fusion metric AIGGI according to the current dialogue state to inquire, and after the user selects an answer of 'exist', 'not exist' or 'uncertain' aiming at the phenotype, the user enters the next round of dialogue;
in the question-answering process, a user can obtain various visual information including disease priority assessment, current dialogue state, patient phenotype state, dialogue history and the like from a visual question-answering differential diagnosis sub-module of the front-end UI module;
(3) And (3) repeating the step (2) until the uncertainty of the disease recommendation result is lower than a threshold value or the maximum dialogue turn is reached, and giving the disease recommendation and ending the dialogue according to the posterior probability of the disease by the system.
The core of the invention is a disease recommendation service, and the algorithm core is how to obtain the most effective phenotype questions and answers in tens of thousands of phenotypes, and the optimal path approaches to real diagnosis. One of the important contributions is to apply a method for calculating the posterior probability of a disease based on Bayesian medical test to a disease recommendation service in a dialogue diagnosis service module, wherein the method comprises two steps of disease prior probability estimation and disease posterior probability calculation. Firstly, the invention designs a conversion method of different epidemic types, the knowledge of the epidemic of the heterogeneous diseases is homogenized into the point prevalence rate, and the prior probability of the diseases is calculated according to the point prevalence rate. And taking a Likelihood Ratio (Likelihood Ratio) index as a measure of the contribution degree of single phenotype conditions to the disease priority, and multiplying to obtain a composite Likelihood Ratio of all the phenotype conditions, thereby obtaining posterior probability of the disease on the basis of phenotype.
Another core of the invention is a phenotype recommendation method which adaptively fuses information gain and a keni index, and is applied to a phenotype inquiry recommendation service in a dialogue diagnosis service module. The objective of this approach is to select the phenotype for interrogation by the system that yields the greatest expected benefit based on the current dialogue state, i.e., the posterior probability distribution of disease and the disease-phenotype annotation relationship. The method comprises three aspects of information Gain (Info Gain) calculation of phenotype, gini index Gain (Gini index Gain) calculation and adaptive fusion metric AIGGI (Adaptive Information Gain and Gini Index) calculation of the two. According to the invention, the posterior probability of the disease is used as the classification weight to calculate the information gain and the base index gain of a single phenotype, the information gain and the base index gain are adaptively fused according to the uncertainty degree of the disease recommendation result, and the phenotype with the maximum AIGGI is recommended by the system as an inquiry object.
Compared with the prior art, the invention has the beneficial effects that:
(1) Compared with the phenotype keyword retrieval recommendation disease based on the knowledge base and the common disease recommendation method based on the patient similarity, the method adopts the phenotype question-answering mode, and the active patient extra symptom information collection is carried out through the man-machine interaction to enrich the patient phenotype information, so that doctors can be inspired to notice the previously missed patient phenotype in the conversation process, and finally more effective disease diagnosis is realized.
(2) Aiming at the problem that the phenotype of a patient is inaccurate and noisy in clinical acquisition, the invention designs a method for calculating the posterior probability of a disease based on Bayesian medical examination.
(3) Prevalence is a very important factor affecting disease diagnosis, and most disease recommendation methods do not consider prevalence. Aiming at the isomerism problem of disease epidemic knowledge, the invention designs a set of epidemic type conversion calculation method, unifies the epidemic type of the disease into point prevalence rate so as to realize homogeneity, calculates the prior probability of the disease according to the point prevalence rate, and applies the prior probability to a disease recommendation method.
(4) The invention designs a phenotype recommendation method for adaptively fusing information gain and a base index. And according to the uncertainty degree of the disease recommendation result, carrying out self-adaptive fusion on the information gain and the keni index gain of the phenotype condition to obtain a more efficient phenotype question and answer.
Drawings
FIG. 1 is a schematic block diagram of a rare disease visual question-answering type auxiliary differential diagnosis system;
FIG. 2 is a visual question-answering type auxiliary differential diagnosis status radar chart for rare diseases;
fig. 3 is a schematic of a question-answer flow chart of a rare disease visual question-answer type auxiliary differential diagnosis system.
Detailed Description
The present invention devised a knowledge-driven rare disease dialogue diagnostic system capable of collecting additional phenotypes beyond its self-report by dialogue with the user (patient/family/doctor) and based thereon automatically developing an assessment and giving a disease diagnosis. The system takes an Orphanet rare disease knowledge base and a human form body as priori knowledge, performs phenotype inquiry by the system after initial patient information is input, selects a phenotype with the maximum expected benefit according to a real-time dialogue state, gives judgment by a user, and iterates the above processes to realize multiple rounds of inquiry and answer, thereby enriching the structured phenotype data acquisition of patients and improving the disease diagnosis effect.
Rare disease dialogue diagnostic system module introduction
The rare disease question-answer dialogue diagnosis system designed by the invention comprises a front end UI (User Interface) module, a dialogue diagnosis service module and a knowledge storage module, as shown in fig. 1, wherein:
the front-end UI module is used for performing man-machine interaction and specifically comprises a patient information acquisition sub-module and a visual question-answer type differential diagnosis sub-module. The patient information acquisition submodule comprises a patient basic information view and a phenotype information input view, and a user can input information such as age, gender at birth, positive phenotype, negative phenotype and the like of the patient. The question-answer differential diagnosis submodule includes a disease evaluation view, a dialogue state view, a patient phenotype state view, a dialogue history view and a question-answer view. The disease evaluation view shows disease prevalence, posterior probability and phenotype matching conditions; the dialogue state view shows the posterior probability (labeled as P), phenotype matching condition (labeled as M) and uncertainty measure (information entropy labeled as E and Kidney index labeled as G) of the disease recommendation result with the greatest current advantage in the form of a radar chart, and as shown in fig. 2, the larger the shadow area of the radar chart is, the lower the uncertainty of the disease recommendation result is; the patient phenotype status view is a rising sun chart of the phenotype characteristics (positive, negative, uncertain) of the corresponding patient based on the human phenotype ontology and utilizes annotation propagation rules; the question-answer view is the core interaction module of the dialog diagnosis sub-module, exposing the phenotypic information of the system query and providing feedback buttons (present, absent, uncertain) to the user. The question-answer differential diagnosis sub-module displays relevant information in a multi-dimensional visual view for differential diagnosis by clinical staff.
The dialogue diagnosis service module mainly provides phenotype inquiry recommendation service, disease diagnosis recommendation service and knowledge base inquiry service through API (Application Programming Interface). The phenotype inquiry recommendation service recommends a proper phenotype according to the current dialogue state to achieve higher dialogue income, and the phenotype of the user inquiring the system can select answer 'present', 'not present' or 'uncertain'; the disease diagnosis recommendation service recommends an appropriate disease based on new phenotypic information generated during the patient's initial self-report and dialogue. We devised corresponding algorithms for phenotypic interrogation and disease diagnosis as two important additional aspects of the invention.
And the knowledge storage module is used for storing information such as disease popularity, average onset death age, disease phenotype frequency characteristics, pathogenic genes and the like obtained from the rare disease authority knowledge base Orphanet. By 2 months of 2022, 4257 rare diseases with phenotypic characteristics were counted in total in the Orphanet knowledge base, and these 4257 rare diseases were the target recommended disease set for the present system. The Orphanet provides a knowledge graph in an XML file format, analyzes the knowledge graph and stores the knowledge graph in a relational database MySQL.
In addition to the Orphanet knowledge base, the system integrates a human phenotype ontology (Human Phenotype Ontology, HPO), which is a standardized structured ontology of human abnormal phenotypes built based on ontological ideas. HPO is constructed as a directed acyclic graph with a hierarchical structure, HPO terms are connected by a relationship of "is_a", and children have a more precise definition than parents. By 2 months of 2022, HPO had a total of 16226 current phenotypically abnormal phenotypes (HP: 0000118-Phenotypic abnormality and its offspring), each having a unique HPO code.
The human phenotype ontology applies annotation propagation rules (Annotation Propagation Rule), i.e. the positive phenotype of the patient can infer that its ancestral phenotype is also positive at the patient, the negative phenotype of the patient can infer that its offspring phenotype is negative at the patient, which play a great role in disease posterior probability calculation and phenotype question-answer recommendation.
(II) method for calculating posterior probability of disease based on Bayes medical test
One of the important additional aspects of the invention is a method for calculating the posterior probability of a disease based on Bayesian medical test, which is applied to a disease recommendation service in a dialogue diagnosis service module and comprises two steps of disease prior probability estimation and disease posterior probability calculation. By designing a epidemic type conversion method, the knowledge of the epidemic of the heterogeneous diseases is homogenized into the point prevalence rate, and the prior probability of the diseases is calculated according to the point prevalence rate. Then, we use a Likelihood Ratio (LR) index as a measure of the degree to which a single phenotypic condition contributes to disease priority, and multiply it to get a composite Likelihood Ratio for all phenotypic conditions, thus obtaining the posterior probability of disease. The detailed embodiments are as follows:
(1) Disease prior probability estimation.
The knowledge of disease epidemics in the Orphanet knowledge base is clearly heterogeneous and a rare disease may be annotated as one or several of five epidemic types (point prevalence, birth prevalence, lifetime prevalence, annual prevalence and number of cases). Through designing a epidemic type conversion calculation method, the disease epidemic types are unified into point prevalence rates so as to realize homogeneity. Then, the prior probability is calculated from the disease point prevalence in the rare disease range:
wherein Prob pretest (D) Representing the prior probability of disease D (Pretest Probability), pointPrev D Point preventives (Point preventives) representing disease D, and ΣPointprev represents the sum of the Prevalence of all rare disease points.
The specific rare disease prevalence information homogenization method is as follows, and a specific value of the prevalence of a representative point is calculated from knowledge of five epidemic types:
if the epidemic type of disease is dot prevalence, an entry whose geographical area of prevalence is "worldwide" is preferably selected, and the numerical dot prevalence or median of category-type dot prevalence of the entry is used. If the disease has no "worldwide" entries but there are entries for prevalence that are counted in a country or region, the point prevalence average of these entries is used as the point prevalence of the disease.
If the epidemic type of the disease is annual incidence, we need to additionally obtain the average age of onset and death information of the disease from the Orphanet. Since rare diseases are difficult to cure, we estimate the disease's point prevalence by multiplying the annual incidence by the disease duration years, taking the age of onset to death as the duration of the disease.
If the epidemic type of the disease is the number of cases, we directly use the number of cases divided by the total population of the world as the point prevalence of the disease.
If the epidemic type of the disease is the prevalence at birth and the age of the disease onset is in neonatal period, we use the ratio of the prevalence at birth times the population of the continuous age group of the disease in the general population of the world as the point prevalence of the disease.
Finally, for the case where the epidemic type of disease is a lifetime prevalence, since the lifetime prevalence is the proportion of people suffering from a disease in the target population for a certain period of time, including the current prevalence at the time point of measurement and the previous prevalence but the current prevalence, this type is difficult to correspond to the point prevalence, we use the default parts per million as the point prevalence of the disease.
(2) And (5) calculating posterior probability of disease.
To better provide a measure of the extent to which individual phenotypic observations contribute to disease priority, we use a Bayesian-based likelihood ratio calculation method to estimate LR of individual phenotypes and posterior probability of disease. Firstly, calculating prior winnings according to a formula (2), calculating a composite likelihood ratio of phenotype information according to a formula (3) according to disease phenotype frequency annotation and annotation propagation rules, then calculating post-calculation winnings according to a formula (4), and finally obtaining posterior probability of the disease according to a formula (5).
Scaling relationship between winnings (Odds) and probability (Prob):
the multiple phenotype composite likelihood ratio (Composite Likelihood Ratio) calculation formula:
wherein P represents the collected phenotype set, (P) 1 ,p 2 ,...,p n )∈P,Represents all rare diseases except rare disease D, pr (p i I D) represents phenotype p i The frequency of occurrence in disease D, the composite likelihood ratio of phenotype set P is calculated as the individual phenotype P i The product of likelihood ratios.
Posterior winner calculation formula:
Odds posterior (P,D)=Odds pretest (D)*CompositeLR(P,D) (4)
wherein Odds pretest (D) Representing an a priori winner of rare disease D.
Finally, the posterior probability is calculated:
it is worth mentioning that, aiming at the problem of inaccurate and noisy phenotype of a patient in clinical collection, the method applies annotation retransmission rules on the human phenotype ontology when calculating the phenotype likelihood ratio, so that the analysis of the disease-phenotype relation is more efficient. In particular, for a relevant phenotype of a disease annotation, it can be inferred that the disease is also associated with an ancestral phenotype of that phenotype. If a positive phenotype is entered as an ancestral phenotype of a disease-associated phenotype, then the likelihood ratio metric for that positive phenotype for that disease should be positive, i.e., a likelihood ratio greater than 1. If a negative phenotype information is entered as an ancestral phenotype of a disease-associated phenotype, then the likelihood score for that negative phenotype should be negative, i.e., a likelihood score of less than 1.
Phenotype recommendation method for self-adaptive fusion information gain and base index
Another important additional aspect of the present invention is a phenotype recommendation method that adaptively fuses information gain and keni fingers, which is applied to a phenotype recommendation service in a dialog diagnosis service module. The objective of this approach is to select the phenotype for interrogation by the system that yields the greatest expected benefit based on the current dialogue state, i.e., the posterior probability distribution of disease and the disease-phenotype annotation relationship. The method includes three aspects of information Gain (Info Gain) computation of phenotype, gini Index Gain (Gini Index Gain) computation, and adaptive fusion metric AIGGI (Adaptive Information Gain and Gini Index) computation of both. The information gain and the base index gain of a single phenotype are calculated by using the posterior probability of the disease, namely the classification weight, and are adaptively fused according to the uncertainty degree of the disease recommendation result, and the phenotype with the largest AIGGI is recommended as an inquiry object. When the early phase of question-answer diagnosis or the posterior probability of dominant diseases is not enough, the system focuses on inquiring the form with larger expected information gain to reduce the uncertainty of disease diagnosis; when the posterior probability of the inquiry diagnosis or the posterior probability of the dominant disease is more prominent, the system focuses on inquiring about the phenotype with a smaller expected keni index to further improve the purity of the disease diagnosis.
(1) And (5) calculating the information gain of the phenotype question and answer.
The present invention uses entropy (Information Entropy) to represent uncertainty in disease recommendations. The information entropy is related to the information content of the system, can represent the uncertainty of a sample set, is focused on information representation, has stronger expressive force on disordered data, and is designed by taking the information entropy as a measurement by a classical decision tree algorithm ID 3. The posterior probability of the disease is used as the weight of classification to calculate the information entropy of the current state, as shown in a formula (7). If the posterior probability distribution of the disease is uniform, the uncertainty of disease diagnosis is higher, and the information entropy value is larger.
First, the basic information Entropy with classification weight (W) is defined:
wherein W is a set of weights, W i Representing weights of a single element in a set of weights W sum Representing the sum of all element weights in W.
Disease posterior probability information Entropy under current dialog state cur (D) The method comprises the following steps:
Entropy cur (D)=Entropy({Prob posterior (d)|d∈D}) (7)
wherein D represents the set of all rare diseases, prob posterior (d) The posterior probability of disease d is represented.
The information gain under single phenotype conditions is:
wherein the first item is disease posterior probability information entropy under the current dialogue state, the second item is conditional entropy when the phenotype p is positive, the third item is conditional entropy when the phenotype p is negative, D represents the set of all rare diseases, prob posterior (d) Representing posterior probability of disease d, gamma is the branching weight of positive phenotype, i.e., { Prob posterior (d) Pr (p|d) |d ε D, the sum of the elements of the set Pr (p|d) represents the frequency of occurrence of the table p in the disease D,indicating the frequency with which phenotype p does not occur in disease d.
(2) And (5) calculating a matrix index of the phenotype question and answer.
The present invention uses the Gini Index (Gini Index) to represent the purity of the disease recommendation. The radix index is associated with data clutter or purity, the radix index is focused on algebraic representations, is better at classifying purer sets, and related CART algorithms are used to construct classification decision trees. We calculate the base value of the current state using the posterior probability of disease as the weight of the classification, as shown in equation (10). If the posterior probability distribution of disease is concentrated on a few diseases, the purity of the disease diagnosis result is higher, and the base value is smaller. First, a basic base index Gini (W) with classification weights is defined:
wherein W is a set of weights, W i Representing weights of a single element in a set of weights W sum Representing the sum of all element weights in W.
The posterior probability base index of the disease in the current dialogue state is:
Gini cur (D)=Gini({Prob posterior (d)|d∈D}) (10)
wherein D represents the set of all rare diseases, prob posterior (d) The posterior probability of disease d is represented.
The base index gain under single phenotype conditions is:
the first term is the disease posterior probability base index under the current dialogue state, the second term is the conditional base index when the phenotype p is positive, the third term is the conditional base index when the phenotype p is negative, and the rest parameters are defined as the equivalent formula (8).
(3) Adaptive fusion of information gain and base index.
The information entropy and the keni index have similar uncertainty measurement functions, but have different definition and measurement emphasis. The information entropy focuses on information representation, and has stronger differentiation on more disordered data; the base index is then focused on algebraic representations, better than the purer set classification. The fusion metrics may integrate the advantages of both. When the early phase of question-answer diagnosis or the posterior probability of dominant diseases is not enough, the system more laterally inquires the phenotype with larger expected information gain to reduce the uncertainty of disease diagnosis; when the posterior probability of inquiry diagnosis or the posterior probability of dominant disease is more prominent, the system focuses on inquiring the phenotype with a smaller expected keni index to further improve the purity of disease diagnosis. The invention performs self-adaptive weighting integration on the information gain and the base index to generate a fusion metric AIGGI (Adaptive Information Gain and Gini Index) with more complete uncertainty characterization:
AIGGI(p,D)=α*InfoGain(p,D)+(1-α)*GiniGain(p,D) (12)
wherein D represents the set of all rare diseases, p represents a single phenotype, alpha is the uncertainty degree or disorder degree of posterior probability distribution of diseases under the current dialogue state, and the range of values of Yu Jini indexes is [0,1]Thus, α can be directly used as Gini cur (D) To represent.
It should be noted that, to increase the efficiency of the conversation, we apply annotation replay rules to filter out some phenotypes when asking for recommendations for phenotypes. First, the reported phenotype is not queried in the next conversation round. In addition, for the reported positive phenotypes, default that the ancestral phenotypes are positive, and filtering the ancestral phenotypes; for reported negative phenotypes, the progeny phenotypes are all negative by default, the progeny phenotypes are filtered out, and the system is no longer interrogated.
(IV) rare disease dialogue diagnostic System dialogue diagnostic flow introduction
The rare disease dialogue diagnosis system dialogue diagnosis flow is shown in fig. 3, after the initial patient self report, the human-computer interaction is performed in the form of multi-round questions to obtain more patient phenotype information, and when the maximum dialogue round or the uncertainty of the disease recommendation result is low, the system gives out the disease diagnosis recommendation and ends the dialogue. The dialogue diagnosis flow comprises the following detailed steps:
(1) Initial self-reporting. The user logs patient information, including patient age, gender at birth, positive phenotype, and negative phenotype, into a patient information acquisition sub-module of the front end UI module.
(2) Phenotypic questions and answers. The system selects the phenotype with the largest fusion metric AIGGI according to the current dialogue state to inquire, and the user enters the next round of dialogue after selecting a reply of 'present', 'absent' or 'uncertain' aiming at the phenotype. In the question-answering process, the user can obtain various visual information including disease priority assessment, current dialogue state, patient phenotype state, dialogue history and the like in the dialogue diagnosis sub-module of the front-end UI module.
(3) Disease recommendation. And (3) repeating the step (2) until the uncertainty of the disease recommendation result is low or the maximum dialogue turn is reached, and giving out the disease recommendation and ending the dialogue according to the posterior probability of the disease.
The foregoing embodiments have described the technical solutions and advantages of the present invention in detail, and it should be understood that the foregoing embodiments are merely illustrative of the present invention and are not intended to limit the invention, and any modifications, additions, substitutions and the like that fall within the principles of the present invention should be included in the scope of the invention.
Claims (7)
1. A rare disease visual question-answering type auxiliary differential diagnosis system, characterized by comprising:
the front-end UI module is used for performing man-machine interaction, collecting basic information and phenotype information of a patient, inputting the basic information and the phenotype information of the patient to obtain an initial self-report of the patient, and displaying related information by utilizing a plurality of dimension visual views for differential diagnosis by clinical staff;
the dialogue diagnosis service module provides phenotype inquiry recommendation service, disease diagnosis recommendation service and knowledge base inquiry service through the API; the phenotype inquiry recommendation service recommends a proper phenotype according to the current dialogue state to achieve higher dialogue income, and the phenotype of the user inquiring the system can select answer 'present', 'not present' or 'uncertain'; the disease diagnosis recommendation service recommends proper diseases according to the initial self report of the patient collected by the front-end UI module and new phenotype information generated in the conversation process; the knowledge base inquiry service inquires and invokes knowledge stored in the knowledge storage module according to the requirements of the phenotype inquiry recommendation service and the disease diagnosis recommendation service; the phenotype query recommendation service includes:
information gain calculation of phenotypes, comprising:
(a-1) using information Entropy to represent uncertainty of disease recommendation result, defining information Entropy (W) with classification weight in formula (6):
wherein W is a set of weights, W i Representing weights of a single element in a set of weights W "#$ Representing the sum of all element weights in W;
information Entropy of posterior probability of disease in current dialogue state (#) (D) The method comprises the following steps:
Entropy cur (D)=Entropy({5Prob posterior (d)|d∈D}) (7)
wherein D represents the set of all rare diseases, prob posterior (d) A posterior probability representing disease d;
(a-2) the information gain under single phenotype conditions is:
wherein the first term is the information entropy of the posterior probability of the disease in the current dialogue state, the second term is the conditional entropy when the phenotype p is positive, the third term is the conditional entropy when the phenotype p is negative, D represents the set of all rare diseases, prob posterior (d) Representing posterior probability of disease d, gamma is the branching weight of positive phenotype, i.e., { Prob posterior (d) Pr (p|d) |d ε D, the sum of the elements of the set Pr (p|d) represents the frequency of occurrence of phenotype p in disease D,indicating the frequency with which phenotype p does not occur in disease d;
a keni index calculation of a phenotype comprising:
(B-1) using the genii index to represent the purity of the disease recommendation result, defining the genii index Gini (W) with classification weight as formula (9):
wherein W is a set of weights, W i Representing weights of a single element in a set of weights W sum Representing the sum of all element weights in W;
the base index of disease posterior probability in the current dialogue state is:
Gini cur (D)=Gini({Prob posterior (d)|d∈D}) (10)
wherein D represents the set of all rare diseases, prob posterior (d) A posterior probability representing disease d;
(B-2) the base index gain under single phenotype conditions is:
the first term is the disease posterior probability base index under the current dialogue state, the second term is the conditional base index when the phenotype p is positive, the third term is the conditional base index when the phenotype p is negative, and the definition of other parameters is equal to the formula (8);
carrying out self-adaptive fusion on the information gain and the keni index to obtain fusion measurement AIGGI:
AIGGI(p,D)=α*InfoGain(p,D)+(1-α)*GiniGain(p,D) (12)
wherein D represents the set of all rare diseases, p represents a single phenotype, α is the uncertainty or clutter of the posterior probability distribution of the disease in the current dialogue state, α uses Gini cur (D) To represent;
the system selects the phenotype with the largest fusion metric AIGGI for inquiry according to the current dialogue state, and the user selects an answer of 'present', 'absent' or 'uncertain' aiming at the phenotype;
and the knowledge storage module is used for storing the human phenotype ontology knowledge base and the disease-phenotype relation knowledge base.
2. The rare disease visualization question-answering type auxiliary differential diagnosis system according to claim 1, wherein the front-end UI module comprises:
the patient information acquisition sub-module comprises a patient basic information view and a phenotype information input view;
the visual question-answer differential diagnosis sub-module is used for displaying relevant information for differential diagnosis by clinical staff through a plurality of dimension visual views, namely a disease evaluation view, a dialogue state view, a patient phenotype state view, a dialogue history view and a question-answer view.
3. The rare disease visual question-answering type auxiliary differential diagnosis system according to claim 2, wherein the patient information acquisition submodule comprises a patient basic information view and a phenotype information input view, and a user can input information such as age, gender at birth, positive phenotype, negative phenotype and the like of the patient;
the question-answer differential diagnosis submodule comprises:
a disease evaluation view for displaying disease prevalence, posterior probability, and phenotype matching;
a dialogue state view, which displays posterior probability, phenotype matching condition and uncertainty measurement of disease recommendation result of the disease with the greatest current advantage in the form of a radar chart;
a patient phenotype status view, a sunglass map of the patient's phenotype characteristics based on the human phenotype ontology and drawn using annotation propagation rules;
a dialogue history view and a question and answer view for presenting phenotypic information of system queries and providing feedback buttons to the user.
4. The rare disease visual question-answering type auxiliary differential diagnosis system according to claim 1, wherein the disease diagnosis recommendation service includes disease prior probability estimation and disease posterior probability calculation, and the dialogue diagnosis service module recommends a disease with the maximum posterior probability at the end of dialogue;
the method for estimating the prior probability of the disease comprises the following steps:
(a-1) uniformly converting disease prevalence types in an Orphanet knowledge base into point prevalence;
(a-2) calculating a priori probability of the disease from the point prevalence of the disease within the rare disease range, the formula:
wherein Prob pretest (D) Representing the prior probability of disease D, pointPrev D Representing the dot prevalence of disease D, Σpointprev represents the sum of the prevalence of all rare disease dots;
the method for calculating the posterior probability of the disease comprises the following steps:
the scaling relationship between (b-1) winnings and probabilities is:
converting the disease prior probability into a disease prior winner according to the formula (2);
(b-2) calculating a composite likelihood ratio of the phenotype information according to formula (3) according to the disease phenotype frequency annotation and annotation propagation rule:
wherein P represents the collected phenotype set, (P) 1 ,p 2 ,…,p n )∈P,Represents all rare diseases except rare disease D, pr (p i I D) represents phenotype p i The frequency of occurrence in disease D, the composite likelihood ratio of phenotype set P is calculated as a single phenotype P i The product of likelihood ratios;
(b-3) calculating the posterior winnings Odds of the disease according to equation (4) posterior (P,D):
Odds posterior (P,D)=Odds pretest (D)*CompositeLR(P,D) (4)
Wherein Odds pretest (D) A priori winner representing rare disease D;
(b-4) calculating the posterior probability of disease Prob according to the formula (5) posterior (D):
5. The rare disease visual question-answering type auxiliary differential diagnosis system according to claim 4, wherein in step (a-1), the method for uniformly converting the disease prevalence type into point prevalence rate comprises:
(1) if the epidemic type of the disease is point prevalence, preferentially selecting an item with a prevalence geographic area of 'worldwide', and using the numerical point prevalence of the item or the median of the prevalence of category points; if the prevalence geographic area has no "worldwide" entries but there are prevalence entries counted in a country or region, then using the point prevalence average of the prevalence entries counted in the country or region as the point prevalence of the disease;
(2) if the epidemic type of the disease is annual morbidity, acquiring average morbidity age and death age information of the disease from an Orphanet library, taking the morbidity age to the death age as the duration of the disease, and estimating the point morbidity of the disease by multiplying the annual morbidity by the duration of the disease;
(3) if the epidemic type of the disease is the number of cases, the number of cases divided by the world population is directly used as the point prevalence of the disease;
(4) if the epidemic type of the disease is the prevalence at birth and the age of onset of the disease is in neonatal period, the ratio of the prevalence at birth multiplied by the population of the continuous age group of the disease in the general population of the world is used as the point prevalence of the disease;
(5) if the epidemic type of disease is a lifetime prevalence, then the default parts per million is used as the point prevalence of the disease.
6. The rare disease visual question-answering type aided differential diagnosis system of claim 1, wherein applying annotation relay rules to filter out some phenotypes at the time of phenotype query recommendation comprises:
the reported phenotype is not queried in the next dialog turn;
for the reported positive phenotypes, defaulting that the ancestral phenotypes are positive, filtering the ancestral phenotypes, and not inquiring any more;
for reported negative phenotypes, its progeny phenotypes were all negative by default, and its progeny phenotype was filtered out and no longer interrogated.
7. A method of rare disease assisted differential diagnosis employing the system of any one of claims 1-6, comprising:
(1) The user inputs patient information in a patient information acquisition sub-module of the front-end UI module, wherein the patient information comprises patient age, gender at birth, positive phenotype and negative phenotype;
(2) The dialogue diagnosis service module selects a phenotype with the maximum fusion metric AIGGI according to the current dialogue state to inquire, and after the user selects an answer of 'exist', 'not exist' or 'uncertain' aiming at the phenotype, the user enters the next round of dialogue;
in the question-answering process, a user can obtain various visual information including disease priority assessment, current dialogue state, patient phenotype state, dialogue history and the like from a visual question-answering differential diagnosis sub-module of the front-end UI module;
(3) And (3) repeating the step (2) until the uncertainty of the disease recommendation result is lower than a threshold value or the maximum dialogue round is reached, and giving out the disease recommendation and ending the dialogue according to the posterior probability of the disease by the system.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211146285.0A CN115482926B (en) | 2022-09-20 | 2022-09-20 | Knowledge-driven rare disease visual question-answer type auxiliary differential diagnosis system and method |
PCT/CN2023/077386 WO2024060508A1 (en) | 2022-09-20 | 2023-02-21 | Knowledge-driven system and method for visualized question-and-answer assisted differential diagnosis of rare disease |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211146285.0A CN115482926B (en) | 2022-09-20 | 2022-09-20 | Knowledge-driven rare disease visual question-answer type auxiliary differential diagnosis system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115482926A CN115482926A (en) | 2022-12-16 |
CN115482926B true CN115482926B (en) | 2024-04-09 |
Family
ID=84392464
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211146285.0A Active CN115482926B (en) | 2022-09-20 | 2022-09-20 | Knowledge-driven rare disease visual question-answer type auxiliary differential diagnosis system and method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115482926B (en) |
WO (1) | WO2024060508A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115482926B (en) * | 2022-09-20 | 2024-04-09 | 浙江大学 | Knowledge-driven rare disease visual question-answer type auxiliary differential diagnosis system and method |
CN118093252B (en) * | 2024-04-28 | 2024-08-09 | 浪潮云信息技术股份公司 | Database diagnosis method and device of cloud platform |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107408143A (en) * | 2014-12-16 | 2017-11-28 | L.I.A.迪朱塞佩·卡帕索公司 | Suitable for determining the medical antidiastole device of the optimal sequence of the diagnostic test for identifying lesion using diagnosis appropriateness standard |
CN111899884A (en) * | 2020-06-23 | 2020-11-06 | 北京左医科技有限公司 | Intelligent auxiliary inquiry method, device and storage medium |
CN111984771A (en) * | 2020-07-17 | 2020-11-24 | 北京欧应信息技术有限公司 | Automatic inquiry system based on intelligent conversation |
CN112135564A (en) * | 2018-05-23 | 2020-12-25 | 松下知识产权经营株式会社 | Method, program, device and system for evaluating ingestion swallowing function |
CN113272912A (en) * | 2018-10-22 | 2021-08-17 | 杰克逊实验室 | Methods and apparatus for phenotype-driven clinical genomics using likelihood ratio paradigm |
WO2021211326A1 (en) * | 2020-04-16 | 2021-10-21 | Ix Layer Inc. | Systems and methods for access management and clustering of genomic, phenotype, and diagnostic data |
CN113707299A (en) * | 2021-08-27 | 2021-11-26 | 平安科技(深圳)有限公司 | Auxiliary diagnosis method and device based on inquiry session and computer equipment |
CN113889259A (en) * | 2021-09-06 | 2022-01-04 | 浙江工业大学 | Automatic diagnosis dialogue system under assistance of knowledge graph |
CN113889265A (en) * | 2021-10-15 | 2022-01-04 | 浙江大学 | Rare disease auxiliary reasoning method and system based on phenotype visualization |
CN114328864A (en) * | 2021-12-14 | 2022-04-12 | 合肥奥比斯科技有限公司 | Ophthalmic question-answering system based on artificial intelligence and knowledge graph |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102011016691A1 (en) * | 2011-04-11 | 2012-10-11 | medPad GmbH | Method and system for assisting in the selection of at least one object from a group of stored objects |
US11404170B2 (en) * | 2016-04-18 | 2022-08-02 | Soap, Inc. | Method and system for patients data collection and analysis |
CN109147930A (en) * | 2017-06-28 | 2019-01-04 | 京东方科技集团股份有限公司 | Divide and examines dialogue method, divides and examine conversational device and system |
CN107577907B (en) * | 2017-09-08 | 2021-04-02 | 成都奇恩生物科技有限公司 | Rare disease auxiliary diagnosis system based on Internet and use method |
KR102147847B1 (en) * | 2018-11-29 | 2020-08-25 | 가천대학교 산학협력단 | Data analysis methods and systems for diagnosis aids |
CN112270988B (en) * | 2020-12-04 | 2022-07-29 | 厦门基源医疗科技有限公司 | Auxiliary diagnosis method for rare diseases |
CN115482926B (en) * | 2022-09-20 | 2024-04-09 | 浙江大学 | Knowledge-driven rare disease visual question-answer type auxiliary differential diagnosis system and method |
-
2022
- 2022-09-20 CN CN202211146285.0A patent/CN115482926B/en active Active
-
2023
- 2023-02-21 WO PCT/CN2023/077386 patent/WO2024060508A1/en unknown
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107408143A (en) * | 2014-12-16 | 2017-11-28 | L.I.A.迪朱塞佩·卡帕索公司 | Suitable for determining the medical antidiastole device of the optimal sequence of the diagnostic test for identifying lesion using diagnosis appropriateness standard |
CN112135564A (en) * | 2018-05-23 | 2020-12-25 | 松下知识产权经营株式会社 | Method, program, device and system for evaluating ingestion swallowing function |
CN113272912A (en) * | 2018-10-22 | 2021-08-17 | 杰克逊实验室 | Methods and apparatus for phenotype-driven clinical genomics using likelihood ratio paradigm |
WO2021211326A1 (en) * | 2020-04-16 | 2021-10-21 | Ix Layer Inc. | Systems and methods for access management and clustering of genomic, phenotype, and diagnostic data |
CN111899884A (en) * | 2020-06-23 | 2020-11-06 | 北京左医科技有限公司 | Intelligent auxiliary inquiry method, device and storage medium |
CN111984771A (en) * | 2020-07-17 | 2020-11-24 | 北京欧应信息技术有限公司 | Automatic inquiry system based on intelligent conversation |
CN113707299A (en) * | 2021-08-27 | 2021-11-26 | 平安科技(深圳)有限公司 | Auxiliary diagnosis method and device based on inquiry session and computer equipment |
CN113889259A (en) * | 2021-09-06 | 2022-01-04 | 浙江工业大学 | Automatic diagnosis dialogue system under assistance of knowledge graph |
CN113889265A (en) * | 2021-10-15 | 2022-01-04 | 浙江大学 | Rare disease auxiliary reasoning method and system based on phenotype visualization |
CN114328864A (en) * | 2021-12-14 | 2022-04-12 | 合肥奥比斯科技有限公司 | Ophthalmic question-answering system based on artificial intelligence and knowledge graph |
Non-Patent Citations (4)
Title |
---|
A novel phenotype-oriented gialogue system supporting differential diagnosis of rare disease;Jian Yang et al.;《Computers in Biology and Medicine》;20240102;第169卷;第1-9页 * |
基于知识图谱的罕见病就医决策引擎设计研究;陈一龙等;《华西医学》;20211228;第36卷(第12期);第1730-1733页 * |
基于领域语义知识库的疾病辅助诊断方法;陈德彦;赵宏;张霞;;软件学报;20201014(第10期);第3167-3183页 * |
融合多模态特征的乳腺癌分类系统的分析与设计;郭星晨;《中国优秀硕士学位论文全文数据库 (医药卫生科技辑)》;20211215(第12期);第E072-285页 * |
Also Published As
Publication number | Publication date |
---|---|
WO2024060508A1 (en) | 2024-03-28 |
CN115482926A (en) | 2022-12-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115482926B (en) | Knowledge-driven rare disease visual question-answer type auxiliary differential diagnosis system and method | |
CN109935336B (en) | Intelligent auxiliary diagnosis system for respiratory diseases of children | |
CN111951965B (en) | Panoramic health dynamic monitoring and predicting system based on time sequence knowledge graph | |
Qin et al. | Machine learning models for data-driven prediction of diabetes by lifestyle type | |
CN117649949B (en) | Clinical thinking data generation system and method based on reinforcement learning | |
Choubey et al. | Rule based diagnosis system for diabetes | |
Ravuri et al. | Learning from the experts: From expert systems to machine-learned diagnosis models | |
CN105718726A (en) | Medical auxiliary examination system knowledge acquisition and inference method based on rough set | |
CN111143573A (en) | Method for predicting target node of knowledge graph based on user feedback information | |
Pingle | Evaluation of mental stress using predictive analysis | |
de Andrade et al. | Hybrid model for early identification post-Covid-19 sequelae | |
CN114093506B (en) | System for assisting disease reasoning and storage medium | |
Pal et al. | Generic disease prediction using symptoms with supervised machine learning | |
Chen et al. | A multi-channel convolutional neural network for ICD coding | |
He et al. | Scalable online disease diagnosis via multi-model-fused actor-critic reinforcement learning | |
Goodman et al. | Caregiver assessment using smart gaming technology: A feasibility study | |
Lucas et al. | Development and validation of HEPAR, an expert system for the diagnosis of disorders of the liver and biliary tract | |
Ravindranath | Clinical Decision Support System for heart diseases using Extended sub tree | |
Khalaf et al. | Predicting Acute Respiratory Failure Using Fuzzy Classifier | |
Lucas et al. | An ExplainableFair Framework for Prediction of Substance Use Disorder Treatment Completion | |
Almadni et al. | Comparative analysis of classification models in diagnosis of type 2 diabetes. | |
Mehra et al. | Generating quality IF-THEN rules for diabetes using linguistic summarization | |
Reynolds | Statistical Learning Methods for Electronic Health Record Data | |
Nam | Predicting Diabetes Using Tree-based Methods | |
Laurikkala | Knowledge discovery for female urinary incontinence expert system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |