WO2021175038A1 - 一种慢性病辅助决策的患者数据可视化方法及系统 - Google Patents

一种慢性病辅助决策的患者数据可视化方法及系统 Download PDF

Info

Publication number
WO2021175038A1
WO2021175038A1 PCT/CN2021/073125 CN2021073125W WO2021175038A1 WO 2021175038 A1 WO2021175038 A1 WO 2021175038A1 CN 2021073125 W CN2021073125 W CN 2021073125W WO 2021175038 A1 WO2021175038 A1 WO 2021175038A1
Authority
WO
WIPO (PCT)
Prior art keywords
patient
data
knowledge
feature
chronic disease
Prior art date
Application number
PCT/CN2021/073125
Other languages
English (en)
French (fr)
Inventor
李劲松
朱世强
周天舒
田雨
Original Assignee
之江实验室
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 之江实验室 filed Critical 之江实验室
Publication of WO2021175038A1 publication Critical patent/WO2021175038A1/zh
Priority to US17/553,832 priority Critical patent/US11521751B2/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/245Classification techniques relating to the decision surface
    • G06F18/2451Classification techniques relating to the decision surface linear, e.g. hyperplane
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/20ICT specially adapted for the handling or processing of medical references relating to practices or guidelines
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • the invention belongs to the technical field of medical treatment and data visualization, and in particular relates to a patient data visualization method and system for assisting decision-making in chronic diseases.
  • Chronic diseases also known as chronic non-communicable diseases, mainly include cardiovascular and cerebrovascular diseases (hypertension, coronary heart disease, stroke), diabetes, and chronic respiratory diseases. Some features have not yet been fully confirmed. With the rapid economic development of our country and the change of residents’ lifestyles, the number of chronic diseases and deaths continues to increase, and the disease burden of the people is becoming heavier. It has become one of the major public health problems that seriously threaten the health of our residents and affect the country’s economic and social development. Statistics show that domestic chronic diseases account for about 86% of disease mortality and 76% of disease burden. Chronic diseases are difficult to cure, and they mainly rely on patients' long-term self-health management.
  • the existing chronic disease data visualization technology mainly refers to the data recorded in the daily management of patients presented in the form of data charts in the application program.
  • the more commonly used data design components are histograms, line charts, circular charts, etc., which are simple and easy to understand and easy to understand.
  • the user accepts.
  • the patient uses multi-touch technology on the mobile terminal to achieve the effects of zooming in, zooming out, rotating, and shifting the picture.
  • Auxiliary decision support visualization is mainly displayed in the form of importance ranking, correlation matrix, etc. It is not a user-friendly visualization solution and the information transmission is not comprehensive and rich enough; the shortcomings of the existing chronic disease data visualization solutions are as follows:
  • the health management plan does not comprehensively consider various types of data such as the patient's personal physical condition, exercise and diet habits, so that decision-making lacks individualization, and patient data is not fully used when formulating and recommending the content of its management plan.
  • the present invention combines semantic technology, clinical decision support technology and visualization technology to propose a method for assisting decision-making in chronic diseases Patient data visualization method and system.
  • the invention constructs a chronic disease knowledge graph, and combines patient static data and dynamic data to construct a patient's management data model diagram on a hyperplane, and then projects it onto a two-dimensional plane. Compare the Euclidean distance between the features on the two-dimensional plan of the patient information model and the distance between the standard features (diagnosed by the doctor as a two-dimensional plan map of the patient with good management effect), and combine the path node concept and the attribute relationship between the concepts , Generate and recommend a management plan.
  • the fusion of patient information models and chronic disease knowledge maps can make full use of the semantic information of each feature to comprehensively and systematically display the importance and association of various risk factors on a two-dimensional plane in the form of position and color. Evaluate the effectiveness of patients' chronic disease management through geometric positions, and then use paths to formulate personalized patient health management plans to help patients improve their chronic disease management capabilities from multiple dimensions.
  • the patient data visualization method for assisting decision-making in chronic diseases proposed in this application, the specific implementation of the method includes the following steps:
  • Hyperplane feature map drawing The patient information model is transformed into a hyperplane feature map through a distributed representation, and the distributed representation uses a translation-based model between entity vectors and relation vectors;
  • Two-dimensional plane mapping The position information of the two-dimensional plane node corresponds to the two-dimensional position of the hyperplanar feature map of the patient information model after dimensionality reduction.
  • the color of the node is used to distinguish the different information categories in the knowledge graph, and the Regularized Gradient Boosted Decision is used.
  • the tree algorithm feature importance ranking is used as the ranking of the correlation between each node and the disease progression, and the feature weight value is used as the weight of the Euclidean distance calculation;
  • the knowledge content of the knowledge map covers disease diagnosis, inspection items, physical signs, related diseases, therapeutic drugs, living habits, measurement units, and detection quantities.
  • the patient information collected in step (2) includes patient health data manually input by daily mobile terminals or collected by wearable devices, and patient electronic medical record data recorded by the regional chronic disease management center.
  • step (2) during the RDF conversion of patient data, the D2R semantic mapping technology is used to map the data in the relational database to the RDF format;
  • D2R includes D2R Server, D2RQ Engine and D2RQ Mapping languages;
  • D2RQ Mapping language defines relational data Converted into RDF format Mapping rules;
  • D2RQ Engine uses customized D2RQ Mapping files to complete data mapping, which specifically refers to mapping the tables and fields of the relational database to the classes and attributes in the OWL file, and the relationship between the classes is expressed from the relationship From the table.
  • step (3) drawing the hyperplanar feature map specifically includes the following sub-steps:
  • the knowledge in the patient information model is stored in the form of triples of (h, r, t), where h represents the head entity vector, r represents the relationship vector, and t represents the tail entity vector;
  • the set of triples constitutes a directed graph, a graph node Represent entities, edges represent different types of relationships, and edges are directed to indicate that the relationship is asymmetric;
  • use the TransH model to construct entity distributed vectors of reflexive relationships, many-to-one, one-to-many, and many-to-many relationships;
  • S is a triplet in the knowledge base
  • S′ is a negatively sampled triplet
  • is a separation distance parameter with a value greater than 0;
  • step (4) two-dimensional plane mapping specifically includes the following sub-steps:
  • Step 1 Assume that the data set X has a total of N data points, and the dimension of each data point x i is D, and the dimension is reduced to two dimensions, that is, all data is represented on the plane;
  • ⁇ i is the Gaussian variance centered on the data point x i;
  • Step two calculating the conditional probability of the similarity of data points in the low dimensional space; for high-dimensional data point x i, x j corresponding to the low dimensional point y i, y j, calculating the conditional probability Q j
  • i as follows:
  • Step 3 Minimize the difference in conditional probability, that is, make the conditional probability Q j
  • the loss function is as follows:
  • Feature importance ranking Use the Regularized Gradient Boosted Decision Tree algorithm to achieve the importance ranking of each entity in the knowledge graph and obtain the feature weight value, specifically:
  • the data set is a patient information model with known chronic disease management effects or outcomes.
  • Each sample contains n-dimensional features (the number of patient information model entities);
  • the objective function L of Regularized Gradient Boosted Decision Tree includes loss function and complexity, which is defined as:
  • i represents the i-th sample
  • k is the k-th tree
  • y i is the label value
  • T is the number of leaf nodes
  • is the leaf weight value
  • is the leaf tree penalty regular term, which has a pruning effect
  • is the leaf weight penalty regular term to prevent overfitting
  • ⁇ k ⁇ (f k ) represents the complexity function of the tree
  • the feature importance score is achieved by calculating the total gain of a feature in all trees each time a node is split; the corresponding feature weight value is obtained by calling the get_score method of the booster parameter.
  • the Regularized Gradient Boosted Decision Tree parameter training is carried out using a grid search method, including general parameters, promotion parameters and learning target parameters; general parameters control macro parameters, and promotion parameters control each step The improvement of the learning target parameter controls the performance of the training target.
  • step (5) the SPARQL query language and Jena rules are used to infer the knowledge in the knowledge graph according to the distance information of the feature, and a personalized management plan for the patient is generated.
  • the SPARQL query statement includes the query information and the conditions that the name should meet.
  • the conditions appear in the form of triples, which are arranged in the order of ⁇ subject, predicate, object>, which is the subject, predicate, and object, and the result of the query In fact, it is the result of matching the conditional triplet with the RDF triplet in the data file.
  • the patient data visualization system for assisting decision-making in chronic diseases proposed in this application includes the following modules:
  • Chronic disease knowledge map building module chronic disease-related clinical guidelines and knowledge documents are used as the knowledge source of the knowledge map.
  • the data semantics are uniquely identified through SNOMED CT, classes, attributes and examples are manually constructed, data relationships and attribute relationships are added, and the knowledge map prototype file is generated ;
  • Patient information model building module collect patient information, convert the data in the patient database into RDF triples in compliance with OWL language specifications; identify the nodes of the patient information model with SNOMED CT, realize the semantic extension and integration of patient data to domain knowledge Construct patient information model with patient information and chronic disease knowledge map;
  • Hyperplanar feature map drawing module The patient information model is transformed into a hyperplanar feature map through a distributed representation.
  • the distributed representation adopts a translation-based model between entity vectors and relation vectors;
  • Two-dimensional plane mapping module The position information of the two-dimensional plane node corresponds to the two-dimensional position of the hyper-plane feature map of the patient information model after dimensionality reduction.
  • the color of the node is used to distinguish different information categories in the knowledge graph, and the Regularized Gradient Boosted Decision Tree algorithm is used.
  • the feature importance ranking is used as the ranking of the correlation between each node and the disease progression, and the feature weight value is used as the weight of the Euclidean distance calculation;
  • Decision support feedback module Take the patient information model marked by the domain experts as the ideal chronic disease management effect as the standard, draw the two-dimensional plane mapping image of the patient data through distributed representation and dimensionality reduction visualization, and calculate the mapping image in combination with the feature weight value
  • the Euclidean distance between the geometric centers of each feature area is used as a standardized management target; the Euclidean distance between the features in the two-dimensional planar map image of the patient who needs decision support feedback is calculated, and combined with the calculated feature weight value, the It is compared with the standard value to find a path with similar distance; the knowledge in the knowledge map is obtained according to the distance information of the feature.
  • the beneficial effect of the present invention is that compared with the existing visualization scheme for chronic disease data, the present invention combines the knowledge map to visually construct two-dimensional plane mapping images of various types of patient data, which is a user-friendly visualization scheme.
  • the relationship, characteristics and importance of patient information can be expressed by means of node distances and colors on a two-dimensional plane, and comprehensive and rich information can be conveyed.
  • the present invention realizes a long-term continuous, cyclical, and spiral-increasing whole-process and all-round visual health management service from health information collection, health evaluation to health promotion.
  • Figure 1 is a flowchart of the patient data visualization method for assisting decision-making in chronic diseases according to the present invention
  • FIG. 2 is a schematic diagram of the TransH model
  • Figure 3 is a schematic diagram of a two-dimensional plane mapping.
  • the patient data visualization method and system for chronic disease-assisted decision-making proposed in this application can help patients better understand personal health conditions and disease intervention conditions, and help doctors more efficiently view patients' conditions and formulate health management plans.
  • the specific implementation of the method of the present invention includes the following steps:
  • Clinical guidelines and knowledge documents related to chronic diseases serve as the knowledge source of the knowledge map.
  • the knowledge content covers disease diagnosis, examination items, physical signs, related diseases, treatment drugs, living habits, etc., and also includes medical auxiliary words such as measurement units and detection quantities.
  • Select SNOMED CT Systematized Nomenclature of Medicine--Clinical Terms
  • SNOMED CT uses SNOMED CT to uniquely identify data semantics, manually construct information such as classes, attributes, and instances, and add data relationships and Attribute relationship, and generate the prototype file of the knowledge graph.
  • patient health data manually input on daily mobile terminals or collected by wearable devices
  • patient electronic medical record data recorded by the regional chronic disease management center.
  • D2R Database to RDF
  • D2R semantic mapping technology
  • D2R mainly includes D2R Server, D2RQ Engine and D2RQ Mapping languages.
  • D2RQ Mapping language defines the mapping rules for converting relational data into RDF format.
  • D2RQ Engine uses a customized D2RQ Mapping file to complete data mapping. Specifically, it refers to mapping the tables and fields of the relational database to the classes and attributes in the OWL file. The relationship between the classes can be derived from the table representing the relationship. .
  • SNOMED CT is used to identify the nodes of the patient data model, so as to realize the semantic extension of patient data to domain knowledge, and construct the patient information model by fusing patient information and the chronic disease knowledge map.
  • the patient information model is transformed into a hyperplanar feature map through a distributed representation.
  • the distributed representation uses a translation-based model between entity vectors and relation vectors.
  • Step 1 Use the TransH model to encode triples into spatial distributed vectors, as shown in Figure 2.
  • TransH replaces the head and tail entities with different probabilities according to the type of relationship r (one-to-one, one-to-many, many-to-one, and many-to-many). For example, for a one-to-many relationship, replacing the head entity is more likely to obtain a legal negative sample than replacing the tail entity, so the head entity can be replaced with a greater probability.
  • TransH first counts and averages the number of tail entities corresponding to each head entity tph and the average number of head entities corresponding to each tail entity hpt, and then defines a Bernoulli distribution with probability Replace the head entity with probability Replace the tail entity.
  • the knowledge in the patient information model is stored as a triple form of (h, r, t), where h represents the head entity vector, r represents the relationship vector, and t represents the tail entity vector.
  • the set of triples forms a directed graph, the graph nodes represent entities, the edges represent different types of relationships, and the edges are directed to indicate that the relationship is asymmetric.
  • the TransH model can construct entity distributed vectors with reflexive relationships, many-to-one, one-to-many, and many-to-many relationships.
  • Step 2 Optimize the objective function. For each relation r in TransH model, suppose there is a corresponding hyperplane (relation r falls on the hyperplane), the relation projection of r on the hyperplane is denoted as dr, and the normal vector of the hyperplane is denoted as ⁇ r , and there is
  • 2 1.
  • h ⁇ and t ⁇ represent the projections of h and t on the hyperplane respectively, then:
  • S is a triplet in the knowledge base
  • S′ is a negatively sampled triplet
  • is an interval distance parameter with a value greater than 0, which is a hyperparameter
  • [x] + represents a positive value function, that is, x>0
  • [x] + x
  • [x] + 0.
  • the score function value of the two nodes is relatively low, which means that the distance is relatively close, and vice versa.
  • SGD Stochastic Gradient Descent
  • the position information of the two-dimensional plane node corresponds to the two-dimensional position of the hyper-plane feature map of the patient information model after dimensionality reduction.
  • the color of the node is used to distinguish the different information categories in the knowledge graph, and the Regularized Gradient Boosted Decision Tree algorithm feature importance ranking is used as each
  • the feature weight value is used as the weight of the Euclidean distance calculation.
  • t-SNE algorithm t-distributed Stochastic Neighbor Embedding, t-distributed neighborhood embedding algorithm
  • the t-SNE algorithm is a machine learning method for dimensionality reduction, which can help us identify associated patterns.
  • the main advantage of t-SNE is the ability to maintain local structure. This means that the projections of similar points in the high-dimensional data space to the low-dimensional data are still similar. t-SNE can also generate beautiful visualizations.
  • the t-SNE algorithm models the distribution of the nearest neighbors of each data point, where the nearest neighbor refers to a collection of data points close to each other.
  • the high-dimensional space we model the high-dimensional space as a Gaussian distribution
  • the two-dimensional output space we can model it as a t distribution.
  • the goal of this process is to find a transformation that maps a high-dimensional space to a two-dimensional space and minimize the gap between the two distributions of all points.
  • the t distribution has a longer tail, which helps the data points to be more evenly distributed in the two-dimensional space.
  • Step 1 Assume that the data set X has a total of N data points, the dimension of each data point x i is D, and the dimension reduction is d-dimensional, where the value of d is 2, which means that all data is represented on the plane. Calculate the conditional probability of similarity in the high-dimensional space of data points.
  • the high-dimensional Euclidean distance between data points is converted into a conditional probability representing similarity.
  • i of similarity between high-dimensional data points x i and x j is as follows:
  • ⁇ i is the Gaussian variance centered on the data point x i.
  • Step two calculating the conditional probability of the similarity of data points in the low dimensional space; for high-dimensional data point x i, x j corresponding to the low dimensional point y i, y j, calculating the conditional probability Q j
  • i as follows:
  • Step 3 Minimize the difference in conditional probability, that is, make the conditional probability Q j
  • This step is achieved by minimizing the Kullback-Leibler divergence (KL divergence) between the two conditional probability distributions.
  • KL divergence Kullback-Leibler divergence
  • the schematic diagram of the two-dimensional plane mapping is shown in Figure 3.
  • the figure shows the projection points of each entity of two different types of features, corresponding to the approximate projection areas of such features, and marks the center point of each type of feature. The distance between the center point and the clustering of the projection points can determine the correlation between the features.
  • Regularized Gradient Boosted Decision Tree eXtreme Gradient Boosting, extreme gradient boosting algorithm
  • the data set is a patient information model with known chronic disease management effects or outcomes, and each sample contains n-dimensional features (the number of patient information model entities).
  • the objective function L of Regularized Gradient Boosted Decision Tree includes loss function and complexity, which is defined as:
  • i represents the i-th sample
  • k is the k-th tree
  • y i is the label value
  • T is the number of leaf nodes
  • is the leaf weight value
  • is the leaf tree penalty regular term, which has a pruning effect
  • is the leaf weight penalty regular term to prevent overfitting
  • ⁇ k ⁇ (f k ) represents the complexity function of the tree. The lower the complexity, the stronger the generalization ability of the model.
  • the split with the smallest value of the objective function after the split is the best split point.
  • the Gain here can be regarded as the objective function value before the division minus the left and right objective function values after the division. Therefore, if Gain ⁇ 0, the leaf node is not divided. ⁇ is added to the new leaf node introducing complexity expense, G L for the left subtree gradient values, H L for the left subtree sample sets second derivative; G R is a right subtree gradient values, H R for the right subtree sample sets Second Derivative;
  • the structure of a tree can be evaluated.
  • the feature importance score is achieved by calculating the total gain of a feature in all trees each time a node is split, that is, total_gain. This score measures the value of the feature in improving the construction of the decision tree, so it can be used as an indicator of feature importance ranking. Finally, the corresponding feature weight value is obtained by calling the get_score method of the booster parameter.
  • the Regularized Gradient Boosted Decision Tree parameter training is performed using a grid search method, including general parameters, boosting parameters, and learning target parameters.
  • the general parameters control the macro parameters
  • the promotion parameters control the promotion of each step
  • the learning target parameters control the performance of the training target.
  • SPARQL SPARQL Protocol and RDF Query Language, SPARQL protocol and RDF query language
  • Ja rule reasoning to obtain knowledge in the knowledge graph according to the distance information of the characteristics, and generate personalized management plans for patients, including exercise recommendations, diet recommendations, and medication Suggestions, inspection suggestions, lifestyle suggestions, etc.
  • the SPARQL query statement includes the query information and the conditions that the name should meet.
  • the conditions appear in the form of triples, arranged in the order of ⁇ subject, predicate, object> (subject, predicate, object), the query conditions also become a pattern, and the result of the query is actually The result of matching conditional triples with RDF triples in the data file.
  • Jena reasoning is based on rules, and rules are defined by Rule objects.
  • This application also proposes a patient data visualization system for assisting decision-making in chronic diseases.
  • the system includes the following modules:
  • Chronic disease knowledge map building module chronic disease-related clinical guidelines and knowledge documents are used as the knowledge source of the knowledge map.
  • the data semantics are uniquely identified through SNOMED CT, classes, attributes and examples are manually constructed, data relationships and attribute relationships are added, and the knowledge map prototype file is generated ;
  • Patient information model building module collect patient information, convert the data in the patient database into RDF triples in compliance with OWL language specifications; identify the nodes of the patient information model with SNOMED CT, realize the semantic extension and integration of patient data to domain knowledge Construct patient information model with patient information and chronic disease knowledge map;
  • Hyperplanar feature map drawing module The patient information model is transformed into a hyperplanar feature map through a distributed representation.
  • the distributed representation adopts a translation-based model between entity vectors and relation vectors;
  • Two-dimensional plane mapping module The position information of the two-dimensional plane node corresponds to the two-dimensional position of the hyper-plane feature map of the patient information model after dimensionality reduction.
  • the color of the node is used to distinguish different information categories in the knowledge graph, and the Regularized Gradient Boosted Decision Tree algorithm is used.
  • the feature importance ranking is used as the ranking of the correlation between each node and the disease progression, and the feature weight value is used as the weight of the Euclidean distance calculation;
  • Decision support feedback module Take the patient information model marked by the domain experts as the ideal chronic disease management effect as the standard, draw the two-dimensional plane mapping image of the patient data through distributed representation and dimensionality reduction visualization, and calculate the mapping image in combination with the feature weight value
  • the Euclidean distance between the geometric centers of each feature area is used as a standardized management target; the Euclidean distance between the features in the two-dimensional planar map image of the patient who needs decision support feedback is calculated, and combined with the calculated feature weight value, the It is compared with the standard value to find a path with similar distance; the knowledge in the knowledge map is obtained according to the distance information of the feature.
  • the feature importance ranking can also use the CatBoost (Categorical Boosting) algorithm and the Light GBM algorithm.
  • Distributed representation can also use translation models such as TransG, TransR, and CTransR.
  • Two-dimensional planar mapping can also use dimensionality reduction algorithms such as principal component analysis (PCA), Sammon mapping, and SNE. Therefore, all simple modifications, equivalent changes and modifications made to the above embodiments according to the technical essence of the present invention without departing from the content of the technical solution of the present invention still fall within the protection scope of the technical solution of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

提供了一种慢性病辅助决策的患者数据可视化方法及系统,通过构建慢性病知识图谱,并结合患者静态数据和动态数据构建患者在超平面的管理数据模型图,再将其投影至二维平面。比较患者信息模型在二维平面图上特征之间的欧式距离和标准特征之间的距离差异,结合路径节点概念以及概念间的属性关系,生成并推荐管理方案。融合患者信息模型和慢性病知识图谱可以充分运用各个特征的语义信息,在二维平面上以位置、颜色等方式全面、系统地展示各个风险因素的重要性以及关联。通过几何位置评估患者慢病管理的效果,然后利用路径制定个性化的患者健康管理方案,帮助患者从多个维度提升慢病管理能力。

Description

一种慢性病辅助决策的患者数据可视化方法及系统 技术领域
本发明属于医疗及数据可视化技术领域,尤其涉及一种慢性病辅助决策的患者数据可视化方法及系统。
背景技术
慢性病又称慢性非传染性疾病,主要包括心脑血管疾病(高血压、冠心病、脑卒中)、糖尿病和慢性呼吸系统疾病等,具有起病隐匿、病程长且病情迁延不愈、病因复杂且有些尚未完全被确认等特点。随着我国经济迅速发展、居民生活方式改变,慢性病患病和死亡人数不断增多,群众疾病负担日益沉重,已成为严重威胁我国居民健康、影响国家经济社会发展的重大公共卫生问题之一。有数据显示,国内慢性病约占疾病死亡率的86%、占疾病负担的76%。慢性病难以根治,主要依赖患者长期的自我健康管理。对于患者电子病历信息以及日常管理所记录的数据,如饮食、运动、日常体征数据,进行一定的分析,提供一种辅助决策的数据可视化方法,有利于患者了解自身健康状况及时调整健康管理方案,也有助于医生为患者制定并推荐管理方案,节省医疗资源。可视化的方式能够清晰地展现患者管理目标,提供健康相关的精准帮助,从而提升患者管理的依从性。
现有的慢性病数据可视化技术主要是指患者日常管理记录的数据在应用程序中以数据图表形式呈现,比较常用的数据设计组件是柱状图、折线图、环形图等,它们简单易懂,容易被用户接受。患者在移动端通过多点触控技术实现图片的放大、缩小、旋转、位移效果。辅助决策支持可视化主要通过重要性排序、相关矩阵等形式展现,不是用户友好的可视化方案同时信息传达不够全面、丰富;现有慢病数据可视化方案的不足具体如下:
(1)只是展示患者日常在移动端记录的体征等数据,以表达某种趋势,缺少了患者日常管理效果的评估体系,患者和医护人员无法明确患者健康管理对其健康状况的影响效果。
(2)大多以折线图、柱状图等基础图形表示,无法体现慢病风险因素之间的关联,并且各类型数据分开统计并绘制,不能系统化、全面地整合患者多维度信息及其慢病风险因素的关联、重要程度等语义信息。
(3)健康管理方案没有综合考虑患者个人身体状况、运动饮食习惯等各类型数据,从而决策缺少个性化,患者数据没有在制定并推荐其管理方案内容时得到充分应用。
发明内容
本发明根据患者电子病历数据以及日常所记录的各类数据(运动、饮食、体征、用药、化验等),结合语义技术、临床决策支持技术与可视化技术,提出了一种用于慢性病辅助决策的患者数据可视化方法及系统。
本发明通过构建慢性病知识图谱,并结合患者静态数据和动态数据构建患者在超平面的管理数据模型图,再将其投影至二维平面。比较患者信息模型在二维平面图上特征之间的欧式距离和标准特征之间的距离(被医生诊断为管理效果好的患者二维平面映射图)差异,结合路径节点概念以及概念间的属性关系,生成并推荐管理方案。融合患者信息模型和慢性病知识图谱可以充分运用各个特征的语义信息,在二维平面上以位置、颜色等方式全面、系统地展示各个风险因素的重要性以及关联。通过几何位置评估患者慢病管理的效果,然后利用路径制定个性化的患者健康管理方案,帮助患者从多个维度提升慢病管理能力。
本申请提出的慢性病辅助决策的患者数据可视化方法,该方法的具体实现包括以下步骤:
(1)构建慢性病知识图谱:慢性病相关临床指南、知识文献作为知识图谱的知识源,通过SNOMED CT对数据语义进行唯一标识,手动构建类、属性与实例,添加数据关系和属性关系,生成知识图谱原型文件;
(2)建立患者信息模型:采集患者信息;进行患者数据RDF转换,将患者数据库中的数据转换成符合OWL语言规范的RDF三元组关系;以SNOMED CT标识患者信息模型的节点,实现患者数据向领域知识的语义扩展,融合患者信息和慢性病知识图谱构建患者信息模型;
(3)超平面特征图绘制:患者信息模型通过分布式表示转换为超平面特征图,分布式表示采用实体向量与关系向量之间基于翻译的模型;
(4)二维平面映射:二维平面节点的位置信息对应患者信息模型超平面特征图降维后的二维位置,利用节点的颜色区分知识图谱中所属的信息类别不同,利用Regularized Gradient Boosted Decision Tree算法特征重要性排序作为各个节点与疾病进展相关性的排序,特征权重值作为欧式距离计算的权重;
(5)决策支持反馈:以领域专家标注结果为慢病管理效果理想的患者信息模型作为标准,通过分布式表示和降维可视化绘制出患者数据的二维平面映射图像,结合特征权重值计算映射图像中各个特征区域的几何中心之间的欧式距离,作为标准化的管理目标;计算需要决策支持反馈的患者在二维平面映射图像中特征之间的欧式距离,并结合其计算出的特征权重值,将其与标准数值进行比较,寻找相似距离的路径;根据特征的距离信息获取知识图谱内的知识。
进一步地,所述知识图谱的知识内容覆盖疾病诊断、检查项目、体征状态、相关疾病、治疗药物、生活习惯、计量单位和检测量。
进一步地,步骤(2)采集的患者信息包括日常移动端手动输入或者可穿戴设备采集的患者健康数据,以及区域慢病管理中心所记录的患者电子病历数据。
进一步地,步骤(2)患者数据RDF转换过程中,采用D2R语义映射技术将关系型数据库中的数据映射到RDF格式;D2R包括D2R Server、D2RQ Engine和D2RQ Mapping语言;D2RQ Mapping语言定义关系型数据转换成RDF格式的Mapping规则;D2RQ Engine使用定制的D2RQ Mapping文件完成数据映射,具体是指将关系型数据库的表和字段分别映射为OWL文件中的类和属性,类之间的关系从表示关系的表中得出。
进一步地,步骤(3)超平面特征图绘制具体包括以下子步骤:
(3.1)采用TransH模型将三元组编码为空间的分布式向量,具体为:
患者信息模型中知识被存储为(h,r,t)的三元组形式,其中h表示头实体向量,r表示关系向量,t表示尾实体向量;三元组集合组成有向图,图形节点代表实体,边代表不同类型的关系,边是有向的表明关系是非对称的;通过TransH模型构建自反关系、多对一、一对多、多对多关系的实体分布式向量;
(3.2)优化目标函数,具体为:
TransH模型对于每一个关系r,假设有一个对应的超平面,r在超平面上的关系投影表示为d r,该超平面法向量表示为ω r,且有||ω r|| 2=1,h 、t 分别表示h、t在该超平面的投影,则有:
Figure PCTCN2021073125-appb-000001
定义评分函数为:
Figure PCTCN2021073125-appb-000002
得到目标函数:
Figure PCTCN2021073125-appb-000003
其中S是知识库中的三元组,S′是负采样的三元组,ε是取值大于0的间隔距离参数;
在优化目标函数L的过程中,需要使正例三元组的值小,负例三元组的值大,即排序损失最小化;使用随机梯度下降训练方法,TransH模型训练完成后,得到实体和关系的向量表示。
进一步地,步骤(4)二维平面映射具体包括以下子步骤:
(4.1)使用t-SNE算法进行降维可视化,具体为:
步骤一:假设数据集X,它共有N个数据点,每个数据点x i的维度为D,降维为二维,即在平面上表示出所有数据;
计算数据点高维空间中的相似性的条件概率;将数据点之间的高维欧几里得距离转换为表示相似性的条件概率,高维数据点x i、x j之间的相似性条件概率P j|i如下:
Figure PCTCN2021073125-appb-000004
其中σ i是以数据点x i为中心的高斯方差;
步骤二:计算数据点低维空间中的相似性的条件概率;对于高维数据点x i、x j的低维对应点y i、y j,计算条件概率Q j|i如下:
Figure PCTCN2021073125-appb-000005
步骤三:最小化条件概率的差异,即令条件概率Q j|i近似于P j|i;通过最小化两个条件概率分布之间的Kullback-Leibler散度实现,并使用梯度下降进行迭代更新,损失函数如下:
Figure PCTCN2021073125-appb-000006
(4.2)特征重要性排序:使用Regularized Gradient Boosted Decision Tree算法实现知识图谱中的各实体的重要性排序以及获得特征权重值,具体为:
数据集为已知慢病管理效果或者结局的患者信息模型,每个样本包含n维特征(患者信息模型实体数);Regularized Gradient Boosted Decision Tree的目标函数L包括损失函数和复杂度,定义为:
Figure PCTCN2021073125-appb-000007
Figure PCTCN2021073125-appb-000008
其中i表示第i个样本,k是第k颗树,
Figure PCTCN2021073125-appb-000009
为预测输出,y i为标签值,T表示叶子节点数,ω表示叶子权重值;γ为叶子树惩罚正则项,具有剪枝作用;λ为叶子权重惩罚正则项,防止过拟合;
Figure PCTCN2021073125-appb-000010
表示第i个样本的预测误差;
Figure PCTCN2021073125-appb-000011
表示损失函数;∑ kΩ(f k)表示树的复杂度函数;
树的生长过程中,通过对比分裂前后的目标函数值,分裂后目标函数值最小的分裂为最佳分裂点;
Figure PCTCN2021073125-appb-000012
其中δ为加入新叶子节点引入的复杂度代价,G L为左子树梯度值,H L为左子树样本集合二阶导数;G R为右子树梯度值,H R为右子树样本集合二阶导数;如果Gain<0,则此叶子节点不做分割;
特征重要性分数通过计算所有树中某特征在每次分裂节点时带来的总增益实现;通过调用booster参数的get_score方法获取对应的特征权重值。
进一步地,步骤(4)特征重要性排序过程中,Regularized Gradient Boosted Decision Tree参数训练采用网格搜索方法进行,包括一般参数、提升参数和学习目标参数;一般参数控制宏观参数,提升参数控制每一步的提升,学习目标参数控制训练目标的表现。
进一步地,步骤(5)中,根据特征的距离信息使用SPARQL查询语言和Jena规则推理获取知识图谱内的知识,生成患者个性化管理方案。
进一步地,步骤(5)中,SPARQL查询语句包括查询信息以及名称应该符合的条件,条件以三元组形式出现,按照<subject,predicate,object>,即主谓宾的顺序排列,查询的结果实际就是条件三元组与数据文件中RDF三元组匹配的结果。
本申请提出的慢性病辅助决策的患者数据可视化系统,该系统包括以下模块:
慢性病知识图谱构建模块:慢性病相关临床指南、知识文献作为知识图谱的知识源,通过SNOMED CT对数据语义进行唯一标识,手动构建类、属性与实例,添加数据关系和属性关系,生成知识图谱原型文件;
患者信息模型构建模块:采集患者信息,将患者数据库中的数据转换成符合OWL语言规范的RDF三元组关系;以SNOMED CT标识患者信息模型的节点,实现患者数据向领域知识的语义扩展,融合患者信息和慢性病知识图谱构建患者信息模型;
超平面特征图绘制模块:患者信息模型通过分布式表示转换为超平面特征图,分布式表示采用实体向量与关系向量之间基于翻译的模型;
二维平面映射模块:二维平面节点的位置信息对应患者信息模型超平面特征图降维后的二维位置,利用节点的颜色区分知识图谱中所属的信息类别不同,利用Regularized Gradient Boosted Decision Tree算法特征重要性排序作为各个节点与疾病进展相关性的排序,特征权重值作为欧式距离计算的权重;
决策支持反馈模块:以领域专家标注结果为慢病管理效果理想的患者信息模型作为标准,通过分布式表示和降维可视化绘制出患者数据的二维平面映射图像,结合特征权重值计算映射图像中各个特征区域的几何中心之间的欧式距离,作为标准化的管理目标;计算需要决策支持反馈的患者在二维平面映射图像中特征之间的欧式距离,并结合其计算出的特征权重值,将其与标准数值进行比较,寻找相似距离的路径;根据特征的距离信息获取知识图谱内的知 识。
本发明的有益效果是:较之现有的慢病数据可视化方案,本发明结合知识图谱,能够以可视化的方式构建患者各类型数据的二维平面映射图像,是用户友好的可视化方案。通过二维平面上节点距离、颜色等方式能够表达患者信息之间的关联、特征和重要性,能够传达全面且丰富的信息。通过几何位置评估患者慢病管理的效果,然后利用路径制定个性化的患者健康管理方案,帮助患者从多个维度提升慢病管理能力,有助于提升患者依从性。本发明从健康信息收集、健康评估到健康促进实现了长期连续、周而复始、螺旋上升的全程全方位的可视化健康管理服务。
附图说明
图1为本发明慢性病辅助决策的患者数据可视化方法实现流程图;
图2为TransH模型示意图;
图3为二维平面映射示意图。
具体实施方式
为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图对本发明的具体实施方式做详细的说明。
在下面的描述中阐述了很多具体细节以便于充分理解本发明,但是本发明还可以采用其他不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本发明内涵的情况下做类似推广,因此本发明不受下面公开的具体实施例的限制。
本申请提出的一种慢性病辅助决策的患者数据可视化方法及系统,能够帮助患者更好地了解个人健康状况和疾病干预情况,帮助医生更高效地查看患者病情并制定健康管理方案。如图1所示,本发明方法的具体实现包括以下步骤:
(1)慢性病知识图谱构建
慢性病相关的临床指南、知识文献作为知识图谱的知识源,知识内容覆盖疾病诊断、检查项目、体征状态、相关疾病、治疗药物、生活习惯等方面,还包括了计量单位、检测量等医学辅助词。选择SNOMED CT(Systematized Nomenclature of Medicine--Clinical Terms,医学系统命名法-临床术语)作为标准化编码体系,通过SNOMED CT对数据语义进行唯一标识,手动构建类、属性与实例等信息,添加数据关系和属性关系,生成知识图谱原型文件。
(2)患者信息模型建立
(2.1)患者信息采集
患者信息主要来源有两类:一类是日常移动端手动输入或者可穿戴设备采集的患者健康数据;另一类是区域慢病管理中心所记录的患者电子病历数据。
(2.2)患者数据RDF转换
患者数据库中的XML、JSON等格式数据需要转换成符合OWL(Web Ontology Language)语言规范的RDF(Resource Description Framework)三元组关系。此处采用D2R(Database to RDF)语义映射技术将关系型数据库中的数据映射到RDF格式。D2R主要包括D2R Server、D2RQ Engine和D2RQ Mapping语言。D2RQ Mapping语言定义关系型数据转换成RDF格式的Mapping规则。D2RQ Engine则是使用定制的D2RQ Mapping文件完成数据映射,具体是指将关系型数据库的表和字段分别映射为OWL文件中的类和属性,类之间的关系可以从表示关系的表中得出。同慢性病知识图谱一样,还是以SNOMED CT标识患者数据模型的节点,从而实现患者数据向领域知识的语义扩展,融合患者信息和慢性病知识图谱构建患者信息模型。
(3)超平面特征图绘制
患者信息模型通过分布式表示转换为超平面特征图,分布式表示采用实体向量与关系向量之间基于翻译的模型。
步骤一:采用TransH模型将三元组编码为空间的分布式向量,如图2所示。
TransH根据关系r的类型(一对一、一对多、多对一、多对多),以不同的概率替换头尾实体。例如,对于一对多关系,替换头实体比替换尾实体有更大的可能获得合法的负样本,因此可以以更大的概率替换头实体。对于关系r所对应的三元组,TransH首先统计平均每个头实体对应的尾实体个数tph以及平均每个尾实体对应的头实体个数hpt,然后定义一个伯努利分布,以概率
Figure PCTCN2021073125-appb-000013
替换头实体,以概率
Figure PCTCN2021073125-appb-000014
替换尾实体。
患者信息模型中知识被存储为(h,r,t)的三元组形式,其中h表示头实体向量,r表示关系向量,t表示尾实体向量。三元组集合组成有向图,图形节点代表实体,边代表不同类型的关系,边是有向的表明关系是非对称的。TransH模型可以构建自反关系、多对一、一对多、多对多关系的实体分布式向量。
步骤二:优化目标函数。TransH模型对于每一个关系r,假设有一个对应的超平面(关系r落在该超平面),r在超平面上的关系投影表示为d r,该超平面法向量表示为ω r,且有||ω r|| 2=1。h 、t 分别表示h、t在该超平面的投影,则有:
Figure PCTCN2021073125-appb-000015
定义评分函数为:
Figure PCTCN2021073125-appb-000016
得到目标函数:
Figure PCTCN2021073125-appb-000017
其中S是知识库中的三元组,S′是负采样的三元组,ε是取值大于0的间隔距离参数,是一个超参数,[x] +表示正值函数,即x>0时,[x] +=x,当x≤0时,[x] +=0。两个节点评分函数值比较低,表示距离较近,反之距离较远。在优化目标函数L的过程中,需要使正例三元组的值小,负例三元组的值大,也就是排序损失最小化。使用随机梯度下降(Stochastic Gradient Descent,SGD)训练方法,TransH模型训练完成后,可得到实体和关系的向量表示。
(4)二维平面映射
二维平面节点的位置信息对应患者信息模型超平面特征图降维后的二维位置,利用节点的颜色区分知识图谱中所属的信息类别不同,利用Regularized Gradient Boosted Decision Tree算法特征重要性排序作为各个节点与疾病进展相关性的排序,特征权重值作为欧式距离计算的权重。
(4.1)降维可视化
使用t-SNE算法(t-distributed Stochastic Neighbor Embedding,t-分布邻域嵌入算法)进行降维可视化。
t-SNE算法是一种用于降维的机器学习方法,它能帮我们识别相关联的模式。t-SNE主要的优势就是保持局部结构的能力。这意味着高维数据空间中距离相近的点投影到低维中仍然相近。t-SNE同样能生成漂亮的可视化。
t-SNE算法对每个数据点近邻的分布进行建模,其中近邻是指相互靠近数据点的集合。在原始高维空间中,我们将高维空间建模为高斯分布,而在二维输出空间中,我们可以将其建模为t分布。该过程的目标是找到将高维空间映射到二维空间的变换,并且最小化所有点在这两个分布之间的差距。与高斯分布相比t分布有较长的尾部,这有助于数据点在二维空间中更均匀地分布。
步骤一:假设数据集X,它共有N个数据点,每个数据点x i的维度为D,降维为d维,这里d取值为2,即在平面上表示出所有数据。计算数据点高维空间中的相似性的条件概率。将数据点之间的高维欧几里得距离转换为表示相似性的条件概率,高维数据点x i、x j之间的相似性条件概率P j|i如下:
Figure PCTCN2021073125-appb-000018
其中σ i是以数据点x i为中心的高斯方差。
步骤二:计算数据点低维空间中的相似性的条件概率;对于高维数据点x i、x j的低维对应点y i、y j,计算条件概率Q j|i如下:
Figure PCTCN2021073125-appb-000019
步骤三:最小化条件概率的差异,即令条件概率Q j|i近似于P j|i。这一步骤通过最小化两个条件概率分布之间的Kullback-Leibler散度(KL散度)实现。这一过程使用梯度下降进行迭代更新,损失函数如下,即最小化损失函数:
Figure PCTCN2021073125-appb-000020
二维平面映射示意图如图3所示。图中显示了两类不同特征各个实体的投影点,对应了此类特征大概的投影区域,并标志出了每类特征的中心点。中心点的距离以及投影点的集群情况可以判断出特征之间的相关性。
(4.2)特征重要性排序
使用Regularized Gradient Boosted Decision Tree算法(eXtreme Gradient Boosting,极值梯度提升算法)实现知识图谱中的各实体的重要性排序以及获得特征权重值。数据集为已知慢病管理效果或者结局的患者信息模型,每个样本包含n维特征(患者信息模型实体数)。Regularized Gradient Boosted Decision Tree的目标函数L包括损失函数和复杂度,定义为:
Figure PCTCN2021073125-appb-000021
Figure PCTCN2021073125-appb-000022
其中i表示第i个样本,k是第k颗树,
Figure PCTCN2021073125-appb-000023
为预测输出,y i为标签值,T表示叶子节点数,ω表示叶子权重值;γ为叶子树惩罚正则项,具有剪枝作用;λ为叶子权重惩罚正则项,防止过拟合;
Figure PCTCN2021073125-appb-000024
表示第i个样本的预测误差,该误差值越小越好;
Figure PCTCN2021073125-appb-000025
表示损失函数;∑ kΩ(f k)表示树的复杂度函数,该复杂度越低,表示模型的泛化能力越强。
树的生长过程中,通过对比分裂前后的目标函数值,分裂后目标函数值最小的分裂为最佳分裂点。这里的Gain可以看作是未分割前的目标函数值减去分裂后的左右目标函数值,因此如果Gain<0,则此叶子节点不做分割。δ为加入新叶子节点引入的复杂度代价,G L为左子树梯度值,H L为左子树样本集合二阶导数;G R为右子树梯度值,H R为右子树样本集合二阶导数;
Figure PCTCN2021073125-appb-000026
可以评价一棵树的结构。
Figure PCTCN2021073125-appb-000027
特征重要性分数是通过计算所有树中某特征在每次分裂节点时带来的总增益,即total_gain实现。该分数衡量了特征在提升决策树构建中的价值,因此可以作为特征重要性排序的指标。最后,通过调用booster参数的get_score方法获取对应的特征权重值。
该步骤中Regularized Gradient Boosted Decision Tree参数训练采用网格搜索方法进行,包括一般参数、提升参数和学习目标参数。一般参数控制宏观参数,提升参数控制每一步的提升,学习目标参数控制训练目标的表现。
(5)决策支持反馈
以领域专家标注结果为慢病管理效果理想的患者信息模型作为标准,通过上述的分布式表示和降维可视化绘制出患者数据的二维平面映射图像,结合特征权重值计算映射图像中各个特征区域的几何中心之间的欧式距离,作为标准化的管理目标。计算需要决策支持反馈的患者在二维平面映射图像中特征之间的欧式距离,并结合其计算出的特征权重值,将其跟标准数值进行比较,寻找相似距离的路径。根据特征的距离信息使用SPARQL(SPARQL Protocol and RDF Query Language,SPARQL协议与RDF查询语言)查询语言和Jena规则推理获取知识图谱内的知识,生成患者个性化管理方案,包括运动建议、饮食建议、用药建议、检查建议、生活习惯建议等。SPARQL查询语句包括查询信息以及名称应该符合的条件,条件以三元组形式出现,按照<subject,predicate,object>(主谓宾)的顺序排列,查询条件也成为一个模式,查询的结果实际就是条件三元组与数据文件中RDF三元组匹配的结果。Jena推理基于规则,规则通过Rule对象来进行定义。
本申请还提出了一种慢性病辅助决策的患者数据可视化系统,该系统包括以下模块:
慢性病知识图谱构建模块:慢性病相关临床指南、知识文献作为知识图谱的知识源,通过SNOMED CT对数据语义进行唯一标识,手动构建类、属性与实例,添加数据关系和属性关系,生成知识图谱原型文件;
患者信息模型构建模块:采集患者信息,将患者数据库中的数据转换成符合OWL语言规范的RDF三元组关系;以SNOMED CT标识患者信息模型的节点,实现患者数据向领域知识的语义扩展,融合患者信息和慢性病知识图谱构建患者信息模型;
超平面特征图绘制模块:患者信息模型通过分布式表示转换为超平面特征图,分布式表示采用实体向量与关系向量之间基于翻译的模型;
二维平面映射模块:二维平面节点的位置信息对应患者信息模型超平面特征图降维后的二维位置,利用节点的颜色区分知识图谱中所属的信息类别不同,利用Regularized Gradient Boosted Decision Tree算法特征重要性排序作为各个节点与疾病进展相关性的排序,特征权重值作为欧式距离计算的权重;
决策支持反馈模块:以领域专家标注结果为慢病管理效果理想的患者信息模型作为标准,通过分布式表示和降维可视化绘制出患者数据的二维平面映射图像,结合特征权重值计算映射图像中各个特征区域的几何中心之间的欧式距离,作为标准化的管理目标;计算需要决策支持反馈的患者在二维平面映射图像中特征之间的欧式距离,并结合其计算出的特征权重值,将其与标准数值进行比较,寻找相似距离的路径;根据特征的距离信息获取知识图谱内的知识。
以上所述仅是本发明的优选实施方式,虽然本发明已以较佳实施例披露如上,然而并非用以限定本发明。任何熟悉本领域的技术人员,在不脱离本发明技术方案范围情况下,都可利用上述揭示的方法和技术内容对本发明技术方案做出许多可能的变动和修饰,或修改为等同变化的等效实施例。例如,特征重要性排序还可以运用CatBoost(Categorical Boosting)算法、Light GBM算法。分布式表示还可以运用TransG、TransR和CTransR等翻译模型。二维平面映射还可以使用主成分分析(Principal Component Analysis,PCA)、Sammon映射、SNE等降维算法。因此,凡是未脱离本发明技术方案的内容,依据本发明的技术实质对以上实施例所做的任何的简单修改、等同变化及修饰,均仍属于本发明技术方案保护的范围内。

Claims (10)

  1. 一种慢性病辅助决策的患者数据可视化方法,其特征在于,该方法包括以下步骤:
    (1)构建慢性病知识图谱:慢性病相关临床指南、知识文献作为知识图谱的知识源,通过SNOMED CT对数据语义进行唯一标识,手动构建类、属性与实例,添加数据关系和属性关系,生成知识图谱原型文件;
    (2)建立患者信息模型:采集患者信息;进行患者数据RDF转换,将患者数据库中的数据转换成符合OWL语言规范的RDF三元组关系;以SNOMED CT标识患者信息模型的节点,实现患者数据向领域知识的语义扩展,融合患者信息和慢性病知识图谱构建患者信息模型;
    (3)超平面特征图绘制:患者信息模型通过分布式表示转换为超平面特征图,分布式表示采用实体向量与关系向量之间基于翻译的模型;
    (4)二维平面映射:二维平面节点的位置信息对应患者信息模型超平面特征图降维后的二维位置,利用节点的颜色区分知识图谱中所属的信息类别不同,利用Regularized Gradient Boosted Decision Tree算法特征重要性排序作为各个节点与疾病进展相关性的排序,特征权重值作为欧式距离计算的权重;
    (5)决策支持反馈:以领域专家标注结果为慢病管理效果理想的患者信息模型作为标准,通过分布式表示和降维可视化绘制出患者数据的二维平面映射图像,结合特征权重值计算映射图像中各个特征区域的几何中心之间的欧式距离,作为标准化的管理目标;计算需要决策支持反馈的患者在二维平面映射图像中特征之间的欧式距离,并结合其计算出的特征权重值,将其与标准数值进行比较,寻找相似距离的路径;根据特征的距离信息获取知识图谱内的知识。
  2. 根据权利要求1所述的一种慢性病辅助决策的患者数据可视化方法,其特征在于,所述知识图谱的知识内容覆盖疾病诊断、检查项目、体征状态、相关疾病、治疗药物、生活习惯、计量单位和检测量。
  3. 根据权利要求1所述的一种慢性病辅助决策的患者数据可视化方法,其特征在于,步骤(2)采集的患者信息包括日常移动端手动输入或者可穿戴设备采集的患者健康数据,以及区域慢病管理中心所记录的患者电子病历数据。
  4. 根据权利要求1所述的一种慢性病辅助决策的患者数据可视化方法,其特征在于,步骤(2)患者数据RDF转换过程中,采用D2R语义映射技术将关系型数据库中的数据映射到RDF格式;D2R包括D2R Server、D2RQ Engine和D2RQ Mapping语言;D2RQ Mapping语言定义关系型数据转换成RDF格式的Mapping规则;D2RQ Engine使用定制的D2RQ  Mapping文件完成数据映射,具体是指将关系型数据库的表和字段分别映射为OWL文件中的类和属性,类之间的关系从表示关系的表中得出。
  5. 根据权利要求1所述的一种慢性病辅助决策的患者数据可视化方法,其特征在于,步骤(3)超平面特征图绘制具体包括以下子步骤:
    (3.1)采用TransH模型将三元组编码为空间的分布式向量,具体为:
    患者信息模型中知识被存储为(h,r,t)的三元组形式,其中h表示头实体向量,r表示关系向量,t表示尾实体向量;三元组集合组成有向图,图形节点代表实体,边代表不同类型的关系,边是有向的表明关系是非对称的;通过TransH模型构建自反关系、多对一、一对多、多对多关系的实体分布式向量;
    (3.2)优化目标函数,具体为:
    TransH模型对于每一个关系r,假设有一个对应的超平面,r在超平面上的关系投影表示为d r,该超平面法向量表示为ω r,且有||ω r|| 2=1,h 、t 分别表示h、t在该超平面的投影,则有:
    Figure PCTCN2021073125-appb-100001
    定义评分函数为:
    Figure PCTCN2021073125-appb-100002
    得到目标函数:
    Figure PCTCN2021073125-appb-100003
    其中S是知识库中的三元组,S′是负采样的三元组,ε是取值大于0的间隔距离参数;
    使用随机梯度下降训练方法,TransH模型训练完成后,得到实体和关系的向量表示。
  6. 根据权利要求1所述的一种慢性病辅助决策的患者数据可视化方法,其特征在于,步骤(4)二维平面映射具体包括以下子步骤:
    (4.1)使用t-SNE算法进行降维可视化,具体为:
    步骤一:假设数据集X,它共有N个数据点,每个数据点x i的维度为D,降维为二维,即在平面上表示出所有数据;
    计算数据点高维空间中的相似性的条件概率;将数据点之间的高维欧几里得距离转换为表示相似性的条件概率,高维数据点x i、x j之间的相似性条件概率P j|i如下:
    Figure PCTCN2021073125-appb-100004
    其中σ i是以数据点x i为中心的高斯方差;
    步骤二:计算数据点低维空间中的相似性的条件概率;对于高维数据点x i、x j的低维对应点y i、y j,计算条件概率Q j|i如下:
    Figure PCTCN2021073125-appb-100005
    步骤三:最小化条件概率的差异,即令条件概率Q j|i近似于P j|i;通过最小化两个条件概率分布之间的Kullback-Leibler散度实现,并使用梯度下降进行迭代更新,损失函数如下:
    Figure PCTCN2021073125-appb-100006
    (4.2)特征重要性排序:使用Regularized Gradient Boosted Decision Tree算法实现知识图谱中的各实体的重要性排序以及获得特征权重值,具体为:
    数据集为已知慢病管理效果或者结局的患者信息模型,每个样本包含n维特征;Regularized Gradient Boosted Decision Tree的目标函数L包括损失函数和复杂度,定义为:
    Figure PCTCN2021073125-appb-100007
    Figure PCTCN2021073125-appb-100008
    其中i表示第i个样本,k是第k颗树,
    Figure PCTCN2021073125-appb-100009
    为预测输出,y i为标签值,T表示叶子节点数,ω表示叶子权重值;γ为叶子树惩罚正则项,具有剪枝作用;λ为叶子权重惩罚正则项,防止过拟合;
    Figure PCTCN2021073125-appb-100010
    表示第i个样本的预测误差;
    Figure PCTCN2021073125-appb-100011
    表示损失函数;∑ kΩ(f k)表示树的复杂度函数;
    树的生长过程中,通过对比分裂前后的目标函数值,分裂后目标函数值最小的分裂为最佳分裂点;
    Figure PCTCN2021073125-appb-100012
    其中δ为加入新叶子节点引入的复杂度代价,G L为左子树梯度值,H L为左子树样本集合二阶导数;G R为右子树梯度值,H R为右子树样本集合二阶导数;如果Gain<0,则此叶子节点不做分割;
    特征重要性分数通过计算所有树中某特征在每次分裂节点时带来的总增益实现;通过调用booster参数的get_score方法获取对应的特征权重值。
  7. 根据权利要求1所述的一种慢性病辅助决策的患者数据可视化方法,其特征在于,步骤(4)特征重要性排序过程中,Regularized Gradient Boosted Decision Tree参数训练采用网格搜索方法进行,包括一般参数、提升参数和学习目标参数;一般参数控制宏观参数,提升参数控制每一步的提升,学习目标参数控制训练目标的表现。
  8. 根据权利要求1所述的一种慢性病辅助决策的患者数据可视化方法,其特征在于,步骤(5)中,根据特征的距离信息使用SPARQL查询语言和Jena规则推理获取知识图谱内的知识,生成患者个性化管理方案。
  9. 根据权利要求8所述的一种慢性病辅助决策的患者数据可视化方法,其特征在于,步骤(5)中,SPARQL查询语句包括查询信息以及名称应该符合的条件,条件以三元组形式出现,按照<subject,predicate,object>,即主谓宾的顺序排列,查询的结果实际就是条件三元组与数据文件中RDF三元组匹配的结果。
  10. 一种慢性病辅助决策的患者数据可视化系统,其特征在于,包括:
    慢性病知识图谱构建模块:慢性病相关临床指南、知识文献作为知识图谱的知识源,通过SNOMED CT对数据语义进行唯一标识,手动构建类、属性与实例,添加数据关系和属性关系,生成知识图谱原型文件;
    患者信息模型构建模块:采集患者信息,将患者数据库中的数据转换成符合OWL语言规范的RDF三元组关系;以SNOMED CT标识患者信息模型的节点,实现患者数据向领域知识的语义扩展,融合患者信息和慢性病知识图谱构建患者信息模型;
    超平面特征图绘制模块:患者信息模型通过分布式表示转换为超平面特征图,分布式表示采用实体向量与关系向量之间基于翻译的模型;
    二维平面映射模块:二维平面节点的位置信息对应患者信息模型超平面特征图降维后的二维位置,利用节点的颜色区分知识图谱中所属的信息类别不同,利用Regularized Gradient Boosted Decision Tree算法特征重要性排序作为各个节点与疾病进展相关性的排序,特征权重值作为欧式距离计算的权重;
    决策支持反馈模块:以领域专家标注结果为慢病管理效果理想的患者信息模型作为标准,通过分布式表示和降维可视化绘制出患者数据的二维平面映射图像,结合特征权重值计算映射图像中各个特征区域的几何中心之间的欧式距离,作为标准化的管理目标;计算需要决策支持反馈的患者在二维平面映射图像中特征之间的欧式距离,并结合其计算出的特征权重值,将其与标准数值进行比较,寻找相似距离的路径;根据特征的距离信息获取知识图谱内的知识。
PCT/CN2021/073125 2020-11-13 2021-01-21 一种慢性病辅助决策的患者数据可视化方法及系统 WO2021175038A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/553,832 US11521751B2 (en) 2020-11-13 2021-12-17 Patient data visualization method and system for assisting decision making in chronic diseases

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011270972.4A CN112102937B (zh) 2020-11-13 2020-11-13 一种慢性病辅助决策的患者数据可视化方法及系统
CN202011270972.4 2020-11-13

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/553,832 Continuation US11521751B2 (en) 2020-11-13 2021-12-17 Patient data visualization method and system for assisting decision making in chronic diseases

Publications (1)

Publication Number Publication Date
WO2021175038A1 true WO2021175038A1 (zh) 2021-09-10

Family

ID=73785227

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/073125 WO2021175038A1 (zh) 2020-11-13 2021-01-21 一种慢性病辅助决策的患者数据可视化方法及系统

Country Status (3)

Country Link
US (1) US11521751B2 (zh)
CN (1) CN112102937B (zh)
WO (1) WO2021175038A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114117081A (zh) * 2022-01-28 2022-03-01 北京明略软件系统有限公司 知识图谱的展示方法、装置、电子设备及可读存储介质
CN114254131A (zh) * 2022-02-28 2022-03-29 南京众智维信息科技有限公司 一种网络安全应急响应知识图谱实体对齐方法

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112102937B (zh) * 2020-11-13 2021-02-12 之江实验室 一种慢性病辅助决策的患者数据可视化方法及系统
CN113033179B (zh) * 2021-03-24 2024-05-24 北京百度网讯科技有限公司 知识获取方法、装置、电子设备及可读存储介质
CN114005509A (zh) * 2021-10-30 2022-02-01 平安国际智慧城市科技股份有限公司 一种治疗方案推荐系统、方法、装置和存储介质
CN113921141B (zh) 2021-12-14 2022-04-08 之江实验室 一种个体慢病演进风险可视化评估方法及系统
CN114741591A (zh) * 2022-04-02 2022-07-12 西安电子科技大学 一种向学习者推荐学习路径的方法及电子设备
CN114496234B (zh) * 2022-04-18 2022-07-19 浙江大学 一种基于认知图谱的全科患者个性化诊疗方案推荐系统
CN114780083B (zh) 2022-06-17 2022-10-18 之江实验室 一种知识图谱系统的可视化构建方法及装置
CN115019960B (zh) * 2022-08-01 2022-11-29 浙江大学 一种基于个性化状态空间进展模型的疾病辅助决策系统
CN115036034B (zh) * 2022-08-11 2022-11-08 之江实验室 一种基于患者表征图的相似患者识别方法及系统
CN115762698B (zh) * 2022-12-01 2024-02-13 武汉博科国泰信息技术有限公司 一种医疗慢病检查报告数据提取方法及系统
CN115762813B (zh) * 2023-01-09 2023-04-18 之江实验室 一种基于患者个体知识图谱的医患交互方法及系统
CN116092627B (zh) * 2023-04-04 2023-06-27 南京大经中医药信息技术有限公司 中医病机辨证智慧开方系统
CN116994704B (zh) * 2023-09-22 2023-12-15 北斗云方(北京)健康科技有限公司 基于临床多模态数据深度表示学习的合理用药判别方法
CN117219247B (zh) * 2023-11-08 2024-02-23 厦门培邦信息科技有限公司 一种用于患者就诊的智慧管理系统
CN117828002B (zh) * 2024-03-04 2024-05-10 济宁蜗牛软件科技有限公司 一种土地资源信息数据智能管理方法及系统
CN117854732A (zh) * 2024-03-08 2024-04-09 微脉技术有限公司 一种基于大数据分析的慢性病管理方法与系统
CN117936012B (zh) * 2024-03-21 2024-05-17 四川省医学科学院·四川省人民医院 一种基于慢性疼痛的检查项目决策方法、介质及系统

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109830303A (zh) * 2019-02-01 2019-05-31 上海众恒信息产业股份有限公司 基于互联网一体化医疗平台的临床数据挖掘分析与辅助决策方法
CN110085325A (zh) * 2019-04-30 2019-08-02 王小岗 关于中医经验数据的知识图谱的构建方法及装置
CN110275959A (zh) * 2019-05-22 2019-09-24 广东工业大学 一种面向大规模知识库的快速学习方法
CN110335676A (zh) * 2019-07-09 2019-10-15 泰康保险集团股份有限公司 数据处理方法、装置、介质及电子设备
US20200176113A1 (en) * 2018-12-04 2020-06-04 International Business Machines Corporation Dynamic creation and manipulation of data visualizations
CN111275486A (zh) * 2020-01-17 2020-06-12 北京光速斑马数据科技有限公司 消费者研究方法和系统
CN111370127A (zh) * 2020-01-14 2020-07-03 之江实验室 一种基于知识图谱的跨科室慢性肾病早期诊断决策支持系统
CN112102937A (zh) * 2020-11-13 2020-12-18 之江实验室 一种慢性病辅助决策的患者数据可视化方法及系统

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7921068B2 (en) * 1998-05-01 2011-04-05 Health Discovery Corporation Data mining platform for knowledge discovery from heterogeneous data types and/or heterogeneous data sources
WO2002103954A2 (en) * 2001-06-15 2002-12-27 Biowulf Technologies, Llc Data mining platform for bioinformatics and other knowledge discovery
US7730063B2 (en) * 2002-12-10 2010-06-01 Asset Trust, Inc. Personalized medicine service
US20150317337A1 (en) * 2014-05-05 2015-11-05 General Electric Company Systems and Methods for Identifying and Driving Actionable Insights from Data
US20190006027A1 (en) * 2017-06-30 2019-01-03 Accenture Global Solutions Limited Automatic identification and extraction of medical conditions and evidences from electronic health records
EP3903241A4 (en) * 2018-12-24 2022-09-14 Roam Analytics, Inc. BUILDING A KNOWLEDGE GRAPH USING MULTIPLE SUB-GRAPHS AND A LINK LAYER INCLUDING MULTIPLE LINK NODES
CN109918475B (zh) * 2019-01-24 2021-01-19 西安交通大学 一种基于医疗知识图谱的可视查询方法及查询系统
US20200342954A1 (en) * 2019-04-24 2020-10-29 Accenture Global Solutions Limited Polypharmacy Side Effect Prediction With Relational Representation Learning
CN110866124B (zh) * 2019-11-06 2022-05-31 北京诺道认知医学科技有限公司 基于多数据源的医学知识图谱融合方法及装置
CN111191048B (zh) * 2020-01-02 2023-06-02 南京邮电大学 基于知识图谱的急诊问答系统构建方法
CN111382272B (zh) * 2020-03-09 2022-11-01 西南交通大学 一种基于知识图谱的电子病历icd自动编码方法
WO2021195133A1 (en) * 2020-03-23 2021-09-30 Sorcero, Inc. Cross-class ontology integration for language modeling
CN111667894A (zh) * 2020-06-03 2020-09-15 长沙瀚云信息科技有限公司 基于rete算法规则引擎的医院电子病历质量监控和管理系统

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200176113A1 (en) * 2018-12-04 2020-06-04 International Business Machines Corporation Dynamic creation and manipulation of data visualizations
CN109830303A (zh) * 2019-02-01 2019-05-31 上海众恒信息产业股份有限公司 基于互联网一体化医疗平台的临床数据挖掘分析与辅助决策方法
CN110085325A (zh) * 2019-04-30 2019-08-02 王小岗 关于中医经验数据的知识图谱的构建方法及装置
CN110275959A (zh) * 2019-05-22 2019-09-24 广东工业大学 一种面向大规模知识库的快速学习方法
CN110335676A (zh) * 2019-07-09 2019-10-15 泰康保险集团股份有限公司 数据处理方法、装置、介质及电子设备
CN111370127A (zh) * 2020-01-14 2020-07-03 之江实验室 一种基于知识图谱的跨科室慢性肾病早期诊断决策支持系统
CN111275486A (zh) * 2020-01-17 2020-06-12 北京光速斑马数据科技有限公司 消费者研究方法和系统
CN112102937A (zh) * 2020-11-13 2020-12-18 之江实验室 一种慢性病辅助决策的患者数据可视化方法及系统

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GYRARD AMÉLIE, GAUR MANAS, SHEKARPOUR SAEEDEH, THIRUNARAYAN KRISHNAPRASAD, SHETH AMIT: "Scholar Commons Scholar Commons Personalized Health Knowledge Graph Personalized Health Knowledge Graph", PUBLISHED IN ISWC 2018 CONTEXTUALIZED KNOWLEDGE GRAPH WORKSHOP, ISWC, 31 October 2018 (2018-10-31), XP055842137, Retrieved from the Internet <URL:https://scholarcommons.sc.edu/cgi/viewcontent.cgi?article=1005&context=aii_fac_pub> [retrieved on 20210917] *
LI LING, DING SHUAI, LI XIAOJIAN, YANG SHANLIN: "Research on Smart Decision-Making Method for Upper Gastrointestinal Diseases Based on Electronic Gastroscopic Video", CHINESE JOURNAL OF MANAGEMENT SCIENCE, vol. 27, no. 11, 30 November 2019 (2019-11-30), pages 211 - 216, XP055842140, ISSN: 1003-207X, ISBN: 978-4-89637-383-7, DOI: 10.16381/j.cnki.issn1003-207x.2019.11.021 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114117081A (zh) * 2022-01-28 2022-03-01 北京明略软件系统有限公司 知识图谱的展示方法、装置、电子设备及可读存储介质
CN114254131A (zh) * 2022-02-28 2022-03-29 南京众智维信息科技有限公司 一种网络安全应急响应知识图谱实体对齐方法

Also Published As

Publication number Publication date
CN112102937A (zh) 2020-12-18
US20220157468A1 (en) 2022-05-19
CN112102937B (zh) 2021-02-12
US11521751B2 (en) 2022-12-06

Similar Documents

Publication Publication Date Title
WO2021175038A1 (zh) 一种慢性病辅助决策的患者数据可视化方法及系统
Lamy et al. Explainable artificial intelligence for breast cancer: A visual case-based reasoning approach
WO2021143779A1 (zh) 一种基于知识图谱的跨科室慢性肾病早期诊断决策支持系统
CN111414393B (zh) 一种基于医学知识图谱的语义相似病例检索方法及设备
WO2023202508A1 (zh) 一种基于认知图谱的全科患者个性化诊疗方案推荐系统
US20190035505A1 (en) Intelligent triage server, terminal and system based on medical knowledge base (mkb)
US11625615B2 (en) Artificial intelligence advisory systems and methods for behavioral pattern matching and language generation
CN111292848B (zh) 一种基于贝叶斯估计的医疗知识图谱辅助推理方法
CN111191048B (zh) 基于知识图谱的急诊问答系统构建方法
Yan et al. Comparison of support vector machine, back propagation neural network and extreme learning machine for syndrome element differentiation
CN113360671B (zh) 一种基于知识图谱的医保医疗单据审核方法及其系统
CN117271804B (zh) 一种共病特征知识库生成方法、装置、设备及介质
Datta et al. Development of predictive model of diabetic using supervised machine learning classification algorithm of ensemble voting
CN111429985B (zh) 电子病历数据处理方法及系统
RU2720363C2 (ru) Способ формирования математических моделей пациента с использованием технологий искусственного интеллекта
US10832822B1 (en) Methods and systems for locating therapeutic remedies
Geetha et al. Evaluation based approaches for liver disease prediction using machine learning algorithms
JP2018014058A (ja) 医療情報処理システム、医療情報処理装置及び医療情報処理方法
Liu et al. Knowledge-aware deep dual networks for text-based mortality prediction
Leng et al. Bi-level artificial intelligence model for risk classification of acute respiratory diseases based on Chinese clinical data
WO2022141925A1 (zh) 一种智能医学服务系统、方法及存储介质
CN114093507A (zh) 边缘计算网络中基于对比学习的皮肤病智能分类方法
Jia et al. Dkdr: An approach of knowledge graph and deep reinforcement learning for disease diagnosis
Wang A multi-modal knowledge graph platform based on medical data lake
Bi et al. A new graph semi-supervised learning method for medical image automatic annotation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21763834

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21763834

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02/10/2023)