CN115171871A - Cardiovascular disease prediction method based on knowledge graph and attention mechanism - Google Patents

Cardiovascular disease prediction method based on knowledge graph and attention mechanism Download PDF

Info

Publication number
CN115171871A
CN115171871A CN202210485938.1A CN202210485938A CN115171871A CN 115171871 A CN115171871 A CN 115171871A CN 202210485938 A CN202210485938 A CN 202210485938A CN 115171871 A CN115171871 A CN 115171871A
Authority
CN
China
Prior art keywords
cardiovascular disease
knowledge
data
frequent
attention mechanism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210485938.1A
Other languages
Chinese (zh)
Inventor
杨鹏
王超余
谢亮亮
马卫东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202210485938.1A priority Critical patent/CN115171871A/en
Publication of CN115171871A publication Critical patent/CN115171871A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a cardiovascular disease prediction method based on a knowledge graph and an attention mechanism, which comprises the steps of firstly constructing a cardiovascular disease corpus; then, establishing a knowledge graph in the cardiovascular disease field, extracting attribute information of the cardiovascular disease from original articles in a cardiovascular disease corpus, and establishing a knowledge graph relation network; then extracting cardiovascular disease description text feature vectors, obtaining symptom entities in the text according to the relation between cardiovascular diseases and symptoms in a knowledge map, performing vector representation on the symptoms by using a TransR knowledge representation model, and extracting description text feature vectors through an attention-based LSTM (A-LSTM); and finally, identifying the cardiovascular diseases by a softmax classifier. Compared with other methods, the method provided by the invention can be used for excavating deeper disease characteristics by combining the cardiovascular disease knowledge map and the attention mechanism, so that a more accurate prediction effect is achieved.

Description

Cardiovascular disease prediction method based on knowledge graph and attention mechanism
Technical Field
The invention relates to a cardiovascular disease prediction method based on a knowledge graph and an attention mechanism, and belongs to the technical field of internet and artificial intelligence.
Background
Cardiovascular diseases (CVDs) are a leading cause of death worldwide. Of the 5770 million deaths reported worldwide in 2015, 1790 million died from cardiovascular disease. In addition, cardiovascular disease places a non-negligible economic burden on patients and leads to severe life-long disability. However, it is estimated that 90% of CVDs can be prevented by appropriate measures. Therefore, predicting the onset of CVDs in individuals is of great importance in the medical field. There are several well-established pathological procedures for detecting markers of CVDs, such as Electrocardiogram (ECG) and angiography, which are the definitive diagnostic methods of cardiovascular disease in the medical field, often with high accuracy. While angiography is generally expensive and invasive, electrocardiogram is another common method for diagnosis and prognosis of cardiovascular diseases, and its accuracy in the medical field is highly dependent on the experience and knowledge of medical personnel or experts. Computer-aided high risk prediction of CVDs is therefore a promising and significant research topic. The traditional task of high-risk prediction based on machine learning aims at obtaining an automated computer system, which should be a potential and critical feature extracted from the patient's historical Electronic Health Record (EHR). Compared with traditional pathological measures, it has the characteristics of operability, non-invasiveness and low cost.
A key challenge for EHR-based high-risk prediction tasks is how to obtain an accurate picture of the patient, also known as patient characterization learning or feature engineering. EHRs are composed of various information about a patient and can be represented as a sequence of time-ordered hospital visits, each of which contains a number of medical variables such as demographics, diagnoses, medications, procedures, laboratory test results, and vital signs. The number of unique medical variables in EHR systems is typically very large, so many existing predictive models seek to handle it in a sparse feature representation through various dimension reduction techniques. Conventional manual intervention feature engineering measures are often poorly scalable and generalized because they are highly dependent on the individual experience of the researcher and the particular EHR system. In recent years, some simple and extensible methods inspired by automatic feature representation have been proposed, such as One-Hot and Bag-of-Words (BoW). However, in these approaches, each feature is typically treated as a discrete and independent word, which results in their inability to accurately capture the semantic information and dynamic associations in EHR data that are hidden between features. Therefore, how to design an efficient method to handle the feature representation of sequential, high-dimensional heterogeneous EHR data becomes an extremely important issue.
Disclosure of Invention
Aiming at the problems and the defects in the prior art, the invention provides a cardiovascular disease prediction method fusing a knowledge graph and an attention mechanism, which uses a prediction model fusing the knowledge graph and the attention mechanism for predicting the onset of cardiovascular diseases. The method combines the cardiovascular disease knowledge map and deep learning, extracts related entities in a text provided by a user according to entity information related to cardiovascular diseases in the cardiovascular disease knowledge map so as to enrich the cardiovascular disease characteristics, and further analyzes the cardiovascular disease information and cardiovascular disease images through a deep neural network model, and finally predicts the cardiovascular diseases.
In order to achieve the purpose of the invention, the invention is realized by the following technical scheme:
a cardiovascular disease prediction method based on a knowledge graph and an attention mechanism comprises the following steps:
step 1, constructing a cardiovascular disease corpus, regularly collecting knowledge articles of cardiovascular diseases through a distributed web crawler, and performing preliminary filtering through a wrapper to construct an original corpus;
step 2, constructing a knowledge graph in the cardiovascular disease field, extracting attribute information of the cardiovascular disease from original articles in a cardiovascular disease corpus by using a rule set, named entity recognition and keyword extraction methods respectively, and constructing a knowledge graph relation network;
step 3, extracting cardiovascular disease description text feature vectors: acquiring symptom entities in the text according to the relation between the cardiovascular diseases and symptoms in the knowledge map, performing vector representation on the symptoms by using a TransR knowledge representation model, and extracting and describing text feature vectors through an LSTM based on an attention mechanism;
and 4, identifying the cardiovascular diseases by a softmax classifier.
Further, the step 1 specifically includes the following steps:
acquiring original data of related cardiovascular disease websites regularly by using a web crawler, counting the total number of knowledge data in a basic knowledge base by using a data mining technology, and calculating the minimum support count; sequentially judging whether the count of each piece of knowledge data meets the minimum support degree or not, and outputting the knowledge data meeting the minimum support degree to obtain a plurality of frequent 1 item sets; reading a frequent k-1 item set, generating a frequent k item set according to a pruning algorithm, and calculating the count of the frequent k item set, wherein k is more than or equal to 2; judging whether the counting of the frequent k item set meets the minimum support degree, if so, adding 1 to the counting value of k, returning to the previous step, and if not, outputting the frequent k item set; traversing all the frequent 1 item sets, acquiring a plurality of frequent k item sets, and filtering part of noise data by using a black-and-white list mechanism based on a dictionary; collecting data related to cardiovascular diseases provided by a user; and preliminarily filtering the acquired data by using a rule set, and storing the data in a file library form.
Further, the step 2 specifically includes the following steps:
extracting the attribute of the original corpus by using the page attribute information; aiming at complex articles, a BilSTM-CRF model is adopted for named entity identification; aiming at the cardiovascular disease pathogenesis feature description, a key word extraction method based on TF-IDF is adopted to extract cardiovascular disease feature entities; expressing the extracted attributes, attribute names and the relationship among the attributes and the attribute names in a triple mode; using Neo4j to store and manage the knowledge graph; adopting a key word extraction method based on TF-IDF to extract cardiovascular disease characteristic entities, wherein a characteristic weight plan arithmetic formula is as follows:
Figure BDA0003629946740000021
wherein, tf ik Is a feature item t k In document d t Number of occurrences in, n k For containing feature items t k The number of documents in (1), N is the total number of the documents; expressing the extracted attributes, attribute names and the relationship among the attributes and the attribute names in a triple mode; and using Neo4j for storing and managing the knowledge graph.
Further, the step 3 specifically includes the following steps:
training the data of the knowledge graph by using a TransR knowledge representation model, extracting cardiovascular disease entities of the description text according to the knowledge graph, and obtaining an entity matrix E through the TransR knowledge representation model m×k Wherein k is the dimension of the entity vector, and m is the number of entities in the description text; entity matrix E that will describe text m×k As input of the BilSTM network, text feature extraction is carried out by using LSTM based on attention mechanism, and an output vector of the last LSTM unit is selected
Figure BDA0003629946740000031
As descriptive text feature vectors, wherein
Figure BDA0003629946740000032
The feature vector of the hidden layer of LSTM is expressed by the following formula:
Figure BDA0003629946740000033
further, when the TransR knowledge representation model is trained, the optimizer adopts a whale optimization algorithm.
Further, the step 4 specifically includes the following steps:
connecting the final patient representation vector to the softmax layer, the prediction of cardiovascular disease using the softmax classifier was obtained as follows:
Figure BDA0003629946740000034
wherein,
Figure BDA0003629946740000036
is a high risk index of cardiovascular disease of the patient of the ith case,
Figure BDA0003629946740000035
is the risk score for the ith patient calculated by the model.
Further, if
Figure BDA0003629946740000037
Equal to 1 indicates a high risk case, if
Figure BDA0003629946740000038
Equal to 0, it is indicated as a normal case.
Has the beneficial effects that:
1 when the prediction method provided by the invention is used for extracting cardiovascular disease entity information in a cardiovascular disease feature description text, a knowledge map technology and a TransR knowledge representation model are utilized, so that the extracted feature entities are more representative. In the training process of the TransR knowledge representation model, a whale optimization algorithm is added, the convergence speed of the model is improved, and the neural network based on the attention mechanism is used for better processing various information with huge dimensions. The method of the invention can be used for mining deeper cardiovascular disease characteristics by combining the knowledge map, the knowledge representation model and the deep learning, and can be used for mining deeper disease characteristics, thereby achieving more accurate recognition effect.
2 when the EHR patient data of high dimension, isomerism and tense is input, the model of the invention can automatically mine the potential information in the EHR patient data, combines the knowledge map with deep learning by using representation learning, obtains accurate feature representation for patients with lower dimension by the relation information between disease entities in the knowledge map and symptom entities described by users and between the entities, and can better complete high risk prediction task in the prediction model by adopting an attention mechanism and a long-short term memory artificial neural network.
Drawings
Fig. 1 is a flow chart of a cardiovascular disease prediction method based on a knowledge map and an attention mechanism provided by the invention.
Fig. 2 is an architecture diagram for implementing the method for predicting cardiovascular diseases based on knowledge mapping and attention mechanism provided by the present invention.
FIG. 3 is a model diagram of knowledge representation in the present invention.
Detailed Description
The technical solutions provided by the present invention will be described in detail with reference to specific examples, which should be understood that the following specific embodiments are only illustrative and not limiting the scope of the present invention.
The method for predicting cardiovascular diseases based on the knowledge base and attention mechanism has the flow shown in fig. 1, the model architecture shown in fig. 2, and the specific implementation steps as follows:
step 1, constructing a cardiovascular disease corpus: acquiring knowledge articles of cardiovascular diseases at regular time by using a distributed web crawler, and performing preliminary filtering by using a wrapper to construct an original corpus;
acquiring original data of a related cardiovascular disease website by using a web crawler regularly, acquiring related data of cardiovascular diseases by using other paths, preliminarily filtering the acquired data by using a rule set to obtain a basic knowledge base, counting the total number of knowledge data in the basic knowledge base, and calculating the minimum support count; sequentially judging whether the count of each piece of knowledge data meets the minimum support degree, and outputting the knowledge data meeting the minimum support degree to obtain a plurality of frequent 1 item sets; reading a frequent k-1 item set, generating a frequent k item set according to a pruning algorithm, and calculating the count of the frequent k item set, wherein k is more than or equal to 2; judging whether the counting of the frequent k item set meets the minimum support degree, if so, adding 1 to the counting value of k, returning to the previous step, and if not, outputting the frequent k item set; traversing all frequent 1 item sets, acquiring a plurality of frequent k item sets, and filtering the rest noise data by combining a dictionary-based black and white list mechanism; and obtaining initialization data and storing the initialization data in a file library form.
Step 2, constructing a knowledge map in the cardiovascular disease field: extracting cardiovascular disease attribute information from original articles in a cardiovascular disease corpus by using a rule set, named entity identification and keyword extraction methods respectively to construct a knowledge graph relation network;
extracting the attribute of the original corpus by using the page attribute information; aiming at complex articles, a BilSTM-CRF model is adopted for named entity identification; aiming at the cardiovascular disease incidence characteristic description, a key word extraction method based on TF-IDF is adopted to extract cardiovascular disease characteristic entities, and a characteristic weight plan arithmetic formula is as follows:
Figure BDA0003629946740000041
wherein, tf ik Is a characteristic item t k In document d t Number of occurrences in, n k For containing feature items t k N is the total number of texts. Expressing the extracted attributes, attribute names and the relationship among the attributes and the attribute names in a triple mode; and using Neo4j for storing and managing the knowledge graph.
Step 3, extracting cardiovascular disease description text feature vectors: acquiring symptom entities in texts according to the relation between cardiovascular diseases and symptoms in a knowledge map, performing vector representation on symptoms by using a TransR knowledge representation model, and extracting and describing text feature vectors through an attention-based LSTM (A-LSTM);
training the data of the knowledge graph by using a TransR knowledge representation model, taking the constructed knowledge graph data as the input of the representation model, and representing the model to obtain
Figure BDA0003629946740000051
Mapping the entities and the relations into a low-dimensional space for the basic idea, wherein h represents a head entity, t represents a tail entityThe body and the r represent the relationship, and further considering the complexity of the knowledge graph, the TransR model not only realizes the distinction between the entity and the relationship, but also projects the entity to the vector space of the relationship of the knowledge representation model aiming at different semantic spaces, so that the many-to-many relationship has more accurate vector representation, as shown in figure 3. For example, for relational symptoms
Figure BDA00036299467400000514
A mapping matrix Mr epsilon Rk multiplied by d is distributed to the coronary heart disease vector
Figure BDA0003629946740000052
Vascular sclerosis vector
Figure BDA00036299467400000518
By passing
Figure BDA0003629946740000053
Figure BDA0003629946740000054
Obtain its projection vector
Figure BDA0003629946740000055
Calculating to obtain the vector representation of coronary heart disease entity according to the following formula
Figure BDA0003629946740000056
Wherein
Figure BDA0003629946740000057
Figure BDA0003629946740000058
The optimizer adopts a whale optimization algorithm to improve the convergence speed of the model; then, the vector e is represented according to the trained entity h An entity matrix E of the description text can be formed m×k Wherein k is the dimension of the entity vector, and m is the number of entities in the description text; the details are as followsShown in the figure: e m×k =[e 1 ,e 2 ,...,e m ]。
Entity matrix E to describe text m×k As an input of the BilSTM network, text feature extraction is carried out by using LSTM (A-LSTM) based on attention mechanism, and an output vector of the last LSTM unit is selected
Figure BDA0003629946740000059
As a feature vector describing the text, wherein
Figure BDA00036299467400000510
The feature vector of the hidden layer of LSTM is expressed by the following formula:
Figure BDA00036299467400000511
and 4, identifying the cardiovascular diseases by a softmax classifier.
The final patient representation vector is connected to the softmax layer. The prediction of cardiovascular disease using the softmax classifier was as follows:
Figure BDA00036299467400000512
here, the
Figure BDA00036299467400000515
Is a high risk indicator of cardiovascular disease in the patient of the i < th > case. If it is
Figure BDA00036299467400000516
Equal to 1 indicates a high risk case, if
Figure BDA00036299467400000517
Equal to 0, it is indicated as a normal case.
Figure BDA00036299467400000513
Is the ith patient calculated by the modelRisk scoring of (2).
The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that modifications and adaptations can be made by those skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

Claims (7)

1. A cardiovascular disease prediction method based on a knowledge graph and attention mechanism is characterized by comprising the following steps:
step 1, constructing a cardiovascular disease corpus, regularly collecting knowledge articles of cardiovascular diseases through a distributed web crawler, and performing preliminary filtering through a wrapper to construct an original corpus;
step 2, constructing a knowledge graph in the cardiovascular disease field, extracting attribute information of the cardiovascular disease from original articles in a cardiovascular disease corpus by using a rule set, named entity recognition and keyword extraction methods respectively, and constructing a knowledge graph relation network;
step 3, extracting the feature vector of the cardiovascular disease description text: acquiring symptom entities in the text according to the relation between the cardiovascular diseases and symptoms in the knowledge map, performing vector representation on the symptoms by using a TransR knowledge representation model, and extracting and describing text feature vectors through an LSTM based on an attention mechanism;
and 4, identifying the cardiovascular diseases by a softmax classifier.
2. The method for cardiovascular disease prediction based on knowledge-graph and attention mechanism as claimed in claim 1, wherein the step 1 comprises the following steps:
acquiring original data of related cardiovascular disease websites regularly by using a web crawler, counting the total number of knowledge data in a basic knowledge base by using a data mining technology, and calculating the minimum support count; sequentially judging whether the count of each piece of knowledge data meets the minimum support degree or not, and outputting the knowledge data meeting the minimum support degree to obtain a plurality of frequent 1 item sets; reading a frequent k-1 item set, generating a frequent k item set according to a pruning algorithm, and calculating the count of the frequent k item set, wherein k is more than or equal to 2; judging whether the counting of the frequent k item set meets the minimum support degree, if so, adding 1 to the counting value of k, returning to the previous step, and if not, outputting the frequent k item set; traversing all the frequent 1 item sets, acquiring a plurality of frequent k item sets, and filtering part of noise data by using a black-and-white list mechanism based on a dictionary; collecting data related to cardiovascular diseases provided by a user; and preliminarily filtering the acquired data by using a rule set, and storing the data in a file library form.
3. The method for cardiovascular disease prediction based on knowledge-graph and attention mechanism as claimed in claim 1, wherein the step 2 comprises the following steps:
extracting the attribute of the original corpus by using the page attribute information; aiming at complex articles, a BilSTM-CRF model is adopted for named entity identification; aiming at the cardiovascular disease pathogenesis feature description, a key word extraction method based on TF-IDF is adopted to extract cardiovascular disease feature entities; expressing the extracted attributes, attribute names and the relationship among the attributes and the attribute names in a triple mode; using Neo4j to store and manage the knowledge graph; adopting a key word extraction method based on TF-IDF to extract cardiovascular disease characteristic entities, wherein a characteristic weight plan arithmetic formula is as follows:
Figure FDA0003629946730000011
wherein, tf ik Is a feature item t k In document d t Number of occurrences in, n k For containing feature items t k The number of documents in (1), N is the total number of the documents; expressing the extracted attributes, attribute names and the relationship among the attributes and the attribute names in a triple mode; and using Neo4j for storing and managing the knowledge graph.
4. The method for cardiovascular disease prediction based on knowledge-graph and attention mechanism as claimed in claim 1, wherein the step 3 comprises the following steps:
training data of the knowledge graph by using a TransR knowledge representation model, extracting cardiovascular disease entities of description texts according to the knowledge graph, and obtaining an entity matrix E through the TransR knowledge representation model m×k Wherein k is the dimension of the entity vector, and m is the number of entities in the description text; entity matrix E that will describe text m×k As input of the BilSTM network, text feature extraction is carried out by using LSTM based on attention mechanism, and an output vector of the last LSTM unit is selected
Figure FDA0003629946730000021
As a feature vector describing the text, wherein
Figure FDA0003629946730000022
The feature vector of the hidden layer of LSTM is expressed by the following formula:
Figure FDA0003629946730000023
5. the method of claim 4, wherein the optimizer employs whale optimization algorithm when training the TransR knowledge representation model.
6. The method of claim 1, wherein step 4 comprises the steps of:
connecting the final patient representation vector to the softmax layer, the prediction of cardiovascular disease using the softmax classifier was obtained as follows:
Figure FDA0003629946730000024
wherein, y i Is a high risk index of cardiovascular disease of the patient of the ith case,
Figure FDA0003629946730000025
is the risk score for the ith patient calculated by the model.
7. The method of claim 6 wherein y is the number of days if i Equal to 1 indicates a high risk case, if y i Equal to 0, it is indicated as a normal case.
CN202210485938.1A 2022-05-06 2022-05-06 Cardiovascular disease prediction method based on knowledge graph and attention mechanism Pending CN115171871A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210485938.1A CN115171871A (en) 2022-05-06 2022-05-06 Cardiovascular disease prediction method based on knowledge graph and attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210485938.1A CN115171871A (en) 2022-05-06 2022-05-06 Cardiovascular disease prediction method based on knowledge graph and attention mechanism

Publications (1)

Publication Number Publication Date
CN115171871A true CN115171871A (en) 2022-10-11

Family

ID=83483686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210485938.1A Pending CN115171871A (en) 2022-05-06 2022-05-06 Cardiovascular disease prediction method based on knowledge graph and attention mechanism

Country Status (1)

Country Link
CN (1) CN115171871A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115563286A (en) * 2022-11-10 2023-01-03 东北农业大学 Knowledge-driven milk cow disease text classification method
CN117079815A (en) * 2023-08-21 2023-11-17 哈尔滨工业大学 Cardiovascular disease risk prediction model construction method based on graph neural network
CN117594241A (en) * 2024-01-15 2024-02-23 北京邮电大学 Dialysis hypotension prediction method and device based on time sequence knowledge graph neighborhood reasoning
CN118072958A (en) * 2024-03-22 2024-05-24 首都医科大学宣武医院 Diabetes risk prediction optimization system based on knowledge graph

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115563286A (en) * 2022-11-10 2023-01-03 东北农业大学 Knowledge-driven milk cow disease text classification method
CN115563286B (en) * 2022-11-10 2023-12-01 东北农业大学 Knowledge-driven dairy cow disease text classification method
CN117079815A (en) * 2023-08-21 2023-11-17 哈尔滨工业大学 Cardiovascular disease risk prediction model construction method based on graph neural network
CN117594241A (en) * 2024-01-15 2024-02-23 北京邮电大学 Dialysis hypotension prediction method and device based on time sequence knowledge graph neighborhood reasoning
CN117594241B (en) * 2024-01-15 2024-04-30 北京邮电大学 Dialysis hypotension prediction method and device based on time sequence knowledge graph neighborhood reasoning
CN118072958A (en) * 2024-03-22 2024-05-24 首都医科大学宣武医院 Diabetes risk prediction optimization system based on knowledge graph
CN118072958B (en) * 2024-03-22 2024-07-26 首都医科大学宣武医院 Diabetes risk prediction optimization system based on knowledge graph

Similar Documents

Publication Publication Date Title
RU2703679C2 (en) Method and system for supporting medical decision making using mathematical models of presenting patients
US10929420B2 (en) Structured report data from a medical text report
CN113421652B (en) Method for analyzing medical data, method for training model and analyzer
CN115171871A (en) Cardiovascular disease prediction method based on knowledge graph and attention mechanism
Huddar et al. Predicting complications in critical care using heterogeneous clinical data
JP7464800B2 (en) METHOD AND SYSTEM FOR RECOGNITION OF MEDICAL EVENTS UNDER SMALL SAMPLE WEAKLY LABELING CONDITIONS - Patent application
JP2020518050A (en) Learning and applying contextual similarity between entities
Jiang et al. The Research of Clinical Decision Support System Based on Three‐Layer Knowledge Base Model
CN108231146B (en) Deep learning-based medical record model construction method, system and device
CN109360658A (en) A kind of the disease pattern method for digging and device of word-based vector model
RU2752792C1 (en) System for supporting medical decision-making
Baniasadi et al. Two-step imputation and AdaBoost-based classification for early prediction of sepsis on imbalanced clinical data
Liu et al. Knowledge-aware deep dual networks for text-based mortality prediction
Mansouri et al. Predicting hospital length of stay of neonates admitted to the NICU using data mining techniques
Leng et al. Bi-level artificial intelligence model for risk classification of acute respiratory diseases based on Chinese clinical data
Niu et al. Deep multi-modal intermediate fusion of clinical record and time series data in mortality prediction
Chen et al. Imbalanced prediction of emergency department admission using natural language processing and deep neural network
US11809826B2 (en) Assertion detection in multi-labelled clinical text using scope localization
Chen et al. Automatically structuring on Chinese ultrasound report of cerebrovascular diseases via natural language processing
CN116737945B (en) Mapping method for EMR knowledge map of patient
Feng et al. Can Attention Be Used to Explain EHR-Based Mortality Prediction Tasks: A Case Study on Hemorrhagic Stroke
Gupta et al. An overview of clinical decision support system (cdss) as a computational tool and its applications in public health
CN115312186A (en) Auxiliary screening system for diabetic retinopathy
CN114098638A (en) Interpretable dynamic disease severity prediction method
Dhamala et al. Multivariate time-series similarity assessment via unsupervised representation learning and stratified locality sensitive hashing: Application to early acute hypotensive episode detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination