CN117672450A - Personalized medicine recommendation method and system based on knowledge graph - Google Patents
Personalized medicine recommendation method and system based on knowledge graph Download PDFInfo
- Publication number
- CN117672450A CN117672450A CN202311660390.0A CN202311660390A CN117672450A CN 117672450 A CN117672450 A CN 117672450A CN 202311660390 A CN202311660390 A CN 202311660390A CN 117672450 A CN117672450 A CN 117672450A
- Authority
- CN
- China
- Prior art keywords
- knowledge
- data
- patient
- graph
- knowledge graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000003814 drug Substances 0.000 title claims abstract description 96
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000003745 diagnosis Methods 0.000 claims abstract description 57
- 229940079593 drug Drugs 0.000 claims abstract description 57
- 201000010099 disease Diseases 0.000 claims abstract description 38
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 38
- 238000002560 therapeutic procedure Methods 0.000 claims abstract description 26
- 239000013598 vector Substances 0.000 claims description 35
- 239000011159 matrix material Substances 0.000 claims description 27
- 238000005516 engineering process Methods 0.000 claims description 23
- 230000004927 fusion Effects 0.000 claims description 20
- 238000013528 artificial neural network Methods 0.000 claims description 19
- 230000002457 bidirectional effect Effects 0.000 claims description 17
- 238000013507 mapping Methods 0.000 claims description 12
- 238000010586 diagram Methods 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 9
- 238000010276 construction Methods 0.000 claims description 7
- 230000009467 reduction Effects 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 4
- 230000004931 aggregating effect Effects 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 230000003247 decreasing effect Effects 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 claims description 2
- 238000013135 deep learning Methods 0.000 description 12
- 208000005577 Gastroenteritis Diseases 0.000 description 4
- 208000012873 acute gastroenteritis Diseases 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 208000004998 Abdominal Pain Diseases 0.000 description 3
- 206010028813 Nausea Diseases 0.000 description 3
- 210000004556 brain Anatomy 0.000 description 3
- 238000007418 data mining Methods 0.000 description 3
- 230000008693 nausea Effects 0.000 description 3
- 238000011176 pooling Methods 0.000 description 3
- 208000024891 symptom Diseases 0.000 description 3
- 206010012735 Diarrhoea Diseases 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 206010047700 Vomiting Diseases 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000008673 vomiting Effects 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention is applicable to the technical field of intelligent medical treatment, and provides a personalized medicine recommendation method and system based on a knowledge graph, wherein the method comprises the following steps: obtaining a disease dataset of a patient, the disease dataset comprising diagnostic data, therapy records, and medication records; generating knowledge in the form of triples surrounding the conceptual entity of each medical code in the diagnostic data, therapy record, and medication record for that medical code using the large model; clustering nodes and edges of the knowledge to form a knowledge graph; and constructing personalized knowledge subgraphs for each patient according to diagnosis and treatment data of different patients and combining knowledge recommended by the knowledge graph. According to the invention, the data driving and the knowledge driving are combined, the knowledge decision of the knowledge graph is introduced, the hidden disease information of the patient can be mined, and more accurate prediction results can be obtained compared with the current mainstream single information source for drug recommendation, so that accurate and intelligent auxiliary diagnosis is realized.
Description
Technical Field
The invention belongs to the technical field of intelligent medical treatment, and particularly relates to a personalized medicine recommendation method and system based on a knowledge graph.
Background
With the development of artificial intelligence, AI technology, which is gradually mature, is gradually turning to the medical field. The intelligent medical treatment is to apply the artificial intelligent technology to disease diagnosis and treatment, and the computer can diagnose and treat the disease, so that doctors can be helped to carry out statistics such as pathological examination report and the like, and services such as medication assistance, triage guidance, health consultation and the like can be independently provided. The medical data of the patient is analyzed and mined by the technologies of big data, deep mining and the like, and the clinical variables and indexes of the patient are automatically identified. The computer simulates the thinking and diagnosis reasoning of a doctor through 'learning' related professional knowledge, so that a reliable diagnosis and treatment scheme is given, namely, the computer becomes a brain with medical knowledge, simulates the thinking and diagnosis reasoning of the doctor, and provides auxiliary decision making for diagnosis and treatment of the doctor.
The deep learning technology has made some breakthrough progress in the field of disease diagnosis, and aims to establish a neural connection structure for modeling human brain, and perform data processing on a plurality of processing layers formed by multiple nonlinear transformation when dealing with actual problems.
However, the deep learning technology is a technology relying on data-driven modeling, and needs to train with large-scale samples to obtain better generalization capability, and besides the requirement on the data quantity, the data quality is also particularly important, especially the data related to the medical field, and most of experientials with rich experience are required to manually sort out "standard answers" so as to improve the accuracy of deep learning prediction. Moreover, deep learning is difficult to land in the application of the medical field with high requirements on the interpretability due to the mechanism of the black box; there is a need for improvement.
Disclosure of Invention
The embodiment of the invention aims to provide a personalized medicine recommendation method and a personalized medicine recommendation system based on a knowledge graph, which can calculate the similarity of vectors in a user portrait vector and a user feature vector space according to a high-dimensional vector converted by a user portrait, and determine symptoms according to the user feature with the highest similarity, so as to assist diagnosis and treatment for a user; therefore, an accurate, quick and perfect diagnosis and treatment scheme is realized according to the user portrait, so that a user can experience accurate disease diagnosis on line, and a support basis is provided for remote medical treatment and intelligent medical treatment.
The aim of the embodiment of the invention is achieved by a personalized medicine recommendation method based on a knowledge graph, which comprises the following steps:
obtaining a disease dataset of a patient, the disease dataset comprising diagnostic data, therapy records, and medication records;
generating knowledge in the form of triples surrounding the conceptual entity of each medical code in the diagnostic data, therapy record, and medication record for that medical code using the large model; clustering nodes and edges of the knowledge to form a knowledge graph;
constructing personalized knowledge subgraphs for each patient according to diagnosis and treatment data of different patients by combining knowledge recommended by the knowledge graph, and carrying out data marking by utilizing time information in the diagnosis and treatment data of the patients;
mapping the knowledge subgraph to a word vector space by using a text embedding technology to obtain a feature fusion matrix; and inputting the feature fusion matrix into a bidirectional attention force diagram neural network to predict, and obtaining a predicted drug recommendation result.
Another object of the embodiment of the present invention is to provide a personalized medicine recommendation system based on a knowledge graph, the system comprising: the system comprises a data acquisition module, a large model module, a personalized knowledge graph construction module and a bidirectional attention map neural network module;
the data acquisition module is used for acquiring a disease data set of a patient, wherein the disease data set comprises diagnosis data, therapy records and medication records;
the large model module is used for generating knowledge in the form of triples of concept entities surrounding each medical code in the diagnosis data, the therapy record and the medication record by utilizing the large model; clustering nodes and edges of the knowledge to form a knowledge graph;
the personalized knowledge graph construction module is used for constructing a personalized knowledge subgraph for each patient according to diagnosis and treatment data of different patients and combining knowledge recommended by the knowledge graph, and carrying out data marking by utilizing time information in the diagnosis and treatment data of the patients;
the bidirectional attention map neural network module is used for mapping the knowledge subgraph to a word vector space by using a text embedding technology to obtain a feature fusion matrix; and inputting the feature fusion matrix into a bidirectional attention force diagram neural network to predict, and obtaining a predicted drug recommendation result.
The invention also provides a computer device comprising a memory and a processor, wherein the memory stores a computer program which can be run on the processor, and the processor realizes the steps of the personalized medicine recommendation method based on the knowledge graph when executing the computer program.
Compared with the prior art, the personalized medicine recommendation method based on the knowledge graph provided by the embodiment of the invention has the following beneficial effects: the method combines data driving and knowledge driving, not only through diagnosis and treatment records of patients, but also through knowledge decision of introducing a knowledge graph, disease information hidden by the patients can be mined, and more accurate prediction results can be obtained compared with the current mainstream single information source for drug recommendation, so that accurate and intelligent auxiliary diagnosis is realized; compared with the traditional auxiliary medicine recommendation based on the knowledge graph, the knowledge graph generated based on the large model is used, and the knowledge graph has richer and more comprehensive personalized information of the patient; by means of the data mining capability of deep learning, medication recommendation which is more in line with the physical condition of the patients can be provided for the patients, and the patients have better interpretability and credibility.
Drawings
Fig. 1 is a block diagram of a personalized medicine recommendation system based on a knowledge graph according to an embodiment of the present invention;
fig. 2 is a flowchart of a personalized medicine recommendation method based on a knowledge graph according to an embodiment of the present invention;
FIG. 3 is a flowchart of a disease dataset pretreatment provided in an embodiment of the present invention;
FIG. 4 is a flow chart of the construction of a personalized knowledge subgraph provided by an embodiment of the present invention;
fig. 5 is a block diagram showing an internal structure of a computer device according to an embodiment of the invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
It should be noted that: knowledge graph is an important technology in the field of artificial intelligence, and is a technical basis for constructing a computer medical knowledge brain. Therefore, knowledge graph is one of key technologies of intelligent medicine. After the first application of data models to the clinical medical field by ledli et al, various forms of medical expert assistance devices have emerged. The main workflow of these devices is to build clinical experience and knowledge of medical professionals to build a medical knowledge base. And then formulating reasoning rules by expert. Finally, in practical application, diagnosis and reasoning are performed according to physical examination data input by a user. However, the mechanized and simplistic rule-based reasoning approach of the device has certain limitations in constructing a knowledge base and diagnostically reasoning about medical data with disparate data.
The large language model (namely, the large model) is one of important research directions in the field of artificial intelligence in recent years, has strong language understanding and generating capability, and has wide application potential for natural language processing tasks. Through a deep network structure and a self-attention mechanism, the context information in the text can be effectively captured, and the logical relationship and semantic meaning in the sentence can be understood. The large language model achieves remarkable results in tasks such as text classification, emotion analysis and entity identification. The abundant corpus information and huge parameter quantity can form a huge implicit knowledge graph, knowledge in the huge implicit knowledge graph is extracted through a prompting technology, and the knowledge graph is constructed and verified to have a remarkable effect.
With the development of computer technology, machine learning and artificial intelligence technology, more and more students began to use a scheme of combining knowledge patterns with deep learning to make algorithms have reliable performance and interpretability. However, the method only focuses on the simple hierarchical relationship in the knowledge graph and ignores the comprehensive relationship among medical entities, so that a personalized medicine recommendation scheme is difficult to give to the individual patient.
In summary, the existing disease prediction system mainly utilizes the constructed knowledge graph to cooperate with the deep learning method to perform drug recommendation, so that good effects are achieved. However, the method only focuses on the simple hierarchical relationship in the knowledge graph and ignores the comprehensive relationship among medical entities, so that a personalized medicine recommendation scheme is difficult to give to the individual patient. Therefore, the embodiment of the invention utilizes the excellent reasoning capability of a large language model (LLM, which is called a large model for short), can extract finer information aiming at individual patients by introducing the knowledge graph extracted by the large model, establishes the personalized knowledge graph of each patient, and combines deep learning to make personalized medicine recommendation.
Fig. 2 is a flowchart of a personalized medicine recommendation method based on a knowledge graph, which may specifically include steps S101 to S107:
s101, acquiring a disease data set of a patient, wherein the disease data set comprises diagnosis data, therapy records and medication records;
s103, generating knowledge in the form of triples surrounding concept entities of the medical codes for each medical code in the diagnosis data, the therapy record and the medication record by using the large model; clustering nodes and edges of the knowledge to form a knowledge graph;
s105, constructing a personalized knowledge subgraph for each patient according to diagnosis and treatment data of different patients by combining knowledge recommended by the knowledge graph, and carrying out data marking by utilizing time information in the diagnosis and treatment data of the patients;
s107, mapping the knowledge subgraph to a word vector space by using a text embedding technology to obtain a feature fusion matrix; inputting the feature fusion matrix into a bidirectional attention graph neural network to predict, so as to obtain a predicted drug recommendation result;
in the embodiment, the method combines data driving and knowledge driving, not only through diagnosis and treatment records of patients, but also through knowledge decision of introducing a knowledge graph, disease information hidden by the patients can be mined, and more accurate prediction results can be obtained compared with the current mainstream single information source for drug recommendation, so that accurate and intelligent auxiliary diagnosis is realized; compared with the traditional auxiliary medicine recommendation based on the knowledge graph, the knowledge graph generated based on the large model is used, and the knowledge graph has richer and more comprehensive personalized information of the patient; by means of the data mining capability of deep learning, medication recommendation which is more in line with the physical condition of the patients can be provided for the patients, and the patients have better interpretability and credibility.
In step S101, in some scenarios, the collected diagnostic data, therapy record and medication record data are not all available, so the method further comprises: after the disease dataset is formed, data cleansing is performed;
after the data are cleaned, using a named entity recognition technology to extract available entity data and constructing a concept entity of the medical code so as to be convenient for representing the medical code; the large model is facilitated to generate knowledge in the form of triples of conceptual entities surrounding the medical code. In the process of generating the triplet-form knowledge, the large model performs operations such as feature extraction, classification, knowledge structuring and the like on the input data through deep learning, and of course, the large model is an application of the prior art and is not described in detail herein.
In this embodiment, knowledge in the form of triples surrounding the conceptual entity of each medical code in the diagnostic data, therapy record, and medication record is generated for that medical code using a large model; in the step, the large model can deeply mine the comprehensive relations among the entities (or medical entities) in the diagnosis data, the therapy records and the medication records, so that the constructed knowledge graph does not ignore the comprehensive relations among the medical entities, and a personalized medicine recommendation scheme is easily given for the individual patient. The knowledge graph extracted by the large model is introduced, so that finer information aiming at the individual patient can be mined, and a more reasonable drug recommendation scheme can be given; at the same time, reliable performance and interpretability are also provided.
In one example of this embodiment, the acquisition of the patient's disease dataset may be acquired offline or online, taking online acquisition as an example. For example: through a multi-turn dialogue interface, dialogue with a patient is performed, and dialogue of the patient is extracted to obtain one or more data of diagnosis data, therapy records and medication records;
for example: the user representation of the dialogue representation is (abdominal pain, diarrhea, nausea), and the most proximate disease is found to be acute gastroenteritis by word segmentation comparison or similarity calculation (abdominal pain, diarrhea, nausea, vomiting= > acute gastroenteritis) with name features in a preset medical dictionary, so that the acute gastroenteritis can be judged to have a high probability.
In this step, the patient's diagnostic data, including but not limited to, patient's basic information, which is personal information, including height, weight, disease history, family history, etc.; and disease, symptom information of the patient, etc.
In some examples, the diagnostic data of the patient may include a diagnosis of the patient for self-symptoms, such as: "I have today a hard abdominal pain and are accompanied by nausea", possibly acute gastroenteritis; deep learning can also extract the disease name as a reference and even as an important basis for diagnostic output.
As shown in fig. 3, in one example of the present embodiment, the method further includes:
after acquiring a disease dataset of a patient, performing data preprocessing, specifically including:
s202, word embedding representation is carried out on text data and structured data in a disease data set through a word embedding model, and the text data and the structured data are mapped into a state, program and medication record representation matrix X;
s204, increasing or decreasing the state, the program and the medication record representation matrix X by a certain data magnitude so as to meet the data generation requirement of the large model.
As shown in fig. 4, in one example of the present embodiment, the knowledge in the form of triples surrounding the concept entity of each medical code in the diagnosis data, the therapy record, and the medication record is generated for the medical code using the large model; the step of clustering the knowledge nodes and edges to form a knowledge graph specifically comprises the following steps:
word embedding representation is carried out on text data and structured data in the disease data set through a word embedding model, mapping is carried out to form a state, program and medication record representation matrix X, then step S401 is carried out, the state, program and medication record representation matrix X is subjected to, and knowledge in a triplet form is generated through instructions, examples and prompt guidance large models; fusing the generated knowledge in the form of triples into a knowledge graph;
s402, clustering similar nodes and edges in the knowledge graph on the global graph G corresponding to all medical codes by using a hierarchical clustering algorithm to obtain a new global graph G';
wherein, the global map G 'is used for creating a new knowledge subgraph which is marked as G' e 。
S403, creating a new knowledge subgraph for each medical code; knowledge subgraphs of all medical codes form the knowledge graph.
Specifically, G' e ={v′ e ,ε′ e ) E G ', where V' e Represents a node, ε' e Representing edges in the graph;
in an example of this embodiment, the step of constructing a personalized knowledge subgraph for each patient according to the diagnosis and treatment data of different patients in combination with the knowledge recommended by the knowledge graph specifically includes:
for each patient, composing its personalized knowledge subgraph by merging the clustered knowledge of its medical codes;
creating a patient node P and connecting the patient node P to a node corresponding to a diagnosis and treatment record directly corresponding to the knowledge subgraph;
the personalized knowledge sub-graph of the patient is expressed as:
wherein the method comprises the steps of Wherein { e } 1 ,e 2 ,…,e ω The number of medical codes of the patient, v 'is the number of medical codes of the patient' e Epsilon' represents the nodes of the patient diagnosis and treatment record and the edges in the graph;
if the patient is represented as a J-visit sequence, then the knowledge sub-graph for patient i is represented as:
the inquiry sequence is { x_1, x_2, & gt, x_j }, wherein The set of subscripts here as pat refers to the sequence of patient diagnosis and treatment records; j is equal to or greater than 1 and is equal to or less than J.
In one example of this embodiment, the mapping of the knowledge subgraph to a word vector space using text embedding technology obtains a feature fusion matrix; inputting the feature fusion matrix into a bidirectional attention force diagram neural network for prediction, and obtaining a predicted drug recommendation result, wherein the method specifically comprises the following steps of:
mapping nodes and edges in the knowledge subgraph to a word vector space by using a text embedding technology to respectively obtain word embedding vectors corresponding to the nodes and word embedding vectors corresponding to the edges;
performing dimension reduction processing on the word embedding vectors corresponding to the obtained nodes and the word embedding vectors corresponding to the edges;
respectively calculating a knowledge subgraph for each visit of a patient and nodes in the knowledge subgraph corresponding to each visit;
aggregating neighboring nodes in all accessed knowledge subgraphs to update node embedding;
each node obtains the representation of the last layer of each node after passing through the set multi-layer attention layers;
setting multi-layer perceptron of bidirectional attention seeking neural network, and inputting the representation of last layer of every node to obtain deep level characteristic z joint ;
Deep level feature z joint Inputting the probability output to a sigmoid function to obtain the probability output of each drug and giving out the corresponding drug code; and obtaining a predicted drug recommendation result.
In one embodiment, the obtained word embedding vectors corresponding to the nodes and the word embedding vectors corresponding to the edges are subjected to dimension reduction, so that the model performance can be improved and the sparse problem can be solved; in concrete implementation, word embedding vectors corresponding to nodes and word embedding vector respectively tables corresponding to edgesThe method is shown as follows:the dimension reduction treatment is as follows:
dimension reduction treatmentWherein W is v ,b v Is a learnable weight vector (hereinafter synonymous);
the knowledge sub-graph for each visit of the patient and the nodes in the knowledge sub-graph corresponding to each visit, including the attention weights, are then calculated separately. For example: node level attention weight of kth node in patient i jth visit knowledge subgraph, denoted as alpha (i,j,k) . And the access level attention weight of the jth access of patient i, denoted as beta (i,j) : the formula is as follows:
α i,j,1 ,...,α i,j,M =Softmax(W a g i,j +b a ) Softmax is calculated alpha i,j,1 ,...,α i,j,M A mapping function of the probability values;
β i,1 ,...,β i,N =λT T anh(w β G i +b β ) Tanh is the activation function;
wherein λ= [ λ ] 1 ,...,λ N ],g i,j ∈R M Multi-hot encoding representing access level subgraphs, lambda E R N Is the corresponding attenuation coefficient vector. G i Graph G representing patient i i Where N is the maximum number of visits for all patients.
In this example, node embedding is updated by aggregating neighboring nodes in all knowledge subgraphs accessed:
wherein, reLU represents a linear activation layer function, and the Aggregate function captures the contribution of the attention weighted nodes and edges, so the attention layer is integratedThe features of the edges and nodes enable the model to learn a rich representation of the patient's disease data. After the set multi-layer attention layer, the representation of the last layer (L) of each node is obtainedFor use in subsequent processes.
In one example of this embodiment, to utilize global information of a knowledge graph of a patient and locally accurate diagnostic information corresponding to an electronic medical record, a multi-layer perceptron (MLP) of a bi-directional attention seeking neural network is designed to satisfy:
integrating information in the global information and the local accurate diagnosis information of the knowledge graph through the multi-layer perceptron, wherein J is the number of times of visit of the patient i, K j Is the number of nodes in visit j,
representing patient graph embeddings obtained by accessing on average the embeddings of all nodes in all knowledge subgraphs, the embeddings of the individual nodes in each knowledge subgraph of patient i; />Representing patient node embeddings calculated by averaging node embeddings linked to the patient's corresponding medical codes. I i,j,k E {0,1}, is a binary tag indicating node v i J, k corresponds to the medical code of patient i. MEAN is a function of the MEAN and MLP is a multi-layer perceptron function.
Obtaining deep feature z from multi-layer perceptron joint Then, the result is input into a sigmoid function:
and obtaining probability output P, e which finally corresponds to each medicine and gives out corresponding medicine codes.
The multi-layer perceptron can utilize global information of the patient knowledge graph and local accurate diagnosis information corresponding to the electronic medical record.
In another embodiment, as shown in fig. 1, a personalized medicine recommendation system based on a knowledge graph, the system comprises: the system comprises a data acquisition module, a large model module, a personalized knowledge graph construction module and a bidirectional attention map neural network module;
the data acquisition module is used for acquiring a disease data set of a patient, wherein the disease data set comprises diagnosis data, therapy records and medication records;
the large model module is used for generating knowledge in the form of triples of concept entities surrounding each medical code in the diagnosis data, the therapy record and the medication record by utilizing the large model; clustering nodes and edges of the knowledge to form a knowledge graph;
the personalized knowledge graph construction module is used for constructing a personalized knowledge subgraph for each patient according to diagnosis and treatment data of different patients and combining knowledge recommended by the knowledge graph, and carrying out data marking by utilizing time information in the diagnosis and treatment data of the patients;
the bidirectional attention map neural network module is used for mapping the knowledge subgraph to a word vector space by using a text embedding technology to obtain a feature fusion matrix; and inputting the feature fusion matrix into a bidirectional attention force diagram neural network to predict, and obtaining a predicted drug recommendation result.
In one example of this embodiment, the data acquisition module includes a diagnostic logging module, a surgical therapy logging module, and a medication logging module.
The diagnosis record module is used for acquiring diagnosis data of a patient, and the operation therapy record module is used for acquiring therapy records of the patient; the medication record module is used for obtaining the medication record of the patient.
In another embodiment, a computer device comprises a memory and a processor, the memory storing a computer program executable on the processor, the processor implementing steps S101-S107 of the personalized medicine recommendation method based on a knowledge graph as described above when executing the computer program:
s101, acquiring a disease data set of a patient, wherein the disease data set comprises diagnosis data, therapy records and medication records;
s103, generating knowledge in the form of triples surrounding concept entities of the medical codes for each medical code in the diagnosis data, the therapy record and the medication record by using the large model; clustering nodes and edges of the knowledge to form a knowledge graph;
s105, constructing a personalized knowledge subgraph for each patient according to diagnosis and treatment data of different patients by combining knowledge recommended by the knowledge graph, and carrying out data marking by utilizing time information in the diagnosis and treatment data of the patients;
s107, mapping the knowledge subgraph to a word vector space by using a text embedding technology to obtain a feature fusion matrix; and inputting the feature fusion matrix into a bidirectional attention force diagram neural network to predict, and obtaining a predicted drug recommendation result.
In the step, aiming at the word vector space, text information represented by a knowledge subgraph can be converted into a vector through word2vec, and the vector is mapped to the word vector space to obtain a feature fusion matrix;
wherein, two-way attention seeking neural network includes: the device comprises a convolution layer, a pooling layer, a full connection layer and an output layer;
extracting all local feature vectors in the feature fusion matrix by using a convolution layer;
through the pooling layer, using maximum pooling operation to all local feature vectors output by the convolution layer, and selecting the maximum value to represent local features;
in the full connection layer, the multi-layer perceptron is used for substitution; to obtain the deep level characteristic z joint ;
Output layer: and the method is used for converting the output of the full connection layer into probability distribution and determining the recommended result of the medicine.
The personalized medicine recommendation method based on the knowledge graph, provided by the embodiment of the invention, is based on the method, and provides a personalized medicine recommendation system based on the knowledge graph, and the method fuses data driving and knowledge driving, so that disease information hidden by a patient can be mined through diagnosis and treatment records of the patient and knowledge decision of introducing the knowledge graph, more accurate prediction results can be obtained compared with the current mainstream single information source for medicine recommendation, and accurate and intelligent auxiliary diagnosis is realized; compared with the traditional auxiliary medicine recommendation based on the knowledge graph, the knowledge graph generated based on the large model is used, and the knowledge graph has richer and more comprehensive personalized information of the patient; by means of the data mining capability of deep learning, medication recommendation which is more in line with the physical condition of the patients can be provided for the patients, and the patients have better interpretability and credibility.
FIG. 5 illustrates an internal block diagram of a computer device in one embodiment. The computer device includes a processor, a memory, a network interface, an input device, and a display screen connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system, and may also store a computer program that, when executed by the processor, may cause the processor to implement a personalized medicine recommendation method based on a knowledge graph. The internal memory may also store a computer program which, when executed by the processor, causes the processor to perform a personalized medicine recommendation method based on the knowledge graph. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in various embodiments may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.
Claims (8)
1. The personalized medicine recommendation method based on the knowledge graph is characterized by comprising the following steps of:
obtaining a disease dataset of a patient, the disease dataset comprising diagnostic data, therapy records, and medication records;
generating knowledge in the form of triples surrounding the conceptual entity of each medical code in the diagnostic data, therapy record, and medication record for that medical code using the large model; clustering nodes and edges of the knowledge to form a knowledge graph;
constructing personalized knowledge subgraphs for each patient according to diagnosis and treatment data of different patients by combining knowledge recommended by the knowledge graph, and carrying out data marking by utilizing time information in the diagnosis and treatment data of the patients;
mapping the knowledge subgraph to a word vector space by using a text embedding technology to obtain a feature fusion matrix; and inputting the feature fusion matrix into a bidirectional attention force diagram neural network to predict, and obtaining a predicted drug recommendation result.
2. The knowledge-based personalized medicine recommendation method according to claim 1, wherein the method further comprises:
after acquiring a disease dataset of a patient, performing data preprocessing, specifically including:
word embedding representation is carried out on text data and structured data in the disease data set through a word embedding model, and the text data and the structured data are mapped into a state, program and medication record representation matrix X;
and increasing or decreasing the state, the program and the medication record representation matrix X by a certain data magnitude so as to meet the data generation requirement of the large model.
3. The knowledge-graph-based personalized medicine recommendation method according to claim 2, wherein the generating knowledge in the form of triples surrounding a conceptual entity of each of the diagnostic data, the therapy record, and the medication record for the medical code using the large model; the step of clustering the knowledge nodes and edges to form a knowledge graph specifically comprises the following steps:
the state, the program and the medication record represent a matrix X, and knowledge in a triplet form is generated through instructions, examples and prompts to guide a large model; fusing the generated knowledge in the form of triples into a knowledge graph;
clustering similar nodes and edges in the knowledge graph on the global graph G corresponding to all medical codes by using a hierarchical clustering algorithm to obtain a new global graph G';
creating a new knowledge subgraph for each medical code; knowledge subgraphs of all medical codes form the knowledge graph.
4. The personalized medicine recommendation method based on the knowledge graph according to claim 1, wherein the step of constructing a personalized knowledge subgraph for each patient according to diagnosis and treatment data of different patients by combining knowledge recommended by the knowledge graph specifically comprises the following steps:
for each patient, composing its personalized knowledge subgraph by merging the clustered knowledge of its medical codes;
creating a patient node P and connecting the patient node P to a node corresponding to a diagnosis and treatment record directly corresponding to the knowledge subgraph;
the personalized knowledge sub-graph of the patient is expressed as:
wherein the method comprises the steps ofWherein { e } 1 ,e 2 ,...,e ω -is a medical code directly related to the patient, ω is the number of medical codes of the patient;
if the patient is represented as a J-visit sequence, then the knowledge sub-graph for patient i is represented as:
the inquiry sequence is { x_1, x_2, …, x_j }, whereinJ is equal to or greater than 1 and is equal to or less than J.
5. The personalized medicine recommendation method based on the knowledge graph according to claim 1, wherein the knowledge subgraph is mapped to a word vector space by using a text embedding technology to obtain a feature fusion matrix; inputting the feature fusion matrix into a bidirectional attention force diagram neural network for prediction, and obtaining a predicted drug recommendation result, wherein the method specifically comprises the following steps of:
mapping nodes and edges in the knowledge subgraph to a word vector space by using a text embedding technology to respectively obtain word embedding vectors corresponding to the nodes and word embedding vectors corresponding to the edges;
performing dimension reduction processing on the word embedding vectors corresponding to the obtained nodes and the word embedding vectors corresponding to the edges;
respectively calculating a knowledge subgraph for each visit of a patient and nodes in the knowledge subgraph corresponding to each visit;
aggregating neighboring nodes in all accessed knowledge subgraphs to update node embedding;
each node obtains the representation of the last layer of each node after passing through the set multi-layer attention layers;
setting multi-layer perceptron of bidirectional attention seeking neural network, and inputting the representation of last layer of every node to obtain deep level characteristic z joint ;
Deep level feature z joint Inputting the probability output to a sigmoid function to obtain the probability output of each drug and giving out the corresponding drug code; and obtaining a predicted drug recommendation result.
6. A personalized medicine recommendation system based on a knowledge graph, the system comprising: the system comprises a data acquisition module, a large model module, a personalized knowledge graph construction module and a bidirectional attention map neural network module;
the data acquisition module is used for acquiring a disease data set of a patient, wherein the disease data set comprises diagnosis data, therapy records and medication records;
the large model module is used for generating knowledge in the form of triples of concept entities surrounding each medical code in the diagnosis data, the therapy record and the medication record by utilizing the large model; clustering nodes and edges of the knowledge to form a knowledge graph;
the personalized knowledge graph construction module is used for constructing a personalized knowledge subgraph for each patient according to diagnosis and treatment data of different patients and combining knowledge recommended by the knowledge graph, and carrying out data marking by utilizing time information in the diagnosis and treatment data of the patients;
the bidirectional attention map neural network module is used for mapping the knowledge subgraph to a word vector space by using a text embedding technology to obtain a feature fusion matrix; and inputting the feature fusion matrix into a bidirectional attention force diagram neural network to predict, and obtaining a predicted drug recommendation result.
7. The knowledge-based personalized medicine recommendation system of claim 6, wherein the data acquisition module comprises a diagnosis record module, a surgical therapy record module, and a medication record module.
8. A computer device comprising a memory and a processor, the memory storing a computer program executable on the processor, wherein the processor, when executing the computer program, performs the steps of the knowledge-based personalized medicine recommendation method according to any one of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311660390.0A CN117672450A (en) | 2023-12-06 | 2023-12-06 | Personalized medicine recommendation method and system based on knowledge graph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311660390.0A CN117672450A (en) | 2023-12-06 | 2023-12-06 | Personalized medicine recommendation method and system based on knowledge graph |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117672450A true CN117672450A (en) | 2024-03-08 |
Family
ID=90076528
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311660390.0A Pending CN117672450A (en) | 2023-12-06 | 2023-12-06 | Personalized medicine recommendation method and system based on knowledge graph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117672450A (en) |
-
2023
- 2023-12-06 CN CN202311660390.0A patent/CN117672450A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023202508A1 (en) | Cognitive graph-based general practice patient personalized diagnosis and treatment scheme recommendation system | |
CN106202955B (en) | Diagnosis associated packets method and system based on intellectual coded adaptation | |
KR102153920B1 (en) | System and method for interpreting medical images through the generation of refined artificial intelligence reinforcement learning data | |
CN116682553B (en) | Diagnosis recommendation system integrating knowledge and patient representation | |
CN112100406B (en) | Data processing method, device, equipment and medium | |
Jiang et al. | A hybrid intelligent model for acute hypotensive episode prediction with large-scale data | |
CN116189847B (en) | Safety medicine recommendation method based on LSTM-CNN strategy of attention mechanism | |
CN111914562B (en) | Electronic information analysis method, device, equipment and readable storage medium | |
CN116364299B (en) | Disease diagnosis and treatment path clustering method and system based on heterogeneous information network | |
CN112765370B (en) | Entity alignment method and device of knowledge graph, computer equipment and storage medium | |
CN113707339A (en) | Method and system for concept alignment and content inter-translation among multi-source heterogeneous databases | |
CN114781382A (en) | Medical named entity recognition system and method based on RWLSTM model fusion | |
CN113673244A (en) | Medical text processing method and device, computer equipment and storage medium | |
CN109360658A (en) | A kind of the disease pattern method for digging and device of word-based vector model | |
He et al. | KG-MTT-BERT: knowledge graph enhanced BERT for multi-type medical text classification | |
CN114297986A (en) | ICD automatic merging coding system and method based on reinforcement learning | |
CN116072298B (en) | Disease prediction system based on hierarchical marker distribution learning | |
CN116631614A (en) | Treatment scheme generation method, treatment scheme generation device, electronic equipment and storage medium | |
CN116168828A (en) | Disease prediction method and device based on knowledge graph and deep learning and computer equipment | |
CN115964475A (en) | Dialogue abstract generation method for medical inquiry | |
CN117672450A (en) | Personalized medicine recommendation method and system based on knowledge graph | |
CN113961715A (en) | Entity linking method, device, equipment, medium and computer program product | |
CN114004237A (en) | Intelligent question-answering system construction method based on bladder cancer knowledge graph | |
Feng et al. | Can Attention Be Used to Explain EHR-Based Mortality Prediction Tasks: A Case Study on Hemorrhagic Stroke | |
Mu et al. | Diagnosis prediction via recurrent neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |