CN115240873A - Medicine recommendation method based on machine learning, electronic equipment and computer-readable storage medium - Google Patents
Medicine recommendation method based on machine learning, electronic equipment and computer-readable storage medium Download PDFInfo
- Publication number
- CN115240873A CN115240873A CN202210548894.2A CN202210548894A CN115240873A CN 115240873 A CN115240873 A CN 115240873A CN 202210548894 A CN202210548894 A CN 202210548894A CN 115240873 A CN115240873 A CN 115240873A
- Authority
- CN
- China
- Prior art keywords
- patient
- drug
- vector
- machine learning
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H70/00—ICT specially adapted for the handling or processing of medical references
- G16H70/40—ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Primary Health Care (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Pharmacology & Pharmacy (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medicinal Chemistry (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Toxicology (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Chemical & Material Sciences (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention relates to a medicine recommendation method based on machine learning, electronic equipment and a computer-readable storage medium, which belong to the field of label classification, and aim to solve the problem of high DDI (drug recommendation index) of a medicine recommendation model, a machine learning model obtains model output according to a core disease condition vector, a global medicine vector and medical information; the model output is mapped according to a threshold to obtain a recommended drug combination, which includes a core drug and an extension drug, with the effect of reducing the recommended DDI of the drugs.
Description
Technical Field
The invention belongs to the field of label classification, and relates to a medicine recommendation method, a model and a training method based on machine learning.
Background
Drug recommendation, an important task of natural language processing, is aimed at recommending drug combinations according to the condition of a patient, and can also be regarded as a multi-label classification task. However, patients often suffer from multiple diseases simultaneously, and the model must consider drug interactions (DDIs) of drug combinations when recommending drugs, which makes drug recommendations more difficult. There is little current work to explore patient drug and disease changes, but these changes may point to the future trends in patient disease, which is critical to finding core drugs.
Drug recommendations are intended to provide patients with an effective and safe combination of drugs. Medication recommendations are typically based on an Electronic Health Record (EHR) of the patient, including the patient's diagnosis history and the doctor's prescription history. With the aid of the drug recommendation model, doctors can prescribe drugs faster in the actual treatment. In past work, neural network-based drug recommendations may be divided into two categories:
the first method does not take into account the patient's historical information: example-based methods (Gong et al, 2021 read et al, 2009. This approach adequately mines the patient's current diagnostic vectors, but the lack of patient historical information may lead to poor recommendation results.
The second approach is based on patient longitudinal vector modeling, which aims to capture the dependencies between patient history vectors and achieve better results.
At present, most modeling methods based on the longitudinal vector of the patient are modeled from the perspective of medicines, and are rarely modeled from the perspective of the illness state of the patient. However, in real life, if a doctor wants to prescribe a drug to a patient, he prescribes the drug according to the previous situation of the patient. Also, the current condition of the patient is largely related to the previous condition of the patient. Neglecting the trend of the patient's condition, the model fails to highlight the patient's core condition, which may lead to a high DDI of the model results.
Rule-based methods (Read et al, 2009, choi et al, 2016) rely on rules that people have designed to suggest. Often, this method requires a large number of physicians to design and is not easily transferable.
Example-based methods (Gong et al, 2021, zhang et al, 2017) use only current patient health information, regardless of the patient's historical health information. For example, LEAP encodes the current patient diagnostic vector and then uses the encoder derived vector for recommendation. Such an approach may be more effective in the absence of historical patient health information, but may be less accurate due to the lack of historical patient health information.
Longitudinal methods (Le et al, 2018, shang et al, 2021, yang et al; wu et al, 2022) use historical health information of patients and mine correlations between visits, most of which are methods using RNNs. For example, game uses the patient's historical health information to build a Graph Augmented Memory Module and uses the historical information to make medication recommendations for the current patient. However, most of these methods excavate correlations from the perspective of the drug, and rarely from the perspective of the patient.
Disclosure of Invention
In order to solve the problem of high DDI of a drug recommendation model, the invention provides the following technical scheme:
in a first aspect, an embodiment of the present invention provides a medicine recommendation method based on machine learning, including learning historical and current medical data of a patient through a machine learning model; the machine learning model outputs a core disease condition vector and a global drug vector of the patient according to the history of the patient and the current medical data; the machine learning model encodes and outputs medical information according to the medicine graph and the EHR graph; the machine learning model obtains model output according to the core disease vector, the global medicine vector and the medical information; and mapping the model output according to a threshold value to obtain a recommended medicine combination, wherein the medicine combination comprises a core medicine and an extension medicine.
On a second aspect, the machine learning model of the first aspect:
the process vector P i And a diagnostic vector D i Stitching to obtain a patient representation h i (t) ,h i (t) A patient representation for the ith patient at the time of the tth visit;
modeling relationships between patient representations from t visits by a GRU whose input is patient representation h i (t) The output of the GRU is patient disease information e i (t) ;
Patient disease information e i (t) Inputting attention module to obtain patient disease information e i (t) Attention score of (a) i (t) ;
Use attention score a i (t) Multiplying by the patient disease information e i (t) Obtaining and outputting the core disease vector x of the patient i (t) ;
The patient's core disease vector x i (t) Current patient core condition vector x of i (T) Inputting the linear layer to obtain a core query vector q i ;
Coding the drug molecular graph through D-MPNN to obtain the embedding of all drugs in the drug molecular graph;
encoding the EHR map by the GCN to obtain the embedding of the EHR map of the patient;
multiplying the embedding of all the medicines and the embedding of the EHR image of the patient to obtain medical information r;
denote the current patient h i (n) Inputting a global medicine linear layer to obtain a global medicine vector q 'of the ith patient at the current moment' i ;
The core query vector q is divided into i And the global medication vector q' i Adding to obtain a request vector;
multiplying the request vector by the medical information r to obtain a model output of the patient iIs represented by the following formula:
wherein: λ is a learnable parameter, and as an element multiplication.
On a third aspect, the machine learning model of the second aspect: core query vector q i The length is 1 x M, and the medical information length is M x M;
recommending results through sigmoid functionThe values are compressed to the (0, 1) interval, and the model output of the patient iIs the result of the combination of the medicaments,size 1 x M |;
each dimension in the vector represents a drug, if the corresponding value of a drug is greater than 0.5, the machine learning model recommends the drug for the patient, otherwise, the machine learning model does not recommend the drug, and the drug combination formed by the recommended drugs is the recommended drug of the machine learning model.
On a fourth aspect, the machine learning model of the second aspect: the relationship between patient representations from t visits by the GRU is modeled by:
u t =σ(W u h t +U u e t-1 +b u )
r t =σ(W r h t +U r e t-1 +b r )
wherein: σ is a sigmoid activation function;is the product of the elements; u. of t As an update gate in the GRU module, by controlling how many hidden states h are t-1 Flow to the next GRU to capture long term relationships; r is a radical of hydrogen t As a reset gate in the GRU module, by controlling how many hidden states h are t-1 Candidate hidden states at inflow time tTo capture short term dependencies in the sequence; e.g. of the type t Is the hidden state of the tth GRU module, e t Is dependent on the hidden state e of the last GRU module t-1 , e t-1 Inputting the signal to the tth GRU module through a path of the model;is a parameter matrix; u shape z ,U r ,U h ∈n H ×n H Is the parameter matrix and nH is the hidden layer size.
On a fifth aspect, the machine learning model of the second aspect: the attention function of the attention module is represented by:
wherein: a is i (t) Is an input e i (t) The attention score of (a); w is a parameter matrix; e.g. of the type i (T) Is the current patient condition information, and the current patient condition information e is calculated i (T) Information e of the t-th patient visit i (t) Is similar to e i (t) An attention score is assigned.
On a sixth aspect, the machine learning model of the second aspect: the step of encoding the drug molecular graph by D-MPNN specifically comprises
Obtaining a molecular image set G of each drug according to the drug set M, wherein the molecular image of each drug consists of atoms and atom-atom edges;
from the ith drug G i Molecular image of (4) to obtain Panel G i A neighbor set N (v) of the node of middle v;
the calculation process of D-MPNN is as follows:
wherein:is that the encoded message is between node v and node w of the t-th iteration; a is k Is an atomic feature of atom k; t is the number of layers;is a hidden state between node k and node v; wi is a learnable parameter matrix; cat (a) v ,e vw ) Is a concatenation of the atomic features av of the atom v; e.g. of the type vw Is the key feature e between nodes v and w; τ denotes the ReLU activation function;
adding all hidden states W a All drug insertions for obtaining a drug molecular map:
h v =τ(W a cat(a v ,m v ))
wherein: w a Is a learnable parameter matrix; t is the total number of visits by patient i; hv is the embedding of node v; m is a unit of v Is the initial molecular diagram code obtained by adding the hidden states among all nodes.
In a seventh aspect, the machine learning model of the second aspect: the step of encoding the EHR map by GCN to obtain patient EHR embedding specifically comprises
Let C denote the drug feature vector matrix that the GCN algorithm needs to input, let A denote the EHR map matrix of the patient, GCN denotes the GCN function, and the EHR code is calculated as follows:
the GCN function is defined as follows:
embedding G for inputting EHR graph into two layers of GCN to obtain EHR graph of patient e The calculation formula is as follows:
G e =GCN(ReLU(GCN(C,A))W,A)
where W is a parameter matrix used to adjust the bias between the two GCN layers.
In one aspect, an electronic device includes: memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method in the first to seventh aspects when executing the computer program.
In one aspect, a computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the method of the first to seventh aspects.
Has the beneficial effects that: the invention provides a Patient Condition Change network (PCCNet), which models the current core drugs of a Patient by mining the medication sequence of the Patient and the time-space change of the state vector of the Patient, and allocates some auxiliary drugs as the currently recommended drug combination. Experimental results show that the proposed model greatly reduces the recommended DDI of the drug while achieving the result not lower than that of the existing SOTA.
Drawings
Fig. 1 is a model architecture of PPCNet.
Detailed Description
To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined objects, the following description is taken in conjunction with the accompanying drawings and preferred embodiments.
Example (b): the present invention proposes a machine learning model to capture changes in patient condition and recommend medications, called the patient condition change network (PCCNet). PCCNet can be divided into a patient condition change module and a whole medication module. The patient condition change module predicts the core condition of the patient by learning the historical condition of the patient and recommends core drugs. The core drug may be obtained from the core condition of the patient. The core drug is a drug that plays a key role in patient treatment. The patient condition change module greatly reduces the DDI (drug interaction) rate of the recommended outcome while maintaining the Jaccard score and F1 score as much as possible. The global medication module proposes recommendations based only on the current patient representation of the patient, i.e. only considering the detailed information of the current patient visit, whereby the main task of the global medication module is to correct the core drugs and expand the drugs. The model PCCNet performs relationship mining from the perspective of the patient in an attempt to find the effect of the patient's historical core conditions on the current condition, and thus gives medication recommendations.
Based on the above concept, the main contributions of the present invention are as follows:
the medicine recommendation model PCCNet provided by the invention can effectively reduce the medicine interaction of the recommended medicine combination and simultaneously provide an accurate recommendation result.
By mimicking the real world physician prescribing, a patient condition change module is presented that enhances the interpretability of PCCNet.
Full experiments are carried out on the public data set MIMIC-III, and the effectiveness of the PCCNet model is proved.
For the purpose of clearly defining the present invention, technical terms not defined with respect to the technical term definitions related to the present invention should be understood in the general meaning thereof in the art.
The molecule represents: the task of molecular representation is to use the molecular structure diagram to embed the molecule to get a molecular representation. Molecular characterization studies have been conducted for a long time. Molecular descriptors (maurietal, 2006) and drug fingerprints (duvenaudet, 2015) are commonly used to represent drug molecules (rogers and hahn, 2010). As deep learning evolves, more and more deep learning models are used to generate molecular representations (huang et al, 2020 b). (Huang et al, 2020 a) proposes direct modeling of subgraphs using graph-based neural network models. (Stoketotal, 2020) proposes a D-MPNN model, which focuses more on relationships between atoms. In the present invention, inspired by (Stoketotal, 2020), the present invention uses D-MPNN as a molecular encoder.
Electronic Health Record (EHR): the EHR data for the patient includes detailed information about each visit, which is stored as a medical code and a longitudinal vector for the patient. Let X = [ X = i (1) ,x i (2) ,...,x i (ni) ]Represents the longitudinal vector of patient i, where x i (j) Represents the jth visit of patient i, where n i Indicates the number of visits of patient i. The invention uses { diagnoses, procedures, mediations } triple to represent the patient's longitudinal vector, denoted by x i (j) ={d i (j) ,p i (j) ,m i (j) In which d is i (j) Is the jth diagnostic record, p, for the ith patient i (j) ,m i (j) The explanation is similar. Let D, P, M denote the patient's element set, whereRepresenting a diagnostic set of patients i, as well as the inventionA set of procedures representing the patient i,drug groups for patient i are indicated.
Safe medication suggestion: the goal of safe medication advice is to recommend safer (lower DDI) combinations of drugs for patients while ensuring that the advice is accurate. By historical patient diagnosis data D i And patient historical process data P i And patient historical medication data M i Training the model, which may recommend combinations M of drugs for patient i i 。
Historical diagnostic data D i : i.e., the DIAGNOSES _ icd. Csv file in the MIMIC-III dataset, which contains diagnostic data given by a physician for each visit of a patient.
Historical process data P i : namely, the procedure _ icd.csv file in the MIMIC-III dataset, which contains the prescription and surgical records of a doctor for each patient hospitalization.
The main notations for the model of the invention are explained below:
in one embodiment, the model structure of the PPCNet is shown in fig. 1. The entire model consists of two modules: the first module is a patient condition change module whose task is to form a current patient condition core vector based on the patient's historical diagnostic data. The second module is an integral medication module that aims to expand the medication recommended combinations of patients to obtain a medication expansion vector. And finally, adding the disease condition core vector and the drug expansion vector, and multiplying by the drug vector coded by the drug map coder to obtain a final recommendation result.
In one approach, the model inputs are accounted for: and converting the health information of the patient into a code through the universal medical code, wherein the health information comprises. And obtaining three embedded tables through data processing. Diagnostic embedded watchWhere each row represents an embedded vector of diagnostic codes. Process embedded tableWhere each row represents an embedded vector of process code.A medication embedding table is represented, where each row represents a medication code embedding vector.
From the above description, the model structure of the present invention includes a patient condition change module, and the purpose of the module is achieved based on the following modes.
In a first step, a patient condition change module extracts disease information from historical diagnostic information of a user, simulating the evolution of a patient condition. First, process vectors are generatedAnd a diagnostic vectorSplicing to obtain the patient characterization h i (t) Wherein h is i (t) Is a patient representation for the ith patient at the time of the t-visit.
In a second step, for the extraction of the condition, the GRU is used to model the relationship between different patient characterizations of a patient, as represented by t patients from visit 1 to t, where the input to the GRU is the patient characterization h i (t) 。
It is understood that the patient's condition may change over time. Similarly, if a physician is to prescribe a drug to a patient, the physician must also consider the patient's disease progression to determine the patient's trend of change. Therefore, it is the core of the patient's disease change module that the model can simulate the disease evolution. The dynamic RNN technology can be used for exploring the evolution trend of the patient's condition through the information path between GRUs while extracting the condition, so as to obtain the core condition of the patient.
GRU (Chungel., 2014) is an improvement on RNN and LSTM, solves the problem of gradient disappearance of RNN, and has a higher speed than LSTM.
The invention models the relationship between patient representations obtained by t times of patient visits through a GRU, and the formula is as follows:
u t =σ(W u h t +U u e t-1 +b u )
r t =σ(W r h t +U r e t-1 +b r )
where sigma is the sigmoid activation function,is an elemental product. u. u t As an update gate in a GRU moduleControlling how many hidden states h there are t-1 Flow to the next GRU to capture long term relationships. r is t As a reset gate in the GRU module, by controlling how many hidden states h are t-1 Candidate hidden states at inflow time tTo capture short term dependencies in the sequence. e.g. of the type t Is the hidden state of the tth GRU module, e t Is dependent on the hidden state e of the last GRU module t-1 ,e t-1 The path through the model is input to the tth GRU module.Is a parameter matrix, U z ,U r ,U h ∈n H ×n H And also parameter matrices, which are used as learnable parameters to adjust for bias between different GRU modules. Wherein n is H Is the hidden layer size. h is i (t) Is a patient representation of the t visit of the ith patient previously derived by stitching the process vector and the diagnostic vector. The invention utilizes GRU to extract and obtain the disease information e of the patient i (t) Wherein e is i (t) Representing the condition information of the ith patient at the t-th visit. The above-mentioned state of illness information that makes the model to seeing a doctor each time will fuse the state of illness information of seeing a doctor last time, and the evolution of the state of illness information that the model can express seeing a doctor.
In a third step, to be able to obtain patient disease information e i (t) Screening to obtain e i (t) Put into an attention module. The Attention mechanism focuses limited Attention on critical information, thereby saving resources and quickly obtaining the most efficient information. In this problem, attention will further highlight the patient's core condition by assigning weights. The attention function used in the present invention is expressed as:
wherein a is i (t) Is an input e i (t) Attention score of (e) i (T) Is the current patient condition information, and the current patient condition information e is calculated i (T) Information e of the t-th patient visit i (t) Has a similarity of e i (t) An attention score is assigned, where W is a parameter matrix used to control the output deviation of the attention module.
In the fourth step, the attention score a is used i (t) Multiplied by the output e of GRU i (t) As an output x i (t) The calculation formula is as follows:
x i (t) =a i (t) ×e i (t)
wherein x is i (t) Is for patient disease information e i (t) Refining and considering the core disease vector obtained after the patient's historical evolution, x will be analyzed in the following experimental examples i (t) The function of (c) is explained.
In the fifth step, the last PCCNet uses only the current patient core disease vector x i (T) Make recommendations because x i (T) The patient condition information from the first time of the patient to the current time of the patient is included, and the core disease vectors of other times do not include too much information of the current time of the patient.
In a sixth step, x is i (T) Putting the data into a linear layer to obtain the output of a patient state change module and a core query vector q i 。
The model structure of the invention also includes a drug and EHR chart encoder, which achieves its object based on the following.
The task of the drug map encoder is to encode the drug molecular map and obtain the drug inlay. A directed message passing neural network (D-MPNN) is used as a drug graph encoder in the present invention. Unlike MPNN, D-MPNN focuses more on edge relationships and achieves better results in molecular representation tasks. For the molecular diagram embedding, the correspondence and the association of the patient, the illness state and the medicine can be influenced by singly using the medicine molecular information or singly using the EHR information of the patient, and the correspondence and the association of the patient, the illness state and the medicine are expressed by the encoder, and the information is reflected in the calculation of the core illness state and the global medicine, so that the model score can be obviously improved, and the effect in the experimental example of the invention can be achieved.
EHR graph encoder: the EHR map is a matrix built from patient prescription records, simply speaking, if medication i and medication j are simultaneously present in a visit data, we assign a value of 1 to the corresponding location in the EHR matrix. That is, there are some drug synergy relationships in EHR maps, which may enhance the drug effect, and therefore these drugs in synergy may appear in the same drug combination, and encoding the EHR maps may highlight such co-occurrence, thereby improving the accuracy of the recommendation.
Let C denote the drug feature vector matrix that the GCN algorithm needs to input, let A denote the EHR map matrix of the patient, and GCN denotes the GCN function, then in the present invention, the EHR code is calculated as follows:
In the present invention, in order to extract information of EHR map better, two layers of GCNs are used for encoding. To prevent model overfitting, a ReLU was placed between the two layers of GCNs as the activation function. Compared with the traditional neural network activation function, the ReLU solves the problems of gradient explosion and gradient disappearance, and meanwhile, the calculation is faster. The EHR diagram is input into two layers of GCN to model the synergistic effect of the drugs in the EHR diagram, and the calculation formula is as follows:
G e =GCN(ReLU(GCN(C,A))W,A)
where W is a parameter matrix used to adjust the bias between the two GCN layers. The embedding G of the EHR map of the patient can be obtained after the calculation is finished e 。
D-MPNN encodes the drug molecule graph: a set of drugs M used in the data set is first collected. From the drug set M, a set G of molecular images for each drug can be obtained, the molecular images for these drugs consisting of atoms and atom-atom edges. From the ith drug G i The molecular image of (A) can be obtained as a graph G i And a neighbor set N (v) of the node of middle v. Let av represent the characteristic of node v (atom v), e vw Representing the edge features (keys) between node v and node w. The calculation process of D-MPNN is as follows:
where t is the number of layers, a k Is an atomic feature of the atom k,is a hidden state between node k and node v,is that the encoded message is between node v and node w of the t-th iteration, wi is a learnable parameter matrix, cat (a) v ,e vw ) Is a radical of atom vSub-feature a v The characteristic of the bond between nodes v and w is denoted by e vw . MESSAGE used in the formula t The function being a message transfer function, UPDATE t The function is a node update function, the implementation of which has been written in the formula, τ denoting the ReLU activation function.
h v =τ(W a cat(a v ,m v ))
wherein isA learnable parameter matrix, h v Is the embedding of node v, m v Is the initial molecular diagram code obtained by adding the hidden states among all nodes.
The above-described EHR-embedded G for patients, which can be obtained as drug information with GCN, was obtained by embedding all drugs even with D-MPNN e The multiplication is performed to obtain the medical information r.
The model structure of the invention also comprises an integral drug module, which achieves its object on the basis of: an integral drug module: after passing through the patient condition change module, the invention obtains the core condition vector of the patient, and the DDI of the recommended combination can be greatly reduced by recommending the medicine to the vector. However, drug recommendations by core disease vector will only result in a recommended combination of core drugs. If only this recommendation is returned, it may result in too few drugs in the recommended combination of drugs affecting the model's effectiveness. Thus, there is a need for a unitary drug moduleSupplementing the recommended combination. In reality, when a doctor prescribes a drug to a patient, the doctor also administers a core drug according to the main disease of the patient and then administers a fine drug combination according to the condition of the patient. The global medication module will only represent h according to the patient i (n) And (6) recommending. The module maps the current health vector h i (n) Inputting to the global drug linear layer to obtain q 'through the global drug linear layer' i Of q's' i A global drug vector representing the ith patient at the current time. Then the core query vector qi and the global dose vector q' i Adding to obtain a request vector, and multiplying the request vector by the medicament and the medicament record output by the EHR chart encoder to obtain a final result, wherein the formula is as follows:
where λ is a learnable parameter, l is an elemental multiplication,is the model output of patient i.
In summary, the model architecture of PPCNet first begins with a patient's diagnosis vector d, as shown in FIG. 1 i (j) And the process vector p i (j) Splicing to obtain an input h i (t) The model of (1). After the initial insertion of a linear layer, the first n-1 visits of the patient are denoted h i (1) To h i (n-1) The information is input into a patient state change module to obtain a core vector qi of the patient state of illness, and the medication information is input into the D-MPNN. Performing molecular diagram encoding, adding the core vector of the patient's condition and the current patient representation to obtain a request vector, and multiplying the request vector by the medical information r to obtain a model outputFinally, mapping the model output to 0 or 1 through a threshold value to obtain a recommended combination
Model training, the present invention trains the models, binary and multi-labelhingoloss, with two loss functions.
BinaryCross-entropy, in a classification task, is often used to measure the difference between a target and an output, the present invention uses binary cross entropy as a loss function, and the formula of BCE is as follows:
multi-label hinge loss takes into account the error between the corresponding true category and other categories, and is commonly used to measure the accuracy of Multi-label classification. To improve the performance of the model, the invention also uses multi-tag hinge loss as a loss function, the formula is as follows:
the algorithm execution of the model comprises the following steps:
when making a medication recommendation for the ith patient, the algorithm first uses the patient's diagnostic informationProcess informationAs an initial input to the model, these two pieces of information are first multiplied by respective embedded tablesAndto obtain a serialized vector, and then concatenating the two serialized vectors to obtain a patient representation h i (t) Wherein h is i (t) Patient representation representing the T-th visit of the ith patient, that is to say if patient i visits T times in total, there will be T patient representations for patient i, where h i (t) Has a size of 256. Then h is i (t) Putting the patient information into a GRU module to obtain the disease information e i (t) The output size of a single GRU module is 1 x M, where M represents the length of the drug set M. Then e is i (t) Put into an attention module which passes all the illness state information e of the patient i Sampling to obtain the patient's condition vector e i (t) Assign weights, which are used to estimate e based on patient pre-and post-patient condition i (t) Degree of influence on the patient's current condition, e i (T) The patient's current diagnostic information is adjusted by considering the patient's condition information for the previous T-1 times, so the recommendation of the model is based on e i (T) The process is carried out. In obtaining e i (T) After the attention score of (1), we multiply the score by e i (T) Obtaining the core disease vector x i (T) And then putting the kernel query vector q into a linear layer to obtain a kernel query vector q, wherein the input size and the output size of the linear layer are both | M |.
The integral medication module only recommends according to the current patient representation of the patient, and represents the current patient h i (T) And putting the query vector into a linear layer, wherein the input size of the linear layer is 256, the output size of the linear layer is 1 x M, and the linear layer searches possible recommendation results through linear fitting so as to obtain a global query vector q'. And finally, multiplying a learnable parameter lambda by q to serve as an adjustment to the size of q, and adding the learnable parameter lambda to q' to serve as a query vector of the recommendation model.
The model requires medical information to be recommended, and uses drug molecule information and patient EHR maps as patient information. For drug molecule information, the model uses a D-MPNN algorithm to carry out graph coding on drug molecule images, wherein the drug molecule image information is obtained from an rdkit library, the model inputs an adjacent matrix of the drug molecule images into the D-MPNN, the D-MPNN carries out information transfer on subimages to code each vertex, finally all vertex codes are added to obtain the graph code of the current drug molecule image, for single drug graph code, the output size of the D-MPNN is | M |, and for the whole drug set D-MPNN, the output size is | M | × | M |, so as to obtain the drug molecule image. For the EHR image of the patient, the model carries out image coding by using a GCN algorithm, the model puts an EHR matrix of the patient into a GCN model, for the side feature required by the GCN model, an identity matrix is used as the input of the side feature, the model firstly calculates a degree matrix of the EHR matrix, then carries out image convolution calculation by imitating convolution transformation to obtain the EHR image code of the patient, and the output size is | M | × | M |. After the medicine molecular diagram code and the patient EHR diagram code are obtained, the model multiplies the two codes to serve as medical information r, and the size of the medical information r is | M | × | M |.
And finally, multiplying the query vector by the medical information r, wherein the length of the query vector is 1 x M, and the length of the medical information is M. After multiplication, the value is compressed to the (0, 1) interval through a sigmoid function, and the recommended result of the model is obtained at the momentThe size is 1 x M,each dimension in the vector represents a drug, and if the corresponding value is greater than 0.5, the model will recommend the drug to the patient, otherwise the model will not recommend the corresponding drug.
Experimental example: the experiment was based on the public data set MIMIC-III (Johnsonetal, 2016), which contained approximately 60,000 ICU admission records. The present invention evaluates PCCNet's performance by comparing it to the following baseline method:
LR, standard logistic regression;
·ECC(Readetal.,2009),EnsembleClassifierChain(ECC);
·RETRIN(Choietal.,2016);
·LEAP(Zhangetal.,2017);
·DMNC(Leetal.,2018);
·GAMENet(Shangetal.,2019);
MICRON (Yang et al, 2021);
·SafeDrug(Yangetal.);
·COGNet(Wuetal.,2022);
shangtal, 2019; yangital, 2021; yangital.; wuetal, 2022, using DDI rate, jaccard similarity coefficient, F1 score and PRAUC as evaluation indices of the present invention.
And (4) analyzing results: as shown in the following table, the PCCNet proposed by the present invention greatly reduced the DDI of the recommended drug combination, while being superior to the most advanced methods in both Jaccard and F1-score indices. For methods that do not take into account patient history information, i.e., LR, ECC, and LEAP, these method values are recommended based on current patient health information, which results in poor results for these methods. The results of these methods are better for methods that take into account patient history information, namely, RETAIN (choice., 2016), DMNC (leetal, 2018), game (shangetal, 2019), safeDrug (yangetal), MICRON (yangetal, 2021), and COGNet (wuetatal, 2022). SafeDrug further improves the performance by introducing the molecular structure of the drug, and the proposed LocalBipartiteEncoder greatly reduces the DDI of the recommended drug combination. The MICRON model was the first proposed method to model patient disease changes, but the results were slightly lower than SafeDrug due to the lack of modeling of drug relationships and drug-to-disease relationships. COGNet uses a transform-like (vaswannieal, 2017) structure to model patients and drugs, introducing both drug vectors and patient electrical health record maps, but with higher DDI for their recommended results.
Model | Jaccard | F1 | PRAUC | DDI |
LR | 0.4865±0.0021 | 0.6434±0.0019 | 0.7509±0.0018 | 0.0829±0.0009 |
ECC | 0.4996±0.0049 | 0.6569±0.0044 | 0.6844±0.0038 | 0.0846±0.0018 |
RETAIN | 0.4887±0.0028 | 0.6481±0.0027 | 0.7556±0.0033 | 0.0835±0.0020 |
LEAP | 0.4521±0.0024 | 0.6138±0.0026 | 0.6549±0.0033 | 0.0731±0.0008 |
DMNC | 0.4864±0.0025 | 0.6529±0.0030 | 0.7580±0.0039 | 0.0842±0.0011 |
GAMENet | 0.5067±0.0025 | 0.6626±0.0025 | 0.7631±0.0030 | 0.0864±0.0006 |
MICRON | 0.5100±0.0033 | 0.6654±0.0031 | 0.7687±0.0026 | 0.0641±0.0007 |
SafeDrug | 0.5213±0.0030 | 0.6768±0.0027 | 0.7647±0.0025 | 0.0589±0.0005 |
COGNet | 0.5336±0.0011 | 0.6869±0.0010 | 0.7739±0.0009 | 0.0852±0.0005 |
PCCNet (text) | 0.5352±0.0031 | 0.6875±0.0014 | 0.7655±0.0033 | 0.0451±0.0025 |
Sample analysis: to interpret the core disease vector x extracted previously i (t) The effect of (1)Sample analysis was performed to demonstrate the effectiveness of the model. In the MIMIC-III dataset, the present invention randomly selects a patient for analysis. This patient has a total of three visits to the hospital. The first visit shows that the patients have diseases such as liver cirrhosis, congestive heart failure, etc., the second visit shows that the patients have diseases such as liver coma, acute respiratory failure, congestive heart failure, etc., and the third visit shows that the patients have diseases such as acute renal failure, esophagitis, etc. Patient information and drug information are listed in the table below.
The present invention uses ICD-9 (international disease classification-9) codes and ATC (anatomical treatment chemistry) codes in place of patient disease and drugs used.
It will also be seen from the present description of patients that the condition of the patient may be the same as the last diagnosed disease or some pathology based thereon. The present invention uses only the patient's current core disease vector x in the PPCNet model i (T) I.e., the core disease vector, from the patient's current diagnostic information, and uses its output to obtain the drug. To explore the core disease vector x i (t) The invention uses each x i (t) Recommendations were made and recorded, the results of which are shown in the table below. The recommended medications for each recommended hit are underlined in the table.
It can be seen from the table that each recommendation gives some correct drugs indicating the validity of the core disease vector for the recommendation task, while for drugs prescribed for each visit of the patient, the recommended combinations resulting from the recommendations based on the core disease vector also contain these drugs, e.g. the model gives J01C in each recommendation.
However, the present invention finds some unusual aspects in the recommendations. For example, drug A12A appears in the prescription made by the physician each time, and each recommendation made by the present invention using the core disease vector also hits A12A. The present invention queries a12A for a specific drug name and finds that a12A actually represents other mineral supplements, which means that recommendations using the core disease vector may recommend not only the core drug, but also some adjunctive therapeutic drugs. The present inventors have studied this case for this purpose and have found that, because the physician prescribes approximately half of the adjunctive therapeutic drugs, when recommendations are made using the core disease vector, three to four adjunctive therapeutic drugs are prescribed per recommendation. However, this does not affect the accuracy of the model, since these drugs are also those prescribed by the doctor for the patient, i.e. these auxiliary drugs are also needed by the patient.
The drug recommendation model PPCNet provided by the invention can better capture the evolution of the patient diseases by utilizing the historical visit information of the patient, thereby giving a more accurate and smaller DDI drug combination. The invention verifies the performance of the model on the public data set MIMIC-III. The results show that PCCNet can greatly reduce the DDI rate of the recommended combination under the condition of providing accurate medication recommended combination. Indeed, PCCNet starts from the reduction of the DDI of a drug combination. In some practical treatments, low DDI rates are even more important than the accuracy of the drug combination itself. At the end of the article, the present invention demonstrates the effectiveness of each component in the PCCNet by ablation. In future work, the present invention will focus on further finding ways to reduce DDI while increasing model score.
An embodiment of the present invention further provides an electronic device, where the electronic device includes: the computer program may be executed on a processor, and when the processor executes the computer program, the processor implements the steps in the method provided by the above-mentioned embodiments. The electronic equipment provided by the embodiment of the invention can realize each implementation mode in the method embodiment and corresponding beneficial effects.
The embodiment of the invention also provides a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the method provided by the embodiment of the invention is realized, and the same technical effect can be achieved.
Those skilled in the art will appreciate that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer readable storage medium and can include the processes of the embodiments of the methods described above when executed. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
Although the present invention has been described with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the present invention.
Claims (9)
1. A medicine recommendation method based on machine learning is characterized by comprising
Learning the history and current medical data of the patient through a machine learning model;
the machine learning model outputs a core disease condition vector and a global drug vector of the patient according to the history of the patient and the current medical data;
the machine learning model encodes and outputs medical information according to the medicine graph and the EHR graph;
the machine learning model obtains model output according to the core disease condition vector, the global medicine vector and the medical information;
and mapping the model output according to a threshold value to obtain a recommended medicine combination, wherein the medicine combination comprises a core medicine and an extension medicine.
2. The machine learning-based medication recommendation method of claim 1,
the machine learning model:
will process vector P i And a diagnostic vector D i Stitching to obtain a patient representation h i (t) ,h i (t) A patient representation for the ith patient at the time of the tth visit;
modeling relationships between patient representations from t visits by a GRU whose input is patient representation h i (t) The output of the GRU is patient disease information e i (t) ;
Patient disease information e i (t) Inputting attention module to obtain patient disease information e i (t) Attention score of (a) i (t) ;
Use attention score a i (t) Multiplying by the patient disease information e i (t) Obtaining and outputting the core disease vector x of the patient i (t) ;
The patient's core disease vector x i (t) Current patient core condition vector x of i (T) Inputting the linear layer to obtain a core query vector q i ;
Coding the drug molecular graph through D-MPNN to obtain the embedding of all drugs in the drug molecular graph;
encoding the EHR map by the GCN to obtain the embedding of the EHR map of the patient;
multiplying the embedding of all the medicines and the embedding of the EHR image of the patient to obtain medical information r;
denote the current patient h i (n) Inputting a global medicine linear layer to obtain a global medicine vector q 'of the ith patient at the current moment' i ;
The core query vector q is divided into i And the global medication vector q' i Adding to obtain a request vector;
multiplying the request vector by the medical information r to obtain a model output of the patient iIs represented by the following formula:
wherein: λ is a learnable parameter, and as an element multiplication.
3. The machine learning-based medication recommendation method of claim 2,
core query vector q i The length is 1 x M, and the medical information length is M x M;
recommending results through sigmoid functionThe values are compressed to the (0, 1) interval, and the model output of the patient iIs the result of the combination of the medicaments,size is 1 x M |;
each dimension in the vector represents a drug, if the corresponding value of a drug is greater than 0.5, the machine learning model recommends the drug for the patient, otherwise, the machine learning model does not recommend the drug, and the drug combination formed by the recommended drugs is the recommended drug of the machine learning model.
4. The machine learning-based medication recommendation method of claim 2, wherein the modeling of the relationship between patient representations from t visits by the GRU is represented by:
u t =σ(W u h t +U u e t-1 +b u )
r t =σ(W r h t +U r e t-1 +b r )
wherein: σ is a sigmoid activation function;is the product of the elements; u. u t As an update gate in the GRU module, by controlling how many hidden states h are t-1 Flow to the next GRU to capture long term relationships; r is a radical of hydrogen t As a reset gate in the GRU module, by controlling how many hidden states h are t-1 Candidate hidden states at inflow time tTo capture short term dependencies in the sequence; e.g. of the type t Is the hidden state of the tth GRU module, e t Is dependent on the hidden state e of the last GRU module t-1 ,e t-1 Inputting the signal to the t GRU module through a path of the model; w u ,W r ,Is a parameter matrix; u shape z ,U r ,U h ∈n H ×n H Is the parameter matrix and nH is the hidden layer size.
5. The machine-learning based drug recommendation method of claim 2, wherein the attention function of the attention module is represented by:
wherein: a is a i (t) Is an input e i (t) The attention score of (a); w is a parameter matrix; e.g. of a cylinder i (T) Is the current patient condition information, and the current patient condition information e is calculated i (T) Information e of the t-th patient visit i (t) Is similar to e i (t) An attention score is assigned.
6. The machine-learning based drug recommendation method of claim 2, wherein the step of encoding the drug molecule graph by D-MPNN specifically comprises
Obtaining a molecular image set G of each drug according to the drug set M, wherein the molecular image of each drug consists of atoms and atom-atom edges;
from the ith drug G i Obtaining a molecular image of (1) i A neighbor set N (v) of the node of middle v;
the calculation process of D-MPNN is as follows:
wherein:is that the encoded message is between node v and node w of the t-th iteration; a is a k Is an atomic feature of atom k; t is the number of layers;is a hidden state between node k and node v; wi is a learnable parameter matrix; cat (a) v ,e vw ) Is a concatenation of the atomic features av of the atom v; e.g. of the type vw Is the key feature e between nodes v and w; τ denotes the ReLU activation function;
adding all hidden states W a All drug insertions to obtain drug molecular profiles
h v =τ(W a cat(a v ,m v ))
Wherein: w is a group of a Is a learnable parameter matrix; t is the total number of visits by patient i; hv is the embedding of node v; m is v Is the initial molecular graph encoding that is the sum of the hidden states between all nodes.
7. The machine learning-based medication recommendation method of claim 2, wherein said step of encoding EHR maps via GCN for patient EHR embedding includes
Let C denote the drug feature vector matrix that the GCN algorithm needs to input, let A denote the EHR map matrix of the patient, GCN denotes the GCN function, and the EHR code is calculated as follows:
the GCN function is defined as follows:
embedding G for inputting EHR graph into two layers of GCN to obtain EHR graph of patient e The calculation formula is as follows:
G e =GCN(ReLU(GCN(C,A))W,A)
where W is a parameter matrix used to adjust the bias between the two GCN layers.
8. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, the processor implementing the steps in the method as claimed in claims 1 to 7 when executing the computer program.
9. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of the method as claimed in claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210548894.2A CN115240873A (en) | 2022-05-20 | 2022-05-20 | Medicine recommendation method based on machine learning, electronic equipment and computer-readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210548894.2A CN115240873A (en) | 2022-05-20 | 2022-05-20 | Medicine recommendation method based on machine learning, electronic equipment and computer-readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115240873A true CN115240873A (en) | 2022-10-25 |
Family
ID=83668414
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210548894.2A Pending CN115240873A (en) | 2022-05-20 | 2022-05-20 | Medicine recommendation method based on machine learning, electronic equipment and computer-readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115240873A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116189847A (en) * | 2023-05-05 | 2023-05-30 | 武汉纺织大学 | Safety medicine recommendation method based on LSTM-CNN strategy of attention mechanism |
CN116417148A (en) * | 2023-05-11 | 2023-07-11 | 纳里健康科技有限公司 | Medical information recommendation method considering user activation interest in 5G application field |
-
2022
- 2022-05-20 CN CN202210548894.2A patent/CN115240873A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116189847A (en) * | 2023-05-05 | 2023-05-30 | 武汉纺织大学 | Safety medicine recommendation method based on LSTM-CNN strategy of attention mechanism |
CN116417148A (en) * | 2023-05-11 | 2023-07-11 | 纳里健康科技有限公司 | Medical information recommendation method considering user activation interest in 5G application field |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shang et al. | Gamenet: Graph augmented memory networks for recommending medication combination | |
Yang et al. | Intelligent health care: Applications of deep learning in computational medicine | |
Beam et al. | Clinical concept embeddings learned from massive sources of multimodal medical data | |
Yang et al. | Change matters: Medication change prediction with recurrent residual networks | |
Degoulet et al. | Introduction to clinical informatics | |
Wagholikar et al. | Modeling paradigms for medical diagnostic decision support: a survey and future directions | |
Sushil et al. | Patient representation learning and interpretable evaluation using clinical notes | |
CN111798954A (en) | Drug combination recommendation method based on time attention mechanism and graph convolution network | |
CN113724814B (en) | Triage method, triage device, computing equipment and storage medium | |
CN109887606B (en) | Attention-based diagnosis and prediction method for bidirectional recurrent neural network | |
Guo et al. | An interpretable disease onset predictive model using crossover attention mechanism from electronic health records | |
CN114783603A (en) | Multi-source graph neural network fusion-based disease risk prediction method and system | |
Davazdahemami et al. | A deep learning approach for predicting early bounce-backs to the emergency departments | |
Gong et al. | Prognosis analysis of heart failure based on recurrent attention model | |
Li et al. | DeepAlerts: deep learning based multi-horizon alerts for clinical deterioration on oncology hospital wards | |
Shang et al. | Knowledge guided multi-instance multi-label learning via neural networks in medicines prediction | |
CN115240873A (en) | Medicine recommendation method based on machine learning, electronic equipment and computer-readable storage medium | |
Liu et al. | Research on named entity recognition of Traditional Chinese Medicine chest discomfort cases incorporating domain vocabulary features | |
CN113902186A (en) | Patient death risk prediction method, system, terminal and readable storage medium based on electronic medical record | |
Nath et al. | Application of specialized word embeddings and named entity and attribute recognition to the problem of unsupervised automated clinical coding | |
US20200243194A1 (en) | Computerized Medical Diagnostic and Treatment Guidance | |
Sun et al. | Deep dynamic patient similarity analysis: model development and validation in ICU | |
Yu et al. | AKA-SafeMed: A safe medication recommendation based on attention mechanism and knowledge augmentation | |
Jagannatha et al. | Bidirectional recurrent neural networks for medical event detection in electronic health records | |
Li et al. | A patient information mining network for drug recommendation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |