CN114694841B - Adverse event risk prediction method based on patient electronic health record - Google Patents

Adverse event risk prediction method based on patient electronic health record Download PDF

Info

Publication number
CN114694841B
CN114694841B CN202210322129.9A CN202210322129A CN114694841B CN 114694841 B CN114694841 B CN 114694841B CN 202210322129 A CN202210322129 A CN 202210322129A CN 114694841 B CN114694841 B CN 114694841B
Authority
CN
China
Prior art keywords
data
patient
representing
vector
basic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210322129.9A
Other languages
Chinese (zh)
Other versions
CN114694841A (en
Inventor
郑恒杰
刘勇国
张云
朱嘉静
李巧勤
傅翀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202210322129.9A priority Critical patent/CN114694841B/en
Publication of CN114694841A publication Critical patent/CN114694841A/en
Priority to ZA2022/08574A priority patent/ZA202208574B/en
Application granted granted Critical
Publication of CN114694841B publication Critical patent/CN114694841B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Mathematical Physics (AREA)
  • Epidemiology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Primary Health Care (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention discloses an adverse event risk prediction method based on an electronic health record of a patient, which comprises the following steps of: s1, preprocessing data; s2, performing K-means clustering sampling processing, and dividing data into 3 clusters to obtain 3 clustering centers; s3, pressing 3 clustering centers to P * The maximum values in the three subsets are sorted from small to large and respectively used as an uncommon code subset, a more common code subset and a common code subset, then the three subsets are respectively and correspondingly input into three basic classifiers of GRAM +, dipole + and RNN + for pre-training, and then model fusion is carried out on the three basic classifiers. According to the method, a clustering algorithm is used for sampling proper training samples for a basic learning device, a self-adaptive combination strategy is designed, and integration weights of different basic classifiers are generated in a self-adaptive mode according to the distance from the training samples to the center of a pre-training set, so that the model has stronger self-adaptability. In addition, through the sampling after clustering, the calculation amount can be obviously reduced when the basic embedding is trained.

Description

Adverse event risk prediction method based on patient electronic health record
Technical Field
The invention relates to an adverse event risk prediction method based on an electronic health record of a patient.
Background
AIDS is a highly harmful infectious disease, is caused by infection of AIDS virus (HIV), and has the main attack target of CD4T lymphocyte which is the most important in the immune system of human body, so that the human body loses the immune function, is easy to infect various diseases and has high fatality rate. After AIDS, if the patient is actively treated, a relatively good treatment effect can be obtained, but if adverse events such as serious complications occur, the treatment effect is affected. The method can predict adverse events such as possible complications in the future by combining conventional risk factors and specific factors of AIDS patients, and can be used as powerful assistance for guiding the medical care of the AIDS patients. The Electronic Health Records (EHRs) of AIDS patients not only comprise medical codes (including diagnosis, medication and program codes, wherein the diagnosis codes comprise 585.9 (chronic kidney disease), the program codes refer to codes representing procedures such as intervention, treatment and the like, each code represents symptoms, diseases, abnormal findings, intervention, treatment and the like) of each diagnosis of the AIDS patients, but also comprise personalized data such as demographic data, vital signs and the like of the AIDS patients, and the data are utilized to predict possible future adverse events of the AIDS patients so as to assist doctors to make more reasonable decisions on the medical care of the AIDS patients.
The Chinese patent application CN109887606A as a diagnosis and prediction method of a bidirectional recurrent neural network based on attention provides a prediction method of the bidirectional recurrent neural network based on attention, firstly, high-dimensional medical codes (namely clinical variables) are embedded into a low code layer space, then, coded representations are input into the bidirectional recurrent neural network based on attention, and hidden state representations are generated. The medical code for future visits is predicted by the softmax layer.
Edward Choi (E.Choi, M.T.Bahadori, L.Song, et al.UA-CRNN: GRAM: graph-based assessment Model for Healthcare retrieval Learning [ C ]. In: proceedings of the 24th ACM SIGKDD International Conference on Knowledge discovery & data mining, london,2018, pp.249-256) et al propose a Representation Learning method based on a Knowledge Graph Attention mechanism, learning an embedded Representation containing more informative medical codes mainly using hierarchical information inherent to a medical ontology, and then performing prediction by using a depth Learning method. However, the above-mentioned technical method has the following problems: (1) The model has dependence on the training data volume, good prediction effect is achieved when the training data is sufficient, and the prediction performance is poor when the data volume is insufficient; (2) Medical ontology knowledge contained in medical coding is ignored, and the prediction performance of medical codes with low occurrence frequency and rare cases is poor.
The representation learning method based on the knowledge graph needs larger calculation cost and training difficulty in order to learn the embedded representation of the medical code containing richer information. In addition, the above methods ignore individual differences between patients, which has an effect on the accuracy of the prediction.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an adverse event risk prediction method based on patient electronic health record, which is characterized in that a clustering algorithm is used for sampling proper training samples for a basic learner, a self-adaptive combination strategy is designed, and integration weights of different basic classifiers are generated in a self-adaptive manner according to the distance between the training samples and the center of a pre-training set, so that a model has stronger self-adaptability.
The purpose of the invention is realized by the following technical scheme: a method for adverse event risk prediction based on an electronic patient health record, comprising the steps of:
s1, data preprocessing: taking the data of each patient as a time-sequential diagnostic sequence in the electronic health record data; the diagnostic sequence was processed as follows:
s11, using C = { C 1 ,c 2 ,...,c N Denotes the set of all diagnostic codes, c i Representing the ith diagnosis code, wherein i is more than or equal to 1 and less than or equal to N, and N represents the total number of the diagnosis codes; x = [ X ] 1 ,x 2 ,...,x T ]Representing a patientThe visit information of (1), wherein the tth visit information x t ∈{0,1} N ,{0,1} N Representing a vector of N elements, each element having a value of 0 or 1, i.e. x t ={x t1 ,x t1 ,…,x ti ,…x tN }; if the diagnostic code c with serial number i i ∈{c 1 ,c 2 ,...,c N X is present in the t-th visit ti =1, otherwise x ti =0;
S12, using L = [ L = 1 ,l 2 ,...,l T ]Personalized data representing all visits of a patient, l i A vector representation of the personalized data record representing the ith visit; the average value is obtained for each patient in T times of treatment, and the average value l of the same kind of data in different times of treatment is obtained * (ii) a Selecting missing values for numerical data, and selecting missing values to be filled by using an average value, and for non-numerical data, filling the missing values by using values with highest occurrence frequency in the patient data according to a mode principle in statistics;
s13, summing each diagnosis code in the X to obtain the frequency of the unique diagnosis code in all the clinic information of each patient
Figure BDA0003572143720000021
I.e. is>
Figure BDA0003572143720000022
Then all are treated again>
Figure BDA0003572143720000023
Summing to obtain the frequency S of unique diagnosis codes in all data * Let P * =s * /S * Representing the proportion of the occurrence frequency of each diagnosis code in each patient data in all data;
after the treatment is finished, the data of j th patient consists of three parts X j 、L j 、F j J is more than or equal to 1 and less than or equal to M, and M represents the number of patients with collected data;
Figure BDA0003572143720000024
is shown asAverage value l of the same data in different visits of j patients * ,/>
Figure BDA0003572143720000025
Representing the proportion of the occurrence frequency of each diagnosis code in the jth patient data in all the data;
s2, carrying out K-means clustering sampling treatment: with data for each patient
Figure BDA0003572143720000026
Performing K-means clustering as sample points, dividing data into 3 clusters to obtain 3 clustering centers theta 123 Then calculating F for each patient data j And F' for each cluster center, at the same sampling rate >>
Figure BDA0003572143720000027
Selecting corresponding sub-data sets from the data of all patients according to the sequence of the distances from small to large to obtain D' = D 1 '∪D 2 '∪D 3 ' generating a plurality of subdata sets for training a basic classifier;
s3, clustering 3 centers theta 123 According to P * The maximum values in the three subsets are sorted from small to large and respectively used as an rare coding subset, a more common coding subset and a common coding subset, then the three subsets are respectively and correspondingly input into three basic classifiers of GRAM +, dipole + and RNN + for pre-training, and then the three basic classifiers are subjected to model fusion.
Further, the GRAM + adds a global attention mechanism by using the personalized data of the patient as a guide on the basis of GRAM, and the specific design is as follows:
in the knowledge directed acyclic graph formed by the medical ontology, a leaf node is an element in the diagnosis code set in the S11, and an ancestor node of the leaf node represents that the ontology represented by the leaf node is derived from the leaf node; all nodes c are assigned a basic embedding vector e, representing the final representation of a leaf node as a basic embedded convex combination of itself and its ancestor nodes:
Figure BDA0003572143720000031
wherein g is i Represents a medical code c i A (i) represents the code c i And c i Index of ancestor node, α ij Is the local attention weight, calculated by the Softmax function as follows:
Figure BDA0003572143720000032
f(e i ,e j ) Is a scalar value, representing e i And e j The compatibility between two basic embeddings is obtained by a multilayer perceptron;
by concatenating the final representations g of all medical codes 1 ,g 2 ,...,g N To obtain an embedded matrix G, and a future diagnosis vector v t Expressed as a vector x t Multiplied by the embedding matrix G and passed through a nonlinear tanh () function:
v 1 ,v 2 ,...,v T =tanh(G[x 1 ,x 2 ,...,x T ])
then using the personalized data L = [ L ] of the patient 1 ,l 2 ,...,l T ]To add a global attention weight beta t Obtaining a global representation u comprising patient-personalized data information t
u t =β t v t ,t=1,2,...,T
β t Calculated by the following Softmax function:
Figure BDA0003572143720000033
f(l i ,l * ) Is a scalar value representing l i And l * Compatibility between, by sense of multi-layerObtaining by knowing a machine;
will u 1 ,u 2 ,...u T Inputting the data into a GRU network to obtain a hidden state representation h 1 ,h 2 ,...,h T Generating the first prediction information by the Softmax layer
Figure BDA0003572143720000041
Is defined as:
h 1 ,h 2 ,...,h T =GRU(u 1 ,u 2 ,...,u Tr )
Figure BDA0003572143720000042
θ r is a super-reference to the GRU network,
Figure BDA0003572143720000043
and &>
Figure BDA0003572143720000044
Weights and biases to be learned;
using true diagnostic information y t And prediction information
Figure BDA0003572143720000045
The loss is calculated as follows:
Figure BDA0003572143720000046
upper label
Figure BDA0003572143720000048
Representing a transpose; the loss calculation is back propagated, the error between prediction and reality is calculated, and back propagation is learned and corrected until the GRAM + model converges.
Further, the Dipole + utilizes patient personalized data L = [ L = ] 1 ,l 2 ,...,l T ]As a guide, both directions are used simultaneouslyA recurrent neural network and attention mechanism to predict patient visit information; first, the visit information X is embedded into a representation vector v by a multi-layer perceptron t Then using the patient's personalized data L = [ L 1 ,l 2 ,...,l T ]To add a global attention weight beta t Obtaining a global representation u comprising patient-personalized data information t
u t =β t v t ,t=1,2,...,T
β t Calculated by the following Softmax function:
Figure BDA0003572143720000047
f(l i ,l * ) Is a scalar value;
then vector u t Is input to a bi-directional recurrent neural network and finally, the bi-directional outputs are concatenated to generate a potential vector for prediction using an attention mechanism based on the data sequence position.
Further, RNN + is based on RNN with patient personalization data L = [ ] 1 ,l 2 ,...,l T ]Guiding a patient visit information representation vector X to generate a global representation vector u comprising patient personalized data information t Global representation vector u t The algorithm of (c) is the same as the Dipole + method, and the vector u is represented globally t An attention model based on the data sequence position is entered and then a prediction is made using unidirectional GRUs.
Further, an adaptive weighted integration strategy is adopted in the model fusion stage, and for each sample X i =[x 1 ,x 2 ,...,x T ]Calculating its distance d to each cluster center i =[δ i1i2i3 ];
For each sample, the integrated weight w is generated using the following formula i
Figure BDA0003572143720000051
The final integrated output result is expressed as:
Figure BDA0003572143720000052
wherein
Figure BDA0003572143720000053
The outputs of the three basic classifiers are shown.
The invention has the beneficial effects that: compared with the prior art, the technical scheme provided by the invention considers the difference between individuals of the patient, and utilizes the individualized data of the patient to guide the model to establish reasonable attention, thereby improving the accuracy of model prediction. In addition, the influence of different sample scales on the model performance is considered, appropriate training samples are sampled for the basic learning device through a clustering algorithm, a self-adaptive combination strategy is designed, and the integration weights of different basic classifiers are generated in a self-adaptive mode according to the distance from the training samples to the center of a pre-training set, so that the model has stronger self-adaptability. In addition, through the sampling after clustering, the calculation amount can be obviously reduced when the basic embedding is trained.
Drawings
FIG. 1 is a flow chart of an adverse event risk prediction method based on an electronic patient health record of the present invention;
FIG. 2 is a diagram illustrating the structure of the GRAM + classifier of the present invention;
FIG. 3 is a schematic diagram of the integration strategy of the present invention.
Detailed Description
The invention discloses an effective method for predicting adverse event risks of electronic health records of AIDS patients, which comprises the steps of firstly utilizing a clustering method to sample proper pre-training sets for different basic learners, integrating the prediction performances of different classifiers on codes with different frequencies, designing a self-adaptive combination strategy, and generating the integration weights of different basic classifiers in a self-adaptive manner according to the distance between a training sample and the center of the pre-training set so as to balance the difference of a single model on the prediction performances of medical codes with different frequency numbers. In addition, the model is added with an attention mechanism which takes personalized data as a guide to make up for individual differences, and the model accuracy is increased. The technical scheme of the invention is further explained by combining the attached drawings.
As shown in fig. 1, the method for predicting the risk of adverse events based on the electronic health record of the patient of the present invention comprises the following steps:
s1, data preprocessing: in the electronic health record data, the data of each patient is regarded as a time sequence of diagnosis, and in each diagnosis, a plurality of diagnosis codes (including diagnosis, medication, program codes and the like) exist; the diagnostic sequence was processed as follows:
s11, using C = { C = 1 ,c 2 ,...,c N Denotes the set of all diagnostic codes, c i Representing the ith diagnosis code, wherein i is more than or equal to 1 and less than or equal to N, and N represents the total number of the diagnosis codes; x = [ X ] 1 ,x 2 ,...,x T ]Representing the visit information of a patient, wherein the t-th visit information x t ∈{0,1} N Being binary vectors, {0,1} N Representing a vector of N elements, each element having a value of 0 or 1, i.e. x t ={x t1 ,x t1 ,…,x ti ,…x tN }; if the diagnostic code c with serial number i i ∈{c 1 ,c 2 ,...,c N X is present in the t-th visit ti =1, otherwise x ti =0;
S12, using L = [ L = 1 ,l 2 ,...,l T ]Personalized data representing all visits by a patient, l i A vector representation of the personalized data record representing the ith visit; the average value is obtained for each patient in T times of treatment, and the average value l of the same kind of data in different times of treatment is obtained * (ii) a Selecting missing values for numerical data, and selecting missing values to be filled by using an average value, and for non-numerical data, filling the missing values by using values with highest occurrence frequency in the patient data according to a mode principle in statistics;
s13, summing each diagnosis code in the X to obtain the frequency of the unique diagnosis code in all the treatment information of each patientNext time
Figure BDA0003572143720000061
I.e. is>
Figure BDA0003572143720000062
Then all are treated again>
Figure BDA0003572143720000063
Summing to obtain the frequency S of unique diagnosis codes in all data * Let P * =s * /S * Representing the proportion of the occurrence frequency of each diagnosis code in each patient data in all data;
after the treatment, the data of j th patient consists of three parts X j 、L j 、F j J is more than or equal to 1 and less than or equal to M, and M represents the number of patients with collected data; x j For training classifiers, L j To guide the attention mechanism of the classifier,
Figure BDA0003572143720000064
for clustering patients, and for determining whether a patient is in a cluster>
Figure BDA0003572143720000065
Mean value l representing the same data from different visits of the j-th patient * ,/>
Figure BDA0003572143720000066
Indicating the proportion of the frequency of occurrence of each diagnostic code in the jth patient data in the total data.
S2, carrying out K-means clustering sampling treatment: data per patient F = [ l = * ,P * ]Performing K-means clustering as sample points, dividing data into 3 clusters (the number of the clusters is generally the same as that of basic classifiers adopted later), and more likely aggregating data with similar personalized data and similar frequency of occurrence of diagnostic codes in diagnostic records to the same cluster; given the number of clusters 3, the data for all patients were divided into D = D using the K-means algorithm in combination with D 1 ∪D 2 ∪D 3 To obtain 3Center of each cluster theta 123 Then calculating F for each patient data j And F' of each cluster center at the same sampling rate for each cluster center
Figure BDA0003572143720000067
Selecting corresponding sub-data sets from the data of all patients according to the sequence of the distances from small to large to obtain D' = D 1 '∪D 2 '∪D 3 ', the generated plurality of subdata sets are used for training of a basic classifier;
s3, clustering 3 centers theta 123 According to P * The maximum values in the three subsets are sorted from small to large and respectively used as an uncommon coding subset, a more common coding subset and a common coding subset, then the three subsets are respectively and correspondingly input into three basic classifiers of GRAM +, dipole + and RNN + for pre-training to learn decision boundaries, and then model fusion is carried out on the three basic classifiers;
the GRAM + is based on GRAM, adds a global attention mechanism by using the personalized data of the patient as a guide, and is specifically designed as follows as shown in fig. 2:
in a knowledge directed acyclic graph formed by a medical ontology, leaf nodes and ancestor nodes are used for distinguishing, the medical ontology conforms to a tree structure when being named and coded, all the leaf nodes are elements in a diagnostic code set in S11, and the ancestor nodes represent ontologies represented by the leaf nodes and are derived from the leaf nodes; all nodes c are assigned a basic embedding vector e, representing the final representation of a leaf node as a basic embedded convex combination of itself and its ancestor nodes:
Figure BDA0003572143720000071
wherein g is i Represents a medical code c i (i.e., leaf node) embedded representation, a (i) represents code c i And c i Index of ancestor node, α ij Is the local attention weight of the user and,calculated by the Softmax function as follows:
Figure BDA0003572143720000072
f(e i ,e j ) Is a scalar value, representing e i And e j Compatibility between two basic embeddings, derived by the multilayer perceptron (MLP); training basic embedding by using Glover, and learning coded representation by using a global co-occurrence matrix of all nodes c;
by concatenating the final representations g of all medical codes 1 ,g 2 ,...,g N To obtain an embedded matrix G, and a diagnosis vector v t Expressed as a vector x t Multiplied by the embedding matrix G and passed through a nonlinear tanh () function:
v 1 ,v 2 ,...,v T =tanh(G[x 1 ,x 2 ,...,x T ])
then using the patient's personalized data L = [ L = 1 ,l 2 ,...,l T ]To add a global attention weight beta t Obtaining a global representation u comprising patient-personalized data information t
u t =β t v t ,t=1,2,...,T
β t Calculated by the following Softmax function:
Figure BDA0003572143720000073
f(l i ,l * ) Is a scalar value, representing l i And l * The compatibility between the sensors is obtained by a multilayer perceptron;
will u 1 ,u 2 ,...u T Inputting the data into a GRU network to obtain a hidden state representation h 1 ,h 2 ,...,h T Generating the first prediction information by a Softmax layer
Figure BDA0003572143720000081
Is defined as follows:
h 1 ,h 2 ,...,h T =GRU(u 1 ,u 2 ,...,u Tr )
Figure BDA0003572143720000082
θ r is a super-reference to the GRU network,
Figure BDA0003572143720000083
and &>
Figure BDA0003572143720000084
Weights and biases to be learned;
using true diagnostic information y t And predictive information
Figure BDA0003572143720000085
The loss is calculated as follows:
Figure BDA0003572143720000086
upper label
Figure BDA0003572143720000088
Representing a transposition; the loss calculation is back propagated, the error between prediction and reality is calculated, and back propagation is learned and corrected until the GRAM + model converges.
The Dipole + utilizes patient personalized data L = [ L = [ ] 1 ,l 2 ,...,l T ]As a guide, a bidirectional recurrent neural network and an attention mechanism are used simultaneously to predict the patient information; first, the information X of the doctor is embedded into a representation vector v by a multi-layer perceptron t Then using the patient's personalized data L = [ L = 1 ,l 2 ,...,l T ]To add a global attention weight beta t Obtaining a global representation u comprising patient-personalized data information t
u t =β t v t ,t=1,2,...,T
β t Calculated by the following Softmax function:
Figure BDA0003572143720000087
f(l i ,l * ) Is a scalar value;
then vector u t Is input to a bi-directional recurrent neural network and finally, the bi-directional outputs are concatenated to generate a potential vector for prediction using an attention mechanism based on the data sequence position.
RNN + is based on RNN using patient-specific data L = [ L = 1 ,l 2 ,...,l T ]Guiding the patient visit information representation vector X to generate a global representation vector u containing patient personalized data information t Global representation vector u t The algorithm of (c) is the same as the Dipole + method, and the vector u is represented globally t An attention model based on the data sequence position is entered and then predictions are made using unidirectional GRUs.
Adopting a self-adaptive weighting integration strategy in a model fusion stage; as shown in FIG. 3, in the fusion phase, X is applied to each sample i =[x 1 ,x 2 ,...,x T ]Calculating its distance d to each cluster center i =[δ i1i2i3 ](ii) a The distance measures the degree that the training sample data belongs to a certain cluster, the closer the center of the pre-training data subset is to the basic classifier of the training sample, the better adaptability to the sample is, and the indirect measurement measures that the classifier trained on the cluster predicts the new sample X i Of the cell. For each sample, the integrated weight w is generated using the following formula i
Figure BDA0003572143720000091
The final integrated output result is expressed as:
Figure BDA0003572143720000092
wherein
Figure BDA0003572143720000093
Represents the output of three basic classifiers>
Figure BDA0003572143720000094
Is a prediction output which represents the probability of the occurrence of the medical code corresponding to the index in a future diagnosis, namely the risk of various adverse events which may occur in the future of the patient, thereby assisting the doctor to make more reasonable decisions on the medical care of the patient.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (5)

1. A method for predicting risk of an adverse event based on an electronic health record of a patient, comprising the steps of:
s1, data preprocessing: taking the data of each patient as a time sequence of diagnosis in the electronic health record data; the diagnostic sequence was processed as follows:
s11, using C = { C = 1 ,c 2 ,...,c N Denotes the set of all diagnostic codes, c i Representing the ith diagnosis code, wherein i is more than or equal to 1 and less than or equal to N, and N represents the total number of the diagnosis codes; x = [ X ] 1 ,x 2 ,...,x T ]Representing the visit information of a patient, wherein the t-th visit information x t ∈{0,1} N ,{0,1} N Representing a vector of N elements, each element having a value of 0 or1, i.e. x t ={x t1 ,x t1 ,…,x ti ,…x tN }; if the diagnostic code c with serial number i i ∈{c 1 ,c 2 ,...,c N X is present in the t-th visit ti =1, otherwise x ti =0;
S12, using L = [ L = 1 ,l 2 ,...,l T ]Personalized data representing all visits of a patient, l i A vector representation of the personalized data record representing the ith visit; the average value is obtained for each patient in T times of treatment, and the average value l of the same kind of data in different times of treatment is obtained * (ii) a Selecting missing values for numerical data, and selecting missing values to be filled by using an average value, and for non-numerical data, filling the missing values by using values with highest occurrence frequency in the patient data according to a mode principle in statistics;
s13, summing each diagnosis code in the X to obtain the frequency S of the unique diagnosis code in all the visit information of each patient *i I.e. by
Figure FDA0003572143710000011
For all s *i Summing to obtain the frequency S of the unique diagnosis code in all data * Let P * =s * /S * Representing the proportion of the occurrence frequency of each diagnosis code in each patient data in all data;
after the treatment is finished, the data of j th patient consists of three parts X j 、L j 、F j J is more than or equal to 1 and less than or equal to M, and M represents the number of patients with collected data; f j =[l *j ,P *j ],l *j Mean value l representing the same data from different visits of the j-th patient * ,P *j Representing the proportion of the occurrence frequency of each diagnosis code in the jth patient data in all the data;
s2, carrying out K-means clustering sampling treatment: with data F for each patient j =[l *j ,P *j ]Performing K-means clustering as sample points, dividing data into 3 clusters to obtain 3 clustering centers theta 123 Then, howeverPost-calculation of F for each patient data j And F' of each cluster center at the same sampling rate for each cluster center
Figure FDA0003572143710000012
Selecting corresponding sub-data sets from the data of all patients according to the sequence of the distances from small to large to obtain D' = D 1 '∪D 2 '∪D 3 ' generating a plurality of subdata sets for training a basic classifier;
s3, clustering 3 centers theta 123 According to P * The maximum values in the three subsets are sorted from small to large and respectively used as an uncommon code subset, a more common code subset and a common code subset, then the three subsets are respectively and correspondingly input into three basic classifiers of GRAM +, dipole + and RNN + for pre-training, and then model fusion is carried out on the three basic classifiers.
2. The method as claimed in claim 1, wherein the GRAM + is a global attention mechanism added by using personalized data of patients as guidance based on GRAM, and is specifically designed as follows:
in the knowledge directed acyclic graph formed by the medical ontology, a leaf node is an element in the diagnosis code set in the S11, and an ancestor node of the leaf node represents that the ontology represented by the leaf node is derived from the leaf node; all nodes c are assigned a basic embedding vector e, representing the final representation of a leaf node as a basic embedded convex combination of itself and its ancestor nodes:
Figure FDA0003572143710000021
wherein g is i Represents a medical code c i A (i) represents the code c i And c i Index of ancestor node, α ij Is the local attention weight, calculated by the Softmax function as follows:
Figure FDA0003572143710000022
f(e i ,e j ) Is a scalar value, representing e i And e j The compatibility between two basic embeddings is obtained by a multilayer perceptron;
by concatenating the final representations g of all medical codes 1 ,g 2 ,...,g N To obtain an embedded matrix G, and a diagnosis vector v t Expressed as a vector x t Multiplied by the embedding matrix G and passed through a nonlinear hyperbolic tangent activation function tanh ():
v 1 ,v 2 ,...,v T =tanh(G[x 1 ,x 2 ,...,x T ])
then using the personalized data L = [ L ] of the patient 1 ,l 2 ,...,l T ]To add a global attention weight beta t Obtaining a global representation u comprising patient-personalized data information t
u t =β t v t ,t=1,2,...,T
β t Calculated by the following Softmax function:
Figure FDA0003572143710000023
f(l i ,l * ) Is a scalar value representing l i And l * The compatibility between the sensors is obtained by a multilayer perceptron;
will u 1 ,u 2 ,...u T Inputting the data into GRU network to obtain hidden state representation h 1 ,h 2 ,...,h T Generating the first prediction information by the Softmax layer
Figure FDA0003572143710000024
Is defined as:
h 1 ,h 2 ,...,h T =GRU(u 1 ,u 2 ,...,u Tr )
Figure FDA0003572143710000025
θ r is a super-reference to the GRU network,
Figure FDA0003572143710000031
and &>
Figure FDA0003572143710000032
Weights and biases to be learned;
using true diagnostic information y t And prediction information
Figure FDA0003572143710000033
The loss is calculated as follows:
Figure FDA0003572143710000034
upper label
Figure FDA0003572143710000036
Representing a transpose; the loss calculation is back propagated, the error between prediction and reality is calculated, and back propagation is learned and corrected until the GRAM + model converges.
3. The method as claimed in claim 1, wherein the Dipole + utilizes the patient personalized data L = [ L ] 1 ,l 2 ,...,l T ]As a guide, a bidirectional recurrent neural network and an attention mechanism are used simultaneously to predict the patient information; first, the visit information X is embedded into a representation vector v by a multi-layer perceptron t Then using the patient's personalized data L = [ L 1 ,l 2 ,...,l T ]To add a global attention weight beta t Obtaining a global representation u comprising patient-personalized data information t
u t =β t v t ,t=1,2,...,T
β t Calculated by the following Softmax function:
Figure FDA0003572143710000035
f(l i ,l * ) Is a scalar value;
then vector u t Is input to a bi-directional recurrent neural network and finally, the bi-directional outputs are concatenated to generate a potential vector for prediction using an attention mechanism based on the data sequence position.
4. The method of claim 3, wherein RNN + is based on RNN using patient personalized data L = [ L ] on the basis of RNN 1 ,l 2 ,...,l T ]Guiding a patient visit information representation vector X to generate a global representation vector u comprising patient personalized data information t Global representation vector u t The algorithm of (c) is the same as the Dipole + method, and the vector u is represented globally t An attention model based on the data sequence position is entered and then predictions are made using unidirectional GRUs.
5. The method of claim 1, wherein an adaptive weighted integration strategy is used in the model fusion stage to predict the risk of adverse events based on the patient's electronic health record (EMR), and wherein X is used for each sample i =[x 1 ,x 2 ,...,x T ]Calculating its distance d to each cluster center i =[δ i1i2i3 ];
For each sample, the integrated weight w is generated using the following formula i
Figure FDA0003572143710000041
The final integrated output result is expressed as:
Figure FDA0003572143710000042
wherein
Figure FDA0003572143710000043
The outputs of the three basic classifiers are shown. />
CN202210322129.9A 2022-03-30 2022-03-30 Adverse event risk prediction method based on patient electronic health record Active CN114694841B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210322129.9A CN114694841B (en) 2022-03-30 2022-03-30 Adverse event risk prediction method based on patient electronic health record
ZA2022/08574A ZA202208574B (en) 2022-03-30 2022-08-01 An adverse event risk prediction method based on patients' electronic health records

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210322129.9A CN114694841B (en) 2022-03-30 2022-03-30 Adverse event risk prediction method based on patient electronic health record

Publications (2)

Publication Number Publication Date
CN114694841A CN114694841A (en) 2022-07-01
CN114694841B true CN114694841B (en) 2023-04-07

Family

ID=82140727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210322129.9A Active CN114694841B (en) 2022-03-30 2022-03-30 Adverse event risk prediction method based on patient electronic health record

Country Status (2)

Country Link
CN (1) CN114694841B (en)
ZA (1) ZA202208574B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778014A (en) * 2016-12-29 2017-05-31 浙江大学 A kind of risk Forecasting Methodology based on Recognition with Recurrent Neural Network
EP3622882A1 (en) * 2018-09-14 2020-03-18 Fundació Institut de Ciències Fotòniques System and computer-implemented method for detecting and categorizing pathologies through an analysis of pulsatile blood flow
CN113871021A (en) * 2021-09-29 2021-12-31 曲阜师范大学 Graph and attention machine mechanism-based circRNA and disease association relation prediction method
CN114175173A (en) * 2019-07-24 2022-03-11 杨森制药公司 Learning platform for patient history mapping

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778014A (en) * 2016-12-29 2017-05-31 浙江大学 A kind of risk Forecasting Methodology based on Recognition with Recurrent Neural Network
EP3622882A1 (en) * 2018-09-14 2020-03-18 Fundació Institut de Ciències Fotòniques System and computer-implemented method for detecting and categorizing pathologies through an analysis of pulsatile blood flow
CN114175173A (en) * 2019-07-24 2022-03-11 杨森制药公司 Learning platform for patient history mapping
CN113871021A (en) * 2021-09-29 2021-12-31 曲阜师范大学 Graph and attention machine mechanism-based circRNA and disease association relation prediction method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A Core Drug Discovery Framework from Large-Scale Literature for Cold Pathogenic Disease Treatment in Traditional Chinese Medicine;Yun Zhang 等;《Journal of Healthcare Engineering》;第2021卷;第1-18页 *
基于循环神经网络的预测方法及其应用研究;时培莹;《中国优秀硕士学位论文全文数据库》(第8期);第E080-54页 *
基于时序图的医疗风险预测研究;张帅;《中国优秀硕士学位论文全文数据库》(第9期);第E054-2页 *
基于特征排序特征联合算法的疾病危险因素分析;李家辉 等;《计算机应用研究》;第38卷(第9期);第2757-2761页 *

Also Published As

Publication number Publication date
CN114694841A (en) 2022-07-01
ZA202208574B (en) 2022-11-30

Similar Documents

Publication Publication Date Title
Qiao et al. Mnn: multimodal attentional neural networks for diagnosis prediction
Xu et al. Multimodal machine learning for automated ICD coding
Davoodi et al. Mortality prediction in intensive care units (ICUs) using a deep rule-based fuzzy classifier
Alanazi Identification and prediction of chronic diseases using machine learning approach
Beaulieu-Jones et al. Mapping patient trajectories using longitudinal extraction and deep learning in the MIMIC-III critical care database
Kim et al. Multi-modal stacked denoising autoencoder for handling missing data in healthcare big data
Lee et al. Machine learning in relation to emergency medicine clinical and operational scenarios: an overview
WO2004016218A2 (en) Medical decision support systems utilizing gene expression and clinical information and method for use
CN116364299B (en) Disease diagnosis and treatment path clustering method and system based on heterogeneous information network
Gupta et al. A novel deep similarity learning approach to electronic health records data
CN117153393A (en) Cardiovascular disease risk prediction method based on multi-mode fusion
Assaf et al. 30-day hospital readmission prediction using MIMIC data
Rupp et al. Exbehrt: Extended transformer for electronic health records
Murad et al. AI powered asthma prediction towards treatment formulation: An android app approach
An et al. MAIN: multimodal attention-based fusion networks for diagnosis prediction
Huang et al. Study on patient similarity measurement based on electronic medical records
Choubey et al. Implementation of a hybrid classification method for diabetes
CN114694841B (en) Adverse event risk prediction method based on patient electronic health record
Rahmati et al. Developing prediction models for 30-day readmission after stroke among Medicare beneficiaries
Khan et al. Medicolite-Machine Learning-Based Patient Care Model
Chaithra et al. A Review of Machine Learning Techniques Used in the Prediction of Heart Disease.
Prabhakar et al. User-cloud-based ensemble framework for type-2 diabetes prediction with diet plan suggestion
Fekihal et al. Self-organizing map approach for identifying mental disorders
Melek et al. A theoretic framework for intelligent expert systems in medical encounter evaluation
Sudha Applied Computational Intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant