CN111028913A

CN111028913A - Hemodialysis treatment scheme aid decision-making method

Info

Publication number: CN111028913A
Application number: CN201911197143.5A
Authority: CN
Inventors: 王冠; 宋胜男
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2019-11-29
Filing date: 2019-11-29
Publication date: 2020-04-17

Abstract

The invention discloses a hemodialysis treatment scheme assistant decision method based on an Attention-LSTM model, which comprises the following steps of 1, preprocessing hemodialysis historical medical record data marked by a professional dialysis doctor to obtain a total sample set, 2, establishing a long-time and short-time memory network based on an Attention mechanism, and distributing Attention weights α to time-series medical records of each time-series medical record input into the model_tThereby calculating a medical record information global feature vector c; inputting the global feature vector c into a multi-task sharing layer to learn scheme labels; 4, distributing weight lambda to loss of each learning task_nWeighting and summing the losses to obtain the total loss; updating the network parameters of the model by using an Adam optimization algorithm to obtain the final bloodThe hemodialysis treatment protocol assists the decision making model. The model in the invention can assist dialysis clinicians to set up individualized hemodialysis treatment schemes for patients, reduce the manual workload of the clinicians and improve the hemodialysis efficiency of a hemodialysis center.

Description

Hemodialysis treatment scheme aid decision-making method

Technical Field

The invention relates to the field of intelligent analysis, in particular to an Attention-LSTM-based hemodialysis treatment scheme assistant decision-making method.

Background

Hemodialysis is one of the kidney replacement treatment modes of patients with acute and chronic renal failure. It slowly leads blood out of the blood vessel passage through a hemodialysis machine, exchanges substances with dialysate through the principles of dispersion, ultrafiltration, adsorption and convection, removes metabolic wastes in the body, maintains the balance of electrolytes and acid and alkali, and returns the purified blood to the body. Traditionally, hemodialysis physicians have developed individualized hemodialysis protocols for patients based on their own clinical experience. The doctor consults the age, weight, renal function, blood routine, electrolyte, cardiac function and other indexes of the dialysis patient to make an individualized hemodialysis scheme, wherein the scheme comprises dialysis frequency, single dialysis time, dialysate concentration, anticoagulation mode, dosage and the like. Because hemodialysis patient treatment cycle is generally longer, so can produce a series of historical dialysis case histories along with the patient that carries out the treatment, and the clinician often adjusts the dialysis scheme according to historical dialysis case history, and manual work volume is great. With the improvement of medical electronic medical record information systems, the field of health and medical care has gradually entered the information era. A large amount of electronic medical record information data provides a solid data base for the development of AI + medical treatment, and provides great potential for improving the level of the medical industry. For the field of hemodialysis, there is also the application of "AI + medical" mode. The method for predicting the hemodialysis protocol is disclosed in patent CN 109686446 a filed by Jiangsu Kogyo health science and technology Limited. The method comprises the steps of preprocessing training samples of a hemodialysis center, screening important features by adopting Lasso sparse constraint and RandomForest random forest respectively, and combining two groups of screened training samples into a final training sample. And finally, loading the training sample into a multiple linear regression model, and continuously iterating to obtain a dialysis scheme prediction model. The method does not consider the time sequence characteristic of hemodialysis medical record data, and meanwhile, a prediction model of the method is a simple multiple linear regression model, and the fitting degree of a complex model is possibly poor.

Disclosure of Invention

In order to solve the above problems and reduce the manual workload of dialysis physicians in making dialysis solutions, the present invention provides an Attention-LSTM-based hemodialysis treatment solution decision-making assisting method. The hemodialysis system is based on a deep learning algorithm, can assist a dialysis doctor to provide a personalized dialysis scheme for a patient, reduces heavy manual analysis work of the dialysis doctor, and improves hemodialysis efficiency of a hemodialysis center.

The technical scheme adopted by the invention is that the hemodialysis treatment scheme assistant decision method based on Attention-LSTM comprises the following steps:

step 1, preprocessing the hemodialysis history medical record data labeled by a professional dialysis doctor to obtain a total sample set;

step 2, establishing a long-time memory network based on a self-attention mechanism, and distributing attention weights α to each time sequence medical record of the input model_tThereby calculating a medical record information global feature vector c;

step 3, inputting the global feature vector c into a multitask sharing layer to learn scheme labels;

step 4, distributing weight lambda to the loss of each learning task_nWeighting and summing the losses to obtain the total loss; and updating the network parameters of the model by using an Adam optimization algorithm to obtain a final hemodialysis treatment scheme assistant decision model.

The data preprocessing of step 1 comprises the following substeps:

step 1.1, sensitive fields related to personal information of a patient in hemodialysis historical data, such as the real name, the telephone, the home address and the like of the patient, are erased;

and 1.2, determining the characteristic data of the input model and the dialysis scheme label data output by the model. The characteristic data includes age, weight, renal function, blood routine, electrolyte, cardiac function, etc. of the patient, and the model label data includes dialysis frequency, single dialysis time, dialysate concentration, anticoagulation mode, dosage, etc. of the patient. In particular, the time interval between each history medical record (including one medical record corresponding to the new dialysis scheme) and the new dialysis scheme of the patient is taken as the characteristic input;

step 1.3, filling up the missing model input characteristic data by using a mean filling method;

step 1.4, taking the patient ID as a unique mark for distinguishing patient data; sequencing the dialysis medical record data according to time to form serialized data by taking the ID of the patient as a basis;

and 1.5, randomly dividing the preprocessed sample set into a model training set, a verification set and a test set according to the proportion of 7:1: 2.

Establishing a long-time memory network based on a self-attention mechanism in the step 2, and distributing attention weights α to each time sequence medical record of the input model_tTherefore, the medical record information global feature vector c is calculated, and the method comprises the following substeps:

and 2.1, constructing a long-term and short-term memory network. The training algorithm of the long-time memory network is a back propagation algorithm, and when the error reaches an expected value, the network learning is finished. The long-and-short-time memory network introduces a gate control unit to endow the network with the capability of controlling the accumulation of internal information, so that the network can master long-distance dependence and selectively forget information during learning to prevent overload; input door i_tFor updating the state of the cells; forget door f_tFor deciding which information should be discarded or retained; output gate o_tFor determining the next hidden state h_tA value of (d); x is the number of_tInput for the current time step network; h is_t-1Hidden state for last time step; c. C_tMemory cells at the current time step; c. C_t-1Memory cells of the last time step;

is a candidate memory cell; w_xi、W_hi、W_xf、W_hf、W_xo、W_ho、W_xc、W_hcAll are weight matrices of the network model; b_i、b_f、b_o、b_cAre the offsets of the network model weights; sigma is a sigmoid activation function and ranges from 0 to 1; the tanh is a hyperbolic tangent activation function, and the long-time and short-time memory network formula is as follows:

i_t＝σ(W_xix_t+W_hih_t-1+b_i) (1)

f_t＝σ(W_xfx_t+W_hfh_t-1+b_f) (2)

o_t＝σ(W_xox_t+W_hoh_t-1+b_o) (3)

h_t＝o_t*tanh(c_t) (6)

the historical medical record data of the dialysis patients has time series, the medical records with longer time have smaller influence on the new dialysis scheme, and the medical records with closer time have larger influence on the new dialysis scheme. The serialized dialysis medical record data is input into a long-time memory network, certain historical information is removed or added through a well-designed gate structure, and the physiological change characteristics and the future treatment trend of the patient are mined.

And 2.2, adding a self-attention mechanism on the basis of the long-time and short-time memory network. The attention mechanism is to obtain the parts needing important attention from all information, thereby suppressing other useless information. The self-attention mechanism is a special case in the attention mechanism, and is hidden layer information h of each medical record obtained after long-time and short-time memory networks input patient serialized medical record data_tDistributing weight and integrating hidden layer information h of each medical record_tAs a final representation of the patient's historical medical record information. Hidden layer information h of each case history_tMultiplying the output u by a parameter matrix W to obtain an output u through calculation_tAnd a trainable parameter matrix u dot product representing the context information to obtain the similarity score of the two, and using softmax to normalize the score of all time steps to obtain the corresponding attention weight α_tFinally, the hidden layer information h of each case history is used_tAnd its corresponding weight α_tThe global feature vector c used for representing the historical medical record information of the patient is obtained by multiplication, and the formula is as follows:

u_t＝tanh(Wh_t) (7)

the step 3 of inputting the global feature vector c into the multitask sharing layer for the scheme label learning includes the following steps:

the method of hard parameter sharing in multi-task learning is applied to construct independent learning structures aiming at different learning tasks by taking a label contained in a patient dialysis scheme as a target of the multi-task learning, namely, a hidden layer is shared among all tasks and a task related layer of a specific task is reserved at the same time. And the global feature vector c learned by the Attention-LSTM network is input into the multitask shared hidden layer. The scheme label comprises: dialysis frequency, single dialysis time, dialysate concentration, anticoagulation mode and dosage, etc.

Step 4, distributing weight lambda to loss of each learning task_nAnd weighting and summing the losses to obtain the total loss. Updating network parameters of the model by using an Adam optimization algorithm to obtain a final hemodialysis treatment scheme assistant decision model, and comprising the following substeps:

step 4.1, inputting the patient time-series medical record data into a model for training after preprocessing and normalization processing; different tasks adopt corresponding Loss functions Loss when multi-task training is carried out_nAnd presetting a loss weight lambda for it_nAnd a weight parameter λ_nThe importance of each task in the total loss is determined, a larger weight is set for the important task, and the learning sensitivity of the model to the task is improved; for the classification task, its loss function is the cross-entropy loss function CE, where i denotes the first sample,

indicating whether j is the true label of the ith sample,

indicating the probability that j is the true label of the ith sample,

for the regression task, the loss function is the mean square error loss function MSE, y_iIn order to be the true value of the value,

is a predicted value; the model total Loss is equal to the weighted sum of the Loss functions of all tasks, and the formula is as follows:

Loss＝λ₁Loss₁+λ₂Loss₂+…+λ_nLoss_n(13)

and 4.2, updating the network parameters of the model by using an Adam optimization algorithm. And (4) repeatedly updating the model, continuously reducing the total loss value until the termination condition is met, and stopping training the model. And verifying the model effect by using the verification set, adjusting the model parameters according to the actual effect, and finally verifying by using the test set.

The invention has the following advantages:

(1) the invention further excavates patient serialized medical record data information by utilizing a long-time memory network based on a self-attention mechanism, allocates attention weight to each time sequence medical record input into the model, integrates the medical record data information of each time sequence into a global feature vector for multi-task learning, and improves the final learning effect of the model. Attention-driven mechanisms stem from the study of human vision, where humans selectively focus on a portion of all information while ignoring other visible information. The attention mechanism mimics the response of the human brain in observing things, i.e., more attention is assigned to the more important parts. The self-attention mechanism is an improvement of the attention mechanism, reduces the dependence on external information, and is better at capturing the internal correlation of data. Because the dialysis medical record data of the dialysis patient have time sequence and the reference values of the medical record data at different times are different, the importance of the medical record data of the patient at each time is mined by using a self-attention mechanism, and the information with larger attention weight can bring more attention to the model, thereby being beneficial to improving the final learning effect of the model.

(2) The method utilizes a multi-task learning mechanism to train a plurality of scheme labels simultaneously, reduces the risk of model overfitting and improves the generalization performance of the model. The method for sharing the hard parameters in the multi-task learning is applied to construct independent learning structures aiming at different learning tasks based on the sharing layer, namely, the hidden layer is shared among all the tasks, and meanwhile, the task related layer of a specific task is reserved, so that the risk of overfitting is reduced. When a certain task is learned, the part irrelevant to the task is equivalent to noise in the learning process, and the generalization effect of model learning can be improved by introducing the noise. During single-task learning, the reverse propagation of the gradient tends to fall into a local minimum value, while the local minimum values of different tasks are in different positions in multi-task learning, and the model can be helped to escape from the local minimum value through interaction, so that the performance of the model is improved.

Drawings

FIG. 1 is a flow chart of an aided decision method for an Attention-LSTM-based hemodialysis treatment protocol according to the present invention.

FIG. 2 is a schematic diagram of a long-term memory network of an aided decision-making method of an Attention-LSTM-based hemodialysis treatment protocol according to the present invention.

FIG. 3 is a schematic diagram of a self-Attention mechanism-based long-term memory network of an Attention-LSTM-based hemodialysis treatment protocol assistant decision method provided by the invention.

FIG. 4 is a schematic diagram of a hard parameter sharing method in multitask learning of an Attention-LSTM-based hemodialysis treatment protocol aided decision method provided by the present invention.

Detailed Description

The present invention will be described in detail with reference to fig. 1 to 4, and the technical solutions in the embodiments of the present invention will be clearly and completely described below, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, the present invention herein provides an Attention-LSTM-based hemodialysis treatment protocol aided decision method, comprising the steps of:

step 4, distributing weight lambda to the loss of each learning task_nAnd weighting and summing the losses to obtain the total loss. And updating the network parameters of the model by using an Adam optimization algorithm to obtain a final hemodialysis treatment scheme assistant decision model.

The data preprocessing of step 1 comprises the following substeps:

step 1.1: removing sensitive fields related to personal information of the patient in hemodialysis historical data, such as the real name, telephone, home address and the like of the patient;

step 1.2: characteristic data input into the model and dialysis scheme label data output by the model are determined. The characteristic data includes age, weight, renal function, blood routine, electrolyte, cardiac function, etc. of the patient, and the model label data includes dialysis frequency, single dialysis time, dialysate concentration, anticoagulation mode, dosage, etc. of the patient. In particular, the time interval between each history medical record (including one medical record corresponding to the new dialysis scheme) and the new dialysis scheme of the patient is taken as the characteristic input;

step 1.3: filling up the missing model input characteristic data by using a mean filling method;

Step 2, establishing a long-time and short-time memory network based on a self-attention mechanism, and allocating attention weights α to each time sequence medical record of the input model_tTherefore, the medical record information global feature vector c is calculated, and the method comprises the following substeps:

and 2.1, constructing a long-term memory network by referring to the figure 2. The training algorithm of the long-time memory network is a back propagation algorithm, and when the error reaches an expected value, the network learning is finished. The long-and-short-time memory network introduces a gate control unit to endow the network with the capability of controlling the accumulation of internal information, so that the network can master long-distance dependence and selectively forget information during learning to prevent overload; input door i_tFor updating the state of the cells; forget door f_tFor deciding which information should be discarded or retained; output gate o_tFor determining the next hidden state h_tA value of (d); x is the number of_tInput for the current time step network; h is_t-1Hidden state for last time step; c. C_tMemory cells at the current time step; c. C_t-1Memory cells of the last time step;

i_t＝σ(W_xix_t+W_hih_t-1+b_i) (1)

f_t＝σ(W_xfx_t+W_hfh_t-1+b_f) (2)

o_t＝σ(W_xox_t+W_hoh_t-1+b_o) (3)

h_t＝o_t*tanh(c_t) (6)

And 2.2, referring to fig. 3, adding a self-attention mechanism on the basis of the long-time and short-time memory network. The attention mechanism is to obtain the parts needing important attention from all information, thereby suppressing other useless information. The self-attention mechanism is a special case in attention mechanism, and is a long-time and short-time memory netHidden layer information h of each medical record obtained after patient serialized medical record data is input in a network mode_tDistributing weight and integrating hidden layer information h of each medical record_tAs a final representation of the patient's historical medical record information. Hidden layer information h of each case history_tMultiplying the output u by a parameter matrix W to obtain an output u through calculation_tAnd a trainable parameter matrix u dot product representing the context information to obtain the similarity score of the two, and using softmax to normalize the score of all time steps to obtain the corresponding attention weight α_tFinally, the hidden layer information h of each case history is used_tAnd its corresponding weight α_tThe global feature vector c used for representing the historical medical record information of the patient is obtained by multiplication, and the formula is as follows:

u_t＝tanh(Wh_t) (7)

step 3, inputting the global feature vector c into the multitask sharing layer for learning the scheme label, which comprises the following steps:

referring to fig. 4, the method of hard parameter sharing in multitask learning is to build independent learning structures for different learning tasks based on the shared layer, i.e. a hidden layer is shared among all tasks while preserving the task-dependent layer of a specific task. The label included in the dialysis scheme of the patient is taken as the target of multi-task learning, and the global feature vector c learned by the Attention-LSTM network is input into the shared hidden layer of the multi-task. The scheme label comprises: dialysis frequency, single dialysis time, dialysate concentration, anticoagulation mode and dosage, etc.

The multi-task learning is an induction migration mechanism, and aims to use a shared representation layer to train a plurality of different tasks in parallel, reduce the risk of model overfitting and improve the generalization performance of the model. The multi-task learning method can acquire internal relation among different learning tasks while learning, and fully utilizes data information in the input model. Hard parameter sharing is the most common way in multi-task learning of neural networks, and can be applied to all hidden layers of all tasks, and the output layer related to the tasks is reserved. Because different tasks have different noise modes, the simultaneous learning of a plurality of tasks can enable the model to obtain a more generalized representation, thereby reducing the risk of overfitting.

Step 4, distributing weight lambda to loss of each learning task_nWeighting and summing the losses to obtain the total loss; updating network parameters of the model by using an Adam optimization algorithm to obtain a final hemodialysis treatment scheme assistant decision model, and comprising the following substeps:

indicating whether j is the true label of the ith sample,

indicating the probability that j is the true label of the ith sample,

for the regression task, itThe loss function is mean square error loss function MSE, y_iIn order to be the true value of the value,

Loss＝λ₁Loss₁+λ₂Loss₂+…+λ_nLoss_n(13)

The above-mentioned details are only one embodiment of the present invention, and do not limit the scope of the present invention, therefore, those skilled in the art can make various changes and modifications without departing from the spirit and scope of the present invention, and these changes and modifications should also be construed as the protection scope of the present invention. The protection scope of the present invention is defined by the claims, and all structural changes equivalent to the content of the description and drawings of the present invention should be considered to be included in the protection scope of the present invention.

Claims

1. An Attention-LSTM-based hemodialysis treatment scheme decision-making assisting method is characterized in that: the method comprises the following steps:

step 2, establishing the duration based on the self-attention mechanismA memory network for assigning attention weights α to each time-series medical record of the input model_tThereby calculating a medical record information global feature vector c;

2. An Attention-LSTM based hemodialysis treatment protocol aid decision method according to claim 1, wherein: the step 1 comprises the following substeps:

step 1.1, sensitive fields related to personal information of a patient, such as the real name, the telephone, the home address and the like of the patient, in the hemodialysis historical medical record data are erased;

step 1.2, determining characteristic data of an input model and dialysis scheme label data output by the model;

step 1.3, filling up the missing characteristic data by using a mean filling method;

3. An Attention-LSTM based hemodialysis treatment protocol aid decision method according to claim 1, wherein: the step 2 comprises the following substeps:

step 2.1, constructing a long-time and short-time memory network; the long-and-short-time memory network introduces a gate control unit to endow the network with the capability of controlling the accumulation of internal information, so that the network can master long-distance dependence and selectively forget information during learning to prevent overload; input door i_tFor updating the state of the cells; forget door f_tFor deciding which information should be discarded or retained; output gate o_tFor determining the next hidden state h_tA value of (d); x is the number of_tInput for the current time step network; h is_t-1Hidden state for last time step; c. C_tMemory cells at the current time step; c. C_t-1Memory cells of the last time step;

i_t＝σ(W_xix_t+W_hih_t-1+b_i) (1)

f_t＝σ(W_xfx_t+W_hfh_t-1+b_f) (2)

o_t＝σ(W_xox_t+W_hoh_t-1+b_o) (3)

h_t＝o_t*tanh(c_t) (6)

step 2.2, adding a self-attention mechanism on the basis of the long-time and short-time memory network; it is a hidden layer information h of each medical record obtained after long-time memory network inputs patient serialized medical record data_tDistributing weight and integrating hidden layer information h of each medical record_tAs a final representation of patient historical medical record information; hidden layer information h of each case history_tMultiplying the output u by a parameter matrix W to obtain an output u through calculation_tAnd a trainable parameter matrix u dot product representing the context information to obtain the similarity score of the two, and using softmax to normalize the score of all time steps to obtain the corresponding attention weight α_tFinally, the hidden layer information h of each case history is used_tAnd its corresponding weight α_tThe global feature vector c used for representing the historical medical record information of the patient is obtained by multiplication, and the formula is as follows:

u_t＝tanh(Wh_t) (7)

score(u_t,u)＝u_t ^Tu (9)

4. an Attention-LSTM based hemodialysis treatment protocol aid decision method according to claim 1, wherein: the step 3 comprises the following steps:

the method of hard parameter sharing in multi-task learning is applied to construct independent learning structures for different learning tasks based on a sharing layer by taking a label contained in a patient dialysis scheme as a target of the multi-task learning, namely, a hidden layer is shared among all tasks and a task related layer of a specific task is reserved at the same time. And the global feature vector c learned by the Attention-LSTM network is input into the multitask shared hidden layer.

5. An Attention-LSTM based hemodialysis treatment protocol aid decision-making approach in accordance with claim 1, wherein: the step 4 comprises the following substeps:

step 4.1, inputting the patient time-series medical record data into a model for training after preprocessing and normalization processing; different tasks adopt corresponding Loss functions Loss when multi-task training is carried out_nAnd presetting a loss weight lambda for it_nAnd a weight parameter λ_nThe importance of each task in the total loss is determined, a larger weight is set for the important task, and the learning sensitivity of the task is improved; for the classification task, its loss function is the cross-entropy loss function CE, where i denotes the first sample,

indicating whether j is the true label of the ith sample,

indicating the probability that j is the true label of the ith sample,

Loss＝λ₁Loss₁+λ₂Loss₂+…+λ_nLoss_n(13)

step 4.2, updating the network parameters of the model by using an Adam optimization algorithm; repeatedly updating the model in an iterative manner, continuously reducing the total loss value until the termination condition is met, and stopping training the model; and verifying the model effect by using the verification set, adjusting the model parameters according to the actual effect, and finally verifying by using the test set.