CN114023441A

CN114023441A - Severe AKI early risk assessment model and device based on interpretable machine learning model and development method thereof

Info

Publication number: CN114023441A
Application number: CN202111312687.9A
Authority: CN
Inventors: 刘晓莉; 张政波; 周飞虎; 刘超; 毛智
Original assignee: Chinese PLA General Hospital
Current assignee: Chinese PLA General Hospital
Priority date: 2021-11-08
Filing date: 2021-11-08
Publication date: 2022-02-08

Abstract

The application provides a severe AKI early risk assessment model based on an interpretable machine learning model, a device thereof and a building method thereof, wherein in the data set building, based on a plurality of large electronic health record data sets, the selection of people, the incorporation of research variables and the building of positive and negative sample sets are completed according to a research scheme; in data processing, data is extracted and preprocessed, and a statistical characteristic data set is constructed based on development of a machine learning model; in the model construction and evaluation, the model is constructed and trained based on the statistical characteristic data set, and the performance of the model is further evaluated based on the formulated 9 indexes and 3 modes to obtain a trained and calibrated prediction model; in model application, based on a trained prediction model, model migration, training and recalibration are performed for different application scenarios.

Description

Severe AKI early risk assessment model and device based on interpretable machine learning model and development method thereof

Technical Field

The present application relates to machine learning, and more particularly, to a severe AKI early risk assessment model based on interpretable machine learning models, an apparatus and a method for developing the same.

Background

Acute Kidney Injury (AKI) is a clinical syndrome of great concern, with morbidity rates as high as 50% in the Intensive Care Unit (ICU) and associated high mortality rates. In 2013, 140 to 290 million AKI patients are hospitalized in China, the total medical cost is about 130 hundred million dollars, and the hospitalization mortality rate is 12.4%. 16.1% of critically ill patients abandon treatment and leave the hospital, and about 65.3% of patients die within 3 months after leaving the hospital. AKI becomes a huge medical burden in China, and has the problem of serious deficiency in diagnosis and treatment. The unidentified rate of AKI was high in both the university and local hospitals, with a 17.6% delay in diagnosis among the identified patients. Since kidney damage is often detected only at the end of the period when irreversible damage to the kidneys has occurred, it can be severe with sequelae that require temporary or long-term dialysis, and even death of the patient. An early detection method for kidney damage would win valuable time for effective clinical treatment.

The current clinical treatment means for high risk patients is to detect the concentration of creatinine in blood every day, and the excessive concentration means the problem of renal function. The medical community lacks effective means to predict whether and when a patient will develop AKI; a new Artificial Intelligence (AI) -based approach can continuously monitor patient health data and predict impending renal injury in time. The 'Previse' automatic diagnosis algorithm developed by the company Dascena in 2017 can predict acute renal injury all day before a patient reaches the clinical diagnosis standard, provides sufficient time for a clinician to intervene and prevent long-term injury, and has 84% accuracy when the algorithm is predicted 48 hours in advance. An AKI early prediction model published in Nature by Google Deepmed in 2019, and a deep learning prediction model with good performance is developed based on data of 72 ten thousand inpatients in the United states.

Although AI expedites the pace of AKI autonomy and early recognition, Google has achieved the best predictive performance to date, most developed predictive models only stay in the model phase and do not perform well in clinical practice, mainly due to limitations in using imperfect methods, poor discrimination of the models, and potential prejudices. Four reasons can be summarized: repeatability, a review reports that results of 28 death prediction models were reproduced using Electronic Health Record (EHR) data, with a difference of about 25% between the data volume of half the articles and no guarantee of model accuracy. To date, there is no formal guideline or operating specification to which predictive research can adhere; transparency, taking into account differences in clinical practice and data sets at different sites, standards and metrics are needed to evaluate the performance of the model to avoid potential bias and unfairness, such as uncertainty in subgroup analysis and prediction, to define the boundaries of use of the model; universality, the narrow-sense universality of the algorithm is transferred to the clinical utility under the scene of the algorithm as a final purpose; rationality, sufficient interpretability, and the like, require negotiation between medical personnel and users such as patients.

AKI can be classified into 3 classes based on the guideline criteria for clinical use KDIGO (clinical distance: Improving Global Outcome). A large number of studies are focused on the entire disease progression stage, especially early, of AKI. However, recent results in some studies indicate that severe AKI (stage 2/3) is associated with a high risk of poor prognosis, whereas stage 1 occurrence has not been clearly associated with outcome and supports clinical judgment, and a greater proportion of patients are transient-transient and may be better treated with treatment. The early attention is paid, so that the alarming fatigue is easily caused and the treatment of other emergency diseases is easily interfered; most studies judge the disease progression stage of AKI with creatinine alone, recent studies have shown that consideration of creatinine alone without urine volume may lead to severe lag in judging the disease stage of AKI and time, leading to failure of model development; in the currently developed prediction model, only a small part of models are calibrated and subjected to subgroup analysis, and external verification only stays in limited hospitals, regions and the same country, which may cause that the model performance cannot be comprehensively and effectively evaluated, and potential evaluation bias and unfairness are caused; as far as we know, only a few models are popularized to practical application scenes, and most models only stay in the development stage of the models. At present, no clear research indicates that under the condition of actual use of the model, the size of local data volume is required for model migration or retraining; the goal of developing models is to better and more conveniently serve the clinic, but there is little research to further develop models as platforms/software that can be understood, trusted, and easily operated by medical personnel. These are the core problems and challenges that the model must solve in practical scenarios.

Disclosure of Invention

In view of the above, the present application aims to propose a severe AKI early risk assessment model based on an interpretable machine learning model.

The severe AKI early risk assessment model based on the interpretable machine learning model is a LightGBM model based on a fusion SHAP method; the model has a number of features, of which the first 20 features by their importance are:

the observation window is the sum of urine volume/kg/h in 6 hours after 12 hours, the sum of urine volume/kg/h in 12 hours, BMI, the mean arterial pressure at the current moment, age, systolic pressure at the current moment, urea nitrogen/creatinine, the maximum value of variation of inhaled oxygen concentration in the observation window, sex, the sum of urine volume/kg/h in the first 6 hours before 12 hours in the observation window, the minimum value of body temperature in the observation window, the minimum value of PH in the observation window, whether diuretics are used at the current moment, body weight, the minimum value of eGFR in the observation window, the maximum value of change of arterial blood carbon dioxide partial pressure in the observation window, whether hypertension exists, the slope of change of mean arterial pressure in the observation window, and the maximum value of change of blood oxygen partial pressure in the observation window;

and the model dynamically evaluates the risk of the patient to be predicted of the occurrence of severe AKI within 24 hours in the future according to input features corresponding to at least some of the features extracted from the health data of the disease risk of the patient to be predicted hourly, and explains the reasoning process.

Preferably, the system further comprises a data processing module;

the health data of the patient to be predicted is processed by a data processing module to extract input features corresponding to at least some of the plurality of features.

On the other hand, the application also provides a severe AKI early risk assessment device based on the interpretable machine learning model, which comprises a computing unit, wherein the computing unit executes the severe AKI early risk assessment model based on the interpretable machine learning model, which is disclosed by any one of claims 1-2, and adopts the SHAP method fused with the model to obtain the ranking of risk factors of the patient to be predicted and the importance degree of each factor for the occurrence of severe AKI in dynamic assessment.

Preferably, the device automatically calls the needed personal information, laboratory examination information, vital sign information, treatment information and urine volume from the hospital information system in real time, and obtains the dynamic evaluation result of the future occurrence risk of the severe AKI per hour and the dynamic change process of important risk factors by using the severe AKI early risk evaluation model based on the interpretable machine learning model;

the device automatically calculates the morbidity risk of the patients in the whole disease area in parallel, and pushes the patient information of which the risk value exceeds the set threshold value of the doctor to responsible doctors and nurses so as to evaluate the disease condition of the patients and adjust the treatment scheme as early as possible.

In another aspect, the present application further provides a method for developing a severe AKI early risk assessment model based on an interpretable machine learning model, including: data set construction, data processing, model construction and evaluation and model application;

in the data set construction, on the basis of a plurality of large electronic health record data sets, the selection of people, the incorporation of research variables and the construction of positive and negative sample sets are completed according to a research scheme;

in data processing, data is extracted and preprocessed, and a statistical characteristic data set is constructed based on development of a machine learning model;

in the model construction and evaluation, the model is constructed and trained based on the statistical characteristic data set, and the performance of the model is further evaluated based on the formulated 9 indexes and 3 modes to obtain a trained and calibrated prediction model;

in model application, based on a trained prediction model, model migration, training and recalibration are performed for different application scenarios.

Preferably, the plurality of large electronic health profile data sets comprises:

BIDMIC 2001-2016、BIDMIC 2017-2019、eICU-CRD、Ams-UMC、PLAGH。

preferably, the selection principle for the population is as follows:

(1) hospitalized Intensive Care Unit (ICU) patients aged 18 to 90 years;

(2) excluding patients who were second and later admitted to the hospital and to the ICU for treatment;

(3) patients with ICU hospitalization less than 1 day or more than 120 days were excluded;

(4) patients with creatinine measurements less than twice during the ICU period were excluded;

(5) excluding patients admitted to the hospital with creatinine values higher than 4 mg/dl;

(6) excluding patients who have been in end stage renal disease;

(7) patients requiring dialysis or renal transplant therapy for the first 6 hours during the ICU were excluded;

(8) patients with stage AKI2 or 3 occurred within 12 hours of exclusion into the ICU.

Preferably, in the inclusion of the study variables, five categories of study variables are determined, classified into static characteristics and dynamic characteristics; the method comprises the following steps:

personal information, which is a static feature, includes: admission type, age, BMI index, race, type of intensive care unit admitted, sex, height, weight, whether or not there is a history of hypertension, whether or not there is a history of diabetes, whether or not there is a history of heart failure;

vital signs, which are dynamic features, include: heart rate, mean arterial pressure, respiratory rate, systolic pressure, shock index, blood oxygen saturation, body temperature;

laboratory examinations, dynamic characterization, which include: alkaline phosphatase, whether alkaline phosphatase is measured, glutamic-pyruvic transaminase, whether glutamic-pyruvic transaminase is measured, glutamic-oxalacetic transaminase, whether glutamic-oxalacetic transaminase is measured, the anion space, the alkali residual, bicarbonate, bilirubin, whether bilirubin is measured, the ratio of blood urea nitrogen to creatinine, blood urea nitrogen, blood calcium, chloride, creatinine, eGRF, the inspired oxygen concentration marker, blood glucose, hematocrit, hemoglobin, the international normalized ratio, lactate, whether lactate is measured, lymphocytes, whether lymphocytes are measured, magnesium, neutrophils, whether neutrophils are measured, the arterial blood carbon dioxide partial pressure, the oxygenation index is measured, whether blood oxygen partial pressure, ph, platelets, potassium, prothrombin time, thromboplastin time, sodium, white blood cell count;

treatment information, which is a dynamic feature, includes: whether dobutamine is used, whether dopamine is used, whether epinephrine is used, whether norepinephrine is used, whether mechanical ventilation is performed, whether diuretics are used.

Urine volume, a dynamic feature, which includes: the volume of urine discharged each time.

Preferably, the extracting and preprocessing of the data comprises:

data cleaning, namely naming the same variables of the large electronic health record data sets uniformly, converting the same variables uniformly, removing abnormal values outside the physiological range, and extracting research data segments as data in the period of an ICU;

data sampling, wherein all dynamically acquired data take ICU time as a starting point, alignment is carried out in units of hours, and if a plurality of values appear, the average value/the latest moment is taken;

data interpolation, wherein the missing proportion is less than or equal to 30 percent, the population median is inserted, otherwise, a mark characteristic mark corresponding to the characteristic is added to indicate whether to measure 0/1, and 21 percent is inserted into the missing value of the inhaled oxygen concentration in a time period without mechanical ventilation;

constructing characteristics, namely, regarding personal information, the personal information is an original value; for laboratory examination, the maximum value, the minimum value, the maximum value of the front-back change, the initial value and the final recorded value of an observation window are obtained; for the vital signs, the maximum, minimum, median, average, standard deviation, front and back variation maximum, average energy, variation sum, times higher than the average, times lower than the average, variation trend slope, initial value and final recorded value of an observation window are obtained; for the treatment information, the maximum value, the initial value, the last recorded value and the duration of the observation window are obtained; for the urine volume, the sum of the observation windows, the 12-hour total urine volume/kg/h, the 6-hour total urine volume/kg/h 6 hours before and 6-hour total urine volume/kg/h 6 hours after the 12-hour observation window, the initial value and the final recorded value were given.

Preferably, in the model construction and evaluation, a proper machine learning model is adopted, based on positive and negative samples labeled in the data set construction and statistical characteristics constructed in the data processing, 80% of data set based on BIDMC 2001-2016 is used for model training and parameter adjustment by Bayesian optimization, 10% of data is used for model calibration, and 10% of data is used for internal model verification; respectively repeating the training, parameter adjusting, calibrating and internal verifying processes to realize the construction of an early prediction model for serious AKI occurring 6 hours, 12 hours, 18 hours and 24 hours in the future; the constructed early prediction model fusion SHAP method realizes model interpretability, which is the importance degree of risk factor ranking and dynamic evaluation of various factors related to the occurrence of severe AKI on the occurrence of severe AKI.

Preferably, in the model construction and evaluation, the 9 indexes are 8 evaluation indexes and 1 function index, and the 9 indexes are used for evaluating the performance of the prediction model;

the 8 evaluation indexes include: AUROC, specificity, sensitivity, accuracy, F1 value, accuracy, negative predictive value, AUPR; the 1 function index is interpretability;

in the model construction and evaluation, the three modes are as follows: time sequence verification, external verification and subgroup analysis; evaluating the performance of the prediction model by using the three modes;

performing timing verification based on the BIDMC 2017 and the 2019;

performing external verification based on eICU-CRD, Ams-UMC and PLAGH;

select 8 aspects for subgroup analysis, including: the age is 18-44 years old, 45-64 years old, 65-74 years old, 75-84 years old, or more than or equal to 85 years old; race, divided into Asian, African, Hispanic, white; ICU types, classified as cardiac ICU, medical surgery ICU, neurology ICU, medical ICU, surgical ICU, trauma surgery ICU; admission SOFA scores of 0-1 point, 2 points, 3-4 points, 5-6 points and more than or equal to 7 points; sex, divided into male and female; hospital types, classified as: teaching hospitals, non-teaching hospitals; the amount of the bed is divided into <100, 100 and 249,250 and 499 and more than or equal to 500; the region is divided into the midwest, the south, the west and the northeast of the United states.

Preferably, in the model application, the model is selected for direct use, model migration, or model retraining according to the difference of different scenes, local population distribution, data ownership, and medical care processes according to whether the model application is for future use in hospitals facing model development, for hospital use in other regions of the country where the model development is located, or for hospital use in other countries outside the country where the model development is located.

Preferably, if the model application is future hospital use oriented to model development, the trained predictive model can be directly used;

if the model application is used by hospitals in other regions of the country where the model is developed, the trained prediction model can be directly used, and if the proportion of serious AKI in the use scene is greatly different from that of the hospital where the model is trained, the model migration prediction performance is better;

if the model application is applied to hospitals of other countries except the country where the model is developed, local data is needed to be used for model migration or retraining, and if the sample size can reach 3000 people, the effect close to large sample set training can be achieved.

Preferably, further comprising: AKI assistant decision support software is constructed;

the AKI assistant decision support software dynamically evaluates the risk of serious AKI of the patient to be predicted within 6, 12, 18 and 24 hours in the future every hour, explains the reasoning process of the prediction model and presents the dynamic change process of the importance degree of the risk factors.

In another aspect, the present application further provides a method for developing a disease prediction model based on an electronic health record, where the prediction model is a machine/deep learning model, the method including: data set construction, data processing, model construction, model evaluation and model application;

in the data set construction, based on a plurality of large electronic health record data sets from different countries, regions and hospitals, according to clinical diagnosis definition and research task identification events and occurrence time in a database, the selection of people and the incorporation of research variables are completed;

in the data processing, data extraction, variable combination, abnormal value removal, data interpolation and state annotation for model construction are carried out;

in model construction and evaluation, a tuned prediction model is obtained based on a large sample dataset, comprising: feature extraction, model training, hyper-parameter selection, model calibration and internal verification; selecting the universality, robustness and application range of the optimal model and the evaluation model, and relating to the included evaluation indexes and bias consideration, model selection, time sequence verification, external verification and model interpretation;

in the model application, the method is used for combining actual application scenes, data bases, population distribution characteristics and hospital characteristics, and performing model recalibration and building a visualization platform supporting disease assistant decision making according with medical behavior habits and performing presentation by directly using, migrating to the local or local retraining, so that the prediction model is suitable for local use.

In the application, (1) a construction and use process which needs to be followed by disease prediction model development is provided for the first time, and subsequent model development and application are carried out based on the process; (2) a large sample time series data set for AKI model Research is constructed based on 5 EHRs (Medical Information Mart for Intelligent Care-III (MIMIC-III), Medical Information Mart for Intelligent Care IV (MIMIC-IV), eICI Collaborative Research Database (eICI-CRD), AmsterdamUMCdb (Ams-UMC), PLA General Hospital ICU Database (PLAGH)) from 205 hospitals in the United states, the Netherlands and China; (3) based on data from BIDMC hospital 2001 to 2016, 4 Machine learning (Light Gradient Boosting Machine (LightGBM), Gradient Boosting on Decision Trees for category risks (Catboost), Random questions (RF), and Logistic Regression (LR)) methods are adopted to construct a model aiming at 4 serious AKI occurrence risk prediction tasks (within 6 hours in the future, within 12 hours in the future, within 18 hours in the future, within 24 hours in the future) and complete internal tuning and calibration; (4) according to the requirements of doctors, 8 evaluation indexes (AUROC, sensitivity, specificity, accuracy, F1 value, accuracy, negative predictive value and AUPR) are selected, 4 models and 4 tasks are evaluated to select the optimal model meeting the clinical requirements; (5) the model was comprehensively evaluated using time series validation, external validation (205 hospitals in 3 countries), 8-aspect subgroup analysis (age, race, ICU type, admission SOFA score, gender, hospital type, bed ownership, area); (6) respectively carrying out model migration and use exploration based on 3 types of application scenes (future use of a development model-oriented hospital, use of other regions of the development model-oriented hospital and use of other countries of the development model-oriented hospital); (7) based on the migrated and calibrated model and the interpretable function of the model, data preprocessing, model operation, visual presentation and the like are packaged, and the risk and model reasoning and interpretation device capable of fully automatically and dynamically evaluating the serious AKI of the ICU patient within 12 hours in the future is obtained.

According to the application, through the verification of large samples, multi-country and multi-center data sets, the model has good universality and robustness, the risk of serious AKI of an ICU patient can be dynamically and early predicted, the ranking of risk associated factors with the serious AKI is given, the reasoning process of the model can be dynamically presented, the potential damage of the kidney can be noticed earlier than that of a clinical gold standard KDIGO, and the model is suitable for mechanisms for preparing different medical resources through full evaluation, model migration and calibration. Finally, the prediction model is integrated into the device, and assistant decision support software is constructed. The diagnosis and treatment data based on real-time acquisition can be realized, the risk of serious AKI of ICU inpatients is fully automatically and dynamically evaluated, and the change track of the importance degree of the risk factors is presented.

Drawings

FIG. 1 shows a process for constructing and using a disease prediction model;

FIG. 2 shows a process for developing and executing a prediction model of severe acute renal injury;

FIG. 3.5 data set patient inclusion procedure;

FIG. 4.BIDMC Hospital's inclusion procedure for patients from 2001 to 2016;

FIG. 5. Admission procedure for patients 2017 to 2019 in BIDMC hospital;

FIG. 6. the inclusion procedure for eICU-CRD 202hospital patients;

FIG. 7, Ams-Inclusion procedure for patients from 2003 to 2016 in UMC Hospital;

FIG. 8. Admission procedure for non-military patients in the year 2008 to 2019 in the PLAGH Hospital;

FIG. 9.5 frequency distribution of creatinine and urine volume measurements in a data set;

FIG. 10. inclusion flow and data segmentation usage for the overall study population;

FIG. 11.5 data set patient experimental design;

FIG. 12.LightGBM model performance of the model in 4 prediction tasks;

FIG. 13.4 performance of machine learning models predicting the occurrence of severe AKI model 12 hours into the future;

FIG. 14.4 model performance after calibration in 4 prediction tasks for the model;

FIG. 15. importance ranking of models and model interpretability;

FIG. 16. Performance of the predictive model in time series validation and external validation;

FIG. 17 Performance presentation of the predictive model in 8 subgroups of analysis;

FIG. 18 migration and retraining of 3 different application scenarios of the severe AKI prediction model;

FIG. 19. migration and retraining performance presentation for 3 different application scenarios of the severe AKI prediction model;

fig. 20. a severe AKI early risk assessment apparatus based on interpretable machine learning models.

Detailed Description

The invention provides an electronic health file-based method for developing a severe AKI early risk assessment model with interpretable function by adopting an ensemble learning method, which specifically comprises the following steps:

step 1: data pre-processing

Extracting the data of the patient according to the standard of the inclusion and exclusion of the patient aiming at 5 clinical research data sets by utilizing the diagnosis and treatment information of the patient recorded in detail in the electronic health record; merging different sources and names of the same variable; removing outliers and outliers in the data; interpolating the extracted data to facilitate model construction; and the status of the patient at each moment (whether severe AKI occurs) is annotated according to clinical diagnostic criteria KDIGO and the set task. Through the process, a time-series research data set is constructed.

Step 2: model construction

Constructing statistical characteristics of the dynamically acquired data aiming at the research population and the variables determined in the step 1; then inputting an ML model to be explored for training; wherein 80% of the data set is used for training of the model and selection of hyper-parameters; 10% of the data was used for model prediction calibration; 10% of the data was used for internal validation evaluation of model performance.

And step 3: model evaluation

In order to comprehensively evaluate the performance of the model and understand the use boundary of the model, AUROC, specificity, sensitivity, accuracy, F1 value, accuracy, negative predictive value and AUPR 8 indexes are adopted for evaluation, and subgroup analysis is carried out on the aspect with clinical important attention so as to avoid potential bias and unfairness of patient evaluation caused by data bias possibly learned by the model; based on the evaluation indexes and the evaluation bias mode, selecting a model according to the result of model verification, and selecting an optimal model; performing time sequence verification of the model based on the selected model; performing external verification of the model based on the selected model; and finally, the evaluation result given by the model is interpreted, so that the understanding and the trust of the user are facilitated.

And 4, step 4: model use

The prediction model developed based on large samples and high quality datasets needs to be considered when used: differences exist in 3 application scenarios (future use of the developed model in hospitals, use in other areas of the developed model hospitals, use in other countries of the developed model hospitals) and in the amount of accumulated data, patient population distribution and medical care owned by actual use scenarios. The following procedure may be employed: directly using the developed model; taking the trained model as a pre-training model to carry out model migration; retraining the model based on the local data; calibrating the trained model; a platform convenient for medical staff to operate and use, namely a risk and model reasoning and explaining device of severe AKI, is developed, so that the clinical service is really realized.

The present invention will be described in detail with reference to fig. 1 to 20.

The invention provides a disease prediction model and a disease prediction device developed based on electronic health records of multiple countries and hospitals, which are mainly used for early and dynamic prediction of the incidence probability/risk of a patient in an ICU (intensive care unit) hospitalization period, and aims to develop a touchable risk assessment model guided by clinical practical requirements, fully automatically and dynamically assess the kidney damage degree of the ICU hospitalized patient so as to realize early assessment of the occurrence of severe AKI (acute renal failure) and help a doctor to perform early intervention and treatment on the patient at a deterioration risk. The invention utilizes the years of large sample data accumulated by the electronic health record to cover a wide application group, can develop and evaluate the model quickly and effectively at low cost, and effectively solves the problems of time consumption, labor consumption and huge cost of clinical random contrast research; the model shows good prediction performance through migration, verification and subgroup analysis of 205 hospitals in 3 countries; the method is finally packaged, the risk (probability) of the patient of serious AKI can be fully automatically calculated, and a model reasoning analysis process is presented.

The method proposed in the invention mainly complies with 4 modules of predictive model development and application as a whole: data preprocessing, namely cleaning and sorting data of data sets from multiple sources according to formulated research problems, and completing state marking of attention outcomes; model construction, namely constructing the characteristics capable of more comprehensively describing the dynamic change of the disease severity of a patient based on clinical prior knowledge and previous research, and further completing the training, the tuning and the calibration of the model based on a selected data set; model evaluation, namely evaluating the model according to comprehensive evaluation indexes, subgroup analysis, time sequence verification and external verification in order to more comprehensively know the prediction performance of the model, such as universality, robustness and interpretability, and finally selecting the optimal model with the performance meeting the clinical requirement; the model is used, and the application foundation aiming at the scenes of model application popularization and actual scenes relates to the direct use of the model, the migration of the model, the retraining of the model, the calibration of the model and the development of auxiliary decision support software. The model developed by the support is more suitable for the actual application scene, and is integrated into the medical care process to really assist the doctor in diagnosing and treating the disease.

The interpretable severe AKI early risk assessment method based on the electronic health record provided by the invention is used for training, tuning and calibrating a model through data of a large sample, and the prediction performance of the method can meet the requirement of clinical use; through subgroup analysis of patient data from 205 hospitals in 3 countries, migration and retraining of models, the final models show good universality; meanwhile, the method can obtain the ranking of risk factors associated with kidney damage and deterioration, dynamically evaluate the influence degree of the factors on the disease development at the present moment, and bring convenience for doctors to have more deep understanding on the AKI disease development; finally, the method is built in parallel computing, and the risk of serious AKI occurring in the ICU of the patient and the reason of model reasoning (namely the contribution degree of risk factors) can be automatically, visually and quickly obtained.

The method for evaluating the risk of serious AKI occurrence of ICU inpatients based on electronic health record complies with the development and application process of a prediction model (figure 1), the specific implementation is shown in figure 2, and for convenience of description, it is required to mention that figure 1 is slightly adjusted. And combining the data preprocessing and model building parts into a data set building model, and combining the rest of model building and model evaluation into a model building and evaluating module. The method specifically comprises the following steps:

the data set construction module process in the invention is as follows:

firstly, based on formulated inclusion and exclusion standards of research groups, 5 data sets of MIMIC-III, MIMIC-IV, eICU-CRD, Ams-UMC and PLAGH are screened for research population. Fusing together 2001-2008 and 2008-2016 patients in MIMIC-III and MIMIC-IV is called BIDMC_{(2001～2016)}Separately extracting 2017 to 2019 years of patients in MIMIC-IV is called BIDMC_{(2017～2019)}Patients from the eICU-CRD dataset were referred to as 202hospitals_(Philip)The other two patient populations were named Ams-UMC and PLAGH, respectively; ② from the inclusion and exclusion criteria of FIG. 3, the respective study sample size and the proportion of severe AKI occurrence can be obtained, respectively. Wherein the BIDMC_{(2001～2016)}(Total 37,968, Severe AKI 40.7%), BIDMC_{(2017～2019)}(Total 6,722 people, Severe AKI 38.8%), 202hospitals_(Philip)(total 55,224, severe AKI 23.6%), Ams-UMC (total 7,403, severe AKI 37.7%), PLAGH (total 33,552, severe AKI 13.5%). Fig. 4-8 present the case of specific patient exclusion in each data set. Table 1 presents the baseline comparisons for the 5 study populations; thirdly, according to clinical prior knowledge and other included variables, the study includes static characteristics (11 total, such as age, sex, BMI and the like) representing individual information of the patient and dynamic characteristics (including laboratory examination, vital signs, treatment information and urine volume, 41, 7, 6 and 1 respectively) representing the disease development process of the patient, and the specific included variables are shown in table 2; and fourthly, marking the state of the patient in each hour according to the clinical diagnosis standard KDIGO and the solved task, if whether the serious AKI occurs in 12 hours is predicted, marking the state as 1 from 12 hours before the occurrence of the serious AKI to 0 before the occurrence of the serious AKI, and not researching the data after the occurrence of the serious AKI in other periods. Figure 9 presents the difference in the distribution of creatinine and urine volume collection frequency for the 5 study data sets.

TABLE 1.5 population Baseline comparison of study datasets

TABLE 2 variables incorporated by the predictive model

The data processing module process in the invention is as follows:

the same naming mode is adopted for the same variables from different data sets respectively based on the extracted research population and the included research variables. A. Outliers are removed based on the physiological boundaries that the physician provides and the quartering of data. B. And (4) arranging the dynamic data into data of the Nth hour relative to the data entering the ICU, and averaging if a plurality of values exist. C. The variables except for the FiO2 are all filled by adopting a forward interpolation method, the variables are used for filling the median of the whole population at the moment that interpolation cannot be carried out, and 21% of values are used for filling the defects that the FiO2 cannot be used for forward interpolation and is not in the mechanical ventilation stage. The deletion ratios for 5 datasets with the variables incorporated are shown, for example, in table 3; secondly, based on the sorted data set, the statistical characteristics of the data in the observation window are constructed according to the requirements of the research tasks, and the third part is to introduce the construction link of the model in detail. Different types of statistical characteristics are constructed according to the acquisition frequency and the variation amplitude of each variable, as shown in table 2. The personal information holds the original data, and the number of the personal information is 11. The laboratory examination respectively constructs the maximum value, the minimum value, the maximum value of two measurement changes, the initial observed value and the final observed value of the observation window, namely the current time value, and the total number of the observation window is 91. The vital signs respectively construct 187 maximum values, minimum values, median values, average values, standard deviations, maximum values of two measurement changes, mean values of the square of each hour value in the observation window, the sum of values of the front and back measurement changes, the times of exceeding the mean values, the times of being lower than the mean values, the slopes of fitting linear curves, the initial observation values and the final observation values, namely current time values, of the observation window. The treatment information constitutes the maximum value of the observation window (0/1), the initial observation, the final observation, the duration, i.e., the sum, for 24, respectively. Urine volume the total of observation windows, total urine volume at 12 hours/patient weight/12 hours, total urine volume at the first 6 hours/patient weight/6 hours for 12 hours, total urine volume at the last 6 hours/patient weight/6 hours for 12 hours, initial and final observations were constructed for 6 total. The input feature dimension of the model co-constructed in the above manner is thus 320 dimensions.

TABLE 3.5 Defect ratios of variables in data sets

The model construction and evaluation module process in the invention is as follows:

the model development is carried out by adopting a machine learning method, and 4 ML models are included in the model developmentTypes, LightGBM, CatBoost, RF and LR models, respectively. The prediction task was to assess the risk of developing severe AKI 6 hours, 12 hours, 18 hours, 24 hours into the future. Statistical characteristics are extracted in a sliding window mode (12 hours are selected in the research), sliding step length is 1 hour, and the statistical characteristics are input into the ML model to carry out classification to obtain the estimated probability. Adding BIDMC_{(2001～2016)}The data of (A) are used for training, tuning, calibrating and internal verification of the model, BIDMC_{(2017～2019)}For time series validation, 202 other hospitals from the united states, Ams-UMC and PLAGH were used for external validation, respectively, and the sample set sizes included are shown in fig. 10. Firstly, model construction and training: based on the experimental design, 4 task prediction models are respectively constructed. Training and parameter tuning of the model based on data from 30,374 patients, calibration of the model based on data from 3,796 patients, and internal validation evaluation of model performance based on data from 3,798 patients; secondly, in order to better evaluate the universality and the robustness of the model, detailed and comprehensive evaluation indexes and modes are listed. The indices included 8 calculated indices and their corresponding 95% confidence intervals (AUROC, specificity, sensitivity, accuracy, F1 value, precision, negative predictive value, and aurr) and 1 functional index (interpretability). The format contained 3 (time series validation, external validation and subgroup analysis). Data for 6,722 patients was used for time series validation, data for 55,224 patients from 202hospitals in the united states was used for external validation, data for 7,403 patients from Ams-UMC hospital was used for external validation, and data for 33,552 patients from PLAGH hospital was also used for external validation. We selected 8 aspects of clinical interest for subgroup analysis, including age (18-44,45-64,65-74,75-84, ≧ 85), race (ASIAN, BLACK, HIPANIC, WHITE), ICU type (CCU, Med-SICU, NICU, SICU, TSICU), admission SOFA score (0-1,2,3-4,5-6, ≧ 7), gender (male, female), hospital type (teaching hospital, non-teaching hospital), bed possession (R)<100,100-. Based on the subgroup analysis mode, the AUROC value and the standard deviation of each subgroup model are obtained by verifying 200 times of Bootstrap mode.

FIG. 12 presents a model internal validation performance presentation of LightGBM model under 4 prediction tasks, with duration as predicted

The prediction capability of the model is obviously different. It can be seen that 6 hours earlier, the model performance is best, but the predicted early time will be greatly constrained. AUROC of the early 12-hour model is 0.874(0.87-0.877), AUPR 0.456(0.448-0.466) is 0.464 is a clinically acceptable prediction range; fig. 13 presents the performance comparison of 4 ML models at the prediction task of 12 hours model, and the prediction model with the optimal LightGBM can be obtained, wherein the main comparison results of 4 models are listed, and more detailed indexes can be seen in table 4. The predicted performance of LightGBM, CatBoost, Random Forest and Logistic regression is as follows: AUROC 0.874(0.87-0.877) and AUPR 0.456(0.448-0.466), AUROC 0.855(0.851-0.858) and AUPR 0.42(0.411-0.43), AUROC0.845(0.842-0.849) and AUPR 0.38(0.371-0.389), AUROC 0.823(0.82-0.827) and 0.285 (0.278-0.293). Fig. 14a-d present the results of model calibration of 4 ML models under 4 prediction tasks, respectively, where it can be seen that the Brier score of LightGBM is the lowest, i.e. the calibration effect is consistently optimal. Table 5 ranks the factors associated with risk of developing severe AKI for early assessment based on the shield method (top 20 important features). Fig. 15(a) presents the important factors for the top 20 ranking, respectively: uretouput _ uo _6kgh _ after, uretouput _ uo _12kgh, bmi, mbp _ now, age, sbp _ now, bun _ cre _ ratio _ min, fio2_ diff _ max, gene _ F, uretouput _ uo _6kgh _ before, uretouput _ sum, temperature _ min, ph _ min, furosemide _ now, weight, egfr _ min, paco2_ diff _ max, hypertension, mbp _ trend _ slope, pao2_ diff _ max. Fig. 15(b) is a model-based ranking of risk factors, at some point dynamically evaluating why the risk of severe AKI next occurring is higher. FIG. 11 presents the content involved in the participation of 5 data sets in model building, evaluation, and application. Fig. 16 shows the prediction results of the time series verification and the external verification performed directly by the model, and it can be seen that the performance of the time series verification (bidic) model is reduced little, the external verification performance in different hospitals in the same country is slightly reduced but AUROC is higher than 0.80, the performance of the external verification model in two hospitals in different countries is greatly reduced, and the applicability of the PLAGH model is the worst. Table 6 presents the validation evaluation results comprising 8 indices. Thus, when the model is popularized elsewhere, especially in different countries, the differences in population distribution and medical behavior are very important, which may result in the model not being directly usable. Fig. 17 presents model predicted performance for 8 subgroups of analyses at each distribution of the bidimc and 202 other hospitals, and table 7 presents the results for a more detailed subgroup analysis. From the results of the graphs and tables, we can conclude that the model has no bias and unfairness for the performance of a small fraction of the population, and that the model obtains better predicted performance in each sub-population and meets clinical needs.

TABLE 4.4 Performance of the machine learning models in 4 prediction tasks

TABLE 5 ranking of the first 20 important features that predict the occurrence of severe AKI model within 12 hours in the future

TABLE 6 Performance of the prediction model with direct timing and external validation

TABLE 7. predictive model analysis of performance Performance in 8 subsets of two datasets

The model using module process in the invention is as follows:

for facing application scenarios: the future use of the hospital for model development, the use of the hospital for other regions of the country where model development is performed, and the use of the hospital for other countries (europe, china) other than the country where model development is performed are shown in fig. 18. We consider the data volume (none, small, medium, large) owned by different application scenarios, potential application scenarios, population distribution of study targets (i.e. the occurrence ratio of severe AKI), and differences in medical behavior (frequency of measurement and recording of creatinine, urine volume, etc.), respectively. Table 8 presents the time of incidence of AKI, the proportion of incidence of severe AKI, the time of incidence, the basis for assessing the grade of AKI (creatinine/urine/both) in the 5 study populations as a function of the prognosis (dialysis and death) of the patients. It is possible to obtain: the proportion of serious AKI in BIDMC hospital at different time intervals is closer to and more than 38%, the eICU-CRD contains data from other 202hospitals in the United states, the occurrence rate is relatively reduced to 23.6%, the proportion of serious AKI in Ams-UMC is close to BIDMIC, and the proportion of PLAGH is minimum 13.5%; in addition to PLAGH, the median of the 4 population occurrences was approximately after ICU day 1, while the time for PLAGH was relatively delayed by one day; in addition to PLAGH, nearly 90% of the 4 population AKI ratings were judged by urine volume, about 2/3 for PLAGH and the remainder 1/3 for creatinine. It can also be inferred from the frequency of urine volume recordings for the 5 data sets in fig. 9 that the frequency of urine volume recordings affects the identification of AKI to some extent. In addition, the dialysis rates of 3 study groups other than PLAGH, Ams-UMC were close to 2%, and the dialysis rates of the remaining two hospitals were higher and close to 7% or more. The mortality rate of 4 hospital patients other than PLAGH was about 11.9-15.2%, whereas the rate of PLAGH exceeded 20%. The cumulative number of people BIDMC available to evaluate model performance from 2017 to 2019 was 6722; the total number of people available for evaluation in 202 other hospitals is 55,224, but the total number of people in more actual hospitals is limited to 500-2000; Ams-UMC can be used to assess a population count of 7,403; PLAGH can be used to assess a population of 33,552 people. In view of the above, we conducted subsequent application exploration studies. Fig. 11 presents the use of the above data.

Model migration and training: the model is directly used without being adjusted, the model is migrated, retrained and calibrated based on the pre-training model, and the model is retrained and calibrated based on local data. We present the predicted performance of having different data volume models in the above-mentioned 3 application scenarios, respectively, as shown in fig. 19. In the figure, the black dotted line is the result of the direct model verification without any migration, the blue hatched line is the result of the model after the migration, and the green hatched line is the result of the model after the retraining. FIG. 19a is a graph of BIDMC, where we explored the data set for 1345 patients to be extracted separately as model assessments, and the difference in results between models incorporating 200, 500 and 5000 new patient data (interval 500). When only 200 patients are included, the performance of the model is reduced (especially the model is migrated) and the uncertainty of the prediction result is high, as the performance of the model including more and more patients is improved continuously, the performance gap between the retrained model and the model obtained by migration learning is reduced gradually, and when the sample size reaches 3000 patients, the performance of the model is close to that of the 3000 patients, and the improvement of the subsequent performance is small. But both eventually approach a well-trained model based on a large sample set. Thus, for future model usage by the BIDMC hospital, models trained based on data accumulated in the past can be used directly; FIGS. 19b-e show the results of 4 hospitals selected from eICU-CRD, including

hospital ids

264, 443, 73 and 420, wherein the information of the hospitals is shown in Table 9. The proportion of the first two serious AKI generation is close to BIDMC, and the proportion of the second two serious AKI generation is far from BIDMC. We take 200 to present the results of the model for the interval. With the first two hospitals, as more and more data is incorporated into the patient, both can exceed the performance of the direct use model and the migrated model performs slightly better than the retrained model. It also shows a rule consistent with the first two hospitals for hospitals with lower incidence of severe AKI (id 73). While for hospitals (id 420) with higher occurrence ratios than BIDMC hospital, it is found that the performance is better when the directly trained model is adopted when the owned data volume is smaller; fig. 19f shows the evaluation result of the application scenario on Ams-UMC, it can be found that AUROC of the direct evaluation model is 0.72, and when the model incorporates a small amount of data of 200 patients, the model has a large performance improvement, and as the performance of the model is continuously improved by incorporating more data, the performance of the model tends to be stable after 3000 patients are incorporated, and the difference between the migration learning model and the retrained model is small in the whole process; FIG. 19g shows the evaluation of the application scenario on PLAGH, which is consistent with the results presented by Ams-UMC, and the model directly evaluates the performance even worse (AUROC 0.62). Through the above 3 application scenarios and the aspects to be considered in use, it is possible to provide suggestions for subsequent researches such as how to use the model, how much data is available, how much data can be stopped waiting for more data to start training the model, and the like.

Assistant decision support software:

we have further developed aided decision support software for AKI, which, based on its communication with the physician, includes dynamic assessment (hourly), early prediction (predicting risk in the future 12 hours) and inferential interpretation (giving reasons for model guess results); the software design needs to follow the main subjects of easy operation of medical personnel (according with medical use habits and conciseness) and automatic calculation (without increasing workload). Further in light of the above, we have devised a severe AKI early risk assessment device based on interpretable machine learning models, as shown in fig. 20. The operation flow of the device is as follows: A. automatically extracting personal information (such as sex, age, BMI and the like), vital signs (such as heart rate, respiratory rate, blood pressure and the like), laboratory examinations (such as creatinine, lactic acid, blood potassium and the like), treatment information (such as whether mechanical ventilation is performed, a booster is used and the like) and urine volume of a patient in a medical information system; B. the dynamic calculation of each hour is realized through the intermediate links including the data preprocessing, the feature calculation, the model operation and the reasoning explanation; C. and (4) converting the serious AKI risk probability obtained by calculation in the step B into a score for presentation, and simultaneously giving an analysis process of decision making of the model at the current moment, so that the understanding and trust of medical staff are facilitated. The device can be full-automatic carry out parallel computation, and the probability that takes place the risk of whole ward or whole hospital cloud platform guardianship patient is obtained in quick calculation, under the work load that does not increase medical personnel in advance, help the potentially patient that takes place the irreversible damage of kidney of doctor's early discernment, and then carries out early intervention treatment to the well high risk patient.

TABLE 8.5 Severe AKI distribution and prognostic comparison of study population patients

TABLE 9 4 Hospital for model migration and retraining in eICU-CRD

Hospital id	Teaching hospital	Number of bed figures type	Region of land	Sample set	Severe AKI proportion (%)
						264	Is that	≥500	Midwest	1,917	32.9
443	Is that	≥500	South	1,348	31.3
						73	Is that	≥500	Midwest	1,868	13.1
420	Is that	≥500	Northeast	1,365	51.3

The model in the application can be packaged after training and run on computing equipment (such as a computer, a server, intelligent equipment and the like) to form a special severe AKI early risk assessment device. The device automatically retrieves required personal information, laboratory examination information, vital sign information, treatment information and urine volume from a hospital information system in real time, and obtains a dynamic evaluation result of the risk of the occurrence of the severe AKI and a dynamic change process of important risk factors per hour by utilizing the severe AKI early risk evaluation model based on the interpretable machine learning model.

The invention has the advantages that:

(1) the risk probability of serious AKI of an ICU patient in the future 12 hours can be predicted, and a doctor is assisted to notice that the kidney of the patient is possibly seriously injured as soon as possible, so that the patient is intervened and treated as soon as possible to avoid irreversible injury to the kidney;

(2) training and evaluation of models using data from approximately 14 million ICU inpatients in 205 hospitals across 3 countries is the most extensive multicenter study that has been developed to date to cover the largest sample size for severe AKI prediction models;

(3) the development, evaluation and application of the model obey a strict and transparent standard flow, 8 evaluation indexes and 8-aspect subgroup analysis are adopted, and the performance and the use boundary of the model are evaluated in detail in 3 modes of internal verification, time sequence verification and external verification;

(4) aiming at 3 application scenes (local, other regions and other countries), the ownership, crowd distribution and other characteristics of actual scene data are explored to explore the migration and application forms of the model, and finally the prediction performance of different scenes close to the development model is realized;

(5) according to clinical use requirements, software (a device) for assisting severe AKI diagnosis is developed, the risk probability of the patient occurring in the future 12 hours and the reasoning process of the model can be fully automatically and dynamically presented, and doctors can more intuitively understand the model and conveniently use the analysis result to perform further treatment intervention.

Unless defined otherwise, all technical and/or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The materials, methods, and examples set forth in this application are illustrative only and not intended to be limiting.

Although the present invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the teachings of this application and yet remain within the scope of this application.

Claims

1. A severe AKI early risk assessment model based on an interpretable machine learning model is a LightGBM model based on a fusion SHAP method; the model has a number of features, of which the first 20 features by their importance are:

2. The severe AKI early risk assessment model based on an interpretable machine learning model according to claim 1, wherein: further comprising a data processing module;

3. An interpretable machine learning model-based severe early AKI risk assessment apparatus, comprising a computing unit executing the interpretable machine learning model-based severe early AKI risk assessment model according to any one of claims 1-2, wherein the model adopts a SHAP method fused with the model to obtain a risk factor ranking and a dynamic assessment of the importance degree of each factor on the occurrence of severe AKI of a patient to be predicted.

4. The device for early risk assessment of severe AKI based on interpretable machine learning model according to claim 3, wherein:

the device automatically transfers needed personal information, laboratory examination information, vital sign information, treatment information and urine volume from a hospital information system in real time, and obtains a dynamic evaluation result of future occurrence risk of the severe AKI in each hour and a dynamic change process of important risk factors by utilizing the severe AKI early risk evaluation model based on the interpretable machine learning model;

5. A method of developing a severe AKI early risk assessment model based on an interpretable machine learning model, comprising: data set construction, data processing, model construction and evaluation and model application;

6. The method of developing a severe AKI early risk assessment model based on an interpretable machine learning model according to claim 5, wherein:

the plurality of large electronic health profile data sets comprises:

BIDMIC 2001-2016、BIDMIC 2017-2019、eICU-CRD、Ams-UMC、PLAGH。

7. the method of developing a severe AKI early risk assessment model based on an interpretable machine learning model according to claim 6, wherein:

the selection principle of the crowd is as follows:

(1) hospitalized Intensive Care Unit (ICU) patients aged 18 to 90 years;

(6) excluding patients who have been in end stage renal disease;

8. The method of developing a severe AKI early risk assessment model based on an interpretable machine learning model according to claim 7, wherein:

in the inclusion of the research variables, five types of research variables are determined and divided into static characteristics and dynamic characteristics; the method comprises the following steps:

9. The method of claim 8 for developing a severe AKI early risk assessment model based on an interpretable machine learning model, wherein:

the data extraction and pretreatment comprises the following steps:

10. The method of developing a severe AKI early risk assessment model based on an interpretable machine learning model according to claim 6, wherein:

in the model construction and evaluation, a proper machine learning model is adopted, based on positive and negative samples labeled in the data set construction and statistical characteristics constructed in the data processing, 80% of data set based on BIDMC 2001-2016 is used for model training and parameter adjustment by Bayesian optimization, 10% of data is used for model calibration, and 10% of data is used for internal model verification; respectively repeating the training, parameter adjusting, calibrating and internal verifying processes to realize the construction of an early prediction model for serious AKI occurring 6 hours, 12 hours, 18 hours and 24 hours in the future; the constructed early prediction model fusion SHAP method realizes model interpretability, which is the importance degree of risk factor ranking and dynamic evaluation of various factors related to the occurrence of severe AKI on the occurrence of severe AKI.

11. The method of developing a severe AKI early risk assessment model based on an interpretable machine learning model according to claim 6, wherein:

in the model construction and evaluation, the 9 indexes are 8 evaluation indexes and 1 function index, and the 9 indexes are used for evaluating the performance of the prediction model;

performing timing verification based on the BIDMC 2017 and the 2019;

performing external verification based on eICU-CRD, Ams-UMC and PLAGH;

12. The method of developing a severe AKI early risk assessment model based on an interpretable machine learning model according to claim 5, wherein:

in the model application, according to the future use of the model development-oriented hospital, the use of the model development-oriented hospital in other regions of the country, and the use of the model development-oriented hospital in other countries, the difference of different scenes, local population distribution, data ownership and medical care processes, the model is selected for direct use, model migration or model retraining.

13. The method of developing a severe AKI early risk assessment model based on an interpretable machine learning model according to claim 12, wherein:

if the model application is used in the future of a hospital facing model development, the trained prediction model can be directly used;

14. The method of developing a severe AKI early risk assessment model based on an interpretable machine learning model according to claim 5, wherein:

further comprising: AKI assistant decision support software is constructed;

15. A development method of a disease prediction model based on an electronic health record, wherein the prediction model is a machine/deep learning type model, and the development method comprises the following steps: data set construction, data processing, model construction, model evaluation and model application;