CN117690591A

CN117690591A - Method, device, equipment and storage medium for predicting chronic kidney disease progression risk

Info

Publication number: CN117690591A
Application number: CN202311705794.7A
Authority: CN
Inventors: 宋娜娜; 陆雨菲; 朱博文; 李阳; 赵栓; 张伟东; 张健; 杨炎; 陈威泽; 颜芷昕; 陈安南; 孙滢雪; 方艺; 丁小强
Original assignee: Zhongshan Hospital Fudan University
Current assignee: Zhongshan Hospital Fudan University
Priority date: 2023-12-12
Filing date: 2023-12-12
Publication date: 2024-03-12

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a chronic kidney disease progression risk prediction method, device, equipment and storage medium, which are used for solving the technical problem that the result predicted by the chronic kidney disease progression risk prediction method in the prior art is inaccurate. The method comprises the following steps: training based on the history medical record information of the chronic kidney disease patient to obtain a progress prediction model; acquiring sample data of a patient to be predicted, inputting the sample data into a progress prediction model for risk prediction, and obtaining a risk score; invoking a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score to obtain a contribution value of each evaluation index in the sample data to the risk score; and generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score. The method can provide more accurate and effective prediction results for the progression risk of chronic kidney disease.

Description

Method, device, equipment and storage medium for predicting chronic kidney disease progression risk

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method, a device, equipment and a storage medium for predicting chronic kidney disease progression risk.

Background

Chronic kidney disease (CKD, chronic Kidney Diseae) is a progressive disease that manifests as kidney damage or reduced kidney function for at least three months. According to the content of a global disease burden study report, nearly 120 tens of thousands of people die worldwide from CKD, and more than one of every 7 adults suffers from CKD. In recent years, a great deal of literature has emphasized that the progress of CKD can lead to serious complications such as high mortality, end-stage renal disease, mineral bone disease, etc., and that despite the increasing incidence of CKD and extremely high mortality rates of CKD-induced diseases such as ESRD (End Atage Renal Disease, end-stage renal disease), CKD is often not found until it has progressed to the end-stage, which makes its therapeutic efficacy undesirable due to its complex potential background, occult clinical manifestations, and irreversible clinical processes. Clinical factors such as anemia, hypertension, hyperuricemia, metabolic acidosis, edema, etc. are associated with CKD progression. The traditional method for diagnosing CKD requires expert consultation and multiple examinations, and is usually based on knowledge and experience of doctors, so that diagnosis errors and inaccurate diagnosis results are easy to occur, and particularly for rural areas with limited medical conditions, the complicated diagnosis process not only brings additional burden to medical staff, but also delays the optimal time for treating patients. Under the background, the performance and efficiency of the CKD layered diagnosis and treatment are improved by applying intelligent technologies such as machine learning and the like, and the method has important practical significance for guaranteeing the health of people. Currently, artificial intelligence has been widely used in the field of kidney disease, and prediction of disease progression and treatment advice are the two most important applications of artificial intelligence in clinical practice of kidney disease, and many studies consider chronic kidney disease prediction schemes based on machine learning.

In the prior art, constructing an artificial intelligent model for layered diagnosis and treatment of CKD patients according to two laboratory quantitative indexes emphasized by the Kidney Disease Improvement Global Outcome (KDIGO) guidelines (2012 edition), namely urine albumin to creatinine ratio (ACR) and glomerular filtration rate (evfr), is helpful for patient layering and formulating clinical treatment strategies, and can delay the progress of CKD in time and reduce the occurrence rate of complications. However, since the urinary albumin index as a basic parameter of ACR has a problem in terms of cost and accuracy in detection, it is difficult to implement ACR as a general screening standard for CKD. Therefore, from the standpoint of practicality and cost effectiveness, it is desirable to provide a more useful and convenient method to replace the detection of urinary albumin. In addition, in the layered diagnosis and treatment scheme of the existing artificial intelligent model for the CKD patient, the model mostly has the problem of a black box, the contribution degree of each index to the output result cannot be determined, and the reliability is not high in the medical field with extremely high requirements on accuracy and certainty. There is an urgent need for a method for predicting the risk of progression of chronic kidney disease that can provide a more accurate and effective prediction of the risk of progression of chronic kidney disease.

Disclosure of Invention

The invention mainly aims to solve the technical problem that the prediction result of the chronic kidney disease progression risk prediction method in the prior art is inaccurate.

The first aspect of the present invention provides a method for predicting the risk of progression of chronic kidney disease, comprising:

training based on the history medical record information of the chronic kidney disease patient to obtain a progress prediction model;

acquiring sample data of a patient to be predicted, and inputting the sample data into the progress prediction model to perform risk prediction to obtain a risk score;

invoking a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score, and obtaining a contribution value of each evaluation index in the sample data to the risk score;

and generating a chronic kidney disease progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score.

Optionally, in a first implementation manner of the first aspect of the present invention, the training to obtain the progress prediction model based on the history information of the chronic kidney disease patient includes:

acquiring history medical record information of a chronic kidney disease patient, and generating a training sample based on the history medical record information and a corresponding diagnosis result;

Constructing a plurality of initial models based on a plurality of deep learning algorithms, and training each initial model according to the training samples to obtain a plurality of trained candidate models;

and evaluating and screening the prediction effect of each candidate model based on the training sample, and selecting the candidate model with the optimal prediction effect as a progress prediction model.

Optionally, in a second implementation manner of the first aspect of the present invention, before the acquiring the history medical record information of the chronic kidney disease patient and generating the training sample based on the history medical record information and the corresponding diagnosis result, the method further includes:

calling a regularization feature sparse algorithm and a feature recursion elimination algorithm of a random forest to construct a plurality of initial classification models;

inputting the historical medical record information into the initial classification model, calling a grid inspection algorithm to perform optimal super-parameter adjustment on the initial classification model, and selecting the initial classification model with the highest integral under the working characteristic curve of the subject as a characteristic screening model;

according to the feature screening model, importance ranking is carried out on the medical features to obtain a medical feature sequence;

and screening medical feature combinations with importance weights larger than a preset threshold value in the medical feature sequences to obtain an evaluation index.

Optionally, in a third implementation manner of the first aspect of the present invention, the obtaining the historical medical record information of the chronic kidney disease patient, and generating the training sample based on the historical medical record information and the corresponding diagnosis result includes:

collecting electronic medical record data of a chronic kidney disease patient confirmed by pathology from a hospital electronic medical record platform;

extracting medical features from the electronic medical record data to obtain medical features and corresponding medical feature values, and preprocessing the medical features and the corresponding medical feature values;

and marking the grade of the decline of the kidney function and classifying the risk prognosis for the medical feature and the corresponding medical feature value after the data preprocessing, so as to obtain a training sample.

Optionally, in a fourth implementation manner of the first aspect of the present invention, the evaluating and screening the prediction effect of each candidate model based on the training samples, and selecting the candidate model with the optimal prediction effect as the progress prediction model includes:

inputting the training samples into each candidate model, and carrying out optimal super-parameter adjustment on each candidate model by using a grid optimizing algorithm;

calculating the area under the working characteristic curve of the test subject of each candidate model;

And selecting a candidate model with the optimal prediction effect as a progress prediction model according to the area under the working characteristic curve of the subject of each candidate model.

Optionally, in a fifth implementation manner of the first aspect of the present invention, the evaluation index includes:

age, hemoglobin, N-terminal osteocalcin, 25 hydroxy vitamin D, total protein, glutamic pyruvic transaminase, apolipoprotein E, thyroxine, carbohydrate antigen 199, mean red blood cell hemoglobin concentration, unsaturated iron binding capacity, blood homocysteine, albumin, major platelet fraction, total bilirubin, creatine kinase isoenzyme, red blood cell distribution breadth, IGG antibodies, mean red blood cell volume, folic acid, sodium, urea nitrogen, plasma proteins, neuronal enolase, serum total complement activity, neutrophil count, total bile acid, chloro, glycosylated albumin, low density lipoprotein cholesterol, non-high density lipoprotein cholesterol.

Optionally, in a sixth implementation manner of the first aspect of the present invention, after the generating a chronic kidney disease progression risk prediction result according to the risk score and the contribution value of each of the evaluation indexes to the risk score, the method further includes:

And updating the importance weight of the evaluation index according to the contribution value of the evaluation index to the risk score.

Optionally, in a seventh implementation manner of the first aspect of the present invention, the data preprocessing the medical feature and the corresponding medical feature value includes:

carrying out standardized treatment on the medical features, unifying codes and changing classification values into one-hot codes;

normalizing the medical characteristic value;

and detecting abnormal values of the medical characteristic values after normalization processing by adopting a clustering algorithm, judging the weight and the balance of the data after removing the extreme outliers, balancing the training samples by using a random sampling method, and filling the missing data by using a random forest algorithm.

In a second aspect, the present invention provides a chronic kidney disease progression risk prediction apparatus comprising:

the training module is used for training to obtain a progress prediction model based on the history medical record information of the chronic kidney disease patient;

the scoring module is used for acquiring sample data of a patient to be predicted, inputting the sample data into the progress prediction model for risk prediction, and obtaining a risk score;

the analysis module is used for calling a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk scores, and obtaining contribution values of all evaluation indexes in the sample data to the risk scores;

And the result output module is used for generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score.

Optionally, in a first implementation manner of the second aspect of the present invention, the training module is specifically configured to: acquiring history medical record information of a chronic kidney disease patient, and generating a training sample based on the history medical record information and a corresponding diagnosis result; constructing a plurality of initial models based on a plurality of deep learning algorithms, and training each initial model according to the training samples to obtain a plurality of trained candidate models; and evaluating and screening the prediction effect of each candidate model based on the training sample, and selecting the candidate model with the optimal prediction effect as a progress prediction model.

Optionally, in a second implementation manner of the second aspect of the present invention, the chronic kidney disease progression risk prediction apparatus further includes an evaluation index screening module, where the evaluation index screening module is specifically configured to: calling a regularization feature sparse algorithm and a feature recursion elimination algorithm of a random forest to construct a plurality of initial classification models; inputting the historical medical record information into the initial classification model, calling a grid inspection algorithm to perform optimal super-parameter adjustment on the initial classification model, and selecting the initial classification model with the highest integral under the working characteristic curve of the subject as a characteristic screening model; according to the feature screening model, importance ranking is carried out on the medical features to obtain a medical feature sequence; and screening medical feature combinations with importance weights larger than a preset threshold value in the medical feature sequences to obtain an evaluation index.

Optionally, in a third implementation manner of the second aspect of the present invention, the obtaining the history medical record information of the chronic kidney disease patient, and generating the training sample based on the history medical record information and the corresponding diagnosis result includes: collecting electronic medical record data of a chronic kidney disease patient confirmed by pathology from a hospital electronic medical record platform; extracting medical features from the electronic medical record data to obtain medical features and corresponding medical feature values, and preprocessing the medical features and the corresponding medical feature values; and marking the grade of the decline of the kidney function and classifying the risk prognosis for the medical feature and the corresponding medical feature value after the data preprocessing, so as to obtain a training sample.

Optionally, in a fourth implementation manner of the second aspect of the present invention, the evaluating and screening the prediction effect of each candidate model based on the training samples, and selecting the candidate model with the optimal prediction effect as the progress prediction model includes: inputting the training samples into each candidate model, and carrying out optimal super-parameter adjustment on each candidate model by using a grid optimizing algorithm; calculating the area under the working characteristic curve of the test subject of each candidate model; and selecting a candidate model with the optimal prediction effect as a progress prediction model according to the area under the working characteristic curve of the subject of each candidate model.

Optionally, in a fifth implementation manner of the second aspect of the present invention, the evaluation index includes: age, hemoglobin, N-terminal osteocalcin, 25 hydroxy vitamin D, total protein, glutamic pyruvic transaminase, apolipoprotein E, thyroxine, carbohydrate antigen 199, mean red blood cell hemoglobin concentration, unsaturated iron binding capacity, blood homocysteine, albumin, major platelet fraction, total bilirubin, creatine kinase isoenzyme, red blood cell distribution breadth, IGG antibodies, mean red blood cell volume, folic acid, sodium, urea nitrogen, plasma proteins, neuronal enolase, serum total complement activity, neutrophil count, total bile acid, chloro, glycosylated albumin, low density lipoprotein cholesterol, non-high density lipoprotein cholesterol.

Optionally, in a sixth implementation manner of the second aspect of the present invention, the chronic kidney disease progression risk prediction apparatus further includes an importance update module, where the importance update module is specifically configured to: and updating the importance weight of the evaluation index according to the contribution value of the evaluation index to the risk score.

Optionally, in a seventh implementation manner of the second aspect of the present invention, the data preprocessing the medical feature and the corresponding medical feature value includes: carrying out standardized treatment on the medical features, unifying codes and changing classification values into one-hot codes; normalizing the medical characteristic value; and detecting abnormal values of the medical characteristic values after normalization processing by adopting a clustering algorithm, judging the weight and the balance of the data after removing the extreme outliers, balancing the training samples by using a random sampling method, and filling the missing data by using a random forest algorithm.

A third aspect of the present invention provides a chronic kidney disease progression risk prediction apparatus device, comprising: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the chronic kidney disease progression risk prediction means device to perform the steps of the chronic kidney disease progression risk prediction means method described above.

A fourth aspect of the present invention provides a computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the steps of the chronic kidney disease progression risk prediction apparatus method described above.

According to the technical scheme provided by the invention, a progress prediction model is obtained based on the history medical record information training of the chronic kidney disease patient; acquiring sample data of a patient to be predicted, inputting the sample data into a progress prediction model for risk prediction, and obtaining a risk score; invoking a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score to obtain a contribution value of each evaluation index in the sample data to the risk score; and generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score. The method can provide more accurate and effective prediction results for the progression risk of the chronic kidney disease.

Drawings

FIG. 1 is a flow chart of an embodiment of a method for predicting risk of progression of chronic kidney disease according to an embodiment of the present invention;

FIG. 2 is a flow chart of another embodiment of a method for predicting risk of progression of chronic kidney disease according to an embodiment of the invention;

FIG. 3 is a schematic diagram showing the working characteristics of a subject at the step of characteristic evaluation in another embodiment of a method for predicting risk of progression of chronic kidney disease according to an embodiment of the present invention;

FIG. 4 is a schematic view showing an apparatus for predicting risk of progression of chronic kidney disease according to an embodiment of the present invention;

FIG. 5 is a schematic view showing an embodiment of a chronic kidney disease progression risk prediction apparatus according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a computer readable medium according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments can be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these exemplary embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the invention to those skilled in the art. The same reference numerals in the drawings denote the same or similar elements, components or portions, and thus a repetitive description thereof will be omitted.

The features, structures, characteristics or other details described in a particular embodiment do not exclude that may be combined in one or more other embodiments in a suitable manner, without departing from the technical idea of the invention.

In the description of specific embodiments, features, structures, characteristics, or other details described in the present invention are provided to enable one skilled in the art to fully understand the embodiments. However, it is not excluded that one skilled in the art may practice the present invention without one or more of the specific features, structures, characteristics, or other details.

The flow diagrams depicted in the figures are exemplary only, and do not necessarily include all of the elements and operations/steps, nor must they be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.

The block diagrams depicted in the figures are merely functional entities and do not necessarily correspond to physically separate entities. That is, the functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

The term "and/or" and/or "includes all combinations of any one or more of the associated listed items.

Referring to fig. 1, an embodiment of a method for predicting risk of progression of chronic kidney disease according to an embodiment of the present invention includes:

s101, training based on historical medical record information of a chronic kidney disease patient to obtain a progress prediction model;

in this embodiment, a scheme for predicting the progression risk of chronic kidney disease (CKD, chronic Kidney Diseae) is specifically described, and it is to be understood that the execution subject of the present invention may be a chronic kidney disease progression risk prediction device, or may be a terminal or a server, which is not limited herein. The embodiment of the invention is described by taking a server as an execution main body as an example.

The method comprises the steps that a server is trained to obtain a progress prediction model in advance based on historical medical record information of a chronic kidney disease patient before receiving a chronic kidney disease progress risk prediction request; the specific step of training the progress prediction model comprises the following steps: acquiring history medical record information of a chronic kidney disease patient, and generating a training sample based on the history medical record information and a corresponding diagnosis result; constructing a plurality of initial models based on a plurality of deep learning algorithms, and training each initial model according to the training samples to obtain a plurality of trained candidate models; inputting the training samples into each candidate model, and carrying out optimal super-parameter adjustment on each candidate model by using a grid optimizing algorithm; calculating the area under the working characteristic curve of the test subject of each candidate model; and selecting a candidate model with the optimal prediction effect as a progress prediction model according to the area under the working characteristic curve of the subject of each candidate model.

S102, acquiring sample data of a patient to be predicted, and inputting the sample data into a progress prediction model to perform risk prediction to obtain a risk score;

after the chronic kidney disease progress risk prediction request is obtained, sample data of a patient to be predicted is obtained, the sample data is input into a progress prediction model obtained through training in the previous step to predict the chronic kidney disease progress risk of the patient to be predicted, and a risk score of the current possible chronic kidney disease progress of the patient to be predicted is obtained. Specifically, in this embodiment, the pre-constructed progress prediction model may calculate, based on specific evaluation values of each evaluation index in sample data of the patient to be predicted, a risk score of the future possible chronic kidney disease progress of the patient to be predicted according to the super parameters adjusted in the model, where the specific evaluation indexes may include:

age (Age), serum Creatinine (CRE), N-terminal osteocalcin (NTX), hemoglobin (HGB), 25 hydroxy vitamin D (25 OHD), total Protein (TP), glutamic pyruvic transaminase (ALT), apolipoprotein E (APO-E), thyroxine (T4), carbohydrate antigen 199 (CA 199), mean red blood cell hemoglobin concentration (MCHC), unsaturated Iron Binding Capacity (UIBC), homocysteine (che), albumin (ALB), large platelet fraction (P-LCR), total Bilirubin (TBIL), creatine kinase isoenzyme (CM-MB), red blood cell distribution width (RDW-SD), IGG antibodies (IGG), mean red blood cell volume (MCV), folic acid (for), sodium (NA), urea nitrogen (N), plasma protein (GLO), neuroenolase (NSE), serum total complement activity (CH 50), neutrophil count (net), total bile acid (a), chloro (CL), glycosylated Albumin (GA), low density lipoprotein (HDL), high cholesterol (LDL-high).

S103, calling a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score, and obtaining a contribution value of each evaluation index in the sample data to the risk score;

s104, generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score.

In this embodiment, the progress prediction model is analyzed in advance for the prediction process of chronic kidney disease risk by an interpretability (Shapley Additive exPlanations, SHAP) algorithm. The parsing includes: the highly compressed model global insight capability is evaluated by calculating the SHAP values to explain the contribution of each medical assessment index to CKD risk prediction. Specifically, in this embodiment, the contribution value of each evaluation index to risk prediction is calculated by using the machine learning interpretation tool, so that the evaluation index that affects the prediction result most when the current patient to be tested is considered to have a certain level of chronic kidney disease progression risk can be determined.

And generating a chronic nephrosis progression risk prediction result based on the prediction risk score of the chronic nephrosis progression risk of the patient to be tested and the contribution value of each evaluation index.

In this embodiment, after the chronic kidney disease progression risk prediction result is obtained, the importance weight of the evaluation index may be updated according to the contribution value of the risk score, so that the prediction effect of the progression prediction model is more accurate.

The method provided by the embodiment of the invention can predict the progression risk of the chronic kidney disease based on the effective evaluation indexes, can output the influence information of each evaluation index on the progression risk, and can provide a more accurate, effective and reliable prediction result for the progression risk of the chronic kidney disease.

Referring to fig. 2 and 3, another embodiment of the method for predicting risk of progression of chronic kidney disease according to the present invention includes:

s201, acquiring historical medical record information of a chronic kidney disease patient, and screening evaluation indexes based on the historical medical record information;

in the step, firstly, effective evaluation indexes contained in the history medical record information are screened through an artificial intelligence algorithm. Firstly, preparing electronic medical record data, namely acquiring an electronic medical record of a CKD patient confirmed by pathology from a hospital electronic medical record platform, extracting medical characteristics of the electronic medical record data to obtain medical characteristics and corresponding medical characteristic values, and preprocessing the medical characteristics and the corresponding medical characteristic values; the resulting clinical big data information of CKD patients is then normalized for characteristic data.

In a specific embodiment, in the step of extracting medical features from electronic medical record data to obtain medical features and corresponding medical feature values, the method may specifically include: calling a regularization feature sparse algorithm and a feature recursion elimination algorithm of a random forest to construct a plurality of initial classification models; inputting the history medical record information into an initial classification model, calling a grid inspection algorithm to perform optimal super-parameter adjustment on the initial classification model, and selecting the initial classification model with the highest integral under the working characteristic curve of the subject as a characteristic screening model; according to the feature screening model, importance ranking is carried out on the medical features to obtain a medical feature sequence; the medical feature combinations with importance weights greater than the preset threshold value in the medical feature sequences are screened to obtain the evaluation indexes, and the specific evaluation indexes are basically the same as those in the previous embodiment, so that the detailed description is omitted herein. With continued reference to the schematic diagram of the working characteristic curve of the subject in the characteristic evaluation step of fig. 3, it can be seen that the area under the curve is substantially greater than 0.90 in the case that the number of the characteristic groups is maintained in a certain number in the test set, which indicates that the characteristic screening model in this step is effective and reasonable to some extent. Specifically, the features are deleted recursively by adopting a recursive feature elimination algorithm, a model is constructed by using the residual features on the basis of a random forest, and the feature combination which contributes to the prediction result most is judged according to the AUC (Area Under the Curve ) value of the ROC (Receiver oOperating Characteristic Curve, subject working feature curve) of the model.

In a specific embodiment, the step of normalizing the feature data of the obtained clinical big data information of the CKD patient may specifically be performed by: carrying out standardized processing on medical characteristic values, particularly unifying symbols, letters, characters and medical codes, and changing classification values into single-heat codes; normalizing the medical characteristic value, particularly ensuring the convergence of an algorithm, and normalizing the characteristic value; detecting abnormal values by adopting a k-means clustering algorithm (k-means Clustering Algorithm), judging the weight and the balance of the data after removing the extreme outliers, and balancing the training sample data by using a random sampling method; and finally filling the missing data by using a random forest algorithm.

S202, generating a training sample based on historical medical record information and corresponding diagnosis results;

and marking the descending level of the kidney function and classifying the dangerous prognosis to the medical characteristic and the corresponding medical characteristic value after the data preprocessing to obtain a training sample.

In the case of a specific training sample, referring to table 1, in this step, the historical medical record information can be labeled and classified according to the evfr (Estimated Glomerular Filtration Rate ) and ACR (Albumin-to-Creatinine Ratio, urinary microalbumin Creatinine Ratio, also called urinary Albumin Creatinine Ratio) levels in the historical medical record information, so as to obtain the prognosis risk classification of the kidney disease. At present, according to the latest international guidelines for chronic kidney disease, the disease is newly divided into two parts, namely:

(1) The glomerular filtration rate was divided into the following stages:

stage G1: GFR is more than or equal to 90 ml/(min.1.73 m) ² ) Indicating normal or increased kidney function;

stage G2: the GFR range is 60-89 ml/(min.1.73 m) ² ) Indicating a slight decrease in renal function;

stage G3 a: the GFR is 45-59 ml/(min.1.73 m) ² ) Indicating a slight to moderate decrease in renal function;

stage G3 b: the GFR range is 30-44 ml/(min.1.73 m) ² ) Indicating a moderate to severe decline in renal function;

stage G4: the GFR is 15-29 ml/(min.1.73 m) ² ) Indicating a severe decline in renal function;

stage G5: GFR (GFR)<15ml/(min·1.73m ² ) Indicating renal failure.

(2) The quantitative stage of albumin urine is 3:

stage A1: urine microalbumin <30mg/d indicates normal or slightly elevated urine microalbumin levels;

stage A2: the range of the urine microalbumin is 30-299mg/d, which means that the urine microalbumin level is moderately increased;

stage A3: the microalbuminuria is more than or equal to 300mg/d, which means that the microalbuminuria level is increased by weight.

Such staging may be more clear and accurate for disease judgment.

Table 1 training sample classification table

In table 1, training samples belonging to the classification of A1G1 and A1G2 are labeled as low risk, training samples belonging to the classification of A1G3a, A2G1 and A2G2 are labeled as medium risk, training samples belonging to the classification of A1G3b, A2G3a, A3G1 and A3G2 are labeled as high risk, and training samples belonging to the classification of A1G4, A1G5, A2G3, A2G4, A2G5, A3G3a, A3G3b, A3G4 and A3G5 are labeled as extremely high risk. Where the number is the number of samples that one practical implementation belongs to under various classifications.

S203, constructing a plurality of initial models based on a plurality of deep learning algorithms, and training each initial model according to a training sample to obtain a plurality of trained candidate models;

s204, inputting training samples into each candidate model, and performing optimal super-parameter adjustment on each candidate model by using a grid optimizing algorithm;

s205, calculating the area under the working characteristic curve of the test subject of each candidate model;

s206, selecting a candidate model with the optimal prediction effect as a progress prediction model according to the area under the subject working characteristic curve of each candidate model;

in this embodiment, a plurality of machine learning models are trained based on the data set formed by the training samples obtained in the previous step, and an optimal model is selected for the plurality of trained machine learning models to obtain a multi-classification chronic kidney disease risk prediction model.

Specifically, the initial machine learning model may be built by gradient lifting (Gradient Boosting Classifier, GBC), k-Nearest Neighbor (KNN), multi-layer perceptron (Multilayer Perceptron, MLP), and Random Forest (RF) algorithms, respectively.

Randomly dividing the training samples obtained in the previous steps into 5 parts, selecting 4 parts of training data sets for constructing models, taking the rest 1 part as a test data set of the models, constructing a plurality of initial machine learning models, carrying out optimal super-parameter adjustment through grid inspection, selecting the model with the highest AUC value score of ROC, and storing the model as a progress prediction model.

S207, acquiring sample data of a patient to be predicted, and inputting the sample data into a progress prediction model for risk prediction to obtain a risk score;

after the chronic kidney disease progress risk prediction request is obtained, sample data of a patient to be predicted is obtained, the sample data is input into a progress prediction model obtained through training in the previous step to predict the chronic kidney disease progress risk of the patient to be predicted, and a risk score of the current possible chronic kidney disease progress of the patient to be predicted is obtained.

In a specific embodiment, when risk prediction is performed, specifically, the risk of chronic kidney disease progression possibility of a patient to be tested in a target prediction time is predicted, the input of the model is patient sample data to be predicted and the target prediction time, and the output of the model is the risk likelihood score corresponding to the risk level of chronic kidney disease progression possibility and the risk level in the target prediction time.

S208, calling a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score, and obtaining a contribution value of each evaluation index in the sample data to the risk score;

S209, generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score;

the chronic kidney disease progression risk prediction result described in this embodiment includes the risk level output by the progression prediction model, the risk likelihood score corresponding to the risk level, and the contribution value information of each evaluation index to the risk score. Specifically, in the step, a chronic kidney disease progression risk prediction result is generated based on the prediction risk score of the chronic kidney disease progression risk of the patient to be tested and the contribution value of each evaluation index, and the chronic kidney disease progression risk prediction result is displayed.

S210, updating the importance weight of the evaluation index according to the contribution value of the evaluation index to the risk score.

In another specific implementation manner of the present embodiment, based on the foregoing solution, a system analysis platform is built through a dock container virtualization technology, and an auxiliary tool for clinical chronic kidney disease progression risk analysis is built. Specifically, mySQL database is used to store system data, and Web server technologies include server technology, public gateway interface technology, and PHP (Hypertext Preprocessor ) technology. The specific construction process comprises the following steps:

Step one: constructing a server, using an Apache software component server, placing a webpage file under a default www folder of Apache software, and realizing remote access through an external network;

step two: the web page design and implementation adopts asynchronous JavaScript and XML (Asynchronous Javascript And XML, ajax) technology, and partial web pages can be updated by inputting task numbers, and the system prediction results are displayed;

step three: constructing a database, and storing medical characteristics and system prediction results of a patient to be tested by using a MySQL database;

step four: the construction of the analysis module triggers the system clinical prediction analysis service in an asynchronous mode through PHP pipeline technology.

The embodiment of the invention can predict the progression risk of the chronic kidney disease based on the effective evaluation indexes, can output the influence information of each evaluation index on the progression risk, and can provide a more accurate, effective and reliable prediction result for the progression risk of the chronic kidney disease.

The method for predicting the risk of chronic kidney disease progression in the embodiment of the present invention is described above, and the device for predicting the risk of chronic kidney disease progression in the embodiment of the present invention is described below, referring to fig. 4, an embodiment of the device for predicting the risk of chronic kidney disease progression in the embodiment of the present invention includes:

The training module 401 is configured to train to obtain a progress prediction model based on the history medical record information of the chronic kidney disease patient;

the scoring module 402 is configured to obtain sample data of a patient to be predicted, input the sample data into the progress prediction model for risk prediction, and obtain a risk score;

the analysis module 403 is configured to invoke a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score, so as to obtain a contribution value of each evaluation index in the sample data to the risk score;

and a result output module 404, configured to generate a chronic kidney disease progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score.

In another embodiment of the present application, the training module is specifically configured to: acquiring history medical record information of a chronic kidney disease patient, and generating a training sample based on the history medical record information and a corresponding diagnosis result;

In another embodiment of the present application, the chronic kidney disease progression risk prediction apparatus further includes an evaluation index screening module, where the evaluation index screening module is specifically configured to:

In another embodiment of the present application, the obtaining the history information of the chronic kidney disease patient, and generating the training sample based on the history information and the corresponding diagnosis result includes:

In another embodiment of the present application, the performing evaluation screening on the prediction effect of each candidate model based on the training samples, and selecting the candidate model with the optimal prediction effect as the progress prediction model includes:

In another embodiment of the present application, the evaluation index includes: age, hemoglobin, N-terminal osteocalcin, 25 hydroxy vitamin D, total protein, glutamic pyruvic transaminase, apolipoprotein E, thyroxine, carbohydrate antigen 199, mean red blood cell hemoglobin concentration, unsaturated iron binding capacity, blood homocysteine, albumin, major platelet fraction, total bilirubin, creatine kinase isoenzyme, red blood cell distribution breadth, IGG antibodies, mean red blood cell volume, folic acid, sodium, urea nitrogen, plasma proteins, neuronal enolase, serum total complement activity, neutrophil count, total bile acid, chloro, glycosylated albumin, low density lipoprotein cholesterol, non-high density lipoprotein cholesterol.

In another embodiment of the present application, the chronic kidney disease progression risk prediction apparatus further comprises an importance update module, wherein the importance update module is specifically configured to:

In another embodiment of the present application, the data preprocessing of the medical feature and the corresponding medical feature value includes:

normalizing the medical characteristic value;

The embodiment of the invention can predict the progression risk of the chronic kidney disease based on the effective evaluation indexes, can output the influence information of each evaluation index on the progression risk, and can provide a more accurate, effective and reliable prediction result for the progression risk of the chronic kidney disease. And after the chronic nephrosis progress risk prediction result is obtained, the importance weight of the evaluation index can be updated according to the contribution value of the risk score, so that the prediction effect of the progress prediction model is more accurate.

The chronic kidney disease progression risk prediction apparatus in the embodiment of the present invention is described in detail from the point of view of the modularized functional entity in fig. 4 above, and based on the same inventive concept, the embodiment of the present invention further provides a chronic kidney disease progression risk prediction device, and the chronic kidney disease progression risk prediction device in the embodiment of the present invention is described in detail from the point of view of hardware processing below.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. An electronic device 500 according to this embodiment of the present invention is described below with reference to fig. 5. The electronic device 500 shown in fig. 5 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.

As shown in fig. 5, the electronic device 500 is embodied in the form of a general purpose computing device. The components of electronic device 500 may include, but are not limited to: at least one processing unit 510, at least one memory unit 520, a bus 530 connecting the different system components (including the memory unit 520 and the processing unit 510), a display unit 540, etc.

Wherein the storage unit stores program code that is executable by the processing unit 510 such that the processing unit 510 performs the steps according to various exemplary embodiments of the invention described in the above processing method section of the present specification. For example, the processing unit 510 may perform the steps shown in fig. 1.

The memory unit 520 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) 5201 and/or cache memory unit 5202, and may further include Read Only Memory (ROM) 5203.

The storage unit 520 may also include a program/utility 5204 having a set (at least one) of program modules 5205, such program modules 5205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

Bus 530 may be one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 500 may also communicate with one or more external devices 100 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 500, and/or any device (e.g., router, modem, etc.) that enables the electronic device 500 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 550. Also, electronic device 500 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 560. The network adapter 560 may communicate with other modules of the electronic device 500 via the bus 530. It should be appreciated that although not shown in fig. 5, other hardware and/or software modules may be used in connection with electronic device 500, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

From the above description of embodiments, those skilled in the art will readily appreciate that the exemplary embodiments described herein may be implemented in software, or may be implemented in software in combination with necessary hardware. Thus, the technical solution according to the embodiments of the present invention may be embodied in the form of a software product, which may be stored in a computer readable storage medium (may be a CD-ROM, a usb disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, or a network device, etc.) to perform the above-mentioned method according to the present invention. The computer program, when executed by a data processing device, enables the computer readable medium to carry out the above-described method of the present invention, namely: such as the method shown in fig. 1 or fig. 2.

Fig. 6 is a schematic diagram of a computer readable medium according to an embodiment of the present disclosure.

A computer program implementing the method shown in fig. 1 or fig. 2 may be stored on one or more computer readable media. The computer readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

In summary, the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functionality of some or all of the components in accordance with embodiments of the present invention may be implemented in practice using a general purpose data processing device such as a microprocessor or Digital Signal Processor (DSP). The present invention can also be implemented as an apparatus or device program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.

The above-described specific embodiments further describe the objects, technical solutions and advantageous effects of the present invention in detail, and it should be understood that the present invention is not inherently related to any particular computer, virtual device or electronic apparatus, and various general-purpose devices may also implement the present invention. The foregoing description of the embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims

1. A method for predicting risk of progression of chronic kidney disease, comprising:

2. The method of claim 1, wherein training the progress prediction model based on the history information of the chronic kidney disease patient comprises:

3. The method according to claim 2, wherein before the acquiring the history information of the chronic kidney disease patient and generating the training sample based on the history information and the corresponding diagnosis result, further comprising:

4. The method of claim 3, wherein the obtaining historical medical record information of the chronic kidney disease patient, and generating training samples based on the historical medical record information and corresponding diagnostic results comprises:

5. The method according to claim 3, wherein the evaluating and screening the prediction effect of each candidate model based on the training samples, and selecting a candidate model with the optimal prediction effect as the progress prediction model comprises:

6. The method of predicting risk of progression of chronic kidney disease according to any one of claims 1-5, wherein the assessment indicator comprises:

7. The method according to claim 6, further comprising, after the generation of the chronic kidney disease progression risk prediction result from the risk score and the contribution value of each of the evaluation indexes to the risk score:

8. A chronic kidney disease progression risk prediction apparatus, characterized in that the chronic kidney disease progression risk prediction apparatus comprises:

9. A chronic kidney disease progression risk prediction apparatus, characterized in that the chronic kidney disease progression risk prediction apparatus comprises: a memory and at least one processor, the memory having instructions stored therein;

the at least one processor invoking the instructions in the memory to cause the chronic kidney disease progression risk prediction device to perform the steps of the chronic kidney disease progression risk prediction method of any one of claims 1-7.

10. A computer readable storage medium having instructions stored thereon, which when executed by a processor, implement the steps of the chronic kidney disease progression risk prediction method of any one of claims 1-7.