CN117690591A - Method, device, equipment and storage medium for predicting chronic kidney disease progression risk - Google Patents
Method, device, equipment and storage medium for predicting chronic kidney disease progression risk Download PDFInfo
- Publication number
- CN117690591A CN117690591A CN202311705794.7A CN202311705794A CN117690591A CN 117690591 A CN117690591 A CN 117690591A CN 202311705794 A CN202311705794 A CN 202311705794A CN 117690591 A CN117690591 A CN 117690591A
- Authority
- CN
- China
- Prior art keywords
- risk
- kidney disease
- chronic kidney
- prediction
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 208000020832 chronic kidney disease Diseases 0.000 title claims abstract description 132
- 238000000034 method Methods 0.000 title claims abstract description 50
- 230000005750 disease progression Effects 0.000 title claims abstract description 47
- 238000012549 training Methods 0.000 claims abstract description 73
- 238000011156 evaluation Methods 0.000 claims abstract description 69
- 238000010801 machine learning Methods 0.000 claims abstract description 17
- 230000001684 chronic effect Effects 0.000 claims abstract description 11
- 206010029164 Nephrotic syndrome Diseases 0.000 claims abstract description 9
- 208000009928 nephrosis Diseases 0.000 claims abstract description 9
- 231100001027 nephrosis Toxicity 0.000 claims abstract description 9
- 238000004422 calculation algorithm Methods 0.000 claims description 42
- 238000012216 screening Methods 0.000 claims description 29
- 230000000694 effects Effects 0.000 claims description 26
- 238000013145 classification model Methods 0.000 claims description 20
- 238000003745 diagnosis Methods 0.000 claims description 17
- 102000009027 Albumins Human genes 0.000 claims description 16
- 108010088751 Albumins Proteins 0.000 claims description 16
- 210000003743 erythrocyte Anatomy 0.000 claims description 15
- 238000007781 pre-processing Methods 0.000 claims description 13
- 238000007637 random forest analysis Methods 0.000 claims description 13
- 102000001554 Hemoglobins Human genes 0.000 claims description 11
- 108010054147 Hemoglobins Proteins 0.000 claims description 11
- 230000003907 kidney function Effects 0.000 claims description 11
- OVBPIULPVIDEAO-LBPRGKRZSA-N folic acid Chemical compound C=1N=C2NC(N)=NC(=O)C2=NC=1CNC1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 OVBPIULPVIDEAO-LBPRGKRZSA-N 0.000 claims description 10
- 230000007423 decrease Effects 0.000 claims description 8
- 238000012360 testing method Methods 0.000 claims description 8
- 101710095339 Apolipoprotein E Proteins 0.000 claims description 6
- 102100029470 Apolipoprotein E Human genes 0.000 claims description 6
- 238000008050 Total Bilirubin Reagent Methods 0.000 claims description 6
- 238000004458 analytical method Methods 0.000 claims description 6
- 238000013135 deep learning Methods 0.000 claims description 6
- 230000008030 elimination Effects 0.000 claims description 6
- 238000003379 elimination reaction Methods 0.000 claims description 6
- 238000007689 inspection Methods 0.000 claims description 6
- 238000004393 prognosis Methods 0.000 claims description 6
- 210000002966 serum Anatomy 0.000 claims description 6
- HSINOMROUCMIEA-FGVHQWLLSA-N (2s,4r)-4-[(3r,5s,6r,7r,8s,9s,10s,13r,14s,17r)-6-ethyl-3,7-dihydroxy-10,13-dimethyl-2,3,4,5,6,7,8,9,11,12,14,15,16,17-tetradecahydro-1h-cyclopenta[a]phenanthren-17-yl]-2-methylpentanoic acid Chemical compound C([C@@]12C)C[C@@H](O)C[C@H]1[C@@H](CC)[C@@H](O)[C@@H]1[C@@H]2CC[C@]2(C)[C@@H]([C@H](C)C[C@H](C)C(O)=O)CC[C@H]21 HSINOMROUCMIEA-FGVHQWLLSA-N 0.000 claims description 5
- JWUBBDSIWDLEOM-XHQRYOPUSA-N (3e)-3-[(2e)-2-[1-(6-hydroxy-6-methylheptan-2-yl)-7a-methyl-2,3,3a,5,6,7-hexahydro-1h-inden-4-ylidene]ethylidene]-4-methylidenecyclohexan-1-ol Chemical compound C1CCC2(C)C(C(CCCC(C)(C)O)C)CCC2\C1=C\C=C1/CC(O)CCC1=C JWUBBDSIWDLEOM-XHQRYOPUSA-N 0.000 claims description 5
- 102100036475 Alanine aminotransferase 1 Human genes 0.000 claims description 5
- 108010082126 Alanine transaminase Proteins 0.000 claims description 5
- 102000004506 Blood Proteins Human genes 0.000 claims description 5
- 108010017384 Blood Proteins Proteins 0.000 claims description 5
- 235000021318 Calcifediol Nutrition 0.000 claims description 5
- 102000004420 Creatine Kinase Human genes 0.000 claims description 5
- 108010042126 Creatine kinase Proteins 0.000 claims description 5
- XUIIKFGFIJCVMT-GFCCVEGCSA-N D-thyroxine Chemical compound IC1=CC(C[C@@H](N)C(O)=O)=CC(I)=C1OC1=CC(I)=C(O)C(I)=C1 XUIIKFGFIJCVMT-GFCCVEGCSA-N 0.000 claims description 5
- DGAQECJNVWCQMB-PUAWFVPOSA-M Ilexoside XXIX Chemical compound C[C@@H]1CC[C@@]2(CC[C@@]3(C(=CC[C@H]4[C@]3(CC[C@@H]5[C@@]4(CC[C@@H](C5(C)C)OS(=O)(=O)[O-])C)C)[C@@H]2[C@]1(C)O)C)C(=O)O[C@H]6[C@@H]([C@H]([C@@H]([C@H](O6)CO)O)O)O.[Na+] DGAQECJNVWCQMB-PUAWFVPOSA-M 0.000 claims description 5
- 102000008133 Iron-Binding Proteins Human genes 0.000 claims description 5
- 108010035210 Iron-Binding Proteins Proteins 0.000 claims description 5
- 108010044467 Isoenzymes Proteins 0.000 claims description 5
- FFFHZYDWPBMWHY-VKHMYHEASA-N L-homocysteine Chemical compound OC(=O)[C@@H](N)CCS FFFHZYDWPBMWHY-VKHMYHEASA-N 0.000 claims description 5
- OVBPIULPVIDEAO-UHFFFAOYSA-N N-Pteroyl-L-glutaminsaeure Natural products C=1N=C2NC(N)=NC(=O)C2=NC=1CNC1=CC=C(C(=O)NC(CCC(O)=O)C(O)=O)C=C1 OVBPIULPVIDEAO-UHFFFAOYSA-N 0.000 claims description 5
- 102000004067 Osteocalcin Human genes 0.000 claims description 5
- 108090000573 Osteocalcin Proteins 0.000 claims description 5
- PNNCWTXUWKENPE-UHFFFAOYSA-N [N].NC(N)=O Chemical compound [N].NC(N)=O PNNCWTXUWKENPE-UHFFFAOYSA-N 0.000 claims description 5
- 239000000427 antigen Substances 0.000 claims description 5
- 102000036639 antigens Human genes 0.000 claims description 5
- 108091007433 antigens Proteins 0.000 claims description 5
- 239000003613 bile acid Substances 0.000 claims description 5
- 150000001720 carbohydrates Chemical class 0.000 claims description 5
- 125000001309 chloro group Chemical group Cl* 0.000 claims description 5
- 230000024203 complement activation Effects 0.000 claims description 5
- 229960000304 folic acid Drugs 0.000 claims description 5
- 235000019152 folic acid Nutrition 0.000 claims description 5
- 239000011724 folic acid Substances 0.000 claims description 5
- 150000002505 iron Chemical class 0.000 claims description 5
- 210000000440 neutrophil Anatomy 0.000 claims description 5
- 230000007170 pathology Effects 0.000 claims description 5
- 102000004169 proteins and genes Human genes 0.000 claims description 5
- 108090000623 proteins and genes Proteins 0.000 claims description 5
- 239000011734 sodium Substances 0.000 claims description 5
- 229910052708 sodium Inorganic materials 0.000 claims description 5
- 229940034208 thyroxine Drugs 0.000 claims description 5
- XUIIKFGFIJCVMT-UHFFFAOYSA-N thyroxine-binding globulin Natural products IC1=CC(CC([NH3+])C([O-])=O)=CC(I)=C1OC1=CC(I)=C(O)C(I)=C1 XUIIKFGFIJCVMT-UHFFFAOYSA-N 0.000 claims description 5
- 108010023302 HDL Cholesterol Proteins 0.000 claims description 4
- 108010028554 LDL Cholesterol Proteins 0.000 claims description 4
- 102000012288 Phosphopyruvate Hydratase Human genes 0.000 claims description 4
- 108010022181 Phosphopyruvate Hydratase Proteins 0.000 claims description 4
- 210000004369 blood Anatomy 0.000 claims description 4
- 239000008280 blood Substances 0.000 claims description 4
- 230000001537 neural effect Effects 0.000 claims description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 5
- 239000000523 sample Substances 0.000 description 43
- 238000012545 processing Methods 0.000 description 14
- DDRJAANPRJIHGJ-UHFFFAOYSA-N creatinine Chemical compound CN1CC(=O)NC1=N DDRJAANPRJIHGJ-UHFFFAOYSA-N 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000011282 treatment Methods 0.000 description 7
- 229940109239 creatinine Drugs 0.000 description 6
- 210000002700 urine Anatomy 0.000 description 6
- 208000017169 kidney disease Diseases 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000002159 abnormal effect Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 230000002485 urinary effect Effects 0.000 description 4
- 201000000523 end stage renal failure Diseases 0.000 description 3
- 230000024924 glomerular filtration Effects 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 210000003734 kidney Anatomy 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 241000045500 Diseae Species 0.000 description 2
- 206010027525 Microalbuminuria Diseases 0.000 description 2
- 239000000654 additive Substances 0.000 description 2
- 230000000996 additive effect Effects 0.000 description 2
- HVYWMOMLDIMFJA-DPAQBDIFSA-N cholesterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2 HVYWMOMLDIMFJA-DPAQBDIFSA-N 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 208000028208 end stage renal disease Diseases 0.000 description 2
- 238000003064 k means clustering Methods 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 208000010444 Acidosis Diseases 0.000 description 1
- 208000020084 Bone disease Diseases 0.000 description 1
- 206010061818 Disease progression Diseases 0.000 description 1
- 108010010234 HDL Lipoproteins Proteins 0.000 description 1
- 102000015779 HDL Lipoproteins Human genes 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-N Hydrochloric acid Chemical compound Cl VEXZGXHMUGYJMC-UHFFFAOYSA-N 0.000 description 1
- 206010020772 Hypertension Diseases 0.000 description 1
- 201000001431 Hyperuricemia Diseases 0.000 description 1
- 108010007622 LDL Lipoproteins Proteins 0.000 description 1
- 102000007330 LDL Lipoproteins Human genes 0.000 description 1
- 206010027417 Metabolic acidosis Diseases 0.000 description 1
- 206010030113 Oedema Diseases 0.000 description 1
- 208000001647 Renal Insufficiency Diseases 0.000 description 1
- 208000007502 anemia Diseases 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 235000012000 cholesterol Nutrition 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 229910052500 inorganic mineral Inorganic materials 0.000 description 1
- 230000002427 irreversible effect Effects 0.000 description 1
- 201000006370 kidney failure Diseases 0.000 description 1
- 239000011707 mineral Substances 0.000 description 1
- 239000013610 patient sample Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 208000037821 progressive disease Diseases 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- 238000013058 risk prediction model Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 238000011269 treatment regimen Methods 0.000 description 1
Landscapes
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The invention relates to the technical field of artificial intelligence, and discloses a chronic kidney disease progression risk prediction method, device, equipment and storage medium, which are used for solving the technical problem that the result predicted by the chronic kidney disease progression risk prediction method in the prior art is inaccurate. The method comprises the following steps: training based on the history medical record information of the chronic kidney disease patient to obtain a progress prediction model; acquiring sample data of a patient to be predicted, inputting the sample data into a progress prediction model for risk prediction, and obtaining a risk score; invoking a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score to obtain a contribution value of each evaluation index in the sample data to the risk score; and generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score. The method can provide more accurate and effective prediction results for the progression risk of chronic kidney disease.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method, a device, equipment and a storage medium for predicting chronic kidney disease progression risk.
Background
Chronic kidney disease (CKD, chronic Kidney Diseae) is a progressive disease that manifests as kidney damage or reduced kidney function for at least three months. According to the content of a global disease burden study report, nearly 120 tens of thousands of people die worldwide from CKD, and more than one of every 7 adults suffers from CKD. In recent years, a great deal of literature has emphasized that the progress of CKD can lead to serious complications such as high mortality, end-stage renal disease, mineral bone disease, etc., and that despite the increasing incidence of CKD and extremely high mortality rates of CKD-induced diseases such as ESRD (End Atage Renal Disease, end-stage renal disease), CKD is often not found until it has progressed to the end-stage, which makes its therapeutic efficacy undesirable due to its complex potential background, occult clinical manifestations, and irreversible clinical processes. Clinical factors such as anemia, hypertension, hyperuricemia, metabolic acidosis, edema, etc. are associated with CKD progression. The traditional method for diagnosing CKD requires expert consultation and multiple examinations, and is usually based on knowledge and experience of doctors, so that diagnosis errors and inaccurate diagnosis results are easy to occur, and particularly for rural areas with limited medical conditions, the complicated diagnosis process not only brings additional burden to medical staff, but also delays the optimal time for treating patients. Under the background, the performance and efficiency of the CKD layered diagnosis and treatment are improved by applying intelligent technologies such as machine learning and the like, and the method has important practical significance for guaranteeing the health of people. Currently, artificial intelligence has been widely used in the field of kidney disease, and prediction of disease progression and treatment advice are the two most important applications of artificial intelligence in clinical practice of kidney disease, and many studies consider chronic kidney disease prediction schemes based on machine learning.
In the prior art, constructing an artificial intelligent model for layered diagnosis and treatment of CKD patients according to two laboratory quantitative indexes emphasized by the Kidney Disease Improvement Global Outcome (KDIGO) guidelines (2012 edition), namely urine albumin to creatinine ratio (ACR) and glomerular filtration rate (evfr), is helpful for patient layering and formulating clinical treatment strategies, and can delay the progress of CKD in time and reduce the occurrence rate of complications. However, since the urinary albumin index as a basic parameter of ACR has a problem in terms of cost and accuracy in detection, it is difficult to implement ACR as a general screening standard for CKD. Therefore, from the standpoint of practicality and cost effectiveness, it is desirable to provide a more useful and convenient method to replace the detection of urinary albumin. In addition, in the layered diagnosis and treatment scheme of the existing artificial intelligent model for the CKD patient, the model mostly has the problem of a black box, the contribution degree of each index to the output result cannot be determined, and the reliability is not high in the medical field with extremely high requirements on accuracy and certainty. There is an urgent need for a method for predicting the risk of progression of chronic kidney disease that can provide a more accurate and effective prediction of the risk of progression of chronic kidney disease.
Disclosure of Invention
The invention mainly aims to solve the technical problem that the prediction result of the chronic kidney disease progression risk prediction method in the prior art is inaccurate.
The first aspect of the present invention provides a method for predicting the risk of progression of chronic kidney disease, comprising:
training based on the history medical record information of the chronic kidney disease patient to obtain a progress prediction model;
acquiring sample data of a patient to be predicted, and inputting the sample data into the progress prediction model to perform risk prediction to obtain a risk score;
invoking a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score, and obtaining a contribution value of each evaluation index in the sample data to the risk score;
and generating a chronic kidney disease progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score.
Optionally, in a first implementation manner of the first aspect of the present invention, the training to obtain the progress prediction model based on the history information of the chronic kidney disease patient includes:
acquiring history medical record information of a chronic kidney disease patient, and generating a training sample based on the history medical record information and a corresponding diagnosis result;
Constructing a plurality of initial models based on a plurality of deep learning algorithms, and training each initial model according to the training samples to obtain a plurality of trained candidate models;
and evaluating and screening the prediction effect of each candidate model based on the training sample, and selecting the candidate model with the optimal prediction effect as a progress prediction model.
Optionally, in a second implementation manner of the first aspect of the present invention, before the acquiring the history medical record information of the chronic kidney disease patient and generating the training sample based on the history medical record information and the corresponding diagnosis result, the method further includes:
calling a regularization feature sparse algorithm and a feature recursion elimination algorithm of a random forest to construct a plurality of initial classification models;
inputting the historical medical record information into the initial classification model, calling a grid inspection algorithm to perform optimal super-parameter adjustment on the initial classification model, and selecting the initial classification model with the highest integral under the working characteristic curve of the subject as a characteristic screening model;
according to the feature screening model, importance ranking is carried out on the medical features to obtain a medical feature sequence;
and screening medical feature combinations with importance weights larger than a preset threshold value in the medical feature sequences to obtain an evaluation index.
Optionally, in a third implementation manner of the first aspect of the present invention, the obtaining the historical medical record information of the chronic kidney disease patient, and generating the training sample based on the historical medical record information and the corresponding diagnosis result includes:
collecting electronic medical record data of a chronic kidney disease patient confirmed by pathology from a hospital electronic medical record platform;
extracting medical features from the electronic medical record data to obtain medical features and corresponding medical feature values, and preprocessing the medical features and the corresponding medical feature values;
and marking the grade of the decline of the kidney function and classifying the risk prognosis for the medical feature and the corresponding medical feature value after the data preprocessing, so as to obtain a training sample.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the evaluating and screening the prediction effect of each candidate model based on the training samples, and selecting the candidate model with the optimal prediction effect as the progress prediction model includes:
inputting the training samples into each candidate model, and carrying out optimal super-parameter adjustment on each candidate model by using a grid optimizing algorithm;
calculating the area under the working characteristic curve of the test subject of each candidate model;
And selecting a candidate model with the optimal prediction effect as a progress prediction model according to the area under the working characteristic curve of the subject of each candidate model.
Optionally, in a fifth implementation manner of the first aspect of the present invention, the evaluation index includes:
age, hemoglobin, N-terminal osteocalcin, 25 hydroxy vitamin D, total protein, glutamic pyruvic transaminase, apolipoprotein E, thyroxine, carbohydrate antigen 199, mean red blood cell hemoglobin concentration, unsaturated iron binding capacity, blood homocysteine, albumin, major platelet fraction, total bilirubin, creatine kinase isoenzyme, red blood cell distribution breadth, IGG antibodies, mean red blood cell volume, folic acid, sodium, urea nitrogen, plasma proteins, neuronal enolase, serum total complement activity, neutrophil count, total bile acid, chloro, glycosylated albumin, low density lipoprotein cholesterol, non-high density lipoprotein cholesterol.
Optionally, in a sixth implementation manner of the first aspect of the present invention, after the generating a chronic kidney disease progression risk prediction result according to the risk score and the contribution value of each of the evaluation indexes to the risk score, the method further includes:
And updating the importance weight of the evaluation index according to the contribution value of the evaluation index to the risk score.
Optionally, in a seventh implementation manner of the first aspect of the present invention, the data preprocessing the medical feature and the corresponding medical feature value includes:
carrying out standardized treatment on the medical features, unifying codes and changing classification values into one-hot codes;
normalizing the medical characteristic value;
and detecting abnormal values of the medical characteristic values after normalization processing by adopting a clustering algorithm, judging the weight and the balance of the data after removing the extreme outliers, balancing the training samples by using a random sampling method, and filling the missing data by using a random forest algorithm.
In a second aspect, the present invention provides a chronic kidney disease progression risk prediction apparatus comprising:
the training module is used for training to obtain a progress prediction model based on the history medical record information of the chronic kidney disease patient;
the scoring module is used for acquiring sample data of a patient to be predicted, inputting the sample data into the progress prediction model for risk prediction, and obtaining a risk score;
the analysis module is used for calling a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk scores, and obtaining contribution values of all evaluation indexes in the sample data to the risk scores;
And the result output module is used for generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score.
Optionally, in a first implementation manner of the second aspect of the present invention, the training module is specifically configured to: acquiring history medical record information of a chronic kidney disease patient, and generating a training sample based on the history medical record information and a corresponding diagnosis result; constructing a plurality of initial models based on a plurality of deep learning algorithms, and training each initial model according to the training samples to obtain a plurality of trained candidate models; and evaluating and screening the prediction effect of each candidate model based on the training sample, and selecting the candidate model with the optimal prediction effect as a progress prediction model.
Optionally, in a second implementation manner of the second aspect of the present invention, the chronic kidney disease progression risk prediction apparatus further includes an evaluation index screening module, where the evaluation index screening module is specifically configured to: calling a regularization feature sparse algorithm and a feature recursion elimination algorithm of a random forest to construct a plurality of initial classification models; inputting the historical medical record information into the initial classification model, calling a grid inspection algorithm to perform optimal super-parameter adjustment on the initial classification model, and selecting the initial classification model with the highest integral under the working characteristic curve of the subject as a characteristic screening model; according to the feature screening model, importance ranking is carried out on the medical features to obtain a medical feature sequence; and screening medical feature combinations with importance weights larger than a preset threshold value in the medical feature sequences to obtain an evaluation index.
Optionally, in a third implementation manner of the second aspect of the present invention, the obtaining the history medical record information of the chronic kidney disease patient, and generating the training sample based on the history medical record information and the corresponding diagnosis result includes: collecting electronic medical record data of a chronic kidney disease patient confirmed by pathology from a hospital electronic medical record platform; extracting medical features from the electronic medical record data to obtain medical features and corresponding medical feature values, and preprocessing the medical features and the corresponding medical feature values; and marking the grade of the decline of the kidney function and classifying the risk prognosis for the medical feature and the corresponding medical feature value after the data preprocessing, so as to obtain a training sample.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the evaluating and screening the prediction effect of each candidate model based on the training samples, and selecting the candidate model with the optimal prediction effect as the progress prediction model includes: inputting the training samples into each candidate model, and carrying out optimal super-parameter adjustment on each candidate model by using a grid optimizing algorithm; calculating the area under the working characteristic curve of the test subject of each candidate model; and selecting a candidate model with the optimal prediction effect as a progress prediction model according to the area under the working characteristic curve of the subject of each candidate model.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the evaluation index includes: age, hemoglobin, N-terminal osteocalcin, 25 hydroxy vitamin D, total protein, glutamic pyruvic transaminase, apolipoprotein E, thyroxine, carbohydrate antigen 199, mean red blood cell hemoglobin concentration, unsaturated iron binding capacity, blood homocysteine, albumin, major platelet fraction, total bilirubin, creatine kinase isoenzyme, red blood cell distribution breadth, IGG antibodies, mean red blood cell volume, folic acid, sodium, urea nitrogen, plasma proteins, neuronal enolase, serum total complement activity, neutrophil count, total bile acid, chloro, glycosylated albumin, low density lipoprotein cholesterol, non-high density lipoprotein cholesterol.
Optionally, in a sixth implementation manner of the second aspect of the present invention, the chronic kidney disease progression risk prediction apparatus further includes an importance update module, where the importance update module is specifically configured to: and updating the importance weight of the evaluation index according to the contribution value of the evaluation index to the risk score.
Optionally, in a seventh implementation manner of the second aspect of the present invention, the data preprocessing the medical feature and the corresponding medical feature value includes: carrying out standardized treatment on the medical features, unifying codes and changing classification values into one-hot codes; normalizing the medical characteristic value; and detecting abnormal values of the medical characteristic values after normalization processing by adopting a clustering algorithm, judging the weight and the balance of the data after removing the extreme outliers, balancing the training samples by using a random sampling method, and filling the missing data by using a random forest algorithm.
A third aspect of the present invention provides a chronic kidney disease progression risk prediction apparatus device, comprising: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the chronic kidney disease progression risk prediction means device to perform the steps of the chronic kidney disease progression risk prediction means method described above.
A fourth aspect of the present invention provides a computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the steps of the chronic kidney disease progression risk prediction apparatus method described above.
According to the technical scheme provided by the invention, a progress prediction model is obtained based on the history medical record information training of the chronic kidney disease patient; acquiring sample data of a patient to be predicted, inputting the sample data into a progress prediction model for risk prediction, and obtaining a risk score; invoking a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score to obtain a contribution value of each evaluation index in the sample data to the risk score; and generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score. The method can provide more accurate and effective prediction results for the progression risk of the chronic kidney disease.
Drawings
FIG. 1 is a flow chart of an embodiment of a method for predicting risk of progression of chronic kidney disease according to an embodiment of the present invention;
FIG. 2 is a flow chart of another embodiment of a method for predicting risk of progression of chronic kidney disease according to an embodiment of the invention;
FIG. 3 is a schematic diagram showing the working characteristics of a subject at the step of characteristic evaluation in another embodiment of a method for predicting risk of progression of chronic kidney disease according to an embodiment of the present invention;
FIG. 4 is a schematic view showing an apparatus for predicting risk of progression of chronic kidney disease according to an embodiment of the present invention;
FIG. 5 is a schematic view showing an embodiment of a chronic kidney disease progression risk prediction apparatus according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a computer readable medium according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments can be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these exemplary embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the invention to those skilled in the art. The same reference numerals in the drawings denote the same or similar elements, components or portions, and thus a repetitive description thereof will be omitted.
The features, structures, characteristics or other details described in a particular embodiment do not exclude that may be combined in one or more other embodiments in a suitable manner, without departing from the technical idea of the invention.
In the description of specific embodiments, features, structures, characteristics, or other details described in the present invention are provided to enable one skilled in the art to fully understand the embodiments. However, it is not excluded that one skilled in the art may practice the present invention without one or more of the specific features, structures, characteristics, or other details.
The flow diagrams depicted in the figures are exemplary only, and do not necessarily include all of the elements and operations/steps, nor must they be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.
The block diagrams depicted in the figures are merely functional entities and do not necessarily correspond to physically separate entities. That is, the functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
The term "and/or" and/or "includes all combinations of any one or more of the associated listed items.
Referring to fig. 1, an embodiment of a method for predicting risk of progression of chronic kidney disease according to an embodiment of the present invention includes:
s101, training based on historical medical record information of a chronic kidney disease patient to obtain a progress prediction model;
in this embodiment, a scheme for predicting the progression risk of chronic kidney disease (CKD, chronic Kidney Diseae) is specifically described, and it is to be understood that the execution subject of the present invention may be a chronic kidney disease progression risk prediction device, or may be a terminal or a server, which is not limited herein. The embodiment of the invention is described by taking a server as an execution main body as an example.
The method comprises the steps that a server is trained to obtain a progress prediction model in advance based on historical medical record information of a chronic kidney disease patient before receiving a chronic kidney disease progress risk prediction request; the specific step of training the progress prediction model comprises the following steps: acquiring history medical record information of a chronic kidney disease patient, and generating a training sample based on the history medical record information and a corresponding diagnosis result; constructing a plurality of initial models based on a plurality of deep learning algorithms, and training each initial model according to the training samples to obtain a plurality of trained candidate models; inputting the training samples into each candidate model, and carrying out optimal super-parameter adjustment on each candidate model by using a grid optimizing algorithm; calculating the area under the working characteristic curve of the test subject of each candidate model; and selecting a candidate model with the optimal prediction effect as a progress prediction model according to the area under the working characteristic curve of the subject of each candidate model.
S102, acquiring sample data of a patient to be predicted, and inputting the sample data into a progress prediction model to perform risk prediction to obtain a risk score;
after the chronic kidney disease progress risk prediction request is obtained, sample data of a patient to be predicted is obtained, the sample data is input into a progress prediction model obtained through training in the previous step to predict the chronic kidney disease progress risk of the patient to be predicted, and a risk score of the current possible chronic kidney disease progress of the patient to be predicted is obtained. Specifically, in this embodiment, the pre-constructed progress prediction model may calculate, based on specific evaluation values of each evaluation index in sample data of the patient to be predicted, a risk score of the future possible chronic kidney disease progress of the patient to be predicted according to the super parameters adjusted in the model, where the specific evaluation indexes may include:
age (Age), serum Creatinine (CRE), N-terminal osteocalcin (NTX), hemoglobin (HGB), 25 hydroxy vitamin D (25 OHD), total Protein (TP), glutamic pyruvic transaminase (ALT), apolipoprotein E (APO-E), thyroxine (T4), carbohydrate antigen 199 (CA 199), mean red blood cell hemoglobin concentration (MCHC), unsaturated Iron Binding Capacity (UIBC), homocysteine (che), albumin (ALB), large platelet fraction (P-LCR), total Bilirubin (TBIL), creatine kinase isoenzyme (CM-MB), red blood cell distribution width (RDW-SD), IGG antibodies (IGG), mean red blood cell volume (MCV), folic acid (for), sodium (NA), urea nitrogen (N), plasma protein (GLO), neuroenolase (NSE), serum total complement activity (CH 50), neutrophil count (net), total bile acid (a), chloro (CL), glycosylated Albumin (GA), low density lipoprotein (HDL), high cholesterol (LDL-high).
S103, calling a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score, and obtaining a contribution value of each evaluation index in the sample data to the risk score;
s104, generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score.
In this embodiment, the progress prediction model is analyzed in advance for the prediction process of chronic kidney disease risk by an interpretability (Shapley Additive exPlanations, SHAP) algorithm. The parsing includes: the highly compressed model global insight capability is evaluated by calculating the SHAP values to explain the contribution of each medical assessment index to CKD risk prediction. Specifically, in this embodiment, the contribution value of each evaluation index to risk prediction is calculated by using the machine learning interpretation tool, so that the evaluation index that affects the prediction result most when the current patient to be tested is considered to have a certain level of chronic kidney disease progression risk can be determined.
And generating a chronic nephrosis progression risk prediction result based on the prediction risk score of the chronic nephrosis progression risk of the patient to be tested and the contribution value of each evaluation index.
In this embodiment, after the chronic kidney disease progression risk prediction result is obtained, the importance weight of the evaluation index may be updated according to the contribution value of the risk score, so that the prediction effect of the progression prediction model is more accurate.
The method provided by the embodiment of the invention can predict the progression risk of the chronic kidney disease based on the effective evaluation indexes, can output the influence information of each evaluation index on the progression risk, and can provide a more accurate, effective and reliable prediction result for the progression risk of the chronic kidney disease.
Referring to fig. 2 and 3, another embodiment of the method for predicting risk of progression of chronic kidney disease according to the present invention includes:
s201, acquiring historical medical record information of a chronic kidney disease patient, and screening evaluation indexes based on the historical medical record information;
in the step, firstly, effective evaluation indexes contained in the history medical record information are screened through an artificial intelligence algorithm. Firstly, preparing electronic medical record data, namely acquiring an electronic medical record of a CKD patient confirmed by pathology from a hospital electronic medical record platform, extracting medical characteristics of the electronic medical record data to obtain medical characteristics and corresponding medical characteristic values, and preprocessing the medical characteristics and the corresponding medical characteristic values; the resulting clinical big data information of CKD patients is then normalized for characteristic data.
In a specific embodiment, in the step of extracting medical features from electronic medical record data to obtain medical features and corresponding medical feature values, the method may specifically include: calling a regularization feature sparse algorithm and a feature recursion elimination algorithm of a random forest to construct a plurality of initial classification models; inputting the history medical record information into an initial classification model, calling a grid inspection algorithm to perform optimal super-parameter adjustment on the initial classification model, and selecting the initial classification model with the highest integral under the working characteristic curve of the subject as a characteristic screening model; according to the feature screening model, importance ranking is carried out on the medical features to obtain a medical feature sequence; the medical feature combinations with importance weights greater than the preset threshold value in the medical feature sequences are screened to obtain the evaluation indexes, and the specific evaluation indexes are basically the same as those in the previous embodiment, so that the detailed description is omitted herein. With continued reference to the schematic diagram of the working characteristic curve of the subject in the characteristic evaluation step of fig. 3, it can be seen that the area under the curve is substantially greater than 0.90 in the case that the number of the characteristic groups is maintained in a certain number in the test set, which indicates that the characteristic screening model in this step is effective and reasonable to some extent. Specifically, the features are deleted recursively by adopting a recursive feature elimination algorithm, a model is constructed by using the residual features on the basis of a random forest, and the feature combination which contributes to the prediction result most is judged according to the AUC (Area Under the Curve ) value of the ROC (Receiver oOperating Characteristic Curve, subject working feature curve) of the model.
In a specific embodiment, the step of normalizing the feature data of the obtained clinical big data information of the CKD patient may specifically be performed by: carrying out standardized processing on medical characteristic values, particularly unifying symbols, letters, characters and medical codes, and changing classification values into single-heat codes; normalizing the medical characteristic value, particularly ensuring the convergence of an algorithm, and normalizing the characteristic value; detecting abnormal values by adopting a k-means clustering algorithm (k-means Clustering Algorithm), judging the weight and the balance of the data after removing the extreme outliers, and balancing the training sample data by using a random sampling method; and finally filling the missing data by using a random forest algorithm.
S202, generating a training sample based on historical medical record information and corresponding diagnosis results;
and marking the descending level of the kidney function and classifying the dangerous prognosis to the medical characteristic and the corresponding medical characteristic value after the data preprocessing to obtain a training sample.
In the case of a specific training sample, referring to table 1, in this step, the historical medical record information can be labeled and classified according to the evfr (Estimated Glomerular Filtration Rate ) and ACR (Albumin-to-Creatinine Ratio, urinary microalbumin Creatinine Ratio, also called urinary Albumin Creatinine Ratio) levels in the historical medical record information, so as to obtain the prognosis risk classification of the kidney disease. At present, according to the latest international guidelines for chronic kidney disease, the disease is newly divided into two parts, namely:
(1) The glomerular filtration rate was divided into the following stages:
stage G1: GFR is more than or equal to 90 ml/(min.1.73 m) 2 ) Indicating normal or increased kidney function;
stage G2: the GFR range is 60-89 ml/(min.1.73 m) 2 ) Indicating a slight decrease in renal function;
stage G3 a: the GFR is 45-59 ml/(min.1.73 m) 2 ) Indicating a slight to moderate decrease in renal function;
stage G3 b: the GFR range is 30-44 ml/(min.1.73 m) 2 ) Indicating a moderate to severe decline in renal function;
stage G4: the GFR is 15-29 ml/(min.1.73 m) 2 ) Indicating a severe decline in renal function;
stage G5: GFR (GFR)<15ml/(min·1.73m 2 ) Indicating renal failure.
(2) The quantitative stage of albumin urine is 3:
stage A1: urine microalbumin <30mg/d indicates normal or slightly elevated urine microalbumin levels;
stage A2: the range of the urine microalbumin is 30-299mg/d, which means that the urine microalbumin level is moderately increased;
stage A3: the microalbuminuria is more than or equal to 300mg/d, which means that the microalbuminuria level is increased by weight.
Such staging may be more clear and accurate for disease judgment.
Table 1 training sample classification table
In table 1, training samples belonging to the classification of A1G1 and A1G2 are labeled as low risk, training samples belonging to the classification of A1G3a, A2G1 and A2G2 are labeled as medium risk, training samples belonging to the classification of A1G3b, A2G3a, A3G1 and A3G2 are labeled as high risk, and training samples belonging to the classification of A1G4, A1G5, A2G3, A2G4, A2G5, A3G3a, A3G3b, A3G4 and A3G5 are labeled as extremely high risk. Where the number is the number of samples that one practical implementation belongs to under various classifications.
S203, constructing a plurality of initial models based on a plurality of deep learning algorithms, and training each initial model according to a training sample to obtain a plurality of trained candidate models;
s204, inputting training samples into each candidate model, and performing optimal super-parameter adjustment on each candidate model by using a grid optimizing algorithm;
s205, calculating the area under the working characteristic curve of the test subject of each candidate model;
s206, selecting a candidate model with the optimal prediction effect as a progress prediction model according to the area under the subject working characteristic curve of each candidate model;
in this embodiment, a plurality of machine learning models are trained based on the data set formed by the training samples obtained in the previous step, and an optimal model is selected for the plurality of trained machine learning models to obtain a multi-classification chronic kidney disease risk prediction model.
Specifically, the initial machine learning model may be built by gradient lifting (Gradient Boosting Classifier, GBC), k-Nearest Neighbor (KNN), multi-layer perceptron (Multilayer Perceptron, MLP), and Random Forest (RF) algorithms, respectively.
Randomly dividing the training samples obtained in the previous steps into 5 parts, selecting 4 parts of training data sets for constructing models, taking the rest 1 part as a test data set of the models, constructing a plurality of initial machine learning models, carrying out optimal super-parameter adjustment through grid inspection, selecting the model with the highest AUC value score of ROC, and storing the model as a progress prediction model.
S207, acquiring sample data of a patient to be predicted, and inputting the sample data into a progress prediction model for risk prediction to obtain a risk score;
after the chronic kidney disease progress risk prediction request is obtained, sample data of a patient to be predicted is obtained, the sample data is input into a progress prediction model obtained through training in the previous step to predict the chronic kidney disease progress risk of the patient to be predicted, and a risk score of the current possible chronic kidney disease progress of the patient to be predicted is obtained.
In a specific embodiment, when risk prediction is performed, specifically, the risk of chronic kidney disease progression possibility of a patient to be tested in a target prediction time is predicted, the input of the model is patient sample data to be predicted and the target prediction time, and the output of the model is the risk likelihood score corresponding to the risk level of chronic kidney disease progression possibility and the risk level in the target prediction time.
S208, calling a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score, and obtaining a contribution value of each evaluation index in the sample data to the risk score;
in this embodiment, the progress prediction model is analyzed in advance for the prediction process of chronic kidney disease risk by an interpretability (Shapley Additive exPlanations, SHAP) algorithm. The parsing includes: the highly compressed model global insight capability is evaluated by calculating the SHAP values to explain the contribution of each medical assessment index to CKD risk prediction. Specifically, in this embodiment, the contribution value of each evaluation index to risk prediction is calculated by using the machine learning interpretation tool, so that the evaluation index that affects the prediction result most when the current patient to be tested is considered to have a certain level of chronic kidney disease progression risk can be determined.
S209, generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score;
the chronic kidney disease progression risk prediction result described in this embodiment includes the risk level output by the progression prediction model, the risk likelihood score corresponding to the risk level, and the contribution value information of each evaluation index to the risk score. Specifically, in the step, a chronic kidney disease progression risk prediction result is generated based on the prediction risk score of the chronic kidney disease progression risk of the patient to be tested and the contribution value of each evaluation index, and the chronic kidney disease progression risk prediction result is displayed.
S210, updating the importance weight of the evaluation index according to the contribution value of the evaluation index to the risk score.
In this embodiment, after the chronic kidney disease progression risk prediction result is obtained, the importance weight of the evaluation index may be updated according to the contribution value of the risk score, so that the prediction effect of the progression prediction model is more accurate.
In another specific implementation manner of the present embodiment, based on the foregoing solution, a system analysis platform is built through a dock container virtualization technology, and an auxiliary tool for clinical chronic kidney disease progression risk analysis is built. Specifically, mySQL database is used to store system data, and Web server technologies include server technology, public gateway interface technology, and PHP (Hypertext Preprocessor ) technology. The specific construction process comprises the following steps:
Step one: constructing a server, using an Apache software component server, placing a webpage file under a default www folder of Apache software, and realizing remote access through an external network;
step two: the web page design and implementation adopts asynchronous JavaScript and XML (Asynchronous Javascript And XML, ajax) technology, and partial web pages can be updated by inputting task numbers, and the system prediction results are displayed;
step three: constructing a database, and storing medical characteristics and system prediction results of a patient to be tested by using a MySQL database;
step four: the construction of the analysis module triggers the system clinical prediction analysis service in an asynchronous mode through PHP pipeline technology.
The embodiment of the invention can predict the progression risk of the chronic kidney disease based on the effective evaluation indexes, can output the influence information of each evaluation index on the progression risk, and can provide a more accurate, effective and reliable prediction result for the progression risk of the chronic kidney disease.
The method for predicting the risk of chronic kidney disease progression in the embodiment of the present invention is described above, and the device for predicting the risk of chronic kidney disease progression in the embodiment of the present invention is described below, referring to fig. 4, an embodiment of the device for predicting the risk of chronic kidney disease progression in the embodiment of the present invention includes:
The training module 401 is configured to train to obtain a progress prediction model based on the history medical record information of the chronic kidney disease patient;
the scoring module 402 is configured to obtain sample data of a patient to be predicted, input the sample data into the progress prediction model for risk prediction, and obtain a risk score;
the analysis module 403 is configured to invoke a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score, so as to obtain a contribution value of each evaluation index in the sample data to the risk score;
and a result output module 404, configured to generate a chronic kidney disease progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score.
The embodiment of the invention can predict the progression risk of the chronic kidney disease based on the effective evaluation indexes, can output the influence information of each evaluation index on the progression risk, and can provide a more accurate, effective and reliable prediction result for the progression risk of the chronic kidney disease.
In another embodiment of the present application, the training module is specifically configured to: acquiring history medical record information of a chronic kidney disease patient, and generating a training sample based on the history medical record information and a corresponding diagnosis result;
Constructing a plurality of initial models based on a plurality of deep learning algorithms, and training each initial model according to the training samples to obtain a plurality of trained candidate models;
and evaluating and screening the prediction effect of each candidate model based on the training sample, and selecting the candidate model with the optimal prediction effect as a progress prediction model.
In another embodiment of the present application, the chronic kidney disease progression risk prediction apparatus further includes an evaluation index screening module, where the evaluation index screening module is specifically configured to:
calling a regularization feature sparse algorithm and a feature recursion elimination algorithm of a random forest to construct a plurality of initial classification models;
inputting the historical medical record information into the initial classification model, calling a grid inspection algorithm to perform optimal super-parameter adjustment on the initial classification model, and selecting the initial classification model with the highest integral under the working characteristic curve of the subject as a characteristic screening model;
according to the feature screening model, importance ranking is carried out on the medical features to obtain a medical feature sequence;
and screening medical feature combinations with importance weights larger than a preset threshold value in the medical feature sequences to obtain an evaluation index.
In another embodiment of the present application, the obtaining the history information of the chronic kidney disease patient, and generating the training sample based on the history information and the corresponding diagnosis result includes:
collecting electronic medical record data of a chronic kidney disease patient confirmed by pathology from a hospital electronic medical record platform;
extracting medical features from the electronic medical record data to obtain medical features and corresponding medical feature values, and preprocessing the medical features and the corresponding medical feature values;
and marking the grade of the decline of the kidney function and classifying the risk prognosis for the medical feature and the corresponding medical feature value after the data preprocessing, so as to obtain a training sample.
In another embodiment of the present application, the performing evaluation screening on the prediction effect of each candidate model based on the training samples, and selecting the candidate model with the optimal prediction effect as the progress prediction model includes:
inputting the training samples into each candidate model, and carrying out optimal super-parameter adjustment on each candidate model by using a grid optimizing algorithm;
calculating the area under the working characteristic curve of the test subject of each candidate model;
and selecting a candidate model with the optimal prediction effect as a progress prediction model according to the area under the working characteristic curve of the subject of each candidate model.
In another embodiment of the present application, the evaluation index includes: age, hemoglobin, N-terminal osteocalcin, 25 hydroxy vitamin D, total protein, glutamic pyruvic transaminase, apolipoprotein E, thyroxine, carbohydrate antigen 199, mean red blood cell hemoglobin concentration, unsaturated iron binding capacity, blood homocysteine, albumin, major platelet fraction, total bilirubin, creatine kinase isoenzyme, red blood cell distribution breadth, IGG antibodies, mean red blood cell volume, folic acid, sodium, urea nitrogen, plasma proteins, neuronal enolase, serum total complement activity, neutrophil count, total bile acid, chloro, glycosylated albumin, low density lipoprotein cholesterol, non-high density lipoprotein cholesterol.
In another embodiment of the present application, the chronic kidney disease progression risk prediction apparatus further comprises an importance update module, wherein the importance update module is specifically configured to:
and updating the importance weight of the evaluation index according to the contribution value of the evaluation index to the risk score.
In another embodiment of the present application, the data preprocessing of the medical feature and the corresponding medical feature value includes:
Carrying out standardized treatment on the medical features, unifying codes and changing classification values into one-hot codes;
normalizing the medical characteristic value;
and detecting abnormal values of the medical characteristic values after normalization processing by adopting a clustering algorithm, judging the weight and the balance of the data after removing the extreme outliers, balancing the training samples by using a random sampling method, and filling the missing data by using a random forest algorithm.
The embodiment of the invention can predict the progression risk of the chronic kidney disease based on the effective evaluation indexes, can output the influence information of each evaluation index on the progression risk, and can provide a more accurate, effective and reliable prediction result for the progression risk of the chronic kidney disease. And after the chronic nephrosis progress risk prediction result is obtained, the importance weight of the evaluation index can be updated according to the contribution value of the risk score, so that the prediction effect of the progress prediction model is more accurate.
The chronic kidney disease progression risk prediction apparatus in the embodiment of the present invention is described in detail from the point of view of the modularized functional entity in fig. 4 above, and based on the same inventive concept, the embodiment of the present invention further provides a chronic kidney disease progression risk prediction device, and the chronic kidney disease progression risk prediction device in the embodiment of the present invention is described in detail from the point of view of hardware processing below.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. An electronic device 500 according to this embodiment of the present invention is described below with reference to fig. 5. The electronic device 500 shown in fig. 5 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in fig. 5, the electronic device 500 is embodied in the form of a general purpose computing device. The components of electronic device 500 may include, but are not limited to: at least one processing unit 510, at least one memory unit 520, a bus 530 connecting the different system components (including the memory unit 520 and the processing unit 510), a display unit 540, etc.
Wherein the storage unit stores program code that is executable by the processing unit 510 such that the processing unit 510 performs the steps according to various exemplary embodiments of the invention described in the above processing method section of the present specification. For example, the processing unit 510 may perform the steps shown in fig. 1.
The memory unit 520 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) 5201 and/or cache memory unit 5202, and may further include Read Only Memory (ROM) 5203.
The storage unit 520 may also include a program/utility 5204 having a set (at least one) of program modules 5205, such program modules 5205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Bus 530 may be one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 500 may also communicate with one or more external devices 100 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 500, and/or any device (e.g., router, modem, etc.) that enables the electronic device 500 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 550. Also, electronic device 500 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 560. The network adapter 560 may communicate with other modules of the electronic device 500 via the bus 530. It should be appreciated that although not shown in fig. 5, other hardware and/or software modules may be used in connection with electronic device 500, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the exemplary embodiments described herein may be implemented in software, or may be implemented in software in combination with necessary hardware. Thus, the technical solution according to the embodiments of the present invention may be embodied in the form of a software product, which may be stored in a computer readable storage medium (may be a CD-ROM, a usb disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, or a network device, etc.) to perform the above-mentioned method according to the present invention. The computer program, when executed by a data processing device, enables the computer readable medium to carry out the above-described method of the present invention, namely: such as the method shown in fig. 1 or fig. 2.
Fig. 6 is a schematic diagram of a computer readable medium according to an embodiment of the present disclosure.
A computer program implementing the method shown in fig. 1 or fig. 2 may be stored on one or more computer readable media. The computer readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
In summary, the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functionality of some or all of the components in accordance with embodiments of the present invention may be implemented in practice using a general purpose data processing device such as a microprocessor or Digital Signal Processor (DSP). The present invention can also be implemented as an apparatus or device program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
The above-described specific embodiments further describe the objects, technical solutions and advantageous effects of the present invention in detail, and it should be understood that the present invention is not inherently related to any particular computer, virtual device or electronic apparatus, and various general-purpose devices may also implement the present invention. The foregoing description of the embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.
Claims (10)
1. A method for predicting risk of progression of chronic kidney disease, comprising:
training based on the history medical record information of the chronic kidney disease patient to obtain a progress prediction model;
acquiring sample data of a patient to be predicted, and inputting the sample data into the progress prediction model to perform risk prediction to obtain a risk score;
invoking a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk score, and obtaining a contribution value of each evaluation index in the sample data to the risk score;
and generating a chronic kidney disease progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score.
2. The method of claim 1, wherein training the progress prediction model based on the history information of the chronic kidney disease patient comprises:
acquiring history medical record information of a chronic kidney disease patient, and generating a training sample based on the history medical record information and a corresponding diagnosis result;
constructing a plurality of initial models based on a plurality of deep learning algorithms, and training each initial model according to the training samples to obtain a plurality of trained candidate models;
and evaluating and screening the prediction effect of each candidate model based on the training sample, and selecting the candidate model with the optimal prediction effect as a progress prediction model.
3. The method according to claim 2, wherein before the acquiring the history information of the chronic kidney disease patient and generating the training sample based on the history information and the corresponding diagnosis result, further comprising:
calling a regularization feature sparse algorithm and a feature recursion elimination algorithm of a random forest to construct a plurality of initial classification models;
inputting the historical medical record information into the initial classification model, calling a grid inspection algorithm to perform optimal super-parameter adjustment on the initial classification model, and selecting the initial classification model with the highest integral under the working characteristic curve of the subject as a characteristic screening model;
According to the feature screening model, importance ranking is carried out on the medical features to obtain a medical feature sequence;
and screening medical feature combinations with importance weights larger than a preset threshold value in the medical feature sequences to obtain an evaluation index.
4. The method of claim 3, wherein the obtaining historical medical record information of the chronic kidney disease patient, and generating training samples based on the historical medical record information and corresponding diagnostic results comprises:
collecting electronic medical record data of a chronic kidney disease patient confirmed by pathology from a hospital electronic medical record platform;
extracting medical features from the electronic medical record data to obtain medical features and corresponding medical feature values, and preprocessing the medical features and the corresponding medical feature values;
and marking the grade of the decline of the kidney function and classifying the risk prognosis for the medical feature and the corresponding medical feature value after the data preprocessing, so as to obtain a training sample.
5. The method according to claim 3, wherein the evaluating and screening the prediction effect of each candidate model based on the training samples, and selecting a candidate model with the optimal prediction effect as the progress prediction model comprises:
Inputting the training samples into each candidate model, and carrying out optimal super-parameter adjustment on each candidate model by using a grid optimizing algorithm;
calculating the area under the working characteristic curve of the test subject of each candidate model;
and selecting a candidate model with the optimal prediction effect as a progress prediction model according to the area under the working characteristic curve of the subject of each candidate model.
6. The method of predicting risk of progression of chronic kidney disease according to any one of claims 1-5, wherein the assessment indicator comprises:
age, hemoglobin, N-terminal osteocalcin, 25 hydroxy vitamin D, total protein, glutamic pyruvic transaminase, apolipoprotein E, thyroxine, carbohydrate antigen 199, mean red blood cell hemoglobin concentration, unsaturated iron binding capacity, blood homocysteine, albumin, major platelet fraction, total bilirubin, creatine kinase isoenzyme, red blood cell distribution breadth, IGG antibodies, mean red blood cell volume, folic acid, sodium, urea nitrogen, plasma proteins, neuronal enolase, serum total complement activity, neutrophil count, total bile acid, chloro, glycosylated albumin, low density lipoprotein cholesterol, non-high density lipoprotein cholesterol.
7. The method according to claim 6, further comprising, after the generation of the chronic kidney disease progression risk prediction result from the risk score and the contribution value of each of the evaluation indexes to the risk score:
and updating the importance weight of the evaluation index according to the contribution value of the evaluation index to the risk score.
8. A chronic kidney disease progression risk prediction apparatus, characterized in that the chronic kidney disease progression risk prediction apparatus comprises:
the training module is used for training to obtain a progress prediction model based on the history medical record information of the chronic kidney disease patient;
the scoring module is used for acquiring sample data of a patient to be predicted, inputting the sample data into the progress prediction model for risk prediction, and obtaining a risk score;
the analysis module is used for calling a machine learning interpretation tool to evaluate the progress prediction model and the obtained risk scores, and obtaining contribution values of all evaluation indexes in the sample data to the risk scores;
and the result output module is used for generating a chronic nephrosis progression risk prediction result according to the risk score and the contribution value of each evaluation index to the risk score.
9. A chronic kidney disease progression risk prediction apparatus, characterized in that the chronic kidney disease progression risk prediction apparatus comprises: a memory and at least one processor, the memory having instructions stored therein;
the at least one processor invoking the instructions in the memory to cause the chronic kidney disease progression risk prediction device to perform the steps of the chronic kidney disease progression risk prediction method of any one of claims 1-7.
10. A computer readable storage medium having instructions stored thereon, which when executed by a processor, implement the steps of the chronic kidney disease progression risk prediction method of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311705794.7A CN117690591A (en) | 2023-12-12 | 2023-12-12 | Method, device, equipment and storage medium for predicting chronic kidney disease progression risk |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311705794.7A CN117690591A (en) | 2023-12-12 | 2023-12-12 | Method, device, equipment and storage medium for predicting chronic kidney disease progression risk |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117690591A true CN117690591A (en) | 2024-03-12 |
Family
ID=90128017
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311705794.7A Pending CN117690591A (en) | 2023-12-12 | 2023-12-12 | Method, device, equipment and storage medium for predicting chronic kidney disease progression risk |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117690591A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117710066A (en) * | 2024-02-05 | 2024-03-15 | 厦门傲凡科技股份有限公司 | Financial customer recommendation method and system |
-
2023
- 2023-12-12 CN CN202311705794.7A patent/CN117690591A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117710066A (en) * | 2024-02-05 | 2024-03-15 | 厦门傲凡科技股份有限公司 | Financial customer recommendation method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Singh et al. | A deep neural network for early detection and prediction of chronic kidney disease | |
Stafford et al. | A systematic review of the applications of artificial intelligence and machine learning in autoimmune diseases | |
Yu et al. | Enabling phenotypic big data with PheNorm | |
Ikemura et al. | Using automated machine learning to predict the mortality of patients with COVID-19: prediction model development study | |
Randell et al. | Delta checks in the clinical laboratory | |
JP2018067266A (en) | Program for forecasting onset risk or recurrence risk of disease | |
Ludvigsson et al. | Use of computerized algorithm to identify individuals in need of testing for celiac disease | |
US20220084639A1 (en) | Electronic Phenotyping Technique for Diagnosing Chronic Kidney Disease | |
Famiglini et al. | Prediction of ICU admission for COVID-19 patients: a machine learning approach based on complete blood count data | |
Ni et al. | Towards phenotyping stroke: Leveraging data from a large-scale epidemiological study to detect stroke diagnosis | |
CN114373544A (en) | Method, system and device for predicting membranous nephropathy based on machine learning | |
JP2018072337A (en) | Method of predicting recurrence risk of major adverse cardiac event | |
Zhou et al. | Integration of artificial intelligence and multi-omics in kidney diseases | |
Park et al. | Artificial intelligence with kidney disease: a scoping review with bibliometric analysis, PRISMA-ScR | |
Grandi et al. | Risk prediction models for peri-operative mortality in patients undergoing major vascular surgery with particular focus on ruptured abdominal aortic aneurysms: a scoping review | |
PEREZ VALDIVIESO et al. | Evaluation of the prognostic value of the risk, injury, failure, loss and end‐stage renal failure (RIFLE) criteria for acute kidney injury | |
US20230145258A1 (en) | Predicting a Diagnostic Test Result From Patient Laboratory Testing History | |
Hempstalk et al. | Improving 30-day readmission risk predictions using machine learning | |
CN117690591A (en) | Method, device, equipment and storage medium for predicting chronic kidney disease progression risk | |
Danilatou et al. | Automated mortality prediction in critically-ill patients with thrombosis using machine learning | |
Li et al. | Tuberculous pleural effusion prediction using ant colony optimizer with grade-based search assisted support vector machine | |
Wang et al. | Comparison of four machine learning techniques for prediction of intensive care unit length of stay in heart transplantation patients | |
Derevitskii et al. | Hybrid predictive modelling: Thyrotoxic atrial fibrillation case | |
CN114446470A (en) | Artificial intelligence model-based acute kidney injury recovery time prediction method | |
Raina et al. | Artificial intelligence in early detection and prediction of pediatric/neonatal acute kidney injury: current status and future directions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |