CN117954089A - Method and device for constructing chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model - Google Patents

Method and device for constructing chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model Download PDF

Info

Publication number
CN117954089A
CN117954089A CN202410060582.6A CN202410060582A CN117954089A CN 117954089 A CN117954089 A CN 117954089A CN 202410060582 A CN202410060582 A CN 202410060582A CN 117954089 A CN117954089 A CN 117954089A
Authority
CN
China
Prior art keywords
model
decision tree
western medicine
traditional chinese
disease management
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410060582.6A
Other languages
Chinese (zh)
Inventor
吴一帆
张显龙
陈惠芬
傅立哲
唐芳
刘旭生
卢富华
张敏
古月瑜
许苑
张腊
吕莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Hospital of Traditional Chinese Medicine
Original Assignee
Guangdong Hospital of Traditional Chinese Medicine
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Hospital of Traditional Chinese Medicine filed Critical Guangdong Hospital of Traditional Chinese Medicine
Priority to CN202410060582.6A priority Critical patent/CN117954089A/en
Publication of CN117954089A publication Critical patent/CN117954089A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a method and a device for constructing a Chronic Kidney Disease (CKD) traditional Chinese and western medicine chronic disease management effect prediction model, comprising the following steps: the data acquisition module acquires case data and constructs a case data set containing the case information of the traditional Chinese and western medicine; the model construction module is used for constructing a model for predicting the chronic disease management effect of the traditional Chinese and western medicine of the CKD based on the decision tree according to the case data set, and predicting whether an endpoint event occurs or not through the model for predicting the chronic disease management effect of the traditional Chinese and western medicine of the CKD; and the model evaluation module is used for evaluating the effect of the CKD traditional Chinese and western medicine chronic disease management effect prediction model in a preset mode. The invention comprehensively collects the multi-dimensional, multi-item and large-sample Chinese and Western medicine clinical characteristics, provides more detailed information about the influence factors of the prognosis of the CKD, can more accurately predict the chronic disease management curative effect, and provides a model for predicting the chronic disease management effect, which has the characteristics of traditional Chinese medicine and synthesizes the influence factors of the multi-dimensional disease prognosis.

Description

Method and device for constructing chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model
Technical Field
The invention relates to the technical field of traditional Chinese medicines, in particular to a method and a device for constructing a chronic kidney disease traditional Chinese and Western medicine chronic disease management effect prediction model.
Background
Chronic Kidney Disease (CKD) is a global public health problem, big data and Artificial Intelligence (AI) assist in the study of kidney disease, thereby improving clinical practice of nephrology, helping kidney disease patients to better manage the disease the AI technology has been applied more and more to kidney research in the field of kidney disease, such as predicting occurrence of disease, mainly involving hospitalized patients suffering from Acute Kidney Injury (AKI), diabetic patients suffering from diabetic nephropathy, etc.; in diagnostic aspects, it is mainly directed to identifying diabetic nephropathy from biopsy sample images, screening CKD patients from ultrasound images or retinal images, identifying renal cell carcinoma from Magnetic Resonance (MRI) or Computed Tomography (CT) images, monitoring internal arteriovenous fistula dysfunction in real time, etc.; prognosis prediction mainly comprises the prediction of occurrence of dialysis of CKD, diabetic nephropathy and IgA nephropathy patients, death of dialysis patients, survival rate of kidney transplantation patients and the like; in the aspect of treatment management, the method comprises anemia management of a hemodialysis patient, peritoneal dialysis scheme establishment of an peritoneal dialysis patient, application of heparin dosage of the hemodialysis patient, application of tacrolimus dosage of a kidney transplantation patient and the like; other aspects of the prediction are, for example, predicting failure of hemodialysis arteriovenous fistula, acute rejection after kidney transplantation, pathogenic microorganisms of acute peritonitis, and parathyroid hormone (iPTH) levels in Hemodialysis (HD) patients, etc.
By adopting a system evaluation method, the system searches and combs the research evidence for constructing a disease progress prediction model of a CKD non-dialysis patient based on the AI technology, and discovers that the existing research of the CKD prediction model has the following defects:
(1) The collection of influencing factors lacks comprehensiveness
In terms of predictors, most of the study-collected predictors are mainly demographics, haematuria examination results, medical history, histological examination, oral western medicines and the like. In addition, many other factors (such as oral decoction, chinese patent medicine, enema, and syndrome of Chinese medicine; other factors such as self-management, high-quality low protein diet, exercise, sleep, disease cognition, and patient education) can influence the progress of the renal disease course, and should be included. Meanwhile, if self-management factors such as diet, exercise, education, sleep and the like can be used for constructing a prediction model, the prediction model can be better used for clinical practice, and medical staff can be helped to better manage patients. Thus, in future prospective collection of patient data for artificial intelligence modeling, care should be taken to collect such data for CKD patients as comprehensively as possible.
(2) Sample amount calculation guideline for AI prediction model to be formed
AI is superior to traditional statistical methods in terms of its ability to handle large data sets containing a large number of features, but AI requires enough data to train to prevent model errors, resulting in biased output, while small training sets are prone to over-fitting during model training, especially for complex models. In conventional regression analysis, a 10 to 1 ratio of each feature (or variable) to event (or result) typically ensures a sufficiently large sample size, but this is often insufficient for more complex models. For AI algorithms there is no universally accepted sample size calculation guideline, but in general the more complex the model, the more features and parameters the larger the training dataset will need to be. Thus, the formation of a sample size calculation guideline is a further improvement in future work.
(3) Lack of efficient internal and external authentication
In terms of model validation, most studies use a method of segmenting the sample for internal validation, the original dataset is typically randomly split into a training set and a validation set, however this method only creates two similar but smaller datasets, which are statistically inefficient and weak because there are no differences in time or place other than occasional differences. While the approach of using segmented samples for small data sets actually increases the bias risk, there is no real benefit to large data sets.
(4) Model results lack interpretability
In terms of model performance, most of the research and development prediction models have high performance, so that it is feasible to construct a CKD prediction model by using AI. Through performance comparison of the prediction models constructed by machine learning, the accuracy of the AI algorithm mainly comprising an artificial neural network, a random forest and XGBoost can be found to be high. Although these models are very high in performance, the results often cannot be explained, and particularly, some studies involve hundreds of predictors, and it is not known by which predictors the algorithm is risk predicted, which leads to difficulty in use and popularization in clinical practice, and why most studies have developed predictive models but are unable to visualize the presentation of the results, such as web-based clinical decision systems or calculators.
(5) Limited evidence of clinical practice
The AI model provides a reliable and accurate method for personalized prediction of prognosis of CKD patients. The use of AI algorithms can help clinicians find high risk patients with early renal function progression so that they receive treatment and management in a timely manner. However, in actual clinical practice, the application of the predictive model developed by AI to clinical practice is very small, and thus, the evidence applied to clinical practice is also very limited.
Disclosure of Invention
The invention provides a method and a device for constructing a CKD traditional Chinese and Western medicine chronic disease management effect prediction model, which can effectively improve the accuracy of traditional prediction of disease progression risk of a CKD diseased crowd participating in chronic disease management.
In a first aspect, the application provides a method for constructing a model for predicting the chronic disease management effect of CKD traditional Chinese and western medicine, comprising the following steps:
the data acquisition module acquires case data and constructs a case data set containing the case information of the traditional Chinese and western medicine;
The model construction module is used for constructing a model for predicting the chronic disease management effect of the traditional Chinese and western medicine of the CKD based on the decision tree according to the case data set, and predicting whether an endpoint event occurs or not through the model for predicting the chronic disease management effect of the traditional Chinese and western medicine of the CKD; the CKD traditional Chinese and western medicine chronic disease management effect prediction model comprises a plurality of decision tree models with different years; the decision tree model trains the decision tree model by taking the sample data after the annual equalization as a training set, tests the trained decision tree model by taking a testing set without the equalization processing to obtain a main variable of the annual decision tree model, and judges whether the sample data has an endpoint event or not by the main variable;
And the model evaluation module is used for evaluating the effect of the CKD traditional Chinese and western medicine chronic disease management effect prediction model in a preset mode.
In some embodiments, the model building module builds a CKD traditional chinese and western medicine slow disease management effect prediction model from the case data set, comprising:
creating a test set and an equalized training set from the case data set using the put-back sampling;
And constructing a model for predicting the chronic disease management effect of the traditional Chinese and western medicine of the CKD based on the decision tree, and training through a training set to finish the construction of the model for predicting the chronic disease management effect of the traditional Chinese and western medicine of the CKD.
In some embodiments, the constructing a decision tree-based CKD traditional chinese and western medicine chronic disease management effect prediction model comprises:
Determining an optimal feature as a root node for classifying the case dataset into as many different categories as possible;
Dividing the case data set into a plurality of subsets, determining the best features in each subset, and generating branches of a decision tree;
reducing the size of the decision tree through pruning operation;
determining the category of an instance of the training set through branches of the decision tree to complete construction of a decision tree model; the category is whether an endpoint event occurs.
In some embodiments, the CKD traditional chinese western medicine slow disease management effect prediction model comprises a first year decision tree model, a second year decision tree model, a third year decision tree model, and a fourth year decision tree model;
The first-year decision tree model takes the sample data after the first-year equalization as a training set and trains by using the decision tree model; the second-year decision tree model takes the sample data after the second-year equalization as a training set and trains by using the decision tree model; the third-year decision tree model takes the sample data after the third-year equalization as a training set and trains by using the decision tree model; and the fourth-year decision tree model takes the sample data after the fourth-year equalization as a training set and trains by using the decision tree model.
In some implementations, the primary variables of the first year decision tree model include one or more of: eGFR, urea, hb, total dialectical score of traditional Chinese medicine, listlessness, debilitation, sex, motor cognition subject score, PRO;
The main variables of the second annual decision tree model include one or more of the following: eGFR, urea, ALB, age, kidney injury medicine cognition subject score, urinary protein creatinine ratio, damp-heat syndrome, listlessness debilitation, TCO2, LDL-C;
the main variables of the third year decision tree model include one or more of the following: urea, ALB, cognitive topic scores for the cause of exacerbation of kidney disease;
the main variables of the fourth year decision tree model include one or more of the following: eGFR, damp-heat syndrome, TG, nocturia.
In some embodiments, the data acquisition module includes a data acquisition unit and a data processing unit, where the data acquisition unit is configured to acquire case data, and the data processing unit is configured to perform data cleansing on the case data, and the data cleansing includes one or more of the following: reject missing variables, equalize, correlation analysis, variable classification, variable merging or variable assignment.
In some embodiments, evaluating the effect of the CKD traditional chinese and western medicine chronic disease management effect prediction model by a preset manner comprises:
and collecting case information, calculating the prediction accuracy, sensitivity and specificity of the CKD Chinese and Western medicine chronic disease management effect prediction model, and evaluating the prediction effect of the CKD Chinese and Western medicine chronic disease management effect prediction model according to the prediction accuracy.
In some embodiments, the case data includes one or more of the following: basic information, diagnosis and treatment conditions, nutritional conditions, traditional Chinese and western medicine influence factors and clinical prognosis ending.
In a second aspect, the present application provides a CKD traditional chinese and western medicine chronic disease management effect prediction model apparatus, including: the system comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the CKD traditional Chinese and western medicine chronic disease management effect prediction model construction method when executing the computer program.
The invention provides a method and a device for constructing a CKD (chronic disease management) effect prediction model, which comprehensively collect multidimensional, multi-item and large-sample Chinese and Western clinical characteristics, provide more detailed information about CKD prognosis influence factors, further accurately predict the chronic disease management effect and provide the CKD chronic disease management effect prediction model with traditional Chinese medicine characteristics and comprehensive multidimensional disease prognosis influence factors. According to the invention, the artificial intelligence method of the decision tree is adopted to construct the model for predicting the chronic disease management effect of the traditional Chinese and western medicine of the CKD, the model is not limited and intervened by statistical conditions, and the model can be repeatedly adjusted according to clinical practice and the requirements of medical workers so as to meet the clinical work requirements. The results of the decision tree model are more interpretable than other machine learning algorithms. The construction method can be applied to clinical models to show good prediction performance, has high clinical value and significance, and provides evidence for application of AI to clinical practice.
Drawings
FIG. 1 is a schematic flow chart of a method for constructing a model for predicting chronic kidney disease management effect of traditional Chinese and Western medicine;
FIG. 2 is a schematic diagram of a first annual decision tree model of a method for constructing a model for predicting chronic kidney disease management effects of traditional Chinese and Western medicine;
FIG. 3 is a schematic diagram of a second annual decision tree model of the method for constructing a model for predicting chronic kidney disease management effect of traditional Chinese and Western medicine;
FIG. 4 is a schematic diagram of a third annual decision tree model of a method for constructing a model for predicting chronic kidney disease management effect of traditional Chinese and Western medicine;
FIG. 5 is a schematic diagram of a fourth annual decision tree model of a method for constructing a model for predicting chronic kidney disease management effects of traditional Chinese and Western medicine;
Fig. 6 is a graph of area under a working characteristic curve (AUC) of a subject with a model of decision tree for predicting endpoint events of CKD patients (test set) for 1-4 years by using the method for constructing a model for predicting chronic kidney disease management effect of traditional Chinese and western medicine provided by the invention;
FIG. 7 is a schematic diagram showing an interface of a model device for predicting chronic kidney disease management effects in Western and traditional Chinese medicine;
FIG. 8 is a second schematic interface diagram of a model device for predicting chronic kidney disease management effects in Western and traditional Chinese medicine provided by the invention;
fig. 9 is a schematic diagram III of an interface of a model device for predicting chronic kidney disease management effect of a traditional Chinese and western medicine provided by the invention;
Fig. 10 is a schematic diagram of an interface of a model device for predicting chronic kidney disease management effect in western medicine.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings.
As shown in fig. 1, the application provides a method and a device for constructing a chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model, which are not limited by statistical conditions and intervene, and can be repeatedly adjusted according to the requirements of clinical practice and medical workers so as to meet the clinical work requirements and improve the prediction accuracy.
Specifically, the construction method comprises the following steps:
step S1, obtaining case data and constructing a case data set containing Chinese and Western medicine case information;
the medical record data requires a series of data processing operations to obtain data that can be used to construct a model. Firstly, obtaining case data, firstly screening a subject, and screening cases meeting the standard from a dirty disease and chronic disease management database according to the exclusion standard. 2701 CKD patients were screened from the kidney disease chronic disease management database for diagnosis at the kidney disease chronic disease management clinic from 25a year 2016, 04 a, to 05 a year 2021, 14 a and were agreed to add chronic disease management, and finally, 1373 CKD patients were included in total according to inclusion exclusion criteria described below. Inclusion criteria were age no less than 18 years old, at least two creatinine results with follow-up time and first and last creatinine measurement time exceeding 6 months, positive diagnosis of CKD with reference to the KDIGO guidelines in 2012, evfr > 5 ml/(min 1.73m 2), and signed patient with informed consent for chronic disease management, consent to use the data.
Exclusion criteria were kidney replacement therapy (dialysis or transplantation) had been performed for the first visit, acute death (e.g., acute lung infection) diagnosed with a history of renal malignancy or with defined disease outside the renal line, severe insufficiency of medical record data (uneven baseline scales or missing test indicators of more than 30%).
Second, subjects are screened for clinical data. The kidney internal medicine chronic disease management clinic has fixed follow-up, ventilating and teaching and evaluation flows for CKD patients, and for patients who make a first visit or make a regular visit, the chronic disease management is incorporated according to the willingness of the patients, the informed consent is signed, basic information is collected, and a traditional Chinese medicine chronic disease management scheme is formulated. And (3) follow-up visit is carried out every other month later, adjustment of a treatment scheme and health education are carried out, laboratory indexes are collected every 3-6 months, various scales are filled every 6 months, and creatinine measurement time of a patient in the half year after the patient is taken in chronic disease management is taken as baseline time.
Clinical data includes basic information, laboratory test results, clinical diagnosis and treatment, collection of scales or questionnaires, nutritional assessment, clinical endpoint events. For patients who agree to incorporate chronic management, each patient fills out a basic information questionnaire, the basic information including: sex, age, body mass index BMI, working status, blood pressure, pay mode, marital status, alcoholism, smoking, education, self care, primary disease, CKD stage, past history (hypertension, diabetes, hyperuricemia, hyperlipidemia, cardiovascular disease, urinary calculi, etc.). The primary disease and past history are checked by means of patient self-reporting, and then researchers check on the basis of patient history, examination results and oral medication of the hospital information system (HIS, hospital information system).
Laboratory test results are obtained by hospital test systems, primarily those from the time the patient has been included in chronic disease management. Comprising the following steps: urine convention (urine occult blood BLD, urine protein PRO), blood convention (white blood count WBC, neutrophil percentage net, lymphocyte percentage LYMPH%, red blood cell count RBC, hemoglobin Hb, hematocrit HCT, platelet count PLT), liver function (glutamic pyruvic transaminase ALT, glutamic oxaloacetic transaminase AST, albumin ALB), kidney function (Urea, creatinine Cr, total carbon dioxide TCO2, uric acid UA), blood lipids (triglyceride TG, total cholesterol TC, high density lipoprotein cholesterol HDL-C, low density lipoprotein cholesterol LDL-C), glucose GLU, urine protein/urine creatinine ratio, ions (potassium ion k+, sodium ion na+, blood calcium ca2+, blood phosphorus P), and the like.
Clinical diagnosis and treatment is the medical condition of Chinese and Western medicine within three months after the baseline time of a patient is acquired by an outpatient diagnosis and treatment system, and comprises the following steps: whether or not to take traditional Chinese medicine decoction, diuretics, ACEI/ARB antihypertensives, CCB antihypertensives, alpha antihypertensives, beta antihypertensives, hypoglycemic agents (insulin, biguanides, glinide, DPP-4 inhibitors, thiazolidinediones, sulfonylureas, SGLT-2 inhibitors, alpha glycosidase inhibitors), lipid-lowering agents (statins, fibrates), uric acid lowering agents (febuxostat, benzbromarone, allopurinol), sodium bicarbonate tablets, erythropoietin (EPO), polysaccharide iron, keto acid tablets, folic acid tablets, calcium supplementing drugs (calcitriol, calQi), hormones, traditional Chinese medicine immunosuppressants (tripterygium wilfordii, kunzan capsules), western medicine immunosuppressants (cyclosporine, leflunomide tablets, mycophenolate mofetil, tacks, cyclophosphamide), turbidity-reducing Chinese patent medicines (uremic toxin, sea kunkenxi, rennin), tonifying Chinese patent medicines (gold water trenbao, bailing capsules), three astragalus oral liquid, nephritis, yellow sunflower capsules, chinese medicine including compound tablet, kidney-aging granule, kidney-tonifying granule, and other drug.
The collection of the scale or questionnaire comprises CKD dialectical evaluation list, kidney disease cognition degree questionnaire and disease treatment condition questionnaire. The CKD differential type evaluation list is formulated according to the chronic renal failure Chinese medicine symptom grading quantization list of the Chinese medicine clinical research guidelines (2002) of Chinese medicine new medicine as a reference. The pattern includes deficiency of spleen and kidney, deficiency of liver and kidney, deficiency of spleen and kidney, deficiency of both qi and yin, and deficiency of both yin and yang; the concurrent syndromes include syndrome of damp-turbidity, syndrome of damp-heat, syndrome of heat-toxin and syndrome of blood stasis. The syndrome is based on the primary and secondary symptoms, and can be judged as corresponding primary syndrome by having primary or secondary symptoms. Two of the concurrent symptoms can be judged as corresponding concurrent symptoms. The kidney disease cognition degree questionnaire is a disease cognition measurement tool based on the Likte 'five-point' scale developed by Wu Yifan et al for non-dialysis patients with CKD, wherein higher scores indicate higher cognition of the patient to the disease. The scale has 18 problems, and involves patient awareness in terms of disease, diet, exercise, medication, and examination results. The scale confidence index, the gram Barkhausen coefficient (Cronbach's alpha), is 0.946. The disease treatment questionnaire is a comprehensive questionnaire formulated by medical staff in a family room for investigating whether patients take medicine on time, learn kidney disease knowledge, control diet, pay attention to exercise, control blood pressure, blood sugar and how mood in daily life. Wherein, whether to take the medicine on time is evaluated according to the Morisky medicine compliance scale of 4 entries (4-item Morisky medication ADHERENCE SCALE, MMAS-4), which has Cronbach's alpha of 0.61. The scale comprises 4 items in total, the answer of each item is yes or no, and the 4 items are no, so that the compliance is judged to be good, otherwise, the compliance is poor.
The nutrition evaluation is to analyze human body components (Inbody) every 3-6 months, and detect with human body component analyzer Inbody-770. The clinical endpoint event refers to kidney replacement therapy (peritoneal dialysis, hemodialysis or kidney transplantation) or eGFR < 5 ml/(min.1.73 m 2).
After the data is acquired, performing data cleaning on the case data, wherein the data cleaning comprises one or more of the following steps: reject missing variables, equalize, correlation analysis, variable classification, variable merging or variable assignment.
The missing variable is removed because part of data contains missing values, so that the prediction effect of the model is easily influenced, and the missing variable needs to be removed. As shown in table 1:
Table 1 list of variable deletions
Wherein, the missing values of the systolic pressure, the diastolic pressure, the Ca2+, the P and the working state exceed 30 percent, the glucose is deleted, and the glucose is reserved in consideration of the clinical significance of the glucose and just exceeds 30 percent; whether to self-care or not, and deleting the tongue body after the classification with a certain classification ratio reaching more than 90 percent. And using the correlation among variables, performing model-based single interpolation on the missing values by using a mice packet of R language so as to maximize statistical capability and reduce deviation.
The correlation analysis dimension reduction is to adopt a Pelson correlation analysis method to perform correlation analysis on all variables, and when the absolute value of the correlation coefficient between the two variables is more than or equal to 0.85, the variables are deleted and reserved according to clinical importance. The deleted variables include: CKD stage (high correlation with eGFR: -0.95), how well the blood pressure was controlled in the past month (high correlation with hypertension: -0.89), how well the blood glucose control was over the past month (high correlation with diabetes: -0.91), LYMPH% (and NEUT% correlation high: -0.91), TC (and LDL-C correlation high: 0.88) and part Inbody indicators such as intracellular moisture (ICW), extracellular moisture (ECW), protein content, mineral content, body total fat (BFM), systemic muscle mass (SLM), lean body mass (FFM), skeletal muscle weight (SMM), body fat Percentage (PBF), right upper limb FFM, left upper limb FFM, torso FFM, right lower limb FFM, left lower limb FFM, right upper limb TBW (total moisture), left upper limb TBW, torso TBW, right lower limb TBW, left lower limb TBW, right upper limb ICW, left upper limb ICW, torso ICW, right lower limb ICW, left upper limb ICW, right upper limb ECW, left upper limb ECW, right lower limb ECW, left upper limb ECW/TBW, torso ratio (extracellular fluid-to-side FFM), left upper limb FFM, left upper limb FFW/right lower limb FFW, BCkHz/BFW, left upper limb TBW, left upper limb BFW, left lower limb TBW, left upper limb ICW, right upper limb ICW, left upper limb ECW, right upper limb ECW, left upper limb ECW, lower limb FFW, lower limb lower, lower limb FFB lower limb lower, lower limb FFB, lower limb lower, lower limb lower limb lower, lower limb lower limb lower, 50 kHz-whole body phase angle, neck circumference, chest circumference, abdomen circumference, hip circumference, right upper arm circumference, left upper arm circumference, right leg circumference, left leg circumference. The Inbody indices are deleted in turn, either in relation to each other or to the other Inbody indices. In addition, SCR was deleted considering that the clinical significance of SCR and eGFR are similar.
And (3) reassigning and classifying, classifying or classifying and assigning partial variables according to the modeling crowd characteristics and combining a statistical principle, and detailing the assignment conditions of the dependent variables and the independent variables in the table 2.
Table 2 dependent and independent variable assignment
/>
/>
/>
/>
S2, constructing a model for predicting the chronic disease management effect of the traditional Chinese and western medicine CKD based on a decision tree according to the case data set, and predicting whether an endpoint event occurs by the model for predicting the chronic disease management effect of the traditional Chinese and western medicine CKD;
And constructing a CKD Chinese and Western medicine chronic disease management effect prediction model according to the case data set. And using the put-back sampling, creating a training set and a testing set according to the case data set, constructing a decision tree-based CKD traditional Chinese and western medicine slow disease management effect prediction model, and training through the training set to finish the construction of the CKD traditional Chinese and western medicine slow disease management effect prediction model.
The use of put-back sampling is due to the fact that the case data set is composed of case data of two groups of patients with unequal numbers, namely the original data set is composed of case data of two groups of patients with unequal numbers, and the problems of unbalanced class are generated for the groups of occurrence of the terminal events and the groups of non-occurrence of the terminal events respectively, so that the prediction accuracy of the model is affected. Therefore, by having the replacement sampling, the number of samples with large sample data size is reduced until the number of samples with small sample data size is similar, and the number of samples with small sample data size is increased until the number of samples with large sample data size is similar. In this way, a training set is generated from the case data set, and a test set is generated from the original case data set.
Decision tree (Decision tree) is a machine learning method commonly used for supervised classification prediction. One decision tree comprises a root node, a plurality of internal nodes and leaf nodes, wherein the leaf nodes correspond to decision results, other nodes correspond to attribute tests, and a sample set contained in each node is divided into sub-nodes according to the results of the attribute tests; the root node contains the sample corpus. The path from the root node to each leaf node corresponds to a predicate test sequence. Specifically, the process comprises:
Determining an optimal feature as a root node for classifying the case dataset into as many different categories as possible;
Dividing the case data set into a plurality of subsets, determining the best features in each subset, and generating branches of a decision tree;
reducing the size of the decision tree through pruning operation;
determining the category of an instance of the training set through branches of the decision tree to complete construction of a decision tree model; the category is whether an endpoint event occurs.
In order to prevent overfitting during pruning operation, the size of the decision tree is reduced, and meanwhile the generalization performance of the decision tree is maintained.
In the application, ten-fold cross training modeling is carried out on the training set by adopting a C4.5 decision tree, in order to avoid overfitting, C4.5 (unpruned =false) of pruning is used, the number of the least instances of the control leaf nodes is 10 during pruning, and the confidence factor of pruning is set to be 0.25. To ensure the repeatability of the experiment, the random seed was set to 1, and the number of samples selected at one time was 100 (batchSize =100) during training.
The CKD traditional Chinese and western medicine chronic disease management effect prediction model comprises a plurality of decision tree models with different years; the decision tree model trains the decision tree model by taking the sample data after the annual equalization as a training set, tests the trained decision tree model by taking the sample data without the equalization processing to obtain main variables of the annual decision tree model, and determines whether the sample data has an endpoint event or not through the main variables.
As shown in fig. 2-6, the present application is a plurality of decision tree models with different years, wherein the first-year decision tree model uses the sample data after the first-year equalization as a training set and uses the decision tree model for training, and the training first-year decision tree model is tested by using the original sample, i.e. the test set. The main variables of the first year model were obtained: eGFR, urea, hb, total dialectical score, listlessness, debilitation, sex, motor cognitive topic score, PRO.
Wherein, eGFR: estimated glomerular filtration rate, urea: urea, hb: hemoglobin, PRO: urine protein. Where "0" means that no endpoint event has occurred and "1" means that an endpoint event has occurred. Motor cognitive topic score (which movements you should do at ordinary times): no point is known at 1 point=2 point=one point is known, 3 point=basically known, 4 point=mostly known, 5 point=very clear.
When eGFR > 17.5ml/min/1.72m2 and Urea. Ltoreq.17.46 mmol/L, the patient has a low probability of endpoint event occurrence within one year, and 537 samples meeting the above conditions do not have endpoint event occurrence within one year (0%).
When eGFR is more than 17.5ml/min/1.72m2 and Urea is more than 17.46mmol/L, if the total score of syndrome differentiation is less than or equal to 15 minutes, the probability of the occurrence of the endpoint event of the patient in one year is lower, 12 samples meeting the conditions are all taken, and the endpoint event (0%) does not occur in one year. If the total score of syndrome differentiation is greater than 15 minutes, the probability of the patient to generate the endpoint event in one year is high, and 24 samples meeting the condition generate the endpoint event (100%) in one year.
When eGFR is less than or equal to 17.5ml/min/1.72m 2 and Urea is less than or equal to 13.91mmol/L, if Hb is less than or equal to 88g/L, the probability of the patient to generate an endpoint event in one year is higher, and 18 samples meeting the conditions generate the endpoint event (100%) in one year; if Hb > 88g/L, the patient has a low probability of endpoint event occurrence within one year, and none of the 12 samples meeting the above conditions have endpoint event occurrence within one year (0%).
When eGFR is less than or equal to 17.5ml/min/1.72m 2, urea is more than 13.91mmol/L, and eGFR is more than 13.3ml/min/1.72m 2, if sex is male, the probability of the patient to generate an endpoint event in one year is high, 115 samples of 127 samples meeting the conditions are all generated in one year (90.55%). If the sex is female, the probability of the patient to have an endpoint event within one year is low, and 13 samples meeting the above conditions do not have an endpoint event within one year (0%).
When eGFR is less than or equal to 17.5ml/min/1.72m 2, urea is more than 13.91mmol/L
When eGFR is less than or equal to 13.3ml/min/1.72m 2, if the symptoms of mild fatigue and hypodynamia are accompanied, the probability of the occurrence of an endpoint event in one year is high, and 336 samples of 341 samples meeting the conditions have the endpoint event (98.53 percent) in one year. If the symptoms of fatigue and hypodynamia are not generated, the score of the exercise cognition subject is1 or 2, the probability of occurrence of the endpoint event in one year is higher, the score is1 time minute, and 16 of 18 samples meeting the conditions have the endpoint event in one year (88.89 percent); at a score of 2, the eligible 16 samples all had an endpoint event (100%) within one year. If there is no symptoms of tiredness and hypodynamia, the motor cognitive topic score is 3 minutes, 4 minutes/5 minutes, the probability of endpoint event occurrence within one year is low, and no endpoint event (0%) occurs in one year for 3 or 4 samples meeting the conditions. If PRO is 0 to + -or 1+ with moderate/severe tiredness, the probability of endpoint event occurrence within one year is low, and no endpoint event (0%) occurs in one year for both 2 and 4 samples meeting the conditions. If PRO is 2+ or 3+ -4+ with moderate/severe tiredness, the probability of endpoint event occurrence within one year is high, and endpoint event occurrence (100%) occurs within one year for both 61 and 20 samples meeting the conditions.
In the first year decision tree model, the accuracy of model training is 97.77% and the accuracy of testing is 94.47%. In the confusion matrix output by the training set, no sample with the end point event occurs, 579 samples are correctly classified, and 27 samples are incorrectly classified; samples of endpoint events occurred, 606 correctly categorized, 0 incorrectly categorized. In the confusion matrix output by the test set, 1112 samples without end point events are correctly classified, and 67 samples are incorrectly classified; samples where endpoint events occurred, 33 were correctly categorized, and 0 were incorrectly categorized. The misdiagnosis rate of the decision tree model in the first year is lower, and the accuracy is higher.
Similarly, the second year decision tree model takes sample data after the equalization of the second year as a training set and uses the decision tree model for training, and tests the trained model with a sample (sample data before equalization) to obtain main variables of the second year model: eGFR, urea, ALB, age, kidney injury medicine cognitive question score, urinary protein creatinine ratio, damp-heat syndrome, listlessness debilitation, TCO 2, LDL-C.
Wherein, eGFR: estimated glomerular filtration rate, urea: urea, ALB: albumin, TCO2: total carbon dioxide, LDL-C: low density lipoprotein cholesterol. Where "0" means that no endpoint event has occurred and "1" means that an endpoint event has occurred. Cognitive topic score of kidney injury drugs (do you know which drugs would damage the kidneys): no point is known at 1 point=2 point=one point is known, 3 point=basically known, 4 point=mostly known, 5 point=very clear.
When eGFR > 39.43ml/min/1.72m 2, the patient had a high probability of no endpoint event for two years, and no endpoint event (0%) occurred for the qualified 316 samples.
When eGFR is less than or equal to 39.43ml/min/1.72m 2, urea is more than 15.51mmol/L, and age is less than or equal to 53 years old, the probability of endpoint event occurrence of patients in two years is high, 194 samples of 197 samples meeting the condition have endpoint event occurrence (98.48%) in two years. When eGFR is less than or equal to 39.43ml/min/1.72m2, urea is more than 15.51mmol/L, urea is more than 53 years old, and urinary creatinine ratio is more than 1.18g/g, the probability of endpoint event occurrence in two years is higher for patients, and 117 samples in the qualified 123 samples are in endpoint event occurrence (95.12%) in two years. When eGFR is less than or equal to 39.43ml/min/1.72m2, urea is more than 15.51mmol/L, age is more than 53 years old, and urinary creatinine ratio is less than or equal to 1.18g/g, if TCO 2 is more than 17.5mmol/L, the probability that the patient does not generate endpoint event within two years is higher, 18 samples in the 23 samples meeting the conditions do not generate endpoint event within two years (21.74%); if TCO 2 is less than or equal to 17.5mmol/L, the probability of the patient to generate an endpoint event in two years is high, and 13 samples meeting the conditions generate the endpoint event (100%) in two years.
When eGFR is less than or equal to 39.43ml/min/1.72m 2 and Urea is less than or equal to 15.51mmol/L, if ALB is more than 42.8g/L, the probability of no endpoint event occurs in two years for a patient is higher, and 51 samples meeting the conditions do not have endpoint event (0 percent) in two years; if ALB is less than or equal to 42.8g/L, the cognitive topic score of the kidney injury medicine is 3 minutes and 4 minutes/5 minutes, the probability of no endpoint event of a patient in two years is higher, and 6 and 3 samples meeting the conditions do not generate endpoint event (0 percent) in two years; if ALB is less than or equal to 42.8g/L, the cognitive topic score of the kidney injury medicine is 2 points, but the patient does not feel tired and weak, the probability of no endpoint event occurs within two years is higher, and 3 samples meeting the conditions do not have endpoint event (0 percent) within two years; if ALB is less than or equal to 42.8g/L, the cognitive subject score of the kidney injury medicine is 2 minutes, and the probability of the patient to generate the end-point event in two years is higher when the patient is accompanied by mild or moderate and severe tiredness and the patient is in fatigue, 36 samples of 37 samples conforming to the condition when the patient is in fatigue are all generated in the end-point event (97.30%) in two years, and 10 samples of 11 samples conforming to the condition when the patient is in fatigue and severe are all generated in the end-point event (90.91%) in two years. If ALB is less than or equal to 42.8g/L, the cognitive question score of the kidney injury medicine is 1 minute, and the probability of the occurrence of the endpoint event of a patient in two years is higher when the kidney injury medicine is accompanied by damp-heat syndrome, and 27 samples of 28 samples meeting the conditions have the endpoint event in two years (96.43%). If ALB is less than or equal to 42.8g/L, the cognitive question score of the kidney injury medicine is 1 minute, but is not accompanied by damp-heat syndrome, LDL-C is less than or equal to 5.78mmol/L, the probability of no endpoint event of a patient in two years is higher, and 10 samples meeting the condition do not have endpoint events in two years (28.57 percent); if LDL-C > 5.78mmol/L, the patient has a high probability of endpoint event in two years, and 13 samples meeting the conditions have endpoint event (100%) in two years.
In the second-year decision tree model, the accuracy of model training is 94.03%, and the accuracy of testing is 95.35%. In the confusion matrix output by the training set, 376 samples without CKD end events are correctly classified, and 43 samples are incorrectly classified; samples where CKD endpoint events occurred, 412 were correctly categorized and 7 were incorrectly categorized. In the confusion matrix output by the test set, samples without CKD end events are correctly classified by 723 and wrongly classified by 35; samples with CKD endpoint events were correctly categorized 77 and incorrectly categorized 4. Therefore, the false diagnosis rate of the decision tree model of the second year is lower, and the accuracy rate is higher.
And the third-year decision tree model takes sample data after the third-year equalization as a training set and uses the decision tree model for training, and tests the trained model by a sample (sample data before equalization) to obtain main variables of the third-year decision tree model: urea, ALB, cognitive topic scores for the cause of exacerbation of kidney disease.
Wherein, urea: urea, ALB: albumin. Where "0" means that no endpoint event has occurred and "1" means that an endpoint event has occurred. Cognitive topic scores for the cause of exacerbation of kidney disease (do you know what causes exacerbate your kidney disease): no point is known at 1 point=2 point=one point is known, 3 point=basically known, 4 point=mostly known, 5 point=very clear.
When Urea is less than or equal to 10.26mmol/L, the probability that the patient does not have an endpoint event in three years is high, and 184 of 192 samples meeting the condition have no endpoint event in three years (4.17%).
When Urea >10.26mmol/L and Urea >15.82mmol/L, the patient has a high probability of endpoint event occurrence within three years, 170 samples out of 181 samples eligible have endpoint event occurrence within three years (93.92%).
When Urea >10.26mmol/L but Urea < 15.82mmol/L, if ALB >43.4g/L, the patient has a high probability of not having an endpoint event within three years, and 28 of the eligible 33 samples have no endpoint event within three years (15.15%). If ALB is less than or equal to 43.4g/L, the cognitive question scores of the kidney disease aggravation cause are 1 minute, 2 minutes and 3 minutes, the probability of the occurrence of the endpoint event in three years is higher, the endpoint event (100%) is generated in three years by 12 samples which are 1 minute eligible, the endpoint event (79.41%) is generated in three years by 27 samples of 34 samples which are 2 minute eligible, and the endpoint event (88.24%) is generated in three years by 15 samples of 17 samples which are 3 minute eligible. If ALB is less than or equal to 43.4g/L, the cognitive question score of the kidney disease aggravation cause is 4 minutes/5 minutes, the probability of no endpoint event of the patient in three years is high, and 5 samples meeting the conditions have no endpoint event (0%) in three years.
In the third-year decision tree model, the accuracy of model training is 87.76%, and the accuracy of testing is 89.89%. In the confusion matrix output by the training set, samples without CKD end events are correctly classified by 204 and incorrectly classified by 33; samples with CKD endpoint events were 212 correctly categorized, 25 incorrectly categorized; in the confusion matrix output by the test set, samples without CKD end events occur, and 334 samples are correctly classified, and 42 samples are incorrectly classified; samples with CKD endpoint events were correctly classified for 93 and incorrectly classified for 6. The third year decision tree model has lower misdiagnosis rate and higher accuracy.
The fourth-year decision tree model takes sample data after fourth-year equalization as a training set and uses the decision tree model for training, and tests the trained model by a sample (sample data before equalization) to obtain main variables of the fourth-year decision tree model: eGFR, damp-heat syndrome, TG, nocturia.
Wherein, eGFR: estimated glomerular filtration rate, TG: triglycerides. Where "0" means that no endpoint event has occurred and "1" means that an endpoint event has occurred.
When eGFR >28.59ml/min/1.72m 2, TG >3.16mmol/L, the patient has a high probability of endpoint event occurrence within four years, 7 out of the eligible 10 samples have endpoint event occurrence within four years (70.00%). When eGFR is larger than 28.59ml/min/1.72m 2 and TG is smaller than or equal to 3.16mmol/L, the probability that the patient does not generate an endpoint event within four years is high, and 124 samples of 135 samples meeting the conditions do not generate the endpoint event (8.15%).
When the eGFR is less than or equal to 28.59ml/min/1.72m2 and the eGFR is less than or equal to 15.99ml/min/1.72m2, the probability of the endpoint event of the patient in four years is high, and the endpoint event (100%) of 78 samples meeting the conditions occurs in four years.
When the eGFR is less than or equal to 28.59ml/min/1.72m2 and the eGFR is more than 15.99ml/min/1.72m2, if the eGFR is accompanied by damp-heat symptoms, the probability of the occurrence of the endpoint event of a patient within four years is high, and the endpoint event (100%) occurs in all 14 samples meeting the conditions within four years. When the eGFR is less than or equal to 28.59ml/min/1.72m2 and the eGFR is more than 15.99ml/min/1.72m2, the disease is not accompanied by damp-heat syndrome and nocturia, the probability of the patient not generating an endpoint event within four years is high, and 10 samples meeting the condition do not generate an endpoint event within four years (28.57%). If mild, moderate or severe nocturia is associated, the patient has a higher probability of endpoint event in four years, 24 of the 27 samples eligible for mild nocturia all have endpoint event in four years (88.89%), and 9 of the 16 samples eligible for moderate or severe nocturia all have endpoint event in four years (56.25%).
In the decision tree model, the accuracy of model training is 82.65%, and the accuracy of testing is 82.71%. In the confusion matrix output by the training set, 121 samples with CKD end events do not occur, and 26 samples are correctly classified and 26 samples are incorrectly classified; samples with CKD endpoint events were correctly categorized 122 and incorrectly categorized 25. In the confusion matrix output by the test set, 144 samples without CKD end events are correctly classified, and 33 samples are incorrectly classified; samples with CKD endpoint events were 100 correctly categorized and 18 incorrectly categorized. The misdiagnosis rate of the model is lower, and the accuracy is higher.
And S3, evaluating the effect of the CKD traditional Chinese and western medicine chronic disease management effect prediction model in a preset mode.
And evaluating the effect of the CKD Chinese and Western medicine chronic disease management effect prediction model in a preset mode. Evaluating the effect of the CKD Chinese and Western medicine chronic disease management effect prediction model in a preset mode comprises the following steps:
and collecting case information, calculating the prediction accuracy, sensitivity and specificity of the CKD Chinese and Western medicine chronic disease management effect prediction model, and evaluating the prediction effect of the CKD Chinese and Western medicine chronic disease management effect prediction model according to the prediction accuracy.
The area under the working characteristic curve (AUC) of the test subject is taken as a main index for evaluating the predictive performance of the above 3 modeling methods, the discrimination of the evaluation model is evaluated, and the AUC > 0.7 is considered to have predictive value. The secondary evaluation indexes are sensitivity, specificity and accuracy. FIG. 6 is a graph showing the AUC of the decision tree model, and FIG. 3 is an objective evaluation index of the decision tree model.
TABLE 3 model evaluation index
Training set Sensitivity of Specificity of the sample Accuracy rate of AUC
Year 1 1.000 0.955 0.978 0.984
Year 2 0.983 0.897 0.940 0.963
Year 3 0.895 0.861 0.878 0.922
Year 4 0.830 0.823 0.827 0.893
Test set Sensitivity of Specificity of the sample Accuracy rate of AUC
Year 1 1.000 0.943 0.945 0.976
Year 2 0.951 0.954 0.954 0.979
Year 3 0.939 0.888 0.899 0.936
The AUCs of the first year decision tree model, the second year decision tree model, the third year decision tree model and the fourth year decision tree model in the training set are respectively 0.984, 0.963, 0.922 and 0.893, and the AUCs of the first year decision tree model, the second year decision tree model, the third year decision tree model and the fourth year decision tree model in the test set are respectively 0.976, 0.979, 0.936 and 0.899, so that the prediction accuracy, the specificity and the accuracy are higher.
Specifically, in terms of accuracy, the accuracy of the first-year decision tree model is highest, the accuracy of the training set and the testing set are respectively 0.978 and 0.945, the accuracy of the second-year decision tree training set and the accuracy of the testing set are respectively 0.940 and 0.954, the accuracy of the third-year decision tree training set and the accuracy of the testing set are respectively 0.878 and 0.899, and the accuracy of the fourth-year decision tree training set and the accuracy of the testing set are respectively 0.827.
In terms of sensitivity, the sensitivity of the first year decision tree model, the sensitivity of both training set and test set was 1.000; sensitivity of the second year decision tree model, training set and test set are respectively 0.983 and 0.951; sensitivity of the third-year decision tree model, training set and test set were 0.895, 0.939, respectively; the sensitivity of the fourth year decision tree model was 0.830 and 0.848 for the training set and the test set, respectively.
In terms of specificity, the specificity of the first-year decision tree model is highest, and the training set and the testing set are respectively 0.955 and 0.943; the specificity of the second year decision tree model is 0.897 and 0.954 respectively for the training set and the test set; the specificity of the third-year decision tree model is 0.861 and 0.888 in the training set and the test set respectively; the specificity of the fourth year decision tree model was 0.823 and 0.814 for training and test sets, respectively.
Overall, the decision tree model performs well in the first, second, third and fourth years.
Based on the construction method, the application also provides a device which can be applied to the clinic and is provided with the method for constructing the chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model.
7-10, The device can be provided with an applet platform of other intelligent terminals such as WeChat or payment treasures, so that medical staff and patients can directly input data and quickly look up a prediction result. Specifically, the development process of the small program in the device is as follows:
And creating a response type page with a fixed width, setting page background color parameters, adopting Flex layout for the whole page, wherein the outer margin of the page is 32rpx multiplied by 40rpx, the height of the page is the height of a display interface, and the first-level subelements are vertically and horizontally centered and automatically set up frames according to actual projects.
And (3) creating a CKD prediction blank form, constructing a data input frame of the prediction model, storing answers corresponding to the data input frame of the prediction model, connecting a prediction model interface of the CKD traditional Chinese and western medicine chronic disease management effect, respectively obtaining a prediction result list, obtaining clear information of an evaluation record, returning a prediction model result, storing result data, and storing details to a database thereof based on a master-slave table mode.
Setting an initialization chart item, creating a CKD prediction chart, and displaying the occurrence probability of the endpoint event, wherein the method comprises the following steps: "first year occurrence probability", "second year occurrence probability", "third year occurrence probability", "fourth year occurrence probability".
In the applet platform, case data of cases also need to be screened and collected. Specifically, 47 CKD patients enrolled in a clinic for chronic disease management of the nephrology department and agreeing to add chronic disease management within a preset period of time according to inclusion exclusion criteria. Inclusion criteria were patients with an age of 18 years or more, at least two creatinine results, and a follow-up time exceeding 6 months for the first and last creatinine measurements, with clear diagnosis of CKD and egffr > 5 ml/(min 1.73m 2) with reference to the 2012 kdaigo guideline, signed with informed consent for slow disease management and consent to use the data. The exclusion criteria are patients who have undergone kidney replacement therapy (dialysis or transplantation) for the first visit, diagnosed with a history of renal malignancy or acute death (e.g., acute lung infection) caused by a well-defined disease other than renal, and medical history data insufficiency, i.e., those who have been deficient in the predictor data involved in the first year in the CKD traditional chinese and western medicine chronic disease management effect prediction model.
And collecting case data required by a prediction model of the first year constructed by the decision tree, predicting whether the first-year endpoint event occurs for the included CKD patient, and calculating the accuracy, sensitivity and specificity compared with the condition that the endpoint event actually occurs. The results showed that there were 44 samples with no endpoint event, and 1 was misclassified; samples where endpoint events occurred, 2 were correctly categorized, and 0 were incorrectly categorized. Therefore, in the verification queue, the accuracy of the first year model verification is 97.87%, the sensitivity is 1, and the specificity is 0.978.
The invention provides a method and a device for constructing a CKD (chronic disease management) effect prediction model, which comprehensively collect multidimensional, multi-item and large-sample Chinese and Western clinical characteristics, provide more detailed information about CKD prognosis influence factors, further accurately predict the chronic disease management effect and provide the CKD chronic disease management effect prediction model with traditional Chinese medicine characteristics and comprehensive multidimensional disease prognosis influence factors. According to the invention, the artificial intelligence method of the decision tree is adopted to construct the model for predicting the chronic disease management effect of the traditional Chinese and western medicine of the CKD, the model is not limited and intervened by statistical conditions, and the model can be repeatedly adjusted according to clinical practice and the requirements of medical workers so as to meet the clinical work requirements. The results of the decision tree model are more interpretable than other machine learning algorithms. The construction method can be applied to clinical models to show good prediction performance, has high clinical value and significance, and provides evidence for application of AI to clinical practice.
The application also provides a device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the construction method as described in any one of the above when executing the computer program.
The apparatus may include: a memory storing executable program code;
a processor coupled to the memory;
a transceiver for communicating with other devices or a communication network, receiving or transmitting network messages;
and a bus for connecting the memory, the processor and the transceiver for internal communication.
The transceiver receives the information transmitted from the network and transmits the information to the processor through the bus, the processor calls the executable program codes stored in the memory through the bus to process the information, and transmits the processing result to the transceiver through the bus to send the processing result, so that the construction method provided by the embodiment of the application is realized.
Embodiments of the present application also provide a non-transitory machine-readable storage medium having stored thereon an executable program which, when executed by a processor, causes the processor to perform the construction method as provided in the above embodiments.
The embodiments described above are illustrative only, and the modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, may be located in one place, or may be distributed over multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above detailed description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product that may be stored in a computer-readable storage medium including Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM) or other optical disc Memory, magnetic disc Memory, tape Memory, or any other medium that can be used for computer-readable carrying or storing data.
Finally, it should be noted that: the embodiment of the present invention is disclosed only in the preferred embodiment of the present invention, and is only used for illustrating the technical scheme of the present invention, but not limiting the technical scheme; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (9)

1. The method for constructing the chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model is characterized by comprising the following steps of:
Obtaining case data and constructing a case data set containing the case information of the traditional Chinese medicine and the western medicine;
according to the case data set, a decision tree-based CKD traditional Chinese and western medicine chronic disease management effect prediction model is constructed, and whether an endpoint event occurs is predicted through the CKD traditional Chinese and western medicine chronic disease management effect prediction model; the CKD traditional Chinese and western medicine chronic disease management effect prediction model comprises a plurality of decision tree models with different years; the decision tree model trains the decision tree model by taking the sample data after the annual equalization as a training set, tests the trained decision tree model by taking a testing set without the equalization processing to obtain a main variable of the annual decision tree model, and judges whether the sample data has an endpoint event or not by the main variable;
And evaluating the effect of the CKD Chinese and Western medicine chronic disease management effect prediction model in a preset mode.
2. The method for constructing a model for predicting chronic kidney disease management effects of traditional Chinese and western medicine according to claim 1, wherein constructing a model for predicting chronic kidney disease management effects of traditional Chinese and western medicine according to the case data set comprises:
creating a test set and an equalized training set from the case data set using the put-back sampling;
And constructing a model for predicting the chronic disease management effect of the traditional Chinese and western medicine of the CKD based on the decision tree, and training through a training set to finish the construction of the model for predicting the chronic disease management effect of the traditional Chinese and western medicine of the CKD.
3. The method for constructing a model for predicting chronic kidney disease management effect according to claim 2, wherein the method for constructing a model for predicting chronic kidney disease management effect according to CKD of traditional Chinese and western medicine based on decision tree comprises the steps of:
Determining an optimal feature as a root node for classifying the case dataset into as many different categories as possible;
Dividing the case data set into a plurality of subsets, determining the best features in each subset, and generating branches of a decision tree;
reducing the size of the decision tree through pruning operation;
determining the category of an instance of the training set through branches of the decision tree to complete construction of a decision tree model; the category is whether an endpoint event occurs.
4. The method for constructing a chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model according to claim 3, wherein the CKD traditional Chinese and western medicine chronic disease management effect prediction model comprises a first annual decision tree model, a second annual decision tree model, a third annual decision tree model and a fourth annual decision tree model;
The first-year decision tree model takes the sample data after the first-year equalization as a training set and trains by using the decision tree model; the second-year decision tree model takes the sample data after the second-year equalization as a training set and trains by using the decision tree model; the third-year decision tree model takes the sample data after the third-year equalization as a training set and trains by using the decision tree model; and the fourth-year decision tree model takes the sample data after the fourth-year equalization as a training set and trains by using the decision tree model.
5. The method for constructing a model for predicting chronic kidney disease management effects in both traditional and western medicine according to claim 4, wherein the main variables of the first annual decision tree model include one or more of the following: eGFR, urea, hb, total dialectical score of traditional Chinese medicine, listlessness, debilitation, sex, motor cognition subject score, PRO;
The main variables of the second annual decision tree model include one or more of the following: eGFR, urea, ALB, age, kidney injury medicine cognition subject score, urinary protein creatinine ratio, damp-heat syndrome, listlessness debilitation, TCO2, LDL-C;
the main variables of the third year decision tree model include one or more of the following: urea, ALB, cognitive topic scores for the cause of exacerbation of kidney disease;
the main variables of the fourth year decision tree model include one or more of the following: eGFR, damp-heat syndrome, TG, nocturia.
6. The method for constructing a model for predicting chronic kidney disease management effects in both traditional and western medicine according to claim 5, wherein obtaining case data and constructing a case data set containing the case information of both traditional and western medicine comprises:
acquiring case data, and performing data cleaning on the case data to construct a case data set containing the case information of the traditional Chinese and western medicine;
The data cleansing includes one or more of the following: reject missing variables, equalize, correlation analysis, variable classification, variable merging or variable assignment.
7. The method for constructing a chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model according to claim 6, wherein the method for evaluating the effect of the CKD traditional Chinese and western medicine chronic disease management effect prediction model by a preset mode comprises the following steps:
and collecting case information, calculating the prediction accuracy, sensitivity and specificity of the CKD Chinese and Western medicine chronic disease management effect prediction model, and evaluating the prediction effect of the CKD Chinese and Western medicine chronic disease management effect prediction model according to the prediction accuracy.
8. The method for constructing a model for predicting chronic kidney disease management effects in a traditional chinese and western medicine according to claim 7, wherein the case data includes one or more of the following: basic information, diagnosis and treatment conditions, nutritional conditions, traditional Chinese and western medicine influence factors and clinical prognosis ending.
9. The device of chronic kidney disease traditional chinese and western medicine chronic disease management effect prediction model, characterized by comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the CKD traditional chinese and western medicine slow disease management effect prediction model construction method according to any one of claims 1 to 8 when executing the computer program.
CN202410060582.6A 2024-01-15 2024-01-15 Method and device for constructing chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model Pending CN117954089A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410060582.6A CN117954089A (en) 2024-01-15 2024-01-15 Method and device for constructing chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410060582.6A CN117954089A (en) 2024-01-15 2024-01-15 Method and device for constructing chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model

Publications (1)

Publication Number Publication Date
CN117954089A true CN117954089A (en) 2024-04-30

Family

ID=90801247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410060582.6A Pending CN117954089A (en) 2024-01-15 2024-01-15 Method and device for constructing chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model

Country Status (1)

Country Link
CN (1) CN117954089A (en)

Similar Documents

Publication Publication Date Title
Qin et al. A machine learning methodology for diagnosing chronic kidney disease
Sawhney et al. A comparative assessment of artificial intelligence models used for early prediction and evaluation of chronic kidney disease
Yuan et al. The development an artificial intelligence algorithm for early sepsis diagnosis in the intensive care unit
Piri et al. A data analytics approach to building a clinical decision support system for diabetic retinopathy: Developing and deploying a model ensemble
Ko et al. 28n19 Ijaet0319396 V7 Iss1 242 254
Young et al. The dialysis outcomes and practice patterns study (DOPPS): an international hemodialysis study
WO2021190300A1 (en) Method for constructing ai chronic kidney disease risk screening model, and chronic kidney disease risk screening method and system
JP2001511680A (en) A system for predicting future health
CN113327679A (en) Pulmonary embolism clinical risk and prognosis scoring method and system
Sankaranarayanan et al. A predictive approach for diabetes mellitus disease through data mining technologies
WO2017165693A1 (en) Use of clinical parameters for the prediction of sirs
CN107145715B (en) Clinical medicine intelligence discriminating gear based on electing algorithm
CN114220540A (en) Construction method and application of diabetic nephropathy risk prediction model
CN112967803A (en) Early mortality prediction method and system for emergency patients based on integrated model
Hsu A decision-making mechanism for assessing risk factor significance in cardiovascular diseases
CN113128654B (en) Improved random forest model for coronary heart disease pre-diagnosis and pre-diagnosis system thereof
US20230343431A1 (en) Apparatus and method for determining a composition of a replacement therapy treatment
US11676726B2 (en) Apparatus and method for generating a treatment plan for salutogenesis
Farzi et al. Predicting serious diabetic complications using hidden pattern detection
CN113782197B (en) New coronary pneumonia patient outcome prediction method based on interpretable machine learning algorithm
CN117954089A (en) Method and device for constructing chronic kidney disease traditional Chinese and western medicine chronic disease management effect prediction model
Sonia et al. An empirical evaluation of benchmark machine learning classifiers for risk prediction of cardiovascular disease in diabetic males
Pervaiz et al. A Study on Detection of Chronic Renal Failure Based on Machine Learning
CN115758155A (en) Intelligent diagnosis system for bacterial meningitis based on random forest algorithm
Archie et al. Machine learning implementation and challenges: a study of lifestyle behaviors pattern and Hba1c status

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination