WO2021190300A1 - Ai慢性肾病风险筛查建模方法、慢性肾病风险筛查方法及系统 - Google Patents
Ai慢性肾病风险筛查建模方法、慢性肾病风险筛查方法及系统 Download PDFInfo
- Publication number
- WO2021190300A1 WO2021190300A1 PCT/CN2021/079849 CN2021079849W WO2021190300A1 WO 2021190300 A1 WO2021190300 A1 WO 2021190300A1 CN 2021079849 W CN2021079849 W CN 2021079849W WO 2021190300 A1 WO2021190300 A1 WO 2021190300A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- kidney disease
- chronic kidney
- medical
- data
- xgboost
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
Definitions
- the present invention relates to a chronic kidney disease risk screening method and system, in particular to the construction of a chronic kidney disease risk screening model by a machine learning method, a chronic kidney disease risk screening assessment method and system, and the use of the model, assessment method and system for medical examiners
- the medical characteristic index is screened, and the chronic kidney disease risk assessment value is given, so as to realize the chronic kidney disease risk screening with high efficiency, low cost and high accuracy.
- Chronic kidney disease has the characteristics of high prevalence, low awareness, poor prognosis and high medical expenses. It is another disease that seriously endangers human health after cardiovascular and cerebrovascular diseases, diabetes and malignant tumors. In recent years, with the aging of our country's population, the incidence of diabetes, hypertension and other diseases has increased year by year, the prevalence of chronic kidney disease has also increased year by year. The prevalence rate of chronic kidney disease among people over 18 years old in my country is 10.8%, while the awareness rate is less than 5%. Therefore, there is an urgent need for an effective chronic kidney disease risk screening system to conduct early chronic kidney disease screening, improve awareness, facilitate early detection and early treatment of chronic kidney disease, prevent the continuous deterioration of kidney function, and reduce the economic burden on individuals, families and society. . At present, chronic kidney disease risk screening requires the examiner to perform the examination in the hospital, and the nephrologist combines clinical guidelines and practical experience to judge, which is not conducive to efficient general screening.
- an AI chronic kidney disease risk screening method which includes the following steps:
- Step S1 establish an effective chronic kidney disease risk screening model
- Step S2 sorting out user data to be screened
- step S3 the user data to be screened is substituted into the chronic kidney disease risk screening model for model calculation, and finally the kidney disease risk prediction result is obtained.
- Step S11 prepare medical record data; collect electronic medical records of patients from the hospital electronic medical record platform, and collect electronic medical records of patients with chronic kidney disease and non-chronic kidney disease patients;
- the method for collecting the electronic medical record of patients with chronic kidney disease as the diagnosis result is: comparing the diagnosis result of the Chinese physician in the electronic medical record with the disease name in the chronic kidney disease disease name database to obtain the electronic medical record of the patient with chronic kidney disease;
- the method for collecting electronic medical records of patients with non-chronic kidney disease is: the data of patients and physical examiners received in the internal medicine department at the same time, excluding unclear medical history and incomplete examination and examination data, as well as the electronic medical records of patients with acute diseases, severe infections or tumors;
- the medical record data includes disease course records, inspection and inspection results, doctor's orders, surgical records, nursing records, and true diagnosis results, and the inspection and inspection results include medical characteristics and thresholds;
- the chronic kidney disease name database contains the names of various medical diseases that can be judged as chronic kidney disease.
- Step S12 a step of extracting medical features; performing chronic kidney disease medical feature extraction on the qualified electronic medical record data obtained in step S11 to extract medical features and medical feature values; the chronic kidney disease medical features include basic information, past history, and family history , Subjective symptoms, blood tests, urine tests.
- the basic information table includes 7 specific feature fields: gender, age, height, weight, blood pressure, pregnancy status, and occupation;
- the past history table includes four specific feature fields: diabetes, hypertension, smoking history, and drinking history;
- the family history table includes 5 specific feature fields: chronic kidney disease, diabetes, hypertension, renal cysts, and polycystic kidney disease;
- Subjective symptoms include: convulsions, polyuria, nausea, fever, fatigue, joint pain, dry mouth, urgency, dysuria, vomiting, rash, macroscopic hematuria, upper respiratory tract infection, oliguria, loss of appetite, edema, headache, Dizziness, anuria, foamy urine, chest tightness, dry eyes, low back pain, and eclampsia have 24 specific characteristic fields;
- the blood test table includes: blood C-reactive protein, blood white blood cell count, hemoglobin, red blood cell count, blood sugar, platelets, blood hepatitis B e antibody, blood hepatitis B e antigen, blood hepatitis B surface antibody, blood hepatitis B surface antigen, blood hepatitis B core antibody, blood Hepatitis C antibody, erythrocyte sedimentation rate, blood lactate dehydrogenase, blood albumin, blood aspartate aminotransferase, blood alanine aminotransferase, blood total protein, blood total bilirubin, blood total cholesterol, blood triglycerides, blood creatinine, blood uric acid, blood Urea nitrogen, blood potassium, blood sodium, blood calcium, blood phosphorus, blood chlorine, blood cystatin C, anti-neutrophil cytoplasmic antibody, complement C4, complement C3, complement C2, complement C1q, immunoglobulin A, Immunoglobulin E, immunoglobulin G, and immunoglobulin
- Urine examination table includes: urine white blood cell, urine specific gravity, urine bilirubin, urine protein, urine red blood cell, urine creatinine, urine occult blood, urine ketone body, urine microalbumin, urine tube type, urine albumin, urine PH value, urine bile Original, urine nitrite, urine glucose, microscopic hematuria, urine osmotic pressure, urine sodium, 24-hour urine output, 24-hour urine protein quantification 20 specific feature fields;
- the medical characteristic value is the specific value of each medical characteristic among basic information, past history, family history, subjective symptoms, blood test, and urine test characteristics.
- Step S13 the step of feature data standardization and data cleaning. Perform feature data standardization on the big data of clinical manifestations of kidney disease obtained in step S12, remove data with missing values, and obtain a standard data sample.
- the standard data sample includes a standard medical feature data set and a standard diagnosis result set, a standard medical feature data set and The data in the standard diagnosis result set has a one-to-one correspondence. It includes the following two steps:
- the expressions of the same feature are replaced with the feature standard library, and the feature description is unified to obtain standardized medical feature data.
- the replacement of medical feature values mainly includes the replacement of symbols, letters, characters, units, and medical codes, and unified symbols, letters, characters, units, and medical codes.
- Step S14 The step of feature screening; combining the kidney disease-related features provided by the nephrologists and calculating the standard data samples using statistical methods, the selected kidney disease-related features are summarized for the epidemiology and examination of the kidney disease screening task Test and symptom characteristics, and obtain the selected medical characteristic data set.
- the kidney disease-related features provided by the nephrologists are a kind of medical experience-related kidney disease-related features table provided by the nephrologists offline.
- T-test and Chi-square test are commonly used methods in statistics and belong to the prior art.
- Computer programs that use T-test and Chi-square test Python are an existing computer program on the market and also belong to the prior art.
- the present invention only uses the above statistical method and related software to calculate to obtain the probability value P.
- P value We set the P value to be less than 0.05. It can be considered that the selected features have an extremely significant correlation with the risk of chronic kidney disease. These features are selected It is reasonable to build a model.
- T test is an example for further explanation.
- T test and chi-square test are used to screen out the influencing factors related to chronic kidney disease.
- the T test is to compare the mean value of each factor to study whether there is a significant difference between the factors in the diagnosis of chronic kidney disease or not.
- Basic premise The sample data obeys a normal or approximately normal distribution; it is used to test quantitative data (white blood cell, red blood cell, hemoglobin, etc.). The operation is as follows:
- the normality test is performed on the quantitative data.
- the normality test result is confirmed by viewing the QQ graph. If the data is basically distributed near a straight line , Can be considered to obey the normal distribution. Then, the corresponding P value is obtained through the T test, and the P value is compared with the significance level. If P ⁇ 0.05, the null hypothesis is rejected (H0: the factor does not have a significant difference between the diagnosis of chronic kidney disease or not). It is considered that the difference between this factor in the presence or absence of chronic kidney disease is statistically significant. It shows that this factor has a significant difference between the presence or absence of chronic kidney disease, and further shows that this factor is one of the factors that affect the presence or absence of chronic kidney disease. Therefore, the influencing factors related to chronic kidney disease were screened out among all the factors.
- the principle and steps of the chi-square test are similar to the above, but the targeted data are all classified data (gender, urine occult blood, etc.).
- Step S15 the step of splitting the feature data set
- the StratifiedShuffleSplit hierarchical segmentation method is an existing technology and belongs to a functional module of a python computer program.
- the BP neural network includes neuron weights and deviations
- the random forest is composed of multiple decision trees, the decision tree includes multiple nodes, and the nodes are medical features and thresholds;
- the XGBoost includes an XGBoost decision tree and the relationship between the XGBoost decision tree;
- the XGBoost decision tree includes a plurality of nodes, and the nodes are medical features and thresholds;
- the relationship between the XGBoost decision trees is a gradient descent optimization algorithm , The next decision tree is obtained from the previous decision tree according to the gradient descent optimization algorithm;
- the training data is calculated by the BP neural network algorithm, the XGBoost algorithm and the random forest algorithm respectively, and the BP neural network prediction result set, XGBoost prediction result set and random forest prediction result set are obtained respectively.
- the total prediction result set consists of the prediction result value, and the prediction result value is composed of two values, yes and no, which is representative Chronic kidney disease, whether it means non-chronic kidney disease; use the voting method to vote on the total prediction result set, according to the number of yes or no in the prediction results, the one with the largest number of yes or no values wins, thereby obtaining the prediction result of chronic kidney disease;
- the selected medical feature data corresponding to the chronic kidney disease prediction results that do not match the diagnostic results in the corresponding patient standard diagnosis result set are put into the random forest algorithm to continue training, and the medical features and thresholds in the decision tree nodes are adjusted to finally make The prediction results of chronic kidney disease are consistent with the diagnosis results in the corresponding patient standard diagnosis result set, so as to obtain the medical characteristics and threshold values in the adaptive decision tree nodes that can distinguish chronic kidney disease;
- the selected medical feature data corresponding to the chronic kidney disease prediction results that do not match the diagnostic results in the corresponding patient standard diagnostic result set are put into the XGBoost algorithm to continue training, and the medical features, thresholds, and XGBoost in the XGBoost decision tree nodes are adjusted.
- the relationship between the decision trees finally makes the prediction results of chronic kidney disease consistent with the diagnosis results in the corresponding patient standard diagnosis result set, so as to obtain the medical features, thresholds, and XGBoost decision tree nodes in the XGBoost decision tree that can distinguish chronic kidney disease. Relationship between trees;
- the adaptive chronic kidney disease risk screening parameter set capable of distinguishing chronic kidney disease includes a BP neural network capable of distinguishing chronic kidney disease. Meta weights and deviations, the medical features and thresholds in the nodes of random forest decision trees, and the medical features and thresholds in the nodes of XGBoost decision trees, and the relationship between XGBoost decision trees;
- a chronic kidney disease risk screening parameter set that can be used to distinguish chronic kidney disease, together with BP neural network algorithm, XGBoost algorithm, and random forest algorithm, constitute a chronic kidney disease risk screening model;
- Step S17 Steps of the chronic kidney disease risk screening model test
- the chronic kidney disease risk screening model calculates the test data obtained in step S15, and calculates the accuracy, recall, and precision of the obtained results. If the average of these three test indicators exceeds 0.95, the chronic kidney disease artificial intelligence The screening model is effective; if the average value does not reach 0.95, go back to step S16 and use the training data again to adjust the algorithm parameters to obtain the chronic kidney disease risk screening parameter set that is compatible with chronic kidney disease, and get the chronic kidney disease risk screen again Check the model;
- the accuracy rate is the ratio of the sum of the number of chronic kidney disease and the number of non-chronic kidney disease in the test data that the chronic kidney disease risk screening model correctly predicts to the total number of test data;
- the recall rate is the ratio of the chronic kidney disease risk screening model correctly predicting the number of chronic kidney disease in the test data to the total number of chronic kidney disease diagnosed in the test data;
- the accuracy rate is the ratio of the number of chronic kidney disease that the chronic kidney disease risk screening model correctly predicts to the total number of chronic kidney disease predicted by the chronic kidney disease risk screening model in the test data.
- Steps to establish an effective risk screening model for chronic kidney disease After steps S16 and S17, the accuracy, precision, and recall rate of the chronic kidney disease risk screening model that exceed 0.95 are determined as the effective risk screening model for chronic kidney disease risk , And finally get an effective model of chronic kidney disease.
- step S2 organizes the user data to be screened; the hospital or the medical examination center provides the user data to be screened, and standardizes the provided user data to be screened to obtain standardized user data to be screened, so as to meet the chronic kidney disease risk screening Check the standard of model data input.
- the data of the user to be screened is medical characteristic data of the user to be screened obtained by a hospital or a physical examination center.
- Step S3 input the standardized user data to be screened into the chronic kidney disease risk screening model for model calculation, and finally obtain the kidney disease risk prediction result. Further, the method of inputting standardized user data to be screened into the chronic kidney disease risk screening model is import, or batch import, or input.
- the present invention also proposes a method for constructing an AI chronic kidney disease risk screening model, which includes the following steps:
- A1 Steps to obtain a chronic kidney disease risk screening model from training data
- the sklearn package of python development language is used, and three models of BP neural network, XGBoost and random forest are used to establish an integrated learning classifier system; a suitable chronic kidney disease risk screening parameter set that can distinguish chronic kidney disease is established.
- BP neural network In the three models of XGBoost and Random Forest, the data is trained and iteratively trained to optimize the chronic kidney disease risk screening parameter set, and finally obtain a suitable chronic kidney disease risk screening parameter set that can distinguish chronic kidney disease.
- the chronic kidney disease risk includes the weights and biases of BP neural network neurons that can distinguish chronic kidney disease, the medical features and thresholds in random forest decision tree nodes, and the medical features and thresholds in XGBoost decision tree nodes, and XGBoost The relationship between decision trees;
- the BP neural network includes neuron weights and deviations
- the random forest is composed of multiple decision trees, the decision tree includes multiple nodes, and the nodes are medical features and thresholds;
- the XGBoost includes an XGBoost decision tree and the relationship between the XGBoost decision tree;
- the XGBoost decision tree includes a plurality of nodes, and the nodes are medical features and thresholds;
- the relationship between the XGBoost decision trees is a gradient descent optimization algorithm , The next decision tree is obtained from the previous decision tree according to the gradient descent optimization algorithm;
- the training data is calculated by the BP neural network algorithm, the XGBoost algorithm and the random forest algorithm respectively, and the BP neural network prediction result set, XGBoost prediction result set and random forest prediction result set are obtained respectively.
- the total prediction result set consists of the prediction result value, and the prediction result value is composed of two values, yes and no, which is representative Chronic kidney disease, whether it means non-chronic kidney disease; use the voting method to vote on the total prediction result set, according to the number of yes or no in the prediction results, the one with the largest number of yes or no values wins, thereby obtaining the prediction result of chronic kidney disease;
- the selected medical feature data corresponding to the chronic kidney disease prediction results that do not match the diagnostic results in the corresponding patient standard diagnosis result set are put into the random forest algorithm to continue training, and the medical features and thresholds in the decision tree nodes are adjusted to finally make The prediction results of chronic kidney disease are consistent with the diagnosis results in the corresponding patient standard diagnosis result set, so as to obtain the medical characteristics and threshold values in the adaptive decision tree nodes that can distinguish chronic kidney disease;
- the selected medical feature data corresponding to the chronic kidney disease prediction results that do not match the diagnostic results in the corresponding patient standard diagnostic result set are put into the XGBoost algorithm to continue training, and the medical features, thresholds, and XGBoost in the XGBoost decision tree nodes are adjusted.
- the relationship between the decision trees finally makes the prediction results of chronic kidney disease consistent with the diagnosis results in the corresponding patient standard diagnosis result set, so as to obtain the medical features, thresholds, and XGBoost decision tree nodes in the XGBoost decision tree that can distinguish chronic kidney disease. Relationship between trees;
- the adaptive chronic kidney disease risk screening parameter set capable of distinguishing chronic kidney disease includes a BP neural network capable of distinguishing chronic kidney disease. Meta weights and deviations, the medical features and thresholds in the nodes of random forest decision trees, and the medical features and thresholds in the nodes of XGBoost decision trees, and the relationship between XGBoost decision trees;
- a chronic kidney disease risk screening parameter set that can be used to distinguish chronic kidney disease, together with BP neural network algorithm, XGBoost algorithm, and random forest algorithm, constitute a chronic kidney disease risk screening model;
- A2 Steps of chronic kidney disease risk screening model test
- the chronic kidney disease risk screening model calculates the test data, and calculates the accuracy, recall, and precision of the obtained results. If the average of these three test indicators exceeds 0.95, the chronic kidney disease artificial intelligence screening model is effective; If the average value does not reach 0.95, go back to step A1 and use the training data again to adjust the algorithm parameters to obtain the chronic kidney disease risk screening parameter set suitable for chronic kidney disease again, and obtain the chronic kidney disease risk screening model again;
- the test data comes from electronic medical records;
- the accuracy rate is the ratio of the sum of the number of chronic kidney disease and the number of non-chronic kidney disease in the test data that the chronic kidney disease risk screening model correctly predicts to the total number of test data;
- the recall rate is the ratio of the chronic kidney disease risk screening model correctly predicting the number of chronic kidney disease in the test data to the total number of chronic kidney disease diagnosed in the test data;
- the accuracy rate is the ratio of the number of chronic kidney disease risk screening models correctly predicted by the chronic kidney disease risk screening model to the total number of chronic kidney diseases predicted by the chronic kidney disease risk screening model in the test data.
- A3 Steps to establish an effective risk screening model for chronic kidney disease; after steps A1 and A2, the accuracy, precision, and recall rate of the chronic kidney disease risk screening model that exceed 0.95 are determined as the effective risk screening model for chronic kidney disease risk .
- the present invention also proposes an AI chronic kidney disease risk screening system, including an effective risk screening model for chronic kidney disease risk.
- the effective risk screening model for chronic kidney disease risk includes three models including BP neural network, XGBoost and random forest.
- the invention uses machine learning BP neural network, XGBoost and random forest integrated algorithm to train the chronic kidney disease risk screening model, which can automatically screen based on basic body measurement information, symptom information, medical examination information, family history, past history, living habits and other data
- the accuracy of detecting high-risk groups of chronic kidney disease is as high as 0.96.
- the invention constructs a machine learning model of chronic kidney disease risk screening. It can improve the awareness of kidney disease risk for the general public and play a guiding role for healthy life.
- the accuracy rate of the model trained with the machine learning integrated algorithm is as high as 96%; the cloud-based deployment scheme can achieve large-scale, high-efficiency, and high-accuracy screening, which saves medical resources to a large extent.
- Fig. 1 is a construction process and application diagram of an effective risk screening model for chronic kidney disease risk according to the present invention.
- an AI chronic kidney disease risk screening method includes the following steps:
- Step S1 establish an effective chronic kidney disease risk screening model
- Step S2 sorting out user data to be screened
- step S3 the user data to be screened is substituted into the chronic kidney disease risk screening model for model calculation, and finally the kidney disease risk prediction result is obtained.
- Step S11 prepare medical record data; collect electronic medical records of patients from the hospital electronic medical record platform, and collect electronic medical records of patients with chronic kidney disease and non-chronic kidney disease patients;
- the method for collecting the electronic medical record of patients with chronic kidney disease as the diagnosis result is: comparing the diagnosis result of the Chinese physician in the electronic medical record with the disease name in the chronic kidney disease disease name database to obtain the electronic medical record of the patient with chronic kidney disease;
- the method for collecting electronic medical records of patients with non-chronic kidney disease is: the data of patients and physical examiners received in the internal medicine department at the same time, excluding unclear medical history and incomplete examination and examination data, as well as the electronic medical records of patients with acute diseases, severe infections or tumors;
- the medical record data includes disease course records, inspection and inspection results, doctor's orders, surgical records, nursing records, and true diagnosis results, and the inspection and inspection results include medical characteristics and thresholds;
- the chronic kidney disease name database contains the names of various medical diseases that can be judged as chronic kidney disease.
- Step S12 a step of extracting medical features; performing chronic kidney disease medical feature extraction on the qualified electronic medical record data obtained in step S11 to extract medical features and medical feature values; the chronic kidney disease medical features include basic information, past history, and family history , Subjective symptoms, blood tests, urine tests.
- the basic information table includes 7 specific feature fields: gender, age, height, weight, blood pressure, pregnancy status, and occupation;
- the past history table includes four specific feature fields: diabetes, hypertension, smoking history, and drinking history;
- the family history table includes 5 specific feature fields: chronic kidney disease, diabetes, hypertension, renal cysts, and polycystic kidney disease;
- Subjective symptoms include: convulsions, polyuria, nausea, fever, fatigue, joint pain, dry mouth, urgency, dysuria, vomiting, rash, macroscopic hematuria, upper respiratory tract infection, oliguria, loss of appetite, edema, headache, Dizziness, anuria, foamy urine, chest tightness, dry eyes, low back pain, and eclampsia have 24 specific characteristic fields;
- the blood test table includes: blood C-reactive protein, blood white blood cell count, hemoglobin, red blood cell count, blood sugar, platelets, blood hepatitis B e antibody, blood hepatitis B e antigen, blood hepatitis B surface antibody, blood hepatitis B surface antigen, blood hepatitis B core antibody, blood Hepatitis C antibody, erythrocyte sedimentation rate, blood lactate dehydrogenase, blood albumin, blood aspartate aminotransferase, blood alanine aminotransferase, blood total protein, blood total bilirubin, blood total cholesterol, blood triglycerides, blood creatinine, blood uric acid, blood Urea nitrogen, blood potassium, blood sodium, blood calcium, blood phosphorus, blood chlorine, blood cystatin C, anti-neutrophil cytoplasmic antibody, complement C4, complement C3, complement C2, complement C1q, immunoglobulin A, Immunoglobulin E, immunoglobulin G, and immunoglobulin
- Urine examination table includes: urine white blood cell, urine specific gravity, urine bilirubin, urine protein, urine red blood cell, urine creatinine, urine occult blood, urine ketone body, urine microalbumin, urine tube type, urine albumin, urine PH value, urine bile Original, urine nitrite, urine glucose, microscopic hematuria, urine osmotic pressure, urine sodium, 24-hour urine output, 24-hour urine protein quantification 20 specific feature fields;
- the medical characteristic value is the specific value of each medical characteristic among basic information, past history, family history, subjective symptoms, blood test, and urine test characteristics.
- Step S13 the step of feature data standardization and data cleaning. Perform feature data standardization on the big data of clinical manifestations of kidney disease obtained in step S12, remove data with missing values, and obtain a standard data sample.
- the standard data sample includes a standard medical feature data set and a standard diagnosis result set, a standard medical feature data set and The data in the standard diagnosis result set has a one-to-one correspondence. It includes the following two steps:
- the expressions of the same feature are replaced with the feature standard library, and the feature description is unified to obtain standardized medical feature data.
- the replacement of medical feature values mainly includes the replacement of symbols, letters, characters, units, and medical codes, and unified symbols, letters, characters, units, and medical codes.
- Step S14 The step of feature screening; combining the kidney disease-related features provided by the nephrologists and calculating the standard data samples using statistical methods, the selected kidney disease-related features are summarized for the epidemiology and examination of the kidney disease screening task Test and symptom characteristics, and obtain the selected medical characteristic data set.
- the kidney disease-related features provided by the nephrologists are a kind of medical experience-related kidney disease-related features table provided by the nephrologists offline.
- T-test and Chi-square test are commonly used methods in statistics and belong to the prior art.
- Computer programs that use T-test and Chi-square test Python are an existing computer program on the market and also belong to the prior art.
- the present invention only uses the above statistical method and related software to calculate to obtain the probability value P.
- P value We set the P value to be less than 0.05. It can be considered that the selected features have an extremely significant correlation with the risk of chronic kidney disease. These features are selected It is reasonable to build a model.
- T test is an example for further explanation.
- T test and chi-square test are used to screen out the influencing factors related to chronic kidney disease.
- the T test is to compare the mean value of each factor to study whether there is a significant difference between the factors in the diagnosis of chronic kidney disease or not.
- Basic premise The sample data obeys a normal or approximately normal distribution; it is used to test quantitative data (white blood cell, red blood cell, hemoglobin, etc.). The operation is as follows:
- the normality test is performed on the quantitative data.
- the normality test result is confirmed by viewing the QQ graph. If the data is basically distributed near a straight line , Can be considered to obey the normal distribution. Then, the corresponding P value is obtained through the T test, and the P value is compared with the significance level. If P ⁇ 0.05, the null hypothesis is rejected (H0: the factor does not have a significant difference between the diagnosis of chronic kidney disease or not). It is considered that the difference between this factor in the presence or absence of chronic kidney disease is statistically significant. It shows that this factor has a significant difference between the presence or absence of chronic kidney disease, and further shows that this factor is one of the factors that affect the presence or absence of chronic kidney disease. Therefore, the influencing factors related to chronic kidney disease were screened out among all the factors.
- the principle and steps of the chi-square test are similar to the above, but the targeted data are all classified data (gender, urine occult blood, etc.).
- Step S15 the step of splitting the feature data set
- the StratifiedShuffleSplit hierarchical segmentation method is an existing technology and belongs to a functional module of a python computer program.
- the BP neural network includes neuron weights and deviations
- the random forest is composed of multiple decision trees, the decision tree includes multiple nodes, and the nodes are medical features and thresholds;
- the XGBoost includes an XGBoost decision tree and the relationship between the XGBoost decision tree;
- the XGBoost decision tree includes a plurality of nodes, and the nodes are medical features and thresholds;
- the relationship between the XGBoost decision trees is a gradient descent optimization algorithm , The next decision tree is obtained from the previous decision tree according to the gradient descent optimization algorithm;
- the training data is calculated by the BP neural network algorithm, the XGBoost algorithm and the random forest algorithm respectively, and the BP neural network prediction result set, XGBoost prediction result set and random forest prediction result set are obtained respectively.
- the total prediction result set consists of the prediction result value, and the prediction result value is composed of two values, yes and no, which is representative Chronic kidney disease, whether it means non-chronic kidney disease; use the voting method to vote on the total prediction result set, according to the number of yes or no in the prediction results, the one with the largest number of yes or no values wins, thereby obtaining the prediction result of chronic kidney disease;
- the selected medical feature data corresponding to the chronic kidney disease prediction results that do not match the diagnostic results in the corresponding patient standard diagnosis result set are put into the random forest algorithm to continue training, and the medical features and thresholds in the decision tree nodes are adjusted to finally make The prediction results of chronic kidney disease are consistent with the diagnosis results in the corresponding patient standard diagnosis result set, so as to obtain the medical characteristics and threshold values in the adaptive decision tree nodes that can distinguish chronic kidney disease;
- the selected medical feature data corresponding to the chronic kidney disease prediction results that do not match the diagnostic results in the corresponding patient standard diagnostic result set are put into the XGBoost algorithm to continue training, and the medical features, thresholds, and XGBoost in the XGBoost decision tree nodes are adjusted.
- the relationship between the decision trees finally makes the prediction results of chronic kidney disease consistent with the diagnosis results in the corresponding patient standard diagnosis result set, so as to obtain the medical features, thresholds, and XGBoost decision tree nodes in the XGBoost decision tree that can distinguish chronic kidney disease. Relationship between trees;
- the adaptive chronic kidney disease risk screening parameter set capable of distinguishing chronic kidney disease includes a BP neural network capable of distinguishing chronic kidney disease. Meta weights and deviations, the medical features and thresholds in the nodes of random forest decision trees, and the medical features and thresholds in the nodes of XGBoost decision trees, and the relationship between XGBoost decision trees;
- a chronic kidney disease risk screening parameter set that can be used to distinguish chronic kidney disease, together with BP neural network algorithm, XGBoost algorithm, and random forest algorithm, constitute a chronic kidney disease risk screening model;
- Step S17 Steps of the chronic kidney disease risk screening model test
- the chronic kidney disease risk screening model calculates the test data obtained in step S15, and calculates the accuracy, recall, and precision of the obtained results. If the average of these three test indicators exceeds 0.95, the chronic kidney disease artificial intelligence The screening model is effective; if the average value does not reach 0.95, go back to step S16 and use the training data again to adjust the algorithm parameters to obtain the chronic kidney disease risk screening parameter set that is compatible with chronic kidney disease, and get the chronic kidney disease risk screen again Check the model;
- the accuracy rate is the ratio of the sum of the number of chronic kidney disease and the number of non-chronic kidney disease in the test data that the chronic kidney disease risk screening model correctly predicts to the total number of test data;
- the recall rate is the ratio of the chronic kidney disease risk screening model correctly predicting the number of chronic kidney disease in the test data to the total number of chronic kidney disease diagnosed in the test data;
- the accuracy rate is the ratio of the number of chronic kidney disease that the chronic kidney disease risk screening model correctly predicts to the total number of chronic kidney disease predicted by the chronic kidney disease risk screening model in the test data.
- Steps to establish an effective risk screening model for chronic kidney disease After steps S16 and S17, the accuracy, precision, and recall rate of the chronic kidney disease risk screening model that exceed 0.95 are determined as the effective risk screening model for chronic kidney disease risk , And finally get an effective model of chronic kidney disease.
- step S2 organizes the user data to be screened; the hospital or the medical examination center provides the user data to be screened, and standardizes the provided user data to be screened to obtain standardized user data to be screened, so as to meet the chronic kidney disease risk screening Check the standard of model data input.
- the data of the user to be screened is medical characteristic data of the user to be screened obtained by a hospital or a physical examination center.
- Step S3 input the standardized user data to be screened into the chronic kidney disease risk screening model for model calculation, and finally obtain the kidney disease risk prediction result. Further, the method of inputting standardized user data to be screened into the chronic kidney disease risk screening model is import, or batch import, or input.
- the present invention also proposes a method for constructing an AI chronic kidney disease risk screening model, which includes the following steps:
- A1 Steps to obtain a chronic kidney disease risk screening model from training data
- the sklearn package of python development language is used, and three models of BP neural network, XGBoost and random forest are used to establish an integrated learning classifier system; a suitable chronic kidney disease risk screening parameter set that can distinguish chronic kidney disease is established.
- BP neural network In the three models of XGBoost and Random Forest, the data is trained and iteratively trained to optimize the chronic kidney disease risk screening parameter set, and finally obtain a suitable chronic kidney disease risk screening parameter set that can distinguish chronic kidney disease.
- the chronic kidney disease risk includes the weights and biases of BP neural network neurons that can distinguish chronic kidney disease, the medical features and thresholds in random forest decision tree nodes, and the medical features and thresholds in XGBoost decision tree nodes, and XGBoost The relationship between decision trees;
- the BP neural network includes neuron weights and deviations
- the random forest is composed of multiple decision trees, the decision tree includes multiple nodes, and the nodes are medical features and thresholds;
- the XGBoost includes an XGBoost decision tree and the relationship between the XGBoost decision tree;
- the XGBoost decision tree includes a plurality of nodes, and the nodes are medical features and thresholds;
- the relationship between the XGBoost decision trees is a gradient descent optimization algorithm , The next decision tree is obtained from the previous decision tree according to the gradient descent optimization algorithm;
- the training data is calculated by the BP neural network algorithm, the XGBoost algorithm and the random forest algorithm respectively, and the BP neural network prediction result set, XGBoost prediction result set and random forest prediction result set are obtained respectively.
- the total prediction result set consists of the prediction result value, and the prediction result value is composed of two values, yes and no, which is representative Chronic kidney disease, whether it means non-chronic kidney disease; use the voting method to vote on the total prediction result set, according to the number of yes or no in the prediction results, the one with the largest number of yes or no values wins, thereby obtaining the prediction result of chronic kidney disease;
- the selected medical feature data corresponding to the chronic kidney disease prediction results that do not match the diagnostic results in the corresponding patient standard diagnosis result set are put into the random forest algorithm to continue training, and the medical features and thresholds in the decision tree nodes are adjusted to finally make The prediction results of chronic kidney disease are consistent with the diagnosis results in the corresponding patient standard diagnosis result set, so as to obtain the medical characteristics and threshold values in the adaptive decision tree nodes that can distinguish chronic kidney disease;
- the selected medical feature data corresponding to the chronic kidney disease prediction results that do not match the diagnostic results in the corresponding patient standard diagnostic result set are put into the XGBoost algorithm to continue training, and the medical features, thresholds, and XGBoost in the XGBoost decision tree nodes are adjusted.
- the relationship between the decision trees finally makes the prediction results of chronic kidney disease consistent with the diagnosis results in the corresponding patient standard diagnosis result set, so as to obtain the medical features, thresholds, and XGBoost decision tree nodes in the XGBoost decision tree that can distinguish chronic kidney disease. Relationship between trees;
- the adaptive chronic kidney disease risk screening parameter set capable of distinguishing chronic kidney disease includes a BP neural network capable of distinguishing chronic kidney disease. Meta weights and deviations, the medical features and thresholds in the nodes of random forest decision trees, and the medical features and thresholds in the nodes of XGBoost decision trees, and the relationship between XGBoost decision trees;
- a chronic kidney disease risk screening parameter set that can be used to distinguish chronic kidney disease, together with BP neural network algorithm, XGBoost algorithm, and random forest algorithm, constitute a chronic kidney disease risk screening model;
- A2 Steps of chronic kidney disease risk screening model test
- the chronic kidney disease risk screening model calculates the test data, and calculates the accuracy, recall, and precision of the obtained results. If the average of these three test indicators exceeds 0.95, the chronic kidney disease artificial intelligence screening model is effective; If the average value does not reach 0.95, then go back to step A1 and use the training data again to adjust the algorithm parameters to re-obtain the chronic kidney disease risk screening parameter set suitable for chronic kidney disease, and obtain the chronic kidney disease risk screening model again;
- the test data comes from electronic medical records;
- the accuracy rate is the ratio of the sum of the number of chronic kidney disease and the number of non-chronic kidney disease in the test data that the chronic kidney disease risk screening model correctly predicts to the total number of test data;
- the recall rate is the ratio of the chronic kidney disease risk screening model correctly predicting the number of chronic kidney disease in the test data to the total number of chronic kidney disease diagnosed in the test data;
- the accuracy rate is the ratio of the number of chronic kidney disease that the chronic kidney disease risk screening model correctly predicts to the total number of chronic kidney disease predicted by the chronic kidney disease risk screening model in the test data.
- A3 Steps to establish an effective risk screening model for chronic kidney disease; after steps A1 and A2, the accuracy, precision, and recall rate of the chronic kidney disease risk screening model that exceed 0.95 are determined as the effective risk screening model for chronic kidney disease risk .
- the present invention also proposes an AI chronic kidney disease risk screening system, including an effective risk screening model for chronic kidney disease risk.
- the effective risk screening model for chronic kidney disease risk includes three models including BP neural network, XGBoost and random forest.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Biomedical Technology (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Primary Health Care (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
一种慢性肾病风险筛查的方法及系统,具体涉及到机器学习方法构建慢性肾病风险筛查模型,包括建立有效慢性肾病风险筛查模型、整理待筛查用户数据、将待筛查用户数据代入慢性肾病风险筛查模型进行模型计算,最终得到肾脏病风险结果。从而实现高效、低成本、高准确率的慢性肾病风险筛查系统。本方法采用机器学习BP神经网络、XGBoost与随机森林集成算法训练慢性肾病风险筛查模型,能够根据身体基本测量信息、症状信息、医学检验检查信息、家族史、既往史、生活习惯等数据自动筛查出慢性肾病高危人群,其准确率高达0.96以上。
Description
本发明涉及一种慢性肾病风险筛查方法及系统,具体涉及到机器学习方法构建慢性肾病风险筛查模型,慢性肾病风险筛查评估方法及系统,采用该模型、评估方法及系统对体检人员的医学特征指标进行筛查,给出慢性肾病风险评估值,从而实现高效、低成本、高准确率的慢性肾病风险筛查。
慢性肾病具有患病率高、知晓率低、预后差和医疗费用高等特点,是继心脑血管疾病、糖尿病和恶性肿瘤之后,又一严重危害人类健康的疾病。近年来随着我国人口老龄化程度、糖尿病和高血压等疾病发病率逐年增高,慢性肾病患病率也逐年上升。我国18岁以上人群慢性肾病患病率为10.8%,而知晓率还不足5%。因此亟需一个有效慢性肾病风险筛查系统进行早期慢性肾病普查,提高知晓率,利于慢性肾病的早发现早治疗,防止肾功能的不断恶化,减轻其给个人、家庭和社会带来的经济负担。目前慢性肾病风险筛查需要检查者在医院进行检查,由肾内科医生结合临床指南与实践经验判断,这样不利于进行高效普查。
发明内容
为了解决上述技术问题本发明提出一种AI慢性肾病风险筛查方法,包括如下步骤:
步骤S1,建立有效慢性肾病风险筛查模型;
步骤S2,整理待筛查用户数据;
步骤S3,将待筛查用户数据代入慢性肾病风险筛查模型进行模型计算,最终得到肾脏病风险预测结果。
建立有效慢性肾病风险筛查模型包括如下步骤:
步骤S11:准备病历数据;从医院电子病历平台采集患者电子病历,搜集诊断结果为慢性肾病患者与非慢性肾病患者的电子病历;
所述诊断结果为慢性肾病患者电子病历采集方法为:将电子病历中医师诊断结果与慢性肾病疾病名称数据库中的疾病名称进行比对,得到慢性肾病患者电子病历;
所述非慢性肾病患者的电子病历采集方法为,同期在内科接收的患者以及体 检人员的数据,排除病史不清和检验检查数据不全,以及合并有急性病、严重的感染或者肿瘤患者电子病历;所述病历数据包括病程记录、检查检验结果、医嘱、手术记录、护理记录、真实的诊断结果,所述检查检验结果包括医学特征及阈值;
得到合格的电子病历数据;
所述慢性肾病疾病名称数据库包含可以判断为慢性肾病的各种医学疾病名称。
步骤S12:医学特征提取的步骤;对步骤S11中得到的合格的电子病历数据进行慢性肾病医学特征提取,提取医学特征及医学特征值;所述慢性肾病医学特征包括基本信息、既往史、家族史、主观症状、血液检查、尿液检查类。
基本信息表包括:性别、年龄、身高、体重、血压、孕否、职业7个具体特征字段;
既往史表包括:糖尿病、高血压、吸烟史、饮酒史4个具体特征字段;
家族史表包括:慢性肾病、糖尿病、高血压、肾囊肿、多囊肾5个具体特征字段;
主观症状表包括:抽搐、多尿、恶心、发热、乏力、关节痛、口干、尿急、尿痛、呕吐、皮疹、肉眼可见血尿、上呼吸道感染、少尿、食欲不振、水肿、头痛、头晕、无尿、小便泡沫多、胸闷、眼干、腰痛、子痫24个具体特征字段;
血液检查表包括:血C反应蛋白、血白细胞计数、血红蛋白、血红细胞计数、血糖、血小板、血乙肝e抗体、血乙肝e抗原、血乙肝表面抗体、血乙肝表面抗原、血乙肝核心抗体、血丙肝抗体、血沉、血乳酸脱氢酶、血白蛋白、血谷草转氨酶、血谷丙转氨酶、血总蛋白、血总胆红素、血总胆固醇、血甘油三酯、血肌酐、血尿酸、血尿素氮、血钾、血钠、血钙、血磷、血氯、血胱抑素C、抗中性粒细胞胞浆抗体、补体C4、补体C3、补体C2、补体C1q、免疫球蛋白A、免疫球蛋白E、免疫球蛋白G、免疫球蛋白M,39个具体特征字段;
尿液检查表包括:尿白细胞、尿比重、尿胆红素、尿蛋白、尿红细胞、尿肌酐、尿潜血、尿酮体、尿微量白蛋白、尿管型、尿白蛋白、尿PH值、尿胆原、尿亚硝酸盐、尿葡萄糖、镜下血尿、尿渗透压、尿钠、24小时尿量、24小时尿蛋白定量20个具体特征字段;
所述医学特征值为基本信息、既往史、家族史、主观症状、血液检查、尿液 检查特征中各个医学特征的具体数值。
得到肾病临床表现的大数据资料。
步骤S13:特征数据标准化及数据清洗的步骤。对步骤S12得到的肾病临床表现的大数据资料进行特征数据标准化,去除有缺失值的数据,得到标准数据样本,标准数据样本包括标准医学特征数据集及标准诊断结果集,标准医学特征数据集与标准诊断结果集中的数据是一一对应关系。包括如下两个步骤:
S131特征数据标准化的步骤。
建立标准库和慢性肾脏病专业数据库,采用图像识别软件对慢性肾脏病专业书籍与文献进行识别,存储到慢性肾脏病专业数据库中,同时将慢性肾脏病专业电子书籍与电子文献也存储到慢性肾脏病专业数据库中,基于慢性肾脏病专业数据库人工构建血液检查项目、尿液检查项目、症状及其他医学实体名词标准库,标准库中包含每个医学名词的标准名称及出现过的相似名称,并进行编码便于唯一标识,形成特征标准库。
对步骤S12提取的医学特征和医学特征值,将其中同一种特征不同的表述对照特征标准库进行替换,统一特征描述,得到标准化的医学特征数据。
具体的,对医学特征值的替换,主要为符号、字母、文字、单位、医学代码的替换,统一符号、字母、文字、单位、医学代码。
S132数据清洗的步骤。
对标准化后的医学特征数据,去除有缺失值的数据。针对定量资料数据采用3倍标准差法剔除错误数据;针对定性资料与等级资料数据采用统一编码法量化,形成特标准医学特征数据集。
步骤S14:特征筛选的步骤;结合肾病专家提供的肾病相关特征,与运用统计学方法对标准数据样本进行计算,筛选出来的肾病相关特征,总结出用于肾病筛查任务的流行病学、检查检验与症状特征,得到选取的医学特征数据集。
肾病专家提供的肾病相关特征为肾病专家线下提供的一种医学经验肾病相关特征表。
对标准数据样本进行计算,筛选肾病相关特征的统计学方法为t检验、卡方检验。T检验、卡方检验是统计学中常用的一种方法,属于现有技术,运用T检验、卡方检验Python的计算机程序是市场上的现有的一种计算机程序,也属于 现有技术。
本发明只是用了上述统计学方法及相关软件进行了计算,得到概率值P,我们设定P值小于0.05的,可以认为选取的特征与慢性肾病危险度存在极其显著的相关关系,选取这些特征建立模型是合理的。
为了便于理解,本发明以T检验为例做进一步说明,针对步骤S12中提取到的肾病临床表现的大数据资料,运用T检验、卡方检验来筛选出与慢性肾病相关的影响因素有哪些。其中,T检验是通过比较各因素的均值,研究其因素在诊断结果有无慢性肾病之间是否存在显著差异。基本前提:样本数据服从正态或近似正态分布;用于检验定量数据(白细胞、红细胞、血红蛋白等数值数据)。操作如下:
将定量数据及研究数据输入到Python的程序中,通过调用scipy包,首先,对定量数据进行正态性检验,正态性检验结果,通过查看Q-Q图进行确认,其数据如果基本分布在直线附近,可以认为服从正态分布。然后,通过T检验得到相应的P值,将P值与显著性水平做比较,若P<0.05,拒绝原假设(H0:该因素在诊断结果有无慢性肾病之间不存在显著差异),可认为该因素在有无慢性肾病之间的差异具有统计学意义。则说明该因素在有无慢性肾病之间存在显著性的差异,进一步说明,该因素是影响有无慢性肾病的因素之一。从而在所有因素中筛选出了与慢性肾病相关的影响因素。
卡方检验原理、步骤与上述类似,但针对的数据均为定类数据(性别、尿隐血等)。
步骤S15:特征数据集拆分的步骤;
采用python的sklearn包的StratifiedShuffleSplit分层分割方法将步骤S13中得到的选取的医学特征数据集分成N份,N>2;选取其中的N-1份作为模型的训练数据,剩余的一份作为模型的测试数据。
StratifiedShuffleSplit分层分割方法是一种现有技术,属于python计算机程序的功能模块。
S16:训练数据得到慢性肾病风险筛查模型的步骤
采用python开发语言的sklearn包,选用BP神经网络、XGBoost与随机森林三种算法建立集成学习分类器系统;
所述BP神经网络包括神经元权重和偏差;
所述随机森林由多个决策树构成,决策树包括多个结点,结点为医学特征及阈值;
所述XGBoost包括XGBoost决策树,以及XGBoost决策树之间的关系;所述XGBoost决策树包括多个结点,结点为医学特征及阈值;所述XGBoost决策树之间的关系为梯度下降优化算法,后一棵决策树由前一棵树决策树按照梯度下降优化算法得到;
将训练数据分别经过BP神经网络算法、XGBoost算法与随机森林算法计算,分别得到BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集,
将BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集合并成一个总预测结果集,总预测结果集中由预测结果值组成,预测结果值为是和否两个值组成,是代表慢性肾脏病,否代表非慢性肾脏病;采用投票法对总预测结果集进行投票,按照预测结果中是与否的数量,是与否值数量最多的胜出,从而得到慢性肾病预测结果;
还包括迭代训练的步骤:
将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果进行比较,如果慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符,则对慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入BP神经网络继续进行训练,调节BP神经网络中的神经元权重和偏差,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的神经元权重和偏差;
同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入随机森林算法继续进行训练,调整决策树结点中的医学特征及阈值,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的决策树结点中的医学特征及阈值;
同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入XGBoost算法继续进行训练,调整XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系,最终使得 慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;
由此,得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,所述能够判别慢性肾病的相适应的慢性肾病风险筛查参数集包括能够判别慢性肾病的相适应的BP神经网络神经元权重和偏差,随机森林决策树结点中的医学特征及阈值,以及XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;
由此,能够判别慢性肾病的相适应的慢性肾病风险筛查参数集与BP神经网络算法、XGBoost算法、随机森林算法共同构成了慢性肾病风险筛查模型;
步骤S17:慢性肾病风险筛查模型测试的步骤
慢性肾病风险筛查模型对步骤S15中得到的测试数据进行计算,对得到的结果计算准确率、召回率和精确率,如果这三个测试指标的平均值超过了0.95,则该慢性肾病人工智能筛查模型有效;如果其平均值没有达到0.95,则进行返回步骤S16再次使用训练数据,进行算法调参,重新得到慢性肾病的相适应的慢性肾病风险筛查参数集,再次得到慢性肾病风险筛查模型;
所述准确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量与非慢性肾病的数量的和占测试数据总数量的比值;
所述召回率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占测试数据中的诊断结果为慢性肾病总数量的比值;
所述精确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占慢性肾病风险筛查模型预测为慢性肾病总数量的比值。
S18:建立慢性肾病风险有效险筛查模型的步骤;经过步骤S16和S17得到的准确率、精确度与召回率都超过0.95的慢性肾病风险筛查模型,判定为慢性肾病风险有效险筛查模型,最终得到慢性肾病有效模型。
进一步的,步骤S2整理待筛查用户数据;医院或体检中心提供待筛查用户数据,将提供的待筛查用户数据进行数据标准化,得到标准化的待筛查用户数据,使得符合慢性肾病风险筛查模型数据输入的标准。
所述待筛查用户数据为医院或体检中心检查所得到的待筛查用户的医学特征 数据。
步骤S3,将标准化的待筛查用户数据输入慢性肾病风险筛查模型进行模型计算,最终得到肾脏病风险预测结果。进一步的,将标准化的待筛查用户数据输入慢性肾病风险筛查模型的方式为导入,或批量导入,或输入。
本发明还提出一种构建AI慢性肾病风险筛查模型的方法,包括如下步骤:
A1:训练数据得到慢性肾病风险筛查模型的步骤,
采用python开发语言的sklearn包,选用BP神经网络、XGBoost与随机森林三种模型建立集成学习分类器系统;建立得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,在BP神经网络、XGBoost与随机森林三种模型中对数据进行训练以及迭代训练,对慢性肾病风险筛查参数集进行调优,最终得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,该慢性肾病风险筛查参数集包括能够判别慢性肾病的相适应的BP神经网络神经元权重和偏差,随机森林决策树结点中的医学特征及阈值,以及XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;
所述BP神经网络包括神经元权重和偏差;
所述随机森林由多个决策树构成,决策树包括多个结点,结点为医学特征及阈值;
所述XGBoost包括XGBoost决策树,以及XGBoost决策树之间的关系;所述XGBoost决策树包括多个结点,结点为医学特征及阈值;所述XGBoost决策树之间的关系为梯度下降优化算法,后一棵决策树由前一棵树决策树按照梯度下降优化算法得到;
将训练数据分别经过BP神经网络算法、XGBoost算法与随机森林算法计算,分别得到BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集,
将BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集合并成一个总预测结果集,总预测结果集中由预测结果值组成,预测结果值为是和否两个值组成,是代表慢性肾脏病,否代表非慢性肾脏病;采用投票法对总预测结果集进行投票,按照预测结果中是与否的数量,是与否值数量最多的胜出,从而得到慢性肾病预测结果;
还包括迭代训练的步骤:
将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果进行比较,如果慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符,则对慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入BP神经网络继续进行训练,调节BP神经网络中的神经元权重和偏差,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的神经元权重和偏差;
同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入随机森林算法继续进行训练,调整决策树结点中的医学特征及阈值,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的决策树结点中的医学特征及阈值;
同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入XGBoost算法继续进行训练,调整XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;
由此,得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,所述能够判别慢性肾病的相适应的慢性肾病风险筛查参数集包括能够判别慢性肾病的相适应的BP神经网络神经元权重和偏差,随机森林决策树结点中的医学特征及阈值,以及XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;
由此,能够判别慢性肾病的相适应的慢性肾病风险筛查参数集与BP神经网络算法、XGBoost算法、随机森林算法共同构成了慢性肾病风险筛查模型;
A2:慢性肾病风险筛查模型测试的步骤
慢性肾病风险筛查模型对测试数据进行计算,对得到的结果计算准确率、召回率和精确率,如果这三个测试指标的平均值超过了0.95,则该慢性肾病人工智能筛查模型有效;如果其平均值没有达到0.95,则进行返回步骤A1再次使用训练数据,进行算法调参,重新得到慢性肾病的相适应的慢性肾病风险筛查参数集, 再次得到慢性肾病风险筛查模型;所述测试数据来源于电子病历;
所述准确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量与非慢性肾病的数量的和占测试数据总数量的比值;
所述召回率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占测试数据中的诊断结果为慢性肾病总数量的比值;
所述精确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占慢性肾病风险筛查模型预测为慢性肾病总数量的比值。
A3:建立慢性肾病风险有效险筛查模型的步骤;经过步骤A1和A2得到的准确率、精确度与召回率都超过0.95的慢性肾病风险筛查模型,判定为慢性肾病风险有效险筛查模型。
进一步的,本发明还提出一种AI慢性肾病风险筛查系统,包括慢性肾病风险有效险筛查模型,所述慢性肾病风险有效险筛查模型包括由BP神经网络、XGBoost与随机森林三种模型建立的集成学习分类器系统,以及能够判别慢性肾病的相适应的慢性肾病风险筛查参数集。
本发明采用机器学习BP神经网络、XGBoost与随机森林集成算法训练慢性肾病风险筛查模型,能够根据身体基本测量信息、症状信息、医学检验检查信息、家族史、既往史、生活习惯等数据自动筛查出慢性肾病高危人群,其准确率高达0.96。本发明构建了机器学习的慢性肾病风险筛查的模型。可为广大人民群众提高肾病风险认知,为健康生活起指导作用。采用机器学习集成算法训练的模型准确率高达96%;基于云的部署方案可以实现大批量、高效率、高准确率的筛查,很大程度上节省了医疗资源。
图1为本发明慢性肾病风险有效险筛查模型构建流程及应用图。
实施例1:
如图1所示,一种AI慢性肾病风险筛查方法,包括如下步骤:
步骤S1,建立有效慢性肾病风险筛查模型;
步骤S2,整理待筛查用户数据;
步骤S3,将待筛查用户数据代入慢性肾病风险筛查模型进行模型计算,最终得到肾脏病风险预测结果。
建立有效慢性肾病风险筛查模型包括如下步骤:
步骤S11:准备病历数据;从医院电子病历平台采集患者电子病历,搜集诊断结果为慢性肾病患者与非慢性肾病患者的电子病历;
所述诊断结果为慢性肾病患者电子病历采集方法为:将电子病历中医师诊断结果与慢性肾病疾病名称数据库中的疾病名称进行比对,得到慢性肾病患者电子病历;
所述非慢性肾病患者的电子病历采集方法为,同期在内科接收的患者以及体检人员的数据,排除病史不清和检验检查数据不全,以及合并有急性病、严重的感染或者肿瘤患者电子病历;所述病历数据包括病程记录、检查检验结果、医嘱、手术记录、护理记录、真实的诊断结果,所述检查检验结果包括医学特征及阈值;
得到合格的电子病历数据;
所述慢性肾病疾病名称数据库包含可以判断为慢性肾病的各种医学疾病名称。
步骤S12:医学特征提取的步骤;对步骤S11中得到的合格的电子病历数据进行慢性肾病医学特征提取,提取医学特征及医学特征值;所述慢性肾病医学特征包括基本信息、既往史、家族史、主观症状、血液检查、尿液检查类。
基本信息表包括:性别、年龄、身高、体重、血压、孕否、职业7个具体特征字段;
既往史表包括:糖尿病、高血压、吸烟史、饮酒史4个具体特征字段;
家族史表包括:慢性肾病、糖尿病、高血压、肾囊肿、多囊肾5个具体特征字段;
主观症状表包括:抽搐、多尿、恶心、发热、乏力、关节痛、口干、尿急、尿痛、呕吐、皮疹、肉眼可见血尿、上呼吸道感染、少尿、食欲不振、水肿、头痛、头晕、无尿、小便泡沫多、胸闷、眼干、腰痛、子痫24个具体特征字段;
血液检查表包括:血C反应蛋白、血白细胞计数、血红蛋白、血红细胞计数、血糖、血小板、血乙肝e抗体、血乙肝e抗原、血乙肝表面抗体、血乙肝表面抗原、血乙肝核心抗体、血丙肝抗体、血沉、血乳酸脱氢酶、血白蛋白、血谷 草转氨酶、血谷丙转氨酶、血总蛋白、血总胆红素、血总胆固醇、血甘油三酯、血肌酐、血尿酸、血尿素氮、血钾、血钠、血钙、血磷、血氯、血胱抑素C、抗中性粒细胞胞浆抗体、补体C4、补体C3、补体C2、补体C1q、免疫球蛋白A、免疫球蛋白E、免疫球蛋白G、免疫球蛋白M,39个具体特征字段;
尿液检查表包括:尿白细胞、尿比重、尿胆红素、尿蛋白、尿红细胞、尿肌酐、尿潜血、尿酮体、尿微量白蛋白、尿管型、尿白蛋白、尿PH值、尿胆原、尿亚硝酸盐、尿葡萄糖、镜下血尿、尿渗透压、尿钠、24小时尿量、24小时尿蛋白定量20个具体特征字段;
所述医学特征值为基本信息、既往史、家族史、主观症状、血液检查、尿液检查特征中各个医学特征的具体数值。
得到肾病临床表现的大数据资料。
步骤S13:特征数据标准化及数据清洗的步骤。对步骤S12得到的肾病临床表现的大数据资料进行特征数据标准化,去除有缺失值的数据,得到标准数据样本,标准数据样本包括标准医学特征数据集及标准诊断结果集,标准医学特征数据集与标准诊断结果集中的数据是一一对应关系。包括如下两个步骤:
S131特征数据标准化的步骤。
建立标准库和慢性肾脏病专业数据库,采用图像识别软件对慢性肾脏病专业书籍与文献进行识别,存储到慢性肾脏病专业数据库中,同时将慢性肾脏病专业电子书籍与电子文献也存储到慢性肾脏病专业数据库中,基于慢性肾脏病专业数据库人工构建血液检查项目、尿液检查项目、症状及其他医学实体名词标准库,标准库中包含每个医学名词的标准名称及出现过的相似名称,并进行编码便于唯一标识,形成特征标准库。
对步骤S12提取的医学特征和医学特征值,将其中同一种特征不同的表述对照特征标准库进行替换,统一特征描述,得到标准化的医学特征数据。
具体的,对医学特征值的替换,主要为符号、字母、文字、单位、医学代码的替换,统一符号、字母、文字、单位、医学代码。
S132数据清洗的步骤。
对标准化后的医学特征数据,去除有缺失值的数据。针对定量资料数据采用3倍标准差法剔除错误数据;针对定性资料与等级资料数据采用统一编码法量 化,形成特标准医学特征数据集。
步骤S14:特征筛选的步骤;结合肾病专家提供的肾病相关特征,与运用统计学方法对标准数据样本进行计算,筛选出来的肾病相关特征,总结出用于肾病筛查任务的流行病学、检查检验与症状特征,得到选取的医学特征数据集。
肾病专家提供的肾病相关特征为肾病专家线下提供的一种医学经验肾病相关特征表。
对标准数据样本进行计算,筛选肾病相关特征的统计学方法为t检验、卡方检验。T检验、卡方检验是统计学中常用的一种方法,属于现有技术,运用T检验、卡方检验Python的计算机程序是市场上的现有的一种计算机程序,也属于现有技术。
本发明只是用了上述统计学方法及相关软件进行了计算,得到概率值P,我们设定P值小于0.05的,可以认为选取的特征与慢性肾病危险度存在极其显著的相关关系,选取这些特征建立模型是合理的。
为了便于理解,本发明以T检验为例做进一步说明,针对步骤S12中提取到的肾病临床表现的大数据资料,运用T检验、卡方检验来筛选出与慢性肾病相关的影响因素有哪些。其中,T检验是通过比较各因素的均值,研究其因素在诊断结果有无慢性肾病之间是否存在显著差异。基本前提:样本数据服从正态或近似正态分布;用于检验定量数据(白细胞、红细胞、血红蛋白等数值数据)。操作如下:
将定量数据及研究数据输入到Python的程序中,通过调用scipy包,首先,对定量数据进行正态性检验,正态性检验结果,通过查看Q-Q图进行确认,其数据如果基本分布在直线附近,可以认为服从正态分布。然后,通过T检验得到相应的P值,将P值与显著性水平做比较,若P<0.05,拒绝原假设(H0:该因素在诊断结果有无慢性肾病之间不存在显著差异),可认为该因素在有无慢性肾病之间的差异具有统计学意义。则说明该因素在有无慢性肾病之间存在显著性的差异,进一步说明,该因素是影响有无慢性肾病的因素之一。从而在所有因素中筛选出了与慢性肾病相关的影响因素。
卡方检验原理、步骤与上述类似,但针对的数据均为定类数据(性别、尿隐血等)。
步骤S15:特征数据集拆分的步骤;
采用python的sklearn包的StratifiedShuffleSplit分层分割方法将步骤S13中得到的选取的医学特征数据集分成N份,N>2;选取其中的N-1份作为模型的训练数据,剩余的一份作为模型的测试数据。
StratifiedShuffleSplit分层分割方法是一种现有技术,属于python计算机程序的功能模块。
S16:训练数据得到慢性肾病风险筛查模型的步骤
采用python开发语言的sklearn包,选用BP神经网络、XGBoost与随机森林三种算法建立集成学习分类器系统;
所述BP神经网络包括神经元权重和偏差;
所述随机森林由多个决策树构成,决策树包括多个结点,结点为医学特征及阈值;
所述XGBoost包括XGBoost决策树,以及XGBoost决策树之间的关系;所述XGBoost决策树包括多个结点,结点为医学特征及阈值;所述XGBoost决策树之间的关系为梯度下降优化算法,后一棵决策树由前一棵树决策树按照梯度下降优化算法得到;
将训练数据分别经过BP神经网络算法、XGBoost算法与随机森林算法计算,分别得到BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集,
将BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集合并成一个总预测结果集,总预测结果集中由预测结果值组成,预测结果值为是和否两个值组成,是代表慢性肾脏病,否代表非慢性肾脏病;采用投票法对总预测结果集进行投票,按照预测结果中是与否的数量,是与否值数量最多的胜出,从而得到慢性肾病预测结果;
还包括迭代训练的步骤:
将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果进行比较,如果慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符,则对慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入BP神经网络继续进行训练,调节BP神经网络中的神经元权重和偏差,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相 符,从而得到能够判别慢性肾病的相适应的神经元权重和偏差;
同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入随机森林算法继续进行训练,调整决策树结点中的医学特征及阈值,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的决策树结点中的医学特征及阈值;
同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入XGBoost算法继续进行训练,调整XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;
由此,得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,所述能够判别慢性肾病的相适应的慢性肾病风险筛查参数集包括能够判别慢性肾病的相适应的BP神经网络神经元权重和偏差,随机森林决策树结点中的医学特征及阈值,以及XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;
由此,能够判别慢性肾病的相适应的慢性肾病风险筛查参数集与BP神经网络算法、XGBoost算法、随机森林算法共同构成了慢性肾病风险筛查模型;
步骤S17:慢性肾病风险筛查模型测试的步骤
慢性肾病风险筛查模型对步骤S15中得到的测试数据进行计算,对得到的结果计算准确率、召回率和精确率,如果这三个测试指标的平均值超过了0.95,则该慢性肾病人工智能筛查模型有效;如果其平均值没有达到0.95,则进行返回步骤S16再次使用训练数据,进行算法调参,重新得到慢性肾病的相适应的慢性肾病风险筛查参数集,再次得到慢性肾病风险筛查模型;
所述准确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量与非慢性肾病的数量的和占测试数据总数量的比值;
所述召回率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占测试数据中的诊断结果为慢性肾病总数量的比值;
所述精确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占慢性肾病风险筛查模型预测为慢性肾病总数量的比值。
S18:建立慢性肾病风险有效险筛查模型的步骤;经过步骤S16和S17得到的准确率、精确度与召回率都超过0.95的慢性肾病风险筛查模型,判定为慢性肾病风险有效险筛查模型,最终得到慢性肾病有效模型。
进一步的,步骤S2整理待筛查用户数据;医院或体检中心提供待筛查用户数据,将提供的待筛查用户数据进行数据标准化,得到标准化的待筛查用户数据,使得符合慢性肾病风险筛查模型数据输入的标准。
所述待筛查用户数据为医院或体检中心检查所得到的待筛查用户的医学特征数据。
步骤S3,将标准化的待筛查用户数据输入慢性肾病风险筛查模型进行模型计算,最终得到肾脏病风险预测结果。进一步的,将标准化的待筛查用户数据输入慢性肾病风险筛查模型的方式为导入,或批量导入,或输入。
实施例2:
本发明还提出一种构建AI慢性肾病风险筛查模型的方法,包括如下步骤:
A1:训练数据得到慢性肾病风险筛查模型的步骤
采用python开发语言的sklearn包,选用BP神经网络、XGBoost与随机森林三种模型建立集成学习分类器系统;建立得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,在BP神经网络、XGBoost与随机森林三种模型中对数据进行训练以及迭代训练,对慢性肾病风险筛查参数集进行调优,最终得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,该慢性肾病风险筛查参数集包括能够判别慢性肾病的相适应的BP神经网络神经元权重和偏差,随机森林决策树结点中的医学特征及阈值,以及XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;
所述BP神经网络包括神经元权重和偏差;
所述随机森林由多个决策树构成,决策树包括多个结点,结点为医学特征及阈值;
所述XGBoost包括XGBoost决策树,以及XGBoost决策树之间的关系;所述XGBoost决策树包括多个结点,结点为医学特征及阈值;所述XGBoost决策 树之间的关系为梯度下降优化算法,后一棵决策树由前一棵树决策树按照梯度下降优化算法得到;
将训练数据分别经过BP神经网络算法、XGBoost算法与随机森林算法计算,分别得到BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集,
将BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集合并成一个总预测结果集,总预测结果集中由预测结果值组成,预测结果值为是和否两个值组成,是代表慢性肾脏病,否代表非慢性肾脏病;采用投票法对总预测结果集进行投票,按照预测结果中是与否的数量,是与否值数量最多的胜出,从而得到慢性肾病预测结果;
还包括迭代训练的步骤:
将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果进行比较,如果慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符,则对慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入BP神经网络继续进行训练,调节BP神经网络中的神经元权重和偏差,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的神经元权重和偏差;
同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入随机森林算法继续进行训练,调整决策树结点中的医学特征及阈值,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的决策树结点中的医学特征及阈值;
同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入XGBoost算法继续进行训练,调整XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;
由此,得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,所述能够判别慢性肾病的相适应的慢性肾病风险筛查参数集包括能够判别慢性肾病的 相适应的BP神经网络神经元权重和偏差,随机森林决策树结点中的医学特征及阈值,以及XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;
由此,能够判别慢性肾病的相适应的慢性肾病风险筛查参数集与BP神经网络算法、XGBoost算法、随机森林算法共同构成了慢性肾病风险筛查模型;
A2:慢性肾病风险筛查模型测试的步骤
慢性肾病风险筛查模型对测试数据进行计算,对得到的结果计算准确率、召回率和精确率,如果这三个测试指标的平均值超过了0.95,则该慢性肾病人工智能筛查模型有效;如果其平均值没有达到0.95,则进行返回步骤A1再次使用训练数据,进行算法调参,重新得到慢性肾病的相适应的慢性肾病风险筛查参数集,再次得到慢性肾病风险筛查模型;所述测试数据来源于电子病历;
所述准确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量与非慢性肾病的数量的和占测试数据总数量的比值;
所述召回率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占测试数据中的诊断结果为慢性肾病总数量的比值;
所述精确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占慢性肾病风险筛查模型预测为慢性肾病总数量的比值。
A3:建立慢性肾病风险有效险筛查模型的步骤;经过步骤A1和A2得到的准确率、精确度与召回率都超过0.95的慢性肾病风险筛查模型,判定为慢性肾病风险有效险筛查模型。
实施例3:
进一步的,本发明还提出一种AI慢性肾病风险筛查系统,包括慢性肾病风险有效险筛查模型,所述慢性肾病风险有效险筛查模型包括由BP神经网络、XGBoost与随机森林三种模型建立的集成学习分类器系统,以及能够判别慢性肾病的相适应的慢性肾病风险筛查参数集。
Claims (4)
- 一种AI慢性肾病风险筛查方法,其特征在于包括如下步骤:步骤S1,建立有效慢性肾病风险筛查模型;步骤S2,整理待筛查用户数据;步骤S3,将待筛查用户数据代入慢性肾病风险筛查模型进行模型计算,最终得到肾脏病风险预测结果;慢性肾病风险筛查所述建立有效慢性肾病风险筛查模型包括如下步骤:步骤S11:准备病历数据;搜集诊断结果为慢性肾病患者与非慢性肾病患者的电子病历;所述诊断结果为慢性肾病患者电子病历采集方法为:将电子病历中医师诊断结果与慢性肾病疾病名称数据库中的疾病名称进行比对,得到慢性肾病患者电子病历;所述非慢性肾病患者的电子病历采集方法为,同期在内科接收的患者以及体检人员的数据,排除病史不清和检验检查数据不全,以及合并有急性病、严重的感染或者肿瘤患者电子病历;所述病历数据包括病程记录、检查检验结果、医嘱、手术记录、护理记录、真实的诊断结果,所述检查检验结果包括医学特征及阈值;得到合格的电子病历数据;所述慢性肾病疾病名称数据库包含可以判断为慢性肾病的各种医学疾病名称;步骤S12:医学特征提取的步骤;对步骤S11中得到的合格的电子病历数据进行慢性肾病医学特征提取;所述慢性肾病医学特征包括基本信息、既往史、家族史、主观症状、血液检查、尿液检查特征数据,得到肾病临床表现的大数据资料;步骤S13:特征数据标准化及数据清洗的步骤;对步骤S12得到的肾病临床表现的大数据资料进行特征数据标准化,去除有缺失值的数据,得到标准数据样本,标准数据样本包括标准医学特征数据集及标准诊断结果集,标准医学特征数据集与标准诊断结果集中的数据是一一对应关系;步骤S14:特征筛选的步骤;结合肾病专家提供的肾病相关特征与标准医学特征数据集,总结出用于肾病筛查任务的流行病学、检查检验与症状特征,得到选取的医学特征数据集;步骤S15:特征数据集拆分的步骤;采用python的sklearn包的StratifiedShuffleSplit分层分割方法将步骤S13中得到的选取的医学特征数据集分成N份,N>2;选取其中的N-1份作为模型的训练数 据,剩余的一份作为模型的测试数据;S16:训练数据得到慢性肾病风险筛查模型的步骤采用python开发语言的sklearn包,选用BP神经网络、XGBoost与随机森林三种算法建立集成学习分类器系统;所述BP神经网络包括神经元权重和偏差;所述随机森林由多个决策树构成,决策树包括多个结点,结点为医学特征及阈值;所述XGBoost包括XGBoost决策树,以及XGBoost决策树之间的关系;所述XGBoost决策树包括多个结点,结点为医学特征及阈值;所述XGBoost决策树之间的关系为梯度下降优化算法,后一棵决策树由前一棵树决策树按照梯度下降优化算法得到;将训练数据分别经过BP神经网络算法、XGBoost算法与随机森林算法计算,分别得到BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集,将BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集合并成一个总预测结果集,总预测结果集中由预测结果值组成,预测结果值为是和否两个值组成,是代表慢性肾脏病,否代表非慢性肾脏病;采用投票法对总预测结果集进行投票,按照预测结果中是与否的数量,是与否值数量最多的胜出,从而得到慢性肾病预测结果;还包括迭代训练的步骤:将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果进行比较,如果慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符,则对慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入BP神经网络继续进行训练,调节BP神经网络中的神经元权重和偏差,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的神经元权重和偏差;同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入随机森林算法继续进行训练,调整决策树结点中的医学特征及阈值,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的决策树结点中的医学特征及阈值;同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入XGBoost算法继续进行训练,调整XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;由此,得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,所述能够判别慢性肾病的相适应的慢性肾病风险筛查参数集包括能够判别慢性肾病的相适应的BP神经网络神经元权重和偏差,随机森林决策树结点中的医学特征及阈值,以及XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;由此,能够判别慢性肾病的相适应的慢性肾病风险筛查参数集与BP神经网络算法、XGBoost算法、随机森林算法共同构成了慢性肾病风险筛查模型;步骤S17:慢性肾病风险筛查模型测试的步骤慢性肾病风险筛查模型对步骤S15中得到的测试数据进行计算,对得到的结果计算准确率、召回率和精确率,如果这三个测试指标的平均值超过了0.95,则该慢性肾病人工智能筛查模型有效;如果其平均值没有达到0.95,则进行返回步骤S16再次使用训练数据,进行算法调参,重新得到慢性肾病的相适应的慢性肾病风险筛查参数集,再次得到慢性肾病风险筛查模型;所述准确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量与非慢性肾病的数量的和占测试数据总数量的比值;所述召回率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占测试数据中的诊断结果为慢性肾病总数量的比值;所述精确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占慢性肾病风险筛查模型预测为慢性肾病总数量的比值;S18:建立慢性肾病风险有效险筛查模型的步骤;经过步骤S16和S17得到的准确率、精确度与召回率都超过0.95的慢性肾病风险筛查模型,判定为慢性肾病风险有效险筛查模型。
- 根据权利要求1所述的AI慢性肾病风险筛查方法,其特征在于所述步骤S13 特征数据标准化及数据清洗的步骤,包括如下两个步骤:S131特征数据标准化的步骤;建立标准库和慢性肾脏病专业数据库,采用图像识别软件对慢性肾脏病专业书籍与文献进行识别,存储到慢性肾脏病专业数据库中,同时将慢性肾脏病专业电子书籍与电子文献也存储到慢性肾脏病专业数据库中,基于慢性肾脏病专业数据库人工构建血液检查项目、尿液检查项目、症状及其他医学实体名词标准库,标准库中包含每个医学名词的标准名称及出现过的相似名称,并进行编码便于唯一标识,形成特征标准库;对步骤S12提取的医学特征和医学特征值,将其中同一种特征不同的表述对照特征标准库进行替换,统一特征描述,得到标准化的医学特征数据;对医学特征值的替换,为符号、字母、文字、单位、医学代码的替换,统一符号、字母、文字、单位、医学代码;S132数据清洗的步骤;对标准化后的医学特征数据,去除有缺失值的数据;针对定量资料数据采用3倍标准差法剔除错误数据;针对定性资料与等级资料数据采用统一编码法量化,形成特标准医学特征数据集。
- 一种构建AI慢性肾病风险筛查模型的方法,其特征在于包括:A1训练数据得到慢性肾病风险筛查模型的步骤采用python开发语言的sklearn包,选用BP神经网络、XGBoost与随机森林三种模型建立集成学习分类器系统;建立得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,在BP神经网络、XGBoost与随机森林三种模型中对数据进行训练以及迭代训练,对慢性肾病风险筛查参数集进行调优,最终得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,该慢性肾病风险筛查参数集包括能够判别慢性肾病的相适应的BP神经网络神经元权重和偏差,随机森林决策树结点中的医学特征及阈值,以及XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;所述BP神经网络包括神经元权重和偏差;所述随机森林由多个决策树构成,决策树包括多个结点,结点为医学特征及阈值;所述XGBoost包括XGBoost决策树,以及XGBoost决策树之间的关系;所述 XGBoost决策树包括多个结点,结点为医学特征及阈值;所述XGBoost决策树之间的关系为梯度下降优化算法,后一棵决策树由前一棵树决策树按照梯度下降优化算法得到;将训练数据分别经过BP神经网络算法、XGBoost算法与随机森林算法计算,分别得到BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集,将BP神经网络预测结果集、XGBoost预测结果集与随机森林预测结果集合并成一个总预测结果集,总预测结果集中由预测结果值组成,预测结果值为是和否两个值组成,是代表慢性肾脏病,否代表非慢性肾脏病;采用投票法对总预测结果集进行投票,按照预测结果中是与否的数量,是与否值数量最多的胜出,从而得到慢性肾病预测结果;还包括迭代训练的步骤:将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果进行比较,如果慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符,则对对慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入BP神经网络继续进行训练,调节BP神经网络中的神经元权重和偏差,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的神经元权重和偏差;同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入随机森林算法继续进行训练,调整决策树结点中的医学特征及阈值,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的决策树结点中的医学特征及阈值;同时,将慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果不符的所对应的选取的医学特征数据放入XGBoost算法继续进行训练,调整XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系,最终使得慢性肾病预测结果与相应患者标准诊断结果集中的诊断结果相符,从而得到能够判别慢性肾病的相适应的XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;由此,得到能够判别慢性肾病的相适应的慢性肾病风险筛查参数集,所述能够判 别慢性肾病的相适应的慢性肾病风险筛查参数集包括能够判别慢性肾病的相适应的BP神经网络神经元权重和偏差,随机森林决策树结点中的医学特征及阈值,以及XGBoost决策树结点中的医学特征、阈值,以及XGBoost决策树之间的关系;由此,能够判别慢性肾病的相适应的慢性肾病风险筛查参数集与BP神经网络算法、XGBoost算法、随机森林算法共同构成了慢性肾病风险筛查模型;A2慢性肾病风险筛查模型测试的步骤慢性肾病风险筛查模型对测试数据进行计算,对得到的结果计算准确率、召回率和精确率,如果这三个测试指标的平均值超过了0.95,则该慢性肾病人工智能筛查模型有效;如果其平均值没有达到0.95,则进行返回步骤A1再次使用训练数据,进行算法调参,重新得到慢性肾病的相适应的慢性肾病风险筛查参数集,再次得到慢性肾病风险筛查模型;所述准确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量与非慢性肾病的数量的和占测试数据总数量的比值;所述召回率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占测试数据中的诊断结果为慢性肾病总数量的比值;所述精确率为在测试数据中慢性肾病风险筛查模型正确预测慢性肾病的数量占慢性肾病风险筛查模型预测为慢性肾病总数量的比值;所述测试数据来源于电子病历;A3建立慢性肾病风险有效险筛查模型的步骤;经过步骤A1和A2得到的准确率、精确度与召回率都超过0.95的慢性肾病风险筛查模型,判定为慢性肾病风险有效险筛查模型。
- 一种AI慢性肾病风险筛查系统,其特征在于包括权利要求1所述的一种AI慢性肾病风险有效险筛查方法,所述慢性肾病风险有效险筛查模型包括由BP神经网络、XGBoost与随机森林三种模型建立的集成学习分类器系统,以及能够判别慢性肾病的相适应的慢性肾病风险筛查参数集。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010225048.8A CN111554401B (zh) | 2020-03-26 | 2020-03-26 | Ai慢性肾病风险筛查建模方法、慢性肾病风险筛查方法及系统 |
CN202010225048.8 | 2020-03-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021190300A1 true WO2021190300A1 (zh) | 2021-09-30 |
Family
ID=72007254
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/079849 WO2021190300A1 (zh) | 2020-03-26 | 2021-03-10 | Ai慢性肾病风险筛查建模方法、慢性肾病风险筛查方法及系统 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111554401B (zh) |
WO (1) | WO2021190300A1 (zh) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114496243A (zh) * | 2021-12-31 | 2022-05-13 | 东软集团股份有限公司 | 数据处理方法、装置、存储介质及电子设备 |
CN115565681A (zh) * | 2022-10-21 | 2023-01-03 | 电子科技大学(深圳)高等研究院 | 面向不平衡数据的IgA肾病的预测分析系统 |
CN116246752A (zh) * | 2023-03-27 | 2023-06-09 | 中国医学科学院肿瘤医院 | 一种全身麻醉术后恶心呕吐预测模型的生成和使用方法 |
CN117995413A (zh) * | 2024-02-01 | 2024-05-07 | 中国人民解放军陆军军医大学第二附属医院 | 基于血清Klotho的慢性肾脏病预测模型的构建方法及应用 |
CN118280566A (zh) * | 2024-06-04 | 2024-07-02 | 成都科瑞普医疗器械有限公司 | 一种用于非医疗诊断的肾脏功能评估方法 |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111554401B (zh) * | 2020-03-26 | 2020-12-29 | 肾泰网健康科技(南京)有限公司 | Ai慢性肾病风险筛查建模方法、慢性肾病风险筛查方法及系统 |
CN112017771B (zh) * | 2020-08-31 | 2024-02-27 | 吾征智能技术(北京)有限公司 | 一种基于精液常规检查数据的疾病预测模型的构建方法及系统 |
CN112017785B (zh) * | 2020-11-02 | 2021-02-05 | 平安科技(深圳)有限公司 | 一种疾病风险预测系统、方法、装置、设备及介质 |
CN112652391A (zh) * | 2020-12-16 | 2021-04-13 | 浙江大学温州研究院 | 一种用于识别慢性阻塞性肺疾病急性加重的系统 |
CN113744869B (zh) * | 2021-09-07 | 2024-03-26 | 中国医科大学附属盛京医院 | 基于机器学习建立早期筛查轻链型淀粉样变性的方法及其应用 |
CN113643778B (zh) * | 2021-10-14 | 2022-01-21 | 山东大学齐鲁医院 | 基于电子病历资料的院内心脏骤停筛选方法及系统 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011012665A1 (en) * | 2009-07-28 | 2011-02-03 | Universiteit Maastricht | In vitro method for predicting whether a compound is genotoxic in vivo |
CN108573753A (zh) * | 2018-04-26 | 2018-09-25 | 葛晓雪 | 一种融合Bagging的XGboost慢性肾病分期预测算法 |
CN109616168A (zh) * | 2018-12-14 | 2019-04-12 | 北京工业大学 | 一种基于电子病历的医疗领域智能管理模型构建方法 |
CN109741835A (zh) * | 2018-12-04 | 2019-05-10 | 平安科技(深圳)有限公司 | 基于大数据的慢性肾病监管方法、装置、设备及存储介质 |
CN109754878A (zh) * | 2018-11-30 | 2019-05-14 | 平安科技(深圳)有限公司 | 慢性肾病筛查方法、装置、设备及存储介质 |
CN110751548A (zh) * | 2019-09-04 | 2020-02-04 | 浪潮金融信息技术有限公司 | 一种应用于智慧银行的用户贷款风险预测方法 |
CN111554401A (zh) * | 2020-03-26 | 2020-08-18 | 肾泰网健康科技(南京)有限公司 | 一种构建ai慢性肾病筛查模型的方法、慢性肾病筛查方法及系统 |
-
2020
- 2020-03-26 CN CN202010225048.8A patent/CN111554401B/zh active Active
-
2021
- 2021-03-10 WO PCT/CN2021/079849 patent/WO2021190300A1/zh active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011012665A1 (en) * | 2009-07-28 | 2011-02-03 | Universiteit Maastricht | In vitro method for predicting whether a compound is genotoxic in vivo |
CN108573753A (zh) * | 2018-04-26 | 2018-09-25 | 葛晓雪 | 一种融合Bagging的XGboost慢性肾病分期预测算法 |
CN109754878A (zh) * | 2018-11-30 | 2019-05-14 | 平安科技(深圳)有限公司 | 慢性肾病筛查方法、装置、设备及存储介质 |
CN109741835A (zh) * | 2018-12-04 | 2019-05-10 | 平安科技(深圳)有限公司 | 基于大数据的慢性肾病监管方法、装置、设备及存储介质 |
CN109616168A (zh) * | 2018-12-14 | 2019-04-12 | 北京工业大学 | 一种基于电子病历的医疗领域智能管理模型构建方法 |
CN110751548A (zh) * | 2019-09-04 | 2020-02-04 | 浪潮金融信息技术有限公司 | 一种应用于智慧银行的用户贷款风险预测方法 |
CN111554401A (zh) * | 2020-03-26 | 2020-08-18 | 肾泰网健康科技(南京)有限公司 | 一种构建ai慢性肾病筛查模型的方法、慢性肾病筛查方法及系统 |
Non-Patent Citations (1)
Title |
---|
HUANG, SHIXIN ET AL.: "Cognitive map study of type 2 diabetic nephropathy based on BP neural network model.", CHINESE JOURNAL OF ENDOCRINOLOGY AND METABOLISM, vol. 33, no. 11, 30 November 2017 (2017-11-30), pages 20, XP055853689 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114496243A (zh) * | 2021-12-31 | 2022-05-13 | 东软集团股份有限公司 | 数据处理方法、装置、存储介质及电子设备 |
CN115565681A (zh) * | 2022-10-21 | 2023-01-03 | 电子科技大学(深圳)高等研究院 | 面向不平衡数据的IgA肾病的预测分析系统 |
CN116246752A (zh) * | 2023-03-27 | 2023-06-09 | 中国医学科学院肿瘤医院 | 一种全身麻醉术后恶心呕吐预测模型的生成和使用方法 |
CN116246752B (zh) * | 2023-03-27 | 2024-01-16 | 中国医学科学院肿瘤医院 | 一种全身麻醉术后恶心呕吐预测模型的生成和使用方法 |
CN117995413A (zh) * | 2024-02-01 | 2024-05-07 | 中国人民解放军陆军军医大学第二附属医院 | 基于血清Klotho的慢性肾脏病预测模型的构建方法及应用 |
CN118280566A (zh) * | 2024-06-04 | 2024-07-02 | 成都科瑞普医疗器械有限公司 | 一种用于非医疗诊断的肾脏功能评估方法 |
Also Published As
Publication number | Publication date |
---|---|
CN111554401A (zh) | 2020-08-18 |
CN111554401B (zh) | 2020-12-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021190300A1 (zh) | Ai慢性肾病风险筛查建模方法、慢性肾病风险筛查方法及系统 | |
CN110827993A (zh) | 基于集成学习的早期死亡风险评估模型建立方法及装置 | |
CN114023441A (zh) | 基于可解释机器学习模型的严重aki早期风险评估模型、装置及其开发方法 | |
CN110246577B (zh) | 一种基于人工智能辅助妊娠期糖尿病遗传风险预测的方法 | |
CN108511056A (zh) | 基于脑卒中患者相似性分析的治疗方案推荐方法及系统 | |
CN113128654B (zh) | 一种用于冠心病预诊断中的改进型随机森林模型及其预诊断系统 | |
US20100185573A1 (en) | Method and Apparatus for Diagnosing an Allergy of the Upper Respiratory Tract Using a Neural Network | |
CN112967803A (zh) | 基于集成模型的急诊患者早期死亡率预测方法及系统 | |
CN109872819A (zh) | 一种基于重症监护检测项的急性肾损伤发病概率预测系统 | |
CN113327679A (zh) | 一种肺栓塞临床风险及预后评分方法与系统 | |
CN114220540A (zh) | 一种糖尿病肾病风险预测模型的构建方法及应用 | |
CN113470816A (zh) | 一种基于机器学习的糖尿病肾病预测方法、系统和预测装置 | |
CN117116477A (zh) | 基于随机森林和XGBoost的前列腺癌患病风险预测模型的构建方法及系统 | |
CN107480419A (zh) | 胎儿出生缺陷智能化诊断系统 | |
CN115116612A (zh) | 一种儿童患者病情智能风险评估系统及方法 | |
CN114974585A (zh) | 一种妊娠期代谢综合征早期风险预测评估模型构建方法 | |
CN113571180A (zh) | 基于c肽分层及脏器功能的2型糖尿病人工智能诊疗管理系统 | |
CN116130105A (zh) | 一种基于神经网络的健康风险预测方法 | |
CN114550896A (zh) | 基于人工神经网络的头晕患者急诊预检分诊决策方法、装置及模型 | |
CN112927795A (zh) | 基于bagging算法的乳腺癌预测方法 | |
Desai | Early Detection and Prevention of Chronic Kidney Disease | |
CN117198532A (zh) | 一种基于机器学习的icu患者脓毒症风险预测方法及系统 | |
CN113972003A (zh) | 基于评分系统的糖尿病患病风险模型的构建方法 | |
TWI848789B (zh) | 建立用於預測罹患糖尿病腎病變風險之模型以及基於模型預測糖尿病腎病變的方法 | |
Alam | Identification of malignant mesothelioma risk factors through association rule mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21776090 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21776090 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.04.2023) |