CN116913550A - Modeling method and application of PPI-related diabetes risk prediction model - Google Patents
Modeling method and application of PPI-related diabetes risk prediction model Download PDFInfo
- Publication number
- CN116913550A CN116913550A CN202310746231.6A CN202310746231A CN116913550A CN 116913550 A CN116913550 A CN 116913550A CN 202310746231 A CN202310746231 A CN 202310746231A CN 116913550 A CN116913550 A CN 116913550A
- Authority
- CN
- China
- Prior art keywords
- model
- risk
- diabetes
- ppi
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 206010012601 diabetes mellitus Diseases 0.000 title claims abstract description 78
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000013058 risk prediction model Methods 0.000 title abstract description 5
- 238000011282 treatment Methods 0.000 claims abstract description 19
- 206010020772 Hypertension Diseases 0.000 claims abstract description 12
- 238000010276 construction Methods 0.000 claims abstract description 12
- 238000012216 screening Methods 0.000 claims abstract description 12
- 230000000391 smoking effect Effects 0.000 claims abstract description 12
- 208000004611 Abdominal Obesity Diseases 0.000 claims abstract description 10
- 208000024172 Cardiovascular disease Diseases 0.000 claims abstract description 10
- 206010065941 Central obesity Diseases 0.000 claims abstract description 10
- 208000032928 Dyslipidaemia Diseases 0.000 claims abstract description 10
- 229940121710 HMGCoA reductase inhibitor Drugs 0.000 claims abstract description 10
- 208000017170 Lipid metabolism disease Diseases 0.000 claims abstract description 10
- 208000026106 cerebrovascular disease Diseases 0.000 claims abstract description 10
- 230000002526 effect on cardiovascular system Effects 0.000 claims abstract description 10
- 208000001953 Hypotension Diseases 0.000 claims abstract description 9
- 238000004364 calculation method Methods 0.000 claims abstract description 7
- 238000012545 processing Methods 0.000 claims abstract description 7
- 208000001072 type 2 diabetes mellitus Diseases 0.000 claims description 9
- 238000002790 cross-validation Methods 0.000 claims description 8
- 230000037081 physical activity Effects 0.000 claims description 7
- 208000021822 hypotensive Diseases 0.000 claims description 6
- 230000001077 hypotensive effect Effects 0.000 claims description 6
- 239000003550 marker Substances 0.000 claims description 6
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 claims description 5
- 238000011088 calibration curve Methods 0.000 claims description 5
- 235000020983 fruit intake Nutrition 0.000 claims description 5
- 239000008103 glucose Substances 0.000 claims description 5
- 235000020991 processed meat Nutrition 0.000 claims description 5
- 235000020989 red meat Nutrition 0.000 claims description 5
- 235000001366 vegetable intake Nutrition 0.000 claims description 5
- 238000012795 verification Methods 0.000 claims description 5
- 230000003442 weekly effect Effects 0.000 claims description 5
- 238000012549 training Methods 0.000 claims description 4
- 230000000225 effect on diabetes Effects 0.000 claims description 3
- 238000000556 factor analysis Methods 0.000 claims description 3
- 230000003862 health status Effects 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 101710107035 Gamma-glutamyltranspeptidase Proteins 0.000 claims description 2
- 101710173228 Glutathione hydrolase proenzyme Proteins 0.000 claims description 2
- 102000001554 Hemoglobins Human genes 0.000 claims description 2
- 108010054147 Hemoglobins Proteins 0.000 claims description 2
- LEHOTFFKMJEONL-UHFFFAOYSA-N Uric Acid Chemical compound N1C(=O)NC(=O)C2=C1NC(=O)N2 LEHOTFFKMJEONL-UHFFFAOYSA-N 0.000 claims description 2
- TVWHNULVHGKJHS-UHFFFAOYSA-N Uric acid Natural products N1C(=O)NC(=O)C2NC(=O)NC21 TVWHNULVHGKJHS-UHFFFAOYSA-N 0.000 claims description 2
- 238000004590 computer program Methods 0.000 claims description 2
- 230000035622 drinking Effects 0.000 claims description 2
- 238000013210 evaluation model Methods 0.000 claims description 2
- 102000006640 gamma-Glutamyltransferase Human genes 0.000 claims description 2
- UFTFJSFQGQCHQW-UHFFFAOYSA-N triformin Chemical compound O=COCC(OC=O)COC=O UFTFJSFQGQCHQW-UHFFFAOYSA-N 0.000 claims description 2
- 229940116269 uric acid Drugs 0.000 claims description 2
- 238000003745 diagnosis Methods 0.000 abstract description 3
- 230000036543 hypotension Effects 0.000 abstract description 3
- 229920000333 poly(propyleneimine) Polymers 0.000 description 56
- 229940126409 proton pump inhibitor Drugs 0.000 description 43
- 230000007774 longterm Effects 0.000 description 9
- 239000003814 drug Substances 0.000 description 6
- 238000012360 testing method Methods 0.000 description 5
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 229940079593 drug Drugs 0.000 description 4
- 239000008280 blood Substances 0.000 description 3
- 210000004369 blood Anatomy 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 239000000612 proton pump inhibitor Substances 0.000 description 3
- 206010022971 Iron Deficiencies Diseases 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000001684 chronic effect Effects 0.000 description 2
- DDRJAANPRJIHGJ-UHFFFAOYSA-N creatinine Chemical compound CN1CC(=O)NC1=N DDRJAANPRJIHGJ-UHFFFAOYSA-N 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 208000021302 gastroesophageal reflux disease Diseases 0.000 description 2
- 229910052742 iron Inorganic materials 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000013517 stratification Methods 0.000 description 2
- 238000011269 treatment regimen Methods 0.000 description 2
- 208000002670 vitamin B12 deficiency Diseases 0.000 description 2
- 206010061819 Disease recurrence Diseases 0.000 description 1
- 206010017076 Fracture Diseases 0.000 description 1
- 206010061164 Gastric mucosal lesion Diseases 0.000 description 1
- 208000031226 Hyperlipidaemia Diseases 0.000 description 1
- FYYHWMGAXLPEAU-UHFFFAOYSA-N Magnesium Chemical compound [Mg] FYYHWMGAXLPEAU-UHFFFAOYSA-N 0.000 description 1
- 208000008589 Obesity Diseases 0.000 description 1
- 208000008469 Peptic Ulcer Diseases 0.000 description 1
- 208000007107 Stomach Ulcer Diseases 0.000 description 1
- 206010046274 Upper gastrointestinal haemorrhage Diseases 0.000 description 1
- 229930003779 Vitamin B12 Natural products 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000037182 bone density Effects 0.000 description 1
- 238000007475 c-index Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 208000020832 chronic kidney disease Diseases 0.000 description 1
- FDJOLVPMNUYSCM-WZHZPDAFSA-L cobalt(3+);[(2r,3s,4r,5s)-5-(5,6-dimethylbenzimidazol-1-yl)-4-hydroxy-2-(hydroxymethyl)oxolan-3-yl] [(2r)-1-[3-[(1r,2r,3r,4z,7s,9z,12s,13s,14z,17s,18s,19r)-2,13,18-tris(2-amino-2-oxoethyl)-7,12,17-tris(3-amino-3-oxopropyl)-3,5,8,8,13,15,18,19-octamethyl-2 Chemical compound [Co+3].N#[C-].N([C@@H]([C@]1(C)[N-]\C([C@H]([C@@]1(CC(N)=O)C)CCC(N)=O)=C(\C)/C1=N/C([C@H]([C@@]1(CC(N)=O)C)CCC(N)=O)=C\C1=N\C([C@H](C1(C)C)CCC(N)=O)=C/1C)[C@@H]2CC(N)=O)=C\1[C@]2(C)CCC(=O)NC[C@@H](C)OP([O-])(=O)O[C@H]1[C@@H](O)[C@@H](N2C3=CC(C)=C(C)C=C3N=C2)O[C@@H]1CO FDJOLVPMNUYSCM-WZHZPDAFSA-L 0.000 description 1
- 229940109239 creatinine Drugs 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 210000002249 digestive system Anatomy 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 210000001061 forehead Anatomy 0.000 description 1
- 210000004211 gastric acid Anatomy 0.000 description 1
- 230000027119 gastric acid secretion Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 229940037467 helicobacter pylori Drugs 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 229910052749 magnesium Inorganic materials 0.000 description 1
- 239000011777 magnesium Substances 0.000 description 1
- 229940091250 magnesium supplement Drugs 0.000 description 1
- 235000020824 obesity Nutrition 0.000 description 1
- 208000011906 peptic ulcer disease Diseases 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 239000011715 vitamin B12 Substances 0.000 description 1
- 235000019163 vitamin B12 Nutrition 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H70/00—ICT specially adapted for the handling or processing of medical references
- G16H70/40—ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/27—Regression, e.g. linear or logistic regression
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/20—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Abstract
The application relates to the technical field of medical diagnosis and treatment, in particular to a modeling method and application of a PPI related diabetes risk prediction model, wherein the prediction model comprises the following steps: (1) a data acquisition module; (2) a data processing module; (3) a model construction module; (4) an additional risk and safety threshold calculation module; the application screens out 12 optimal predictors by screening the commonly used predictors: age, sex, education level, income, BMI, abdominal obesity, smoking, parent's history of diabetes, past history of dyslipidemia, cardiovascular and cerebrovascular diseases, whether to receive hypertension hypotension treatment, statin use. The application utilizes the prediction model to screen the high risk group of the PPI related diabetes mellitus, and provides a basis for the personalized treatment of the PPI.
Description
Technical Field
The application relates to the technical field of medical diagnosis and treatment, in particular to a modeling method and application of a PPI related diabetes risk prediction model.
Background
Proton pump inhibitors (Protonpump inhibitors, PPIs) are a commonly used class of gastric acid secretion inhibitors in clinic, and are commonly used for treating acute and chronic gastric acid related diseases of the digestive system, including gastroesophageal reflux disease, peptic ulcer, upper gastrointestinal hemorrhage, and for preventing and treating stress gastric mucosal lesions. Because PPIs have fast onset of action, strong and durable acid inhibition, good short-term safety and tolerance, and very wide clinical application, the preparation is one of ten most widely used medicines worldwide. Despite the good short-term safety, more and more studies confirm that long-term use of PPIs carries many health risks, such as fractures, chronic kidney disease, iron or vitamin B12 deficiency, etc. Applicant's previous studies found that long-term use of proton pump inhibitors increased the risk of developing type 2 diabetes. Considering the number of people used, the consequences are very serious if long-term use of PPIs can increase the risk of diabetes. However, given the important clinical value of PPIs, regular administration is very important for the treatment of gastroesophageal reflux, gastric ulcers, anti-helicobacter pylori and other diseases, and blindly stopping the administration can affect the therapeutic effect. How to reasonably and accurately use PPIs is important for clinicians and patients, but no guidance tool for personalized medicine is available at present.
At present, researchers in the field have many researches on models of diabetes risks, for example, an application patent CN113903450A discloses a construction system of a type 2 diabetes risk prediction model, the construction system determines risk factors of type 2 diabetes by researching and researching relations between diseased states and behavioral life modes of residents, establishes a regression model, obtains weight values corresponding to the risk factors after mathematical conversion of the regression coefficients, establishes a risk scoring system, and provides scientific and reasonable suggestions for prevention and intervention of type 2 diabetes. The application patent CN102063568A discloses a model for predicting individual level diabetes mellitus, wherein the model utilizes self information of age, sex, height (cm), weight (kg), waistline (cm), family history of diabetes mellitus, history of hypertension, history of hyperlipidemia and last fasting blood glucose value (mmol/L), calculates a weighted value of a diabetes mellitus risk factor through an effect value of meta combination analysis, further calculates a diabetes mellitus onset risk value (Pn) of a target object in the future n years, and if the P value is more than or equal to 5% of the diabetes mellitus risk value; and < 5% is normal population. However, all of the above models are considered daily factors, and the influence caused by the use of PPI is not considered, but the accurate treatment of PPIs based on the diabetes risk model is not reported yet. For the potential risk of PPIs, both american AGA and chinese guidelines for clinical application propose strict indications as well as controlled doses and duration of use, as AGA recommends that patients with chronic PPI use indications for most and twice daily administration should be considered to be reduced to once daily PPI. Additional solutions include regular screening or detection of bone density, creatinine, magnesium or vitamin B12 levels, etc. by long-term PPIs users, such screening (e.g. iron or vitamin B12 deficiency) has not proven to benefit, and thus how to trade off the benefits of long-term PPIs are not yet addressed.
The prediction model of PPIs related diabetes is constructed, so that individual accurate medication of patients with different risk levels can be guided, but related researches and applications have not been reported yet. Aiming at the technical problems, the application establishes a risk stratification device for evaluating the long-term use of PPI complicated diabetes of different individuals based on the establishment of a prediction model for predicting the risk of the use of the PPI related diabetes, and can divide clinical PPI use groups into a low risk group and a high risk group by the device, so that the PPI can be safely used for the people without increasing absolute risk; for high risk populations, careful assessment of the necessity of long-term use of PPI, search for alternative treatment regimens, and periodic screening for abnormal blood glucose and T2DM is suggested.
Disclosure of Invention
Aiming at the technical problems, the application constructs a diabetes prediction model, and identifies the high risk group causing diabetes after long-term use of PPI by calculating the additional risk brought by the PPI of the people with different diabetes risks, thereby providing basis for the rational use of the PPI, and the specific technical scheme is as follows:
the primary object of the present application is to provide a marker for risk of PPI-related diabetes mellitus, said marker comprising fasting glucose, body mass index, gamma glutamyl transpeptidase, triglyceride, sex, age, uric acid, hemoglobin A1c, smoking, drinking, physical activity and family history.
A second object of the present application is to provide the use of said marker for predicting the risk of a PPI-associated type 2 diabetes.
The third object of the present application is to provide a method for constructing a predictive model of risk of PPI-related diabetes, comprising the steps of:
(1) And a data acquisition module: acquiring basic demographic characteristics such as age, sex and the like, life style and health status related information of individuals by using a questionnaire;
(2) And a data processing module: performing variable preprocessing on the acquired data, and screening potential predictors which are preliminarily included based on single factor analysis;
(3) Model construction module:
s1, constructing an initial model: constructing an initial model of diabetes risk prediction based on potential predictors;
s2, optimizing a model: selecting important predictive factors for an initial model by using LASSO regression, and determining an optimal lambda value according to a 10-fold cross validation method so as to determine a variable with a predictive effect on diabetes;
s3, obtaining an optimal model: incorporating the initially selected variables into a Cox regression model, and screening the variables finally entering the model by using a stepwise regression method;
s4, model verification: verifying the accuracy and the calibration degree of the model;
(4) Additional risk and safety threshold calculation module: and calculating the additional risk brought by the people at different risks after using the PPI, and determining the high-risk group.
Preferably, the potential predictors initially incorporated in step (2) include: age, sex, education received, BMI (weight (KG)/height (meter square), abdominal obesity, smoking, physical activity/exercise, daily vegetable/fruit intake, weekly intake of red meat/processed meat, family history of parent diabetes, past history of dyslipidemia, cardiovascular and cerebrovascular diseases, whether to receive hypertension hypotensive treatment, statin use.
Preferably, in the model verification described in the step (3), the accuracy of the model is evaluated by drawing an ROC curve and calculating AUC values and Harrell's C statistics, and the calibration degree of the model is evaluated by a calibration curve.
Preferably, the variables selected in step (3) for final entry into the model include age, sex, education, income, BMI, abdominal obesity, smoking, history of parental diabetes, history of past dyslipidemia, cardiovascular and cerebrovascular diseases, whether to receive hypertension hypotension treatment, statin use.
A fourth object of the present application is to provide a system for constructing a predictive model of risk of PPI-related diabetes, which is applied to the construction method, comprising:
the data acquisition module is at least used for data acquisition and acquiring a sample data set;
a data processing module for extracting at least valid samples from the sample dataset that can be used to construct an assessment model;
the model construction module is at least used for constructing a model from the incomplete data set of the effective sample, fitting a training set by using a LASSO regression and Cox stepwise regression model method, and recording optimal model parameters;
an additional risk and safety threshold calculation module is used for calculating model additional risk and safety thresholds according to the model 10-year risk prediction value.
A fifth object of the present application is to provide a prediction system for predicting risk of PPI-related diabetes, comprising:
the pre-input module is used for inputting at least data to be diagnosed for the evaluation model;
the PPI related diabetes risk constructed by the method is at least used for evaluating the data to be evaluated;
and the display module is at least used for displaying the prediction result.
It is a sixth object of the present application to provide a computer program product stored on a computer readable medium, comprising a computer readable program for providing a user input interface for applying said predictive system for predicting a risk of PPI-related diabetes when executed on an electronic device.
A seventh object of the present application is to provide a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to apply the predictive system for predicting PPI-related diabetes risk.
The beneficial effects of the application are as follows: 1) The application screens out 12 optimal predictors by screening the commonly used predictors: age, sex, education level, income, BMI, abdominal obesity, smoking, parent's history of diabetes, past history of dyslipidemia, cardiovascular and cerebrovascular diseases, whether to receive hypertension hypotension treatment, statin use. 2) The application utilizes the prediction model to screen the high risk group of the PPI related diabetes mellitus, and provides a basis for the personalized treatment of the PPI.
Drawings
FIG. 1 model building flow chart
FIG. 2 is a nomogram of a predictive model of diabetes
FIG. 3 model validation ROC curve
FIG. 4 calibration curve for 10 years
Detailed Description
In order to make the present application better understood by those skilled in the art, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Ten-fold cross-validation in the following examples, named 10-fold cross-validation, was used to test algorithm accuracy. Is a common test method. The data set was divided into ten parts, 9 parts of which were used as training data and 1 part as test data in turn, and the test was performed. Each test gives a corresponding correct rate (or error rate). As an estimation of the accuracy of the algorithm, an average value of the accuracy (or error rate) of the result of 10 times is generally required to perform 10-fold cross-validation (e.g., 10 times 10-fold cross-validation), and then the average value is obtained as an estimation of the accuracy of the algorithm. Ten fold cross-validation was chosen to split the dataset into 10 parts because by using a large number of datasets, a large number of experiments using different learning techniques, it was shown that 10 fold is the proper choice to obtain the best error estimate, and some theory could prove this. But this is not the final diagnosis and the dispute remains. And it appears that 5-fold or 20-fold and 10-fold give comparable results.
In the technical scheme of the application, BMI=weight (KG)/height (meter) square
Embodiment one, a method for constructing a predictive model of PPI-related diabetes risk
Referring to fig. 1, a method for constructing a predictive model of risk of PPI-related diabetes includes the steps of:
(1) And a data acquisition module: acquiring basic demographic characteristics such as age, sex and the like, life style and health status related information of individuals by using a questionnaire;
(2) And a data processing module: performing variable preprocessing on the acquired data, and screening potential predictors which are preliminarily included based on single factor analysis;
potential predictors of preliminary inclusion include: age, sex, education received, BMI (weight (KG)/height (meter square), abdominal obesity, smoking, physical activity/exercise, daily vegetable/fruit intake, weekly intake of red meat/processed meat, family history of parent diabetes, past history of dyslipidemia, cardiovascular and cerebrovascular diseases, whether to receive hypertension hypotensive treatment, statin use.
(3) Model construction module:
s1, constructing an initial model: constructing an initial model of diabetes risk prediction based on potential predictors;
s2, optimizing a model: selecting important predictive factors for an initial model by using LASSO regression, and determining an optimal lambda value according to a 10-fold cross validation method so as to determine a variable with a predictive effect on diabetes;
s3, obtaining an optimal model: incorporating the initially selected variables into a Cox regression model, and screening the variables finally entering the model by using a stepwise regression method;
variables that ultimately enter the model include age, sex, degree of education, income, BMI, abdominal obesity, smoking, history of parent diabetes, history of past dyslipidemia, cardiovascular and cerebrovascular disease, whether to receive hypertension hypotensive treatment, statin use.
S4, model verification: verifying the accuracy and the calibration degree of the model;
the accuracy of the model was assessed by plotting the ROC curve and calculating the AUC values and Harrell's C statistic, and the degree of calibration of the model was assessed by the calibration curve.
(4) Additional risk and safety threshold calculation module: and calculating the additional risk brought by the people at different risks after using the PPI, and determining the high-risk group.
Embodiment II, a predictive model for risk of PPI-related diabetes
A system for constructing a predictive model of PPI-related diabetes risk, applied to the construction method, comprising:
the data acquisition module is at least used for data acquisition and acquiring a sample data set;
a data processing module for extracting at least valid sample features from the sample dataset that can be used to construct an assessment model;
the model construction module is at least used for constructing a model from the incomplete data set of the effective sample, fitting a training set by using a LASSO regression and Cox stepwise regression model method, and recording optimal model parameters;
an additional risk and safety threshold calculation module is used for calculating model additional risk and safety thresholds according to the model 10-year risk prediction value.
By using the prediction model, the risk of diabetes occurrence of the patient for the next 10 years is calculated, meanwhile, the additional risk which is increased after the individual uses the PPI is calculated, the high risk group with increased risk of diabetes after the PPI is used is judged, and the individual accurate treatment is realized.
Embodiment III, application of predictive model of PPI-related diabetes risk
In the following we will take UKbiobank database as an example to illustrate the implementation of the application.
UK biobank is a large cohort of about 50 tens of thousands of people conducted in the UK region, and we will use this cohort population to illustrate our inventive technique.
1. After the exclusion of the baseline diabetic, cancerous, and potentially predictor information deficient population, a total 309468 human inclusion analysis, the baseline characteristics of the inclusion population are shown in table 1.
Table 1 baseline characteristics of the inclusion group
2. After an average follow-up of 12.6 years, a total of 9650 cases of type 2 diabetes develop. The difference between the 15 potential predictors was found between diabetic and non-diabetic patients, including: age, sex, education received, BMI (weight (KG)/height (meter square), abdominal obesity, smoking, physical activity/exercise, daily vegetable/fruit intake, weekly intake of red meat/processed meat, family history of parent diabetes, past history of dyslipidemia, cardiovascular and cerebrovascular diseases, whether to receive hypertension hypotensive treatment, statin use.
3. Variables were initially screened using the LASSO regression model, three variables of physical activity/exercise, daily vegetable/fruit intake, and weekly red/processed meat intake were excluded from the optimal lambda value for c-index, the remaining variables were incorporated into the Cox regression model, the variables were further screened using a two-way stepwise regression method, and the final predictive model was built as shown in table 2.
TABLE 2 final inclusion of variables into the predictive model
4. In the final model, included variables included age, sex, education level, income, BMI, abdominal obesity, smoking, parent's history of diabetes, past history of dyslipidemia, cardiovascular and cerebrovascular diseases, whether to receive hypertension hypotensive treatment, statin use. To facilitate clinical use of the model, a nomogram was further constructed as shown in fig. 2.
5. The predicted performance of the model was evaluated, and ROC curves for 0.814,3, 5 and 10 years of C-statistics of the model are shown in fig. 3, and AUC values are 0.823, 0.828 and 0.824, respectively, indicating that the model has good accuracy. The average of the 10-year diabetes occurrence risk predicted using the model was 1.08%, while the actual observed 10-year diabetes occurrence risk was 2.17%, and the slope of the 10-year calibration curve was 0.966 (fig. 4), indicating that the model had good calibration.
The established prediction model is used for predicting the diabetes risk of an individual for 10 years in the future, the population is divided into 10 equal parts according to the risk, and then the diabetes risk additionally increased after PPI is used by different populations is calculated. As shown in table 3, the risk of additional increase in PPIs using nearly half of the population with lower risk of diabetes (< 1%) for the next 10 years (0.49%); whereas the use of PPIs increases the risk of diabetes by more than 3% in an additional amount for around 10% of the population with a high risk of diabetes (> 5%) for the next 10 years. The result shows that the prediction model constructed by the method can identify the high risk group with increased diabetes risk after PPI use, and the individuation accurate treatment is realized.
TABLE 3 increased risk of DM with increased use of the forehead for PPI in populations at different risk
The patent builds a diabetes prediction risk model, and can calculate the diabetes risk of an individual for 10 years in the future. More importantly, the model can be used for calculating the additional risk of diabetes caused by using PPI and identifying high-risk groups. The patent can guide the PPI to take medicine, and realizes accurate treatment of individuals.
Compared with the prior art, the method aims at layering the diabetes risks caused by PPIs in the future 10 years of individuals by constructing a diabetes prediction model, screening out low-risk reasonable medication crowds and high-risk crowds, and providing PPI use recommendation suggestions so as to achieve the purpose of personalized medication. Compared with the prior art, the application has the following advantages: 1) The accurate treatment is carried out according to the individuation risk, so that the disease recurrence risk caused by the control dosage and the use time is reduced; 2) The present application can reduce additional costs compared to conventional tests for blood glucose.
In addition to the predictive model constructed by this patent, individuals can use past predictive models of diabetes to determine their risk of developing diabetes in the future, roughly estimating the risk that may be increased after use of PPI. Meanwhile, in addition to stratification of predictive models, individuals can trade off the benefits of PPIs use against potential diabetes risk based on whether they are associated with common diabetes risk factors to determine future diabetes risk. If an individual has multiple risk factors for diabetes, such as obesity, family history, hypertension, etc., careful assessment of the need for long-term use of PPI is suggested, alternative treatment regimens are sought, and abnormal glycemia and T2DM are screened periodically; PPIs may be safely used in individuals without risk factors for diabetes.
Claims (10)
1. A marker of risk of a PPI-associated diabetic condition, said marker comprising fasting glucose, body mass index, gamma glutamyl transpeptidase, triglyceride, sex, age, uric acid, hemoglobin A1c, smoking, drinking, physical activity and family history.
2. The use of the marker of claim 1 for predicting risk of PPI-associated type 2 diabetes.
3. A method for constructing a predictive model of risk of PPI-associated diabetes, comprising the steps of:
(1) And a data acquisition module: acquiring basic demographic characteristics such as age, sex and the like, life style and health status related information of individuals by using a questionnaire;
(2) And a data processing module: performing variable preprocessing on the acquired data, and screening potential predictors which are preliminarily included based on single factor analysis;
(3) Model construction module:
s1, constructing an initial model: constructing an initial model of diabetes risk prediction based on potential predictors;
s2, optimizing a model: selecting important predictive factors for an initial model by using LASSO regression, and determining an optimal lambda value according to a 10-fold cross validation method so as to determine a variable with a predictive effect on diabetes;
s3, obtaining an optimal model: incorporating the initially selected variables into a Cox regression model, and screening the variables finally entering the model by using a stepwise regression method;
s4, model verification: verifying the accuracy and the calibration degree of the model;
(4) Additional risk and safety threshold calculation module: and calculating the additional risk brought by the people at different risks after using the PPI, and determining the high-risk group.
4. The method of claim 3, wherein the potential predictors initially incorporated in step (2) comprise: age, sex, education, income, BMI, abdominal obesity, smoking, physical activity/exercise, daily vegetable/fruit intake, weekly intake of red meat/processed meat, family history of parent diabetes, past history of dyslipidemia, cardiovascular and cerebrovascular diseases, whether to receive hypertension hypotensive treatment, statin use.
5. The method of claim 3, wherein in the model verification in step (3), the accuracy of the model is evaluated by drawing an ROC curve and calculating AUC values and Harrell' sC statistics, and the degree of calibration of the model is evaluated by a calibration curve.
6. The method of claim 3, wherein the final entry model variables selected in step (3) include age, gender, educational level, income, BMI, abdominal obesity, smoking, history of parental diabetes, past history of dyslipidemia, cardiovascular and cerebrovascular disease, whether to receive hypertension and statin use.
7. A system for constructing a predictive model of PPI-associated diabetes risk, applied to the construction method of any one of claims 3-6, comprising:
the data acquisition module is at least used for data acquisition and acquiring a sample data set;
a data processing module for extracting at least valid sample features from the sample dataset that can be used to construct an assessment model;
the model construction module is at least used for constructing a model from the incomplete data set of the effective sample, fitting a training set by using a LASSO regression and Cox stepwise regression model method, and recording optimal model parameters;
an additional risk and safety threshold calculation module is used for calculating model additional risk and safety thresholds according to the model 10-year risk prediction value.
8. A predictive system for predicting risk of PPI-associated diabetes, comprising:
the pre-input module is used for inputting at least data to be diagnosed for the evaluation model;
a risk of PPI-related diabetes constructed by the method of any one of claims 3-6, at least for evaluating the data to be evaluated;
and the display module is at least used for displaying the prediction result.
9. A computer program product stored on a computer readable medium, comprising a computer readable program for providing a user input interface for applying the prediction system for predicting PPI-related diabetes risk of claim 7 when executed on an electronic device.
10. A computer readable storage medium storing instructions that when executed on a computer cause the computer to apply the predictive system for predicting risk of PPI-associated diabetes of claim 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310746231.6A CN116913550A (en) | 2023-06-25 | 2023-06-25 | Modeling method and application of PPI-related diabetes risk prediction model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310746231.6A CN116913550A (en) | 2023-06-25 | 2023-06-25 | Modeling method and application of PPI-related diabetes risk prediction model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116913550A true CN116913550A (en) | 2023-10-20 |
Family
ID=88363879
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310746231.6A Pending CN116913550A (en) | 2023-06-25 | 2023-06-25 | Modeling method and application of PPI-related diabetes risk prediction model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116913550A (en) |
-
2023
- 2023-06-25 CN CN202310746231.6A patent/CN116913550A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102024373B1 (en) | Apparatus and method for predicting disease risk of metabolic disease | |
Jung et al. | The ACC/AHA 2013 pooled cohort equations compared to a Korean Risk Prediction Model for atherosclerotic cardiovascular disease | |
Schena et al. | Development and testing of an artificial intelligence tool for predicting end-stage kidney disease in patients with immunoglobulin A nephropathy | |
US8793144B2 (en) | Treatment effect prediction system, a treatment effect prediction method, and a computer program product thereof | |
Lagani et al. | A systematic review of predictive risk models for diabetes complications based on large scale clinical studies | |
Barton et al. | Movement Deviation Profile: A measure of distance from normality using a self-organizing neural network | |
CN110246577B (en) | Method for assisting gestational diabetes genetic risk prediction based on artificial intelligence | |
CN115602325A (en) | Chronic disease risk assessment method and system based on multi-model algorithm | |
KR20230126155A (en) | Method, server and program for user-customized analysis of the efficacy of ingredients using medical data based on artificial intelligence | |
CN117012392A (en) | Hypertension risk assessment model construction method, diet therapy and health management system | |
Shin et al. | Development of various diabetes prediction models using machine learning techniques | |
US20070161868A1 (en) | Method and system for determining whether additional laboratory tests will yield values beyond a threshold level | |
KR102467999B1 (en) | Etiome model for gastric cancer development based on multi-layer ad multi-factor panel and computational biological network modeling | |
JP6864947B2 (en) | How to create health positioning maps and health functions, systems, and programs, and how to use them | |
CN116913550A (en) | Modeling method and application of PPI-related diabetes risk prediction model | |
CN113593705B (en) | Nomogram model system for predicting weak progress of old people in community | |
Suneetha et al. | Fine tuning bert based approach for cardiovascular disease diagnosis | |
Wang et al. | Predicting cumulative lead (Pb) exposure using the Super Learner algorithm | |
Hsu | Relationship between metabolism and probability risks of having cardiovascular diseases or renal complications using GH-Method: Math-Physical Medicine | |
CN117153398A (en) | Construction method of health risk prediction model of proton pump inhibitor and danger layering application of health risk prediction model | |
Zhang et al. | Role of artificial intelligence in tackling the metabolic syndrome pandemic | |
KR20230173438A (en) | Device for predicting the probability of impaired fasting glucose in non-diabetics, method for predicting the probability of impaired fasting glucose in non-diabetics and program stored in a recording medium | |
RU2457788C1 (en) | Method of diagnosing symptomatic arterial hypertension in patients with adrenal incidentalomas | |
Shi | Leveraging genetic data in observational studies: methods in Mendelian randomization and applications in risk prediction modeling | |
Xu et al. | Establishment of hypertension risk nomograms based on physical fitness parameters for men and women: a cross-sectional study |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |