CN117332203A - System and method for carrying out data exploration and analysis on type 2 diabetes special disease queue - Google Patents
System and method for carrying out data exploration and analysis on type 2 diabetes special disease queue Download PDFInfo
- Publication number
- CN117332203A CN117332203A CN202311457263.0A CN202311457263A CN117332203A CN 117332203 A CN117332203 A CN 117332203A CN 202311457263 A CN202311457263 A CN 202311457263A CN 117332203 A CN117332203 A CN 117332203A
- Authority
- CN
- China
- Prior art keywords
- analysis
- type
- diabetes
- data exploration
- queues
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 47
- 238000000034 method Methods 0.000 title claims abstract description 35
- 208000001072 type 2 diabetes mellitus Diseases 0.000 title claims abstract description 32
- 201000010099 disease Diseases 0.000 title claims abstract description 30
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 30
- 238000007619 statistical method Methods 0.000 claims abstract description 21
- 206010012601 diabetes mellitus Diseases 0.000 claims abstract description 16
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims abstract description 8
- 230000002452 interceptive effect Effects 0.000 claims abstract description 3
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 claims description 24
- 239000008280 blood Substances 0.000 claims description 20
- 210000004369 blood Anatomy 0.000 claims description 20
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 claims description 17
- 239000008103 glucose Substances 0.000 claims description 17
- 238000012360 testing method Methods 0.000 claims description 16
- 102000004877 Insulin Human genes 0.000 claims description 12
- 108090001061 Insulin Proteins 0.000 claims description 12
- 229940125396 insulin Drugs 0.000 claims description 12
- 238000012417 linear regression Methods 0.000 claims description 10
- 238000011160 research Methods 0.000 claims description 10
- 230000001419 dependent effect Effects 0.000 claims description 7
- 230000000694 effects Effects 0.000 claims description 7
- 238000010219 correlation analysis Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 6
- 238000007477 logistic regression Methods 0.000 claims description 6
- 238000003745 diagnosis Methods 0.000 claims description 5
- 230000002641 glycemic effect Effects 0.000 claims description 5
- 230000002218 hypoglycaemic effect Effects 0.000 claims description 4
- 230000001603 reducing effect Effects 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 3
- 229940029980 drug used in diabetes Drugs 0.000 claims description 3
- 230000007774 longterm Effects 0.000 claims description 3
- 238000004393 prognosis Methods 0.000 claims description 3
- 208000017667 Chronic Disease Diseases 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims 1
- 239000000523 sample Substances 0.000 claims 1
- 230000008719 thickening Effects 0.000 claims 1
- 230000001737 promoting effect Effects 0.000 abstract description 4
- 238000000491 multivariate analysis Methods 0.000 abstract 1
- 230000000007 visual effect Effects 0.000 abstract 1
- 230000008569 process Effects 0.000 description 5
- 206010020772 Hypertension Diseases 0.000 description 3
- 208000006011 Stroke Diseases 0.000 description 3
- 230000009471 action Effects 0.000 description 3
- 208000020832 chronic kidney disease Diseases 0.000 description 3
- 208000029078 coronary artery disease Diseases 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 3
- 238000000585 Mann–Whitney U test Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000000546 chi-square test Methods 0.000 description 2
- 238000000611 regression analysis Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 238000000729 Fisher's exact test Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 210000004204 blood vessel Anatomy 0.000 description 1
- 230000009693 chronic damage Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 208000016097 disease of metabolism Diseases 0.000 description 1
- 230000004064 dysfunction Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000001508 eye Anatomy 0.000 description 1
- 210000002216 heart Anatomy 0.000 description 1
- 201000001421 hyperglycemia Diseases 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 230000003914 insulin secretion Effects 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 208000030159 metabolic disease Diseases 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012353 t test Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/15—Correlation function computation including computation of convolution operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2132—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
- G06F18/21322—Rendering the within-class scatter matrix non-singular
- G06F18/21324—Rendering the within-class scatter matrix non-singular involving projections, e.g. Fisherface techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/27—Regression, e.g. linear or logistic regression
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Algebra (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Public Health (AREA)
- Medical Informatics (AREA)
- Operations Research (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Pathology (AREA)
- Probability & Statistics with Applications (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The invention relates to a system and a method for carrying out data exploration and analysis on a type 2 diabetes special disease queue, wherein the analysis system is an interactive Web application program constructed by Shiny; the method for carrying out data exploration analysis comprises the following steps: s1: acquiring an ADaM data set and a predefined file of a type 2 diabetic patient; s2: dividing the group of type 2 diabetics into different subgroups or queues; s3: selecting a proper statistical method in an analysis system instrument panel according to the data exploration target; s4: according to a statistical method, selecting variables and confounding factors of corresponding people and participation models; s5: and outputting a result chart, and adjusting the width and the height of the chart. The invention can help the user simplify the work flow of developing the multivariate data analysis on the special type 2 diabetes queue, so that the result display is more visual, and the invention provides basis for promoting the standardized treatment and management of the basic diabetes.
Description
Technical Field
The invention relates to the technical field of data exploration and analysis, in particular to a system and a method for carrying out data exploration and analysis on a type 2 diabetes special disease queue.
Background
Diabetes is a metabolic disease characterized by hyperglycemia, which is caused by defective insulin secretion or impaired biological action, or both, and which causes chronic damage to various tissues, especially eyes, kidneys, heart, blood vessels, nerves, and dysfunction.
At present, due to the fact that the differences of disease diagnosis and treatment and management modes among different regions and levels of medical institutions in China are large, the related researches on diabetes are mostly concentrated in three-level hospitals and the like, high-quality researches on focusing on the basic diagnosis and treatment and management modes are not available, the blood glucose reducing effect and safety of different diagnosis and treatment schemes in basic-level hospital patients are explored, the factors affecting the treatment effect and cost of basic-level diabetes patients can be influenced, and a basis can be provided for promoting basic-level diabetes standardized treatment and effective management.
The analytical procedures of the current study all followed the following procedure: making a research scheme and a statistical analysis plan; creating a corresponding analysis data set according to a set of research targets; writing analysis codes to generate a series of TFLs (TABLE, FUGURE, LISTING, statistical analysis report forms) designed in advance; the study summary report was completed.
The above workflow has the following two problems: 1. when a plurality of subgroups and queues are involved in the study or the same crowd is applied to different study indexes, a large number of similar reports with the same format can be generated, so that the problems of lengthy report, difficulty in reading and rapid extraction of effective information are caused; 2. when constructing the multi-factor model, the inclusion of confounding factors is needed for adjustment, the selection and combination of confounding factors are needed for continuous debugging, and each combination is needed for re-writing codes.
Disclosure of Invention
The invention aims at the current situation that the traditional TFL presentation form is complicated and the readability is low because a plurality of subgroups, queues and different research indexes are involved in the research in the process of exploring the influence of the blood sugar management of the type 2 diabetes patient on the long-term prognosis ending, and the mixed factors and the re-writing codes are required to be continuously debugged in the process of constructing a multi-factor model, and provides a system and a method for carrying out data exploration analysis on the type 2 diabetes special-purpose disease queue.
In order to achieve the above purpose, the present invention provides the following technical solutions: a system and method for carrying on data exploration analysis to the special disease queue of type 2 diabetes, through setting up the connection between Server (Server) and User Interface (UI) of the application program of Shiny, construct the data exploration analysis system comprising variable, statistical method option box, special disease subgroup of type 2 diabetes or queue select drop-down box, etc., when users interact with application program through UI instrument panel interface, the Server logic will interact with R code according to user's input and operation, respond to and update the state of the application program, realize the manipulation and real-time viewing of data and chart;
further, the setting of the variables in the analysis system is specifically: gender, age, region of visit, length of stay in hospital, medical insurance, condition of insulin use (basal vs. premix), diabetes medication regimen, whether the glycemic biochemical index in hospitalization meets the standard, whether a hypoglycemic event occurs, whether a patient is again in discharge 90 days, fasting blood glucose variation difference, last HbA1c < 7% in discharge 90 days, other disease categories (coronary heart disease, hypertension, stroke, chronic renal insufficiency), other disease numbers, service costs, treatment costs, and total costs;
further, the setting of the statistical method in the analysis system specifically includes:
1) Hypothesis testing: determining whether the sample-to-sample, sample-to-population differences are due to sampling errors or statistical inferences caused by intrinsic differences, wherein for a continuous variable, the number of cases (N), the Mean (Mean), the Standard Deviation (SD), the minimum (Min), the maximum (Max), the Median (Median), and the quartile spacing (IQR) thereof are described; the comparison between groups of normal distribution data adopts t test, and the comparison between groups of non-normal distribution data adopts Mann Whitney U test (Wilcoxon rank sum test); for the classification variables, the frequency (N) and percent (%) are described, the inter-group comparison uses chi-square test, and when any desired frequency is less than 1 or 20% of the desired frequency is less than 5, fisher's exact test will be used instead of chi-square test; finally, determining whether a statistically significant difference exists in the group comparison according to the P value; the result display form is a table, the display content comprises descriptive statistical results and P values, and the P values with statistically significant differences are displayed in a thickened form;
2) Logistic regression: correcting the influence of confounding factors, wherein the influence of the confounding factors can cause deviation or error to the prediction of the target variable, correcting the confounding factors through logistic regression analysis, and further analyzing the adjusted model coefficients; the result display form is a forest map commonly used in scientific literature, and the display content comprises the magnitude, the direction and the confidence interval of the coefficient;
3) Correlation analysis: assessing the degree of association between two or more variables in a study, helping to understand the linear relationship between the variables and how they change with each other; the Pearson correlation test is used for continuous variables, specifically:
in equation 1, x and y are two vectors of length n, m x And m y Corresponding to the mean of x and y, respectively; in the formula 2, n is the number of observations (length) in the x and y variables, the corresponding P value is confirmed through the t distribution table, and finally whether the correlation coefficient is obviously unequal to zero is determined according to the P value;
the discontinuous variable is tested by Spearman correlation, and is specifically:
in equation 3, x 'and y' are the rank orders of x and y,
the display content of the correlation analysis is a thermodynamic diagram, and the intensity of the correlation coefficient is represented by different color codes in the diagram;
4) Multiple linear regression and linear regression hypothesis testing:
a) The multiple linear regression model is used for exploring the relation between a plurality of independent variables and one continuous dependent variable in the research, explaining the variation of the dependent variable by estimating regression coefficients, displaying a forest map which is commonly used in scientific literature, and displaying the size, direction and confidence interval of content containing coefficients;
b) The linear regression hypothesis test is used for carrying out regression diagnosis, evaluating the fitting quality of a regression model in research, and displaying a residual image in the form of a residual image, wherein the content comprises a residual image of regression values and residual errors, a residual QQ image, a residual error, a lever image and a position scale image.
Further, the method for the user to conduct data exploration on the type 2 diabetes mellitus specific disease queue in the instrument panel of the entering analysis system comprises the following steps:
s1: acquiring an ADaM data set and a predefined file of a type 2 diabetic patient;
s2: dividing the group of type 2 diabetes patients into different subgroups or queues according to the characteristics of the ADaM data set and the requirements on data exploration targets in a predefined file;
s3: selecting a proper statistical method in an analysis system instrument panel according to the data exploration target;
s4: selecting a group of special diabetes mellitus queues of a suitable method and ending variables and confounding factors of a participation model in an analysis system instrument panel according to the statistical method;
s5: and outputting a result chart, and adjusting the width and the height of the chart by a system sliding slicer.
Further, in step S2, the data probing targets of the group of type 2 diabetes-specific disease queues are specifically:
1) The current treatment situation of the type 2 diabetes mellitus patient is explored, and the blood glucose reducing effect and the safety comparison result of the basic insulin and the premixed insulin are reflected;
2) The effect of glycemic management during hospitalization of type 2 diabetics on the outcome of long-term prognosis was explored, as reflected by the effects of glycemic changes and treatment during hospitalization of patients on chronic disease and readmission.
Further, the group of type 2 diabetics is divided into different subgroups or queues, specifically:
and dividing different subgroups or queues according to the exploration targets based on different departments of medical treatment, the condition of insulin usage in hospitalization treatment and the distribution interval of the last fasting blood glucose detection result before discharge.
Further, in step S4, a model building page in the analysis system selects a model-applicable model-specific group of type 2 diabetes and outcome variables and confounding factors of the participation model, specifically:
and selecting a group of special diabetes type 2 patients who apply the method or model according to the provided method or model in the statistical method option box in the step S3, and selecting ending variables and confounding factors of the participation model.
Further, after outputting the result chart by using the statistical method in the analysis system, the width and height of the chart are adjusted by the system sliding-section.
Compared with the prior art, the technical scheme of the application has the following beneficial effects:
according to the system and the method for carrying out data exploration analysis on the type 2 diabetes special disease queue, the combination selection of the subgroups and the queues (different queues can be selected under one subgroup) and the free combination of dependent variables and multiple independent variables in multiple regression analysis can be realized through the analysis system, the condition of the concerned queue is rapidly acquired and read, the workload of repeated screening and visualization on the multiple subgroups and the multiple variables corresponding to the subgroups when a user uses different statistical methods to carry out data analysis is reduced, the result can be presented more flexibly and intuitively, and the basis is provided for promoting standardized treatment and effective management of the basic diabetes.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of the study objectives and study population selection provided in the examples;
FIG. 3 is a schematic diagram of a hypothesis testing result table provided in the embodiment;
FIG. 4 is a schematic representation of a logistic regression results forest provided by the examples;
FIG. 5 is a schematic diagram of a correlation analysis result provided in the embodiment;
FIG. 6 is a schematic representation of a multiple regression results forest provided by the examples;
fig. 7 is a schematic representation of a linear regression hypothesis test residual error provided by an embodiment.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent.
All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. The embodiment provides a system and a method for performing data exploration and analysis on a type 2 diabetes mellitus specific disease queue, wherein an interactive Web application program is constructed for the type 2 diabetes mellitus specific disease queue for performing data analysis, as shown in fig. 1, the specific analysis method comprises the following steps:
s1: acquiring an ADaM data set and a predefined file of a type 2 diabetic patient;
s2: dividing the group of type 2 diabetes patients into different subgroups or queues according to the characteristics of the ADaM data set and the requirements on data exploration targets in a predefined file;
s3: selecting a proper statistical method in an analysis system instrument panel according to the data exploration target;
s4: selecting a group of special diabetes mellitus queues of a suitable method and ending variables and confounding factors of a participation model in an analysis system instrument panel according to the statistical method;
s5: outputting a result chart, and adjusting the width and the height of the chart through a system sliding slicer;
in this embodiment, the ADaM dataset and predefined study plan of the type 2 diabetic patient corresponding to step S1 is intended to utilize the county hospital medical record data of 8 different provinces to perform retrospective analysis; the operations corresponding to steps S2 and S3 for selecting study targets and study populations in the analysis system are shown in fig. 2.
The graphs produced by the statistical methods of the present embodiment based on the hypothesis testing, logistic regression, correlation analysis, multiple regression and linear regression hypothesis testing in the analysis system are shown in fig. 3 to 7, respectively.
Wherein, assuming that the test is used for patients who visit at different departments, the blood glucose reduction effect in the case of using different insulin is compared with the safety difference, the indexes for comparison may include: the incidence of blood glucose biochemical indicators reaching standards, the incidence of hypoglycemic events, and the incidence of readmission events within 90 days in hospitalization.
Logistic regression is used for patients with visits in different departments, and the comparison of the blood glucose reducing effect with different insulin levels with safety differences can include: gender, age, whether the biochemical index of blood glucose meets the standard, whether the hypoglycemic event occurs, and whether the patient is further treated within 90 days of discharge.
The correlation analysis is used for analyzing the blood sugar change and treatment influence of the patient in different distribution intervals of the last fasting blood sugar detection result before discharge, and the analysis indexes can comprise: fasting blood glucose variation difference, last HbA1c < 7% within 90 days of discharge, other disease types (coronary heart disease, hypertension, stroke, chronic renal insufficiency), other disease numbers, occurrence of readmission events within 90 days, service fee, treatment fee, total cost, age, sex, visit area, hospitalization duration, medical insurance, insulin use condition (basal vs. premix), diabetes medication regimen.
Multiple regression and regression hypothesis test examples for analysis of blood glucose changes and treatment effects in patients at different distribution intervals of the last fasting blood glucose test results prior to discharge, dependent variables may include: the length of stay, service fee, treatment fee, total cost, last HbA1c < 7% in 90 days discharged, occurrence of readmission event in 90 days, and the independent variables can include: fasting blood glucose variation differences, age, sex, region of visit, other disease categories (coronary heart disease, hypertension, stroke, chronic renal insufficiency), other disease numbers, insulin use (basal vs. premix), diabetes regimen.
The beneficial effects of the embodiment are as follows:
the combination selection of the subgroups and the queues (different queues can be selected under one subgroup) is realized through the analysis system, and the condition of the concerned queue is rapidly acquired and read through the free combination of the dependent variable and the independent variables in the multiple regression analysis, so that the workload of repeated screening and visualization of the multiple variables corresponding to the subgroups is reduced when a user uses different statistical methods to conduct data analysis, the result can be presented more flexibly and intuitively, and the basis is provided for promoting standardized treatment and effective management of the diabetes mellitus of the basal layer.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (7)
1. A system and method for carrying out data exploration analysis on a type 2 diabetes mellitus special disease queue is characterized in that a data exploration analysis system comprising variable, statistical method option boxes, type 2 diabetes mellitus special disease subgroups or queue selection drop-down boxes and other interactive elements is constructed by establishing connection between a user interface of a Shiny application program and a server, when a user interacts with the application program through a UI instrument panel interface, server logic interacts with R codes according to input and operation of the user, responds and updates the state of the application program, and realizes manipulation and real-time viewing of data and charts;
the method for carrying out data exploration in the analyzed system comprises the following steps:
s1: acquiring an ADaM data set and a predefined file of a type 2 diabetic patient;
s2: dividing the group of type 2 diabetes patients into different subgroups or queues according to the characteristics of the ADaM data set and the requirements on data exploration targets in a predefined file;
s3: selecting a proper statistical method in an analysis system instrument panel according to the data exploration target;
s4: selecting a group of special diabetes mellitus queues of a suitable method and ending variables and confounding factors of a participation model in an analysis system instrument panel according to the statistical method;
s5: and outputting a result chart, and adjusting the width and the height of the chart by a system sliding slicer.
2. The system and method for data exploration and analysis of type 2 diabetes specific disease queues according to claim 1, wherein the setting of variables in the analysis system is specifically:
gender, age, region of visit, length of stay in hospital, medical insurance, condition of insulin use, diabetes medication, whether the blood glucose biochemical index meets the standard in blood glucose biochemical index in hospitalization, whether a hypoglycemic event occurs, whether a patient is again in a hospital within 90 days of discharge, a fasting blood glucose change difference value, last HbA1c of less than 7% in a hospital within 90 days of discharge, other disease types, other disease numbers, service fees, treatment fees, and total cost.
3. The system and method for data exploration and analysis of type 2 diabetes specific disease queues according to claim 1, wherein the arrangement of statistical methods in the analysis system is specifically as follows:
1) Hypothesis testing: judging whether the difference between the samples in the study is statistical deduction caused by sampling errors or intrinsic differences, wherein the result display form is a table, the display content comprises descriptive statistical results and P values, and the P values with statistically significant differences are displayed in a thickening form;
2) Logistic regression: correcting the influence of confounding factors, avoiding deviation or error of prediction of a target variable caused by the confounding factors, and displaying a forest map with a result display form commonly used in scientific literature to display the size, direction and confidence interval of content containing coefficients;
3) Correlation analysis: evaluating the degree of association between two or more variables in the study, wherein the continuous variable adopts a Pearson correlation test, the discontinuous variable adopts a Spearman correlation test, the result is displayed in a thermodynamic diagram, and the display content represents the strength of a correlation coefficient through different color codes;
4) Multiple linear regression and linear regression hypothesis testing:
a) The multiple linear regression model is used for exploring the relation between a plurality of independent variables and one continuous dependent variable in the research, explaining the variation of the dependent variable by estimating regression coefficients, displaying a forest map which is commonly used in scientific literature, and displaying the size, direction and confidence interval of content containing coefficients;
b) The linear regression hypothesis test is used for carrying out regression diagnosis, evaluating the fitting quality of a regression model in research, and displaying a residual image in the form of a residual image, wherein the content comprises a residual image of regression values and residual errors, a residual QQ image, a residual error, a lever image and a position scale image.
4. The system and method for data exploration and analysis of type 2 diabetes mellitus specific disease queues according to claim 1, wherein the targets of data exploration of type 2 diabetes mellitus specific disease queues in step S2 are specifically:
1) The current treatment situation of the type 2 diabetes mellitus patient is explored, and the blood glucose reducing effect and the safety comparison result of the basic insulin and the premixed insulin are reflected;
2) The effect of glycemic management during hospitalization of type 2 diabetics on the outcome of long-term prognosis was explored, as reflected by the effects of glycemic changes and treatment during hospitalization of patients on chronic disease and readmission.
5. The system and method for data exploration and analysis of type 2 diabetes mellitus exclusive disease queue according to claim 1, wherein in step S2, the type 2 diabetes mellitus patient group is divided into different subgroups or queues, specifically, in step S2, different subgroups or queues are divided based on different departments of medical care, the condition of insulin use in hospitalization, and the distribution interval of the last fasting blood glucose detection result before discharge according to the exploration target.
6. The system and method for data exploration and analysis of type 2 diabetes mellitus specific disease queues according to claim 1, wherein the model construction page in the analysis system in step S4 selects the population and variables of the applicable model, specifically:
and selecting a group of special diabetes type 2 patients who apply the method or model according to the provided method or model in the statistical method option box in the step S3, and selecting ending variables and confounding factors of the participation model.
7. The system and method for probe analysis of type 2 diabetes specific disease queues according to claim 1, further comprising step S5: after outputting the result chart by using the statistical method in the analysis system, the width and the height of the chart are adjusted by the sliding slicer of the system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311457263.0A CN117332203A (en) | 2023-11-03 | 2023-11-03 | System and method for carrying out data exploration and analysis on type 2 diabetes special disease queue |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311457263.0A CN117332203A (en) | 2023-11-03 | 2023-11-03 | System and method for carrying out data exploration and analysis on type 2 diabetes special disease queue |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117332203A true CN117332203A (en) | 2024-01-02 |
Family
ID=89290415
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311457263.0A Pending CN117332203A (en) | 2023-11-03 | 2023-11-03 | System and method for carrying out data exploration and analysis on type 2 diabetes special disease queue |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117332203A (en) |
-
2023
- 2023-11-03 CN CN202311457263.0A patent/CN117332203A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Powell et al. | Using routine comparative data to assess the quality of health care: understanding and avoiding common pitfalls | |
Bouch et al. | Severity scoring systems in the critically ill | |
US20170061102A1 (en) | Methods and systems for identifying or selecting high value patients | |
RUBENFELD et al. | Outcomes research in critical care: results of the American Thoracic Society critical care assembly workshop on outcomes research | |
Schilling et al. | A new self-report measure of self-management of type 1 diabetes for adolescents | |
US20030216628A1 (en) | Methods and systems for assessing glycemic control using predetermined pattern label analysis of blood glucose readings | |
Plebani | Quality and future of clinical laboratories: the Vico’s whole cyclical theory of the recurring cycles | |
US20090216556A1 (en) | Patient Monitoring | |
CN102770761A (en) | Tracking the probability for imminent hypoglycemia in diabetes from self-monitoring blood glucose (SMBG) data | |
CA2702042A1 (en) | Multi automated severity scoring | |
Van Allen et al. | A longitudinal examination of hope and optimism and their role in type 1 diabetes in youths | |
Bedini et al. | Performance evaluation of three blood glucose monitoring systems using ISO 15197: 2013 accuracy criteria, consensus and surveillance error grid analyses, and insulin dosing error modeling in a hospital setting | |
Pellathy et al. | Intensive care unit scoring systems | |
Simmons | How should blood glucose meter system analytical performance be assessed? | |
Obstfeld et al. | Data mining approaches to reference interval studies | |
Cadamuro et al. | Presentation and formatting of laboratory results: a narrative review on behalf of the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) Working Group “postanalytical phase”(WG-POST) | |
Inoue et al. | Low HbA1c levels and all-cause or cardiovascular mortality among people without diabetes: the US National Health and Nutrition Examination Survey 1999–2015 | |
Brankovic et al. | Explainable machine learning for real-time deterioration alert prediction to guide pre-emptive treatment | |
Kunadian et al. | Cumulative funnel plots for the early detection of interoperator variation: retrospective database analysis of observed versus predicted results of percutaneous coronary intervention | |
CN111406294A (en) | Automatically generating rules for laboratory instruments | |
Vázquez-Ingelmo et al. | Usability study of CARTIER-IA: a platform for medical data and imaging management | |
Johnson et al. | Mood trajectories following daily life events | |
Jeong et al. | Large-scale performance evaluation of Accu-Chek inform II point-of-care glucose meters | |
Zijlstra et al. | A comprehensive performance evaluation of five blood glucose systems in the hypo-, eu-, and hyperglycemic range | |
CN117332203A (en) | System and method for carrying out data exploration and analysis on type 2 diabetes special disease queue |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |