CN117095828A - Blood glucose value prediction and alarm method based on follow-up record of diabetics - Google Patents

Blood glucose value prediction and alarm method based on follow-up record of diabetics Download PDF

Info

Publication number
CN117095828A
CN117095828A CN202311029689.6A CN202311029689A CN117095828A CN 117095828 A CN117095828 A CN 117095828A CN 202311029689 A CN202311029689 A CN 202311029689A CN 117095828 A CN117095828 A CN 117095828A
Authority
CN
China
Prior art keywords
blood glucose
data
model
follow
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311029689.6A
Other languages
Chinese (zh)
Inventor
胡伟
张文红
陈斯蕾
陈雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202311029689.6A priority Critical patent/CN117095828A/en
Publication of CN117095828A publication Critical patent/CN117095828A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H80/00ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The application discloses a blood glucose value prediction and alarm method based on follow-up records of diabetics, which comprises the following steps: collecting follow-up records of patients, extracting original blood glucose data, cleaning, then supplementing the data by using a linear interpolation method, resampling the data according to a fixed time interval, filling the data by using a quadratic interpolation method, and finally smoothing and denoising the data by using an exponential weighted moving average method. And then, automatically adjusting parameters by using Bayesian information, respectively training a differential autoregressive moving average model, a gradient lifting decision tree model and a long-term and short-term memory network model, and integrating the three models. When the model predicts that the future blood glucose level or the change trend of the patient is abnormal, alarm information is generated, so that medical staff is helped to pay more detailed attention to and manage the patient. The application has the characteristics of convenience, low cost and wide application.

Description

Blood glucose value prediction and alarm method based on follow-up record of diabetics
Technical Field
The application relates to the technical field of medical care and machine learning, in particular to a blood glucose value prediction and alarm method based on follow-up records of diabetics.
Background
Diabetes is a chronic metabolic disease, and hundreds of millions of people suffer from the disease, so that not only can the pain of complications be afflicted, but also the eating habit and the living habit of people can be influenced for a long time, and psychological difficulties and troubles are brought, and the physical and psychological health of people are endangered continuously. According to the state of the art, there is no complete cure for diabetes, so monitoring and controlling blood glucose levels is of paramount importance in the daily management of diabetics. However, most diabetics currently need to manually record blood glucose levels and adjust insulin injections and diets according to their own experience and professional recommendations, which has many limitations such as being error-prone, inaccurate, and requiring a significant amount of time and effort to maintain.
Digital medicine plays an increasingly important role in diabetes management, with periodic follow-up being an important aspect. The follow-up visit refers to the regular communication and communication between medical staff and diabetics to know the physical condition and treatment effect of the patients, and to adjust and improve the patients according to the needs, for example, communication through online questionnaires, telephone consultation, offline physical examination and the like. In addition, the digital medical technology can help medical staff to better collect and analyze relevant data of diabetics, such as blood sugar level, insulin use condition, diet and exercise habit and the like, so as to better understand the illness state and treatment effect of the patients, and adjust and manage the illness state and the treatment effect according to requirements.
In recent years, with the development of artificial intelligence and machine learning techniques, blood glucose level prediction for diabetics has become a trend. Blood glucose level prediction refers to monitoring the physiological state of a patient by using devices such as a sensor and a portable blood glucose meter, and predicting the blood glucose level in a future time period by using techniques such as machine learning so as to better regulate and manage the blood glucose level. However, the blood glucose level prediction method based on sensor acquired data is high in cost and low in feasibility, although the blood glucose level can be monitored in real time.
Disclosure of Invention
The application aims to: aiming at the problems and the shortcomings of the prior art, the application aims to provide a blood glucose level prediction and alarm method based on the follow-up record of a diabetic patient, which utilizes artificial intelligence and machine learning technology to analyze and model the follow-up record of the diabetic patient, predicts the blood glucose level and the change trend in a future time period, and gives an alarm to the patient and medical staff according to the prediction result. Compared with the prior art, the application is matched with digital therapy, utilizes the conventional follow-up record which is sparse in data volume and easy to acquire as a data source, and has the characteristics of convenience, low cost and wide application.
The technical scheme is as follows: in order to achieve the aim of the application, the technical scheme adopted by the application is a blood glucose value prediction and alarm method based on follow-up records of diabetics, which comprises the following steps:
(1.1) collecting follow-up records of the patient, extracting first blood glucose level data therefrom and performing a data cleansing. Then, a missing restoration and fine adjustment algorithm of the blood glucose value data is used for completing the missing blood glucose value, so that second blood glucose value data with time sequence integrity is recovered from incomplete follow-up data;
(1.2) dividing the second blood glucose value data into a training data set by using a forward verification mode, automatically adjusting parameters by using a grid search method, respectively training a differential autoregressive moving average model, and training a gradient lifting decision tree model and a long-term and short-term memory network model by using a supervised learning mode;
(1.3) providing an autonomous learning blood glucose level prediction integration method based on the differential autoregressive moving average model, the gradient lifting decision tree model and the long-term memory network model obtained in the step (1.2), and adaptively selecting an optimal method according to performance differences to integrate the models by evaluating test performance of the three models in a last period of time;
(1.4) predicting a blood glucose level of the diabetic patient for a future period of time based on the integrated model obtained in step (1.3); and (3) monitoring future blood glucose levels and change trends of the patient by combining a preset normal range and change rules of the blood glucose levels, and if the blood glucose levels exceed a specified threshold or continuously rise, regarding the blood glucose levels as abnormal conditions, generating corresponding alarm information and notifying doctors and the patient.
Further, the step (1.1) includes the steps of:
(2.1) collecting follow-up records of patients, and extracting original first blood glucose value data according to blood glucose indexes predicted as required; then, cleaning the empty value and the abnormal value in the first blood glucose value data to ensure the accuracy and consistency of the data;
(2.2) based on the cleaned data obtained in the step (2.1), processing by adopting a deletion reduction and fine adjustment algorithm of blood glucose level data:
(2.2.1) firstly, using a linear interpolation method to initially estimate the blood glucose value of the missing point by utilizing the adjacent data points of the missing data so as to fill the missing data;
(2.2.2) resampling the data by the algorithm at predetermined time intervals based on the interpolated data obtained in step (2.2.1), ensuring that the time intervals between the data points remain consistent. Meanwhile, analyzing the time sequence mode of the existing data, and obtaining continuous blood glucose value data with equal intervals by a secondary filling method;
(2.2.3) performing fine adjustment based on the continuous data obtained in the step (2.2.2), and performing smoothing denoising treatment on the data by using an exponential weighted moving average method to obtain second blood glucose level data after final treatment in order to maintain the smoothness of the data in time and avoid abnormal fluctuation and discontinuity phenomenon.
Further, the step (1.2) includes the steps of:
(3.1) automatically dividing the second blood glucose value data according to a given proportion by using a forward verification mode, and dividing the second blood glucose value data into a training set and a test set according to a time sequence; combining the acquisition time of blood sugar data in the follow-up record, taking the current time as a reference, and using the data of the first 80% of the earlier acquisition time as a training set for training and parameter tuning of the model; the later 20% of data with later acquisition time is used as a test set for evaluating the performance and the prediction capability of the model;
(3.2) automatically calculating three hyper-parameters of model optimization by a grid search method according to a minimized Bayesian information criterion by using a relative optimal model identification method based on the second blood glucose value data: an autoregressive order, a differential term order and a hysteresis term order, and then training the differential autoregressive moving average model based on the training set obtained in the step (3.1);
(3.3) converting the training set and test set into supervised samples using a sliding window method; then, the number of weak learners and an objective function of a decision tree are designated to construct a regression model, a residual error is gradually fitted by a greedy algorithm by utilizing a gradient lifting framework, and a supervised sample is used for training the gradient lifting decision tree model;
and (3.4) dynamically adjusting the layer number and the neuron number of the long-period memory network model according to actual requirements based on the supervised sample obtained in the step (3.3), and designating an activation function and a loss function to train the model.
Further, the method for integrating the blood glucose level prediction for autonomous learning in the step (1.3) comprises the following steps:
(4.1) based on the trained differential autoregressive moving average model, the gradient lifting decision tree model and the long-term and short-term memory network model obtained in the step (1.2), evaluating the test performance of the three models in the last period of time by using the test set obtained in the step (3.1), and using the average absolute error as an evaluation index to quantify the model prediction error level;
(4.2) based on the evaluation results of the three models obtained in the step (4.1) on test data, determining the relative performance difference of each model; based on these differences, an adaptive approach is adopted to select the optimal integration mode: if the test performance of a certain model is lower than a given threshold, discarding the model and selecting other models for substitution; if the three models have similar test performances, synthesizing the prediction results of the three models by using a dynamic integration method, and carrying out weight distribution according to the inverse of the test errors of the three models so as to ensure that the model with more excellent performance has higher weight;
(4.3) dynamically adjusting the weight distribution according to the latest prediction performance and data change through a continuous learning process based on the model weight obtained in the step (4.2) so as to flexibly adapt to the continuously changing data and prediction requirements.
The beneficial effects are that: (1) The application provides a missing reduction and fine adjustment algorithm of blood glucose data, which effectively improves the quality and the integrity of the blood glucose data through data processing and correction and provides beneficial effects for blood glucose management and prediction of diabetics; (2) The application provides an autonomous learning blood glucose value prediction integration method, which performs weight distribution and adjustment independently according to the latest prediction performance to perform multi-model coordination, so that the advantages of different models are utilized to the greatest extent, and the accuracy and stability of prediction are improved; (3) The application utilizes the conventional follow-up record and feedback information of the diabetics to realize the prediction and alarm of blood sugar, avoids additional implanted or wearable equipment to acquire the patient data, and has low cost and easy deployment.
Drawings
FIG. 1 is an overall process flow diagram of the present application;
FIG. 2 is a graph showing the effects of a missing recovery and fine adjustment algorithm on blood glucose level data;
Detailed Description
The present application is further illustrated in the accompanying drawings and detailed description which are to be understood as being merely illustrative of the application and not limiting of its scope, and various modifications of the application, which are equivalent to those skilled in the art upon reading the application, will fall within the scope of the application as defined in the appended claims.
The application provides a digital therapy for managing and controlling the blood sugar level of a diabetic patient, which is characterized in that the blood sugar level of the diabetic patient can be managed and controlled more comprehensively, efficiently and individually by using the method, the diabetic patient can improve the treatment effect and quality by analyzing and modeling the conventional historical follow-up record of the diabetic patient by using artificial intelligence and machine learning technology, predicting the blood sugar level and change trend in a future time period and sending an alarm to the patient and medical staff according to the prediction result.
The complete flow of the application is shown in FIG. 1, comprising 5 major parts: collecting follow-up data of a diabetic patient to obtain first blood glucose value data and performing primary cleaning and treatment; processing the data by using a missing reduction and fine adjustment algorithm of the blood glucose level data so as to fill the missing value and restore the integrity of the time sequence to obtain second blood glucose level data; then training three models and performing performance test; integrating the model by using an autonomous learning blood glucose value prediction integration method according to the difference of the model performance; finally, the integrated model is used for predicting the future blood sugar of the diabetic patient and alarming in abnormal situations.
Specific embodiments are described below:
1. collecting follow-up data of diabetics to obtain first blood sugar value data, and performing primary cleaning and treatment
When collecting follow-up records of diabetics, firstly, the type of blood sugar to be predicted, such as fasting blood sugar, blood sugar after two hours after meal, and the like, is selected. Subsequently, the originally measured blood glucose value dataset is obtained at the follow-up time to obtain first blood glucose value data, and the time is converted into a unified date format or timestamp format for subsequent data processing and analysis.
In performing the cleaning of the first blood glucose level data, it is necessary to process repeated data items that may be present. For example, the blood glucose value at the same time point may be repeatedly recorded in the follow-up record. To ensure accuracy and consistency of the data, it is necessary to identify and remove these duplicates and average multiple measurements at the same time as the unique blood glucose value at that point in time.
In order to solve the blood glucose unit difference caused by different follow-up modes, the data needs to be standardized. Specifically, the blood glucose values are uniformly converted into units of millimoles per liter (mmol/L) for subsequent data analysis and comparison. Meanwhile, the character string "nan" is used to represent the null value and chinese record not measured in the follow-up record to ensure the integrity and reliability of the data.
2. Processing the data by using a deletion reduction and fine adjustment algorithm of the blood glucose data to obtain second blood glucose data
Because the application is based on the conventional follow-up records of diabetics, the phenomenon of data loss is easy to occur, and as shown by the bold line raw in fig. 2, the diabetics have a large number of follow-up records in the period from 2020 to 2023. Therefore, a missing reduction and fine adjustment algorithm of blood glucose data is provided, which aims to maximally reduce the blood glucose level and the change trend of a patient from limited follow-up records.
Firstly, using adjacent data points of missing data, and using a linear interpolation method to initially fill the blood sugar value of the missing points to obtain linear lines in fig. 2, wherein the formula of the linear interpolation method is as follows:
wherein G (t) represents the blood glucose level at time point t, G (t) 1 ) And G (t) 2 ) Representing a known point in time t 1 And t 2 Blood glucose value at that location.
To unify the temporal representation of the data, the algorithm resamples the blood glucose value data. Specifically, the algorithm will sample at fixed time intervals, such as 15 days, to ensure data continuity and consistency. Since the acquisition time of the original data may not completely conform to the sampling period, the condition of missing data may occur after resampling. To fine fill these missing data points, the algorithm fits the trend of the existing data with a quadratic polynomial function and generates corresponding complement values. In this way, missing data points are effectively supplemented, so that the blood glucose value data show a more continuous and complete trend in time, and a trace in the graph 2 is obtained.
In order to maintain the smoothness of the data in time and reduce abnormal fluctuation and discontinuity, the algorithm uses an exponential weighted moving average method to carry out smoothing denoising treatment on the data, so as to obtain second blood glucose value data after final treatment, namely a ewm line in fig. 2. The core idea of the exponentially weighted moving average method is to use exponentially decaying weights to weight average data, and its calculation formula is described as follows:
E t =α·G t +(1-α)·E t-1
wherein E is t Is a weighted moving average of time points t, G t Is the original observation of time point t, E t-1 Is a weighted moving average of time point t-1. And designating a smoothing factor alpha according to the requirement, wherein the value range of alpha is more than or equal to 0 and less than or equal to 1. It can be seen that the weight of the data decays exponentially as the observations move away from the current time, with earlier observations being less weighted. Therefore, the influence of the data at the near moment on the average value is larger, and the change trend of the blood glucose value data is better captured.
2. Training three models according to the processed second blood glucose value data and performing performance test
3.1 differential autoregressive moving average model
The differential autoregressive moving average model is a commonly used time series prediction model for modeling and predicting data with certain trends and seasonality. It combines the properties of an autoregressive model, a differential operation, and a moving average model. The differential autoregressive moving average model includes three parts:
autoregressive model: the autoregressive model linearly combines the current value with the past several values based on historical observations of the time series. The order of the autoregressive model is denoted by p, and represents the number of past observations in the linear combination. The first order autoregressive model is denoted as AR (1), the second order autoregressive model is denoted as AR (2), and so on.
Differential operation: the differential operation is used to eliminate non-stationarity of the time series, converting the non-stationary series into stationary series. The difference times are recorded as d, and one difference (d=1) means that one difference operation between adjacent observed values is performed on the original sequence to form a new sequence. If the sequence after the primary difference is still not stationary, the new sequence may be continued with a secondary difference (d=2), a tertiary difference (d=3), and so on until a stationary sequence is obtained.
Moving average model: the moving average model linearly combines the current value with the past several residual terms based on the residual terms of the time series. The order of the moving average model is denoted as q, and represents the number of residual terms in the linear combination. The first order moving average model is denoted as MA (1), the second order moving average model is denoted as MA (2), and so on. The calculation formula of the overall differential autoregressive moving average model is described as follows:
Y t =c+φ 1 Y t-12 Y t-2 +…φ p Y t-p1 ε t-12 ε t-2 +…+θ q ε t-qt
wherein Y is t Is the observation of time t, c is a constant, φ i Is the coefficient (i is more than or equal to 1 and less than or equal to p) of the autoregressive model, theta j Is the coefficient (j is more than or equal to 1 and less than or equal to q) of the moving average model, epsilon t Is the residual term at time t.
To train a differential autoregressive moving average model, it is first necessary to analyze the processed blood glucose data of a diabetic patient to check the stability of the blood glucose data sequence and determine the number of differences (d), autoregressive order (p) and moving average order (q) of the model from the blood glucose data, which is usually an empirical process. In order to select the relatively optimal model parameters, the maximum value ranges of p, d and q are set, wherein p and q are between 0 and 5, d is between 0 and 1, all order combinations are traversed to establish a corresponding differential autoregressive moving average model, the Bayesian Information Criterion (BIC) value of the current model is calculated according to the estimated model parameters and the processed blood glucose value data, and a calculation formula is described as follows:
BIC=-2·ln(L)+k·ln(n)
where L is the maximum likelihood function value of the model, k is the number of parameters of the model, and n is the number of median values of blood glucose level data. Thus, the smaller the BIC value, the better the fit of the representation model and the lower the model complexity. And selecting p, d and q corresponding to the model with the minimum BIC value as the optimal order, thereby obtaining the differential autoregressive moving average model with the optimal order.
Adopting the idea of forward verification, combining the acquisition time of blood glucose data in follow-up records, taking the current time as a reference, and taking the first 80% of data of the processed second blood glucose data with earlier acquisition time as a training set for training and parameter tuning of a model; later 20% of the data were acquired as test sets for evaluating the performance and predictive power of the model. Training the differential autoregressive moving average model of the optimal order obtained in the last step, predicting a test set by using the trained differential autoregressive moving average model, and calculating an average absolute error (MAE) between a predicted value and an actual value as a performance index of the model. The formula for MAE is described as follows:
wherein n represents the number of blood glucose data points in the test set, Y' i Representing the predicted value of the model on the ith blood glucose value data point, Y i Is the true value of the ith blood glucose value data point in the test data. And evaluating the performance of the differential autoregressive moving average model on the blood glucose value data test set according to the calculated MAE value. Smaller MAE values indicate smaller prediction errors for the model and better performance for the model.
3.2 gradient-lifting decision Tree model
Since the gradient lifting decision tree is a supervised learning algorithm, features and corresponding target variables are required to be input for model training. The time series blood sugar value data of the diabetics are converted into samples with supervision by adopting a sliding window method, the size of a custom window is 8, the past 8 blood sugar values in the window are taken as characteristics, the blood sugar value at the current time point is taken as a target variable, and the blood sugar value are combined into one sample. After processing all the time series data, the supervised samples are divided into training sets and testing sets according to the proportion of 8:2 by using a forward verification method.
The gradient lifting decision tree model is an iterative and enhanced decision tree model, and the prediction capability of the model is gradually improved through integrating a plurality of weak learners (decision trees). The method has the core concept that the loss function of the model is optimized in a gradient descent mode, and the residual error between the predicted result and the actual value of each decision tree is used as a training target of the next decision tree. The following is the main steps of the method for training the gradient lifting decision tree model:
(1) Initializing a model: initially, the initial blood glucose prediction value is set to a constant (e.g., the average value of the training set), and iterative training is started after the number of decision tree weak learners is designated as 1000.
(2) Constructing a decision tree: and calculating a residual error between the predicted value and the actual blood glucose value of the current model, and taking the residual error as a training target of the next decision tree. The goal of the decision tree is to minimize the square loss function (MSE), the calculation formula of which is described as follows:
in order to determine the tree structure, the decision tree construction process adopts a greedy algorithm, uses the objective function value as an evaluation function, and traverses all characteristic segmentation points to select optimal segmentation points and segmentation rules.
(3) Updating the model: and adding the newly constructed decision tree with the previous model, and updating the blood glucose value prediction result of the model. In iteration t, the model predicts the blood glucose of sample iIs the sum of predictive values of the previous t-1 round model +.>Plus the predictive value of the current wheel model +.>
(4) Iterative training: and (3) repeating the steps (2) to (3), wherein the number of the weak learners is the iteration round number, so that training is completed after 1000 rounds of iteration are completed.
The final trained gradient lifting decision tree model is formed by combining 1000 weak learner decision trees, the model is also used for predicting a test set, and an average absolute error (MAE) between a predicted value and an actual value is calculated as a performance index of the model.
3.3 Long-short term memory network model
And constructing a three-layer two-way long-short-term memory network model for predicting the blood sugar level. The long-term and short-term memory network units of each layer comprise memory units, forgetting gates, input gates and output gates. The bidirectional long-short term memory network structure can capture forward and backward context information in sequence data, thereby improving prediction performance. In the model, the number of neurons in each long-term and short-term memory network layer is set to be 100, the discard rate of 0.2 is applied between the layers to reduce the overfitting phenomenon, and finally the model output result is mapped into the predicted value of the current blood sugar through a full-connection layer.
Next, the three layers of the two-way long-short-term memory network model constructed is trained by using the marked training data set obtained in the step 3.2. During training, the model parameters are adjusted using the loss function between the predicted and actual values using an adaptive moment estimation optimizer, taking as input the blood glucose values for the past 8 time steps, in particular using the square loss function (MSE).
After training, the model is also used to predict the test set, and the Mean Absolute Error (MAE) between the predicted value and the actual value is calculated to evaluate the performance of the model.
4. Model integration using autonomous learning blood glucose level prediction integration method
The scheme adaptively selects the optimal integration mode according to the relative performance difference between models. For example, for model A, B, C, the solution first finds the smallest MAE among the three models that is predicted on the test set, then calculates the ratio of each model MAE to the MAE minimum to obtain the mean absolute error ratio (MAERatio), taking model a as an example, the calculation formula is:
where min (·) represents the minimum operation, MAE (a) represents the average absolute error of model a over the test set, and so on. The average absolute error ratio of models B and C can then be calculated separately in the same way.
Then, for each model, firstly calculating the average value OtherMean of the average absolute error ratios of the rest models except the current model, taking the model A as an example, and calculating:
wherein MAERatio (B) and MAERatio (C) are the average absolute error ratios of models B and C, respectively.
Calculating the average absolute error ratio of the current model and the average absolute error ratio of the rest models to obtain a relative performance ratio, taking a model A as an example, and calculating:
where Score (a), the relative performance ratio of model a. The relative performance ratios of models B and C can then be calculated separately in the same manner.
When the relative performance ratio of a certain model exceeds a given threshold of 1.8, the model is considered to perform significantly worse than the others. In this case, the integration method will autonomously judge and discard the model to avoid negative impact on the performance of the integrated model as a whole.
On the other hand, for the remaining models, the integration method will determine that their test performance is close. To integrate the predictions of these models, the predictions of multiple models are combined using dynamic integration normals, and the integrated model can be defined as:
where m is the number of sub-models remaining in the integrated prediction model, and m=3 for three models A, B, C. Y'. i (j) Is the predicted value of blood sugar, w, of the ith model at time j i And Y (j) is the predicted value of the integrated model in time j for the weight of the ith model in the integrated prediction model. The method distributes weights based on the inverse of the test errors of the models, and ensures that the models with better performance have higher weights. By doing so, the advantages of each model can be fully utilized, the overall prediction performance of the integrated model is improved, and w i The calculation formula of (2) is described as follows:
wherein MAE is i Mean absolute error of the ith model over the test set is represented, and so on.
Through the dynamic learning process, the integration method can automatically adjust the internal structure of the integrated model and the weight distribution of each sub-model according to the latest data change and the prediction performance, discard the model with poor prediction effect and emphasize the model with good prediction effect. The flexibility and adaptability enable the integrated model to remain sensitive to new data and new conditions, thereby ensuring the accuracy and stability of blood glucose predictions.
After determining the composition of the integrated model, the remaining sub-models are trained using the complete blood glucose time series data of the diabetic patient, possibly including: and the differential autoregressive moving average submodel, the gradient lifting decision tree submodel and the long-term and short-term memory network submodel are integrated with different submodels.
5. Predicting future blood glucose of diabetic patient using integrated model and alerting in abnormal situation
Based on the time intervals of the glucose recordings in the training data, integrated models are used to predict the glucose levels of the diabetic patient for the next 15 days, 30 days and 45 days. Longer-term prediction can be performed according to different time ranges by combining actual requirements, and change rules and thresholds for different blood sugar types are set in the system.
In the course of blood glucose level prediction, the system focuses on whether the blood glucose level is outside of the normal range. For example, if fasting blood glucose values are not in the range of 3.9-6.1mmol/L, or 2 hours postprandial blood glucose exceeds 7.8mmol/L, the system will flag these values as abnormal. In addition, the system also considers the trend of future blood glucose. If the predicted result shows a trend of continuously rising blood glucose levels for a future period of time, it is also marked as an abnormal situation.
For diabetics marked as abnormal, the system will trigger an alarm mechanism. The alarm information is sent to medical staff in various forms such as mobile application programs, emails, short messages and the like, and the medical staff can adjust measures such as treatment schemes, diet control and drug management of patients in time according to the alarm information so as to ensure that blood sugar is controlled within a reasonable range.
In summary, the application provides a blood glucose value prediction and alarm method based on follow-up records of diabetics in combination with diabetes digital therapy. The method can realize accurate prediction and effective management of blood sugar by only recording and storing the conventional follow-up record of the patient without additional equipment. Meanwhile, the method has autonomous learning capability, can adapt to the blood glucose level change of a patient in the latest time in time to dynamically adjust the structure of the internal model, and ensures the accuracy and stability of blood glucose prediction. Overall, it is an innovative, economical and practical solution for blood glucose management.

Claims (4)

1. A blood glucose level prediction and alarm method based on follow-up records of diabetics, which is characterized by comprising the following steps:
(1.1) collecting follow-up records of the patient, extracting first blood glucose value data therefrom and performing data cleaning; then, a missing restoration and fine adjustment algorithm of the blood glucose value data is used for completing the missing blood glucose value, so that second blood glucose value data with time sequence integrity is recovered from incomplete follow-up data;
(1.2) dividing the second blood glucose value data into a training data set by using a forward verification mode, automatically adjusting parameters by using a grid search method, respectively training a differential autoregressive moving average model, and training a gradient lifting decision tree model and a long-term and short-term memory network model by using a supervised learning mode;
(1.3) providing an autonomous learning blood glucose level prediction integration method based on the differential autoregressive moving average model, the gradient lifting decision tree model and the long-term memory network model obtained in the step (1.2), and adaptively selecting an optimal method according to performance differences to integrate the models by evaluating test performance of the three models in a last period of time;
(1.4) predicting a blood glucose level of the diabetic patient for a future period of time based on the integrated model obtained in step (1.3); and (3) monitoring future blood glucose levels and change trends of the patient by combining a preset normal range and change rules of the blood glucose levels, and if the blood glucose levels exceed a specified threshold or continuously rise, regarding the blood glucose levels as abnormal conditions, generating corresponding alarm information and notifying doctors and the patient.
2. The method for predicting and alerting blood glucose values based on follow-up records of diabetics as claimed in claim 1, wherein the step (1.1) comprises the steps of:
(2.1) collecting follow-up records of patients, and extracting original first blood glucose value data according to blood glucose indexes predicted as required; then, cleaning the empty value and the abnormal value in the first blood glucose value data to ensure the accuracy and consistency of the data;
(2.2) based on the cleaned data obtained in the step (2.1), processing by adopting a deletion reduction and fine adjustment algorithm of blood glucose level data:
(2.2.1) firstly, using a linear interpolation method to initially estimate the blood glucose value of the missing point by utilizing the adjacent data points of the missing data so as to fill the missing data;
(2.2.2) resampling the data at predetermined time intervals based on the interpolated data obtained in step (2.2.1) to ensure that the time intervals between the data points remain consistent; meanwhile, analyzing the time sequence mode of the existing data, and obtaining continuous blood glucose value data with equal intervals by a secondary filling method;
(2.2.3) performing fine adjustment based on the continuous blood glucose level data obtained in the step (2.2.2), and performing smoothing and denoising processing on the data by using an exponential weighted moving average method to obtain second blood glucose level data after final processing in order to maintain the smoothness of the data in time and avoid abnormal fluctuation and discontinuity.
3. The method for predicting and alerting blood glucose values based on follow-up records of diabetics as claimed in claim 1, wherein the step (1.2) comprises the steps of:
(3.1) automatically dividing the second blood glucose value data according to a given proportion by using a forward verification mode, and dividing the second blood glucose value data into a training set and a test set according to a time sequence: combining the acquisition time of blood sugar data in the follow-up record, taking the current time as a reference, and using the data of the first 80% of the earlier acquisition time as a training set for training and parameter tuning of the model; the later 20% of data with later acquisition time is used as a test set for evaluating the performance and the prediction capability of the model;
(3.2) automatically calculating three hyper-parameters of model optimization by a grid search method according to a minimized Bayesian information criterion by using a relative optimal model identification method based on the second blood glucose value data: an autoregressive order, a differential term order and a hysteresis term order, and then training the differential autoregressive moving average model based on the training set obtained in the step (3.1);
(3.3) converting the training set and test set into supervised samples using a sliding window method; then, the number of weak learners and an objective function of a decision tree are designated to construct a regression model, a residual error is gradually fitted by a greedy algorithm by utilizing a gradient lifting framework, and a supervised sample is used for training the gradient lifting decision tree model;
and (3.4) dynamically adjusting the layer number and the neuron number of the long-period memory network model according to actual requirements based on the supervised sample obtained in the step (3.3), and designating an activation function and a loss function to train the model.
4. The blood glucose level prediction and alarm method based on follow-up records of diabetics according to claim 3, wherein the autonomous learning blood glucose level prediction integration method comprises the following steps:
(4.1) evaluating the test performance of the three models over the last period of time by using the test set obtained in the step (3.1), and using the average absolute error as an evaluation index for quantifying the model prediction error level;
(4.2) determining the relative performance difference of each model based on the evaluation results of the three models obtained in the step (4.1) on the test set; based on these differences, an adaptive approach is adopted to select the optimal integration mode: if the test performance of a certain model is lower than a given threshold, discarding the model and selecting other models for substitution; if the three models have similar test performances, synthesizing the prediction results of the three models by using a dynamic integration method, and carrying out weight distribution according to the inverse of the test errors of the three models so as to ensure that the model with more excellent performance has higher weight;
(4.3) dynamically adjusting the weight distribution according to the latest prediction performance and data change through a continuous learning process based on the model weight obtained in the step (4.2) so as to flexibly adapt to the continuously changing data and prediction requirements.
CN202311029689.6A 2023-08-15 2023-08-15 Blood glucose value prediction and alarm method based on follow-up record of diabetics Pending CN117095828A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311029689.6A CN117095828A (en) 2023-08-15 2023-08-15 Blood glucose value prediction and alarm method based on follow-up record of diabetics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311029689.6A CN117095828A (en) 2023-08-15 2023-08-15 Blood glucose value prediction and alarm method based on follow-up record of diabetics

Publications (1)

Publication Number Publication Date
CN117095828A true CN117095828A (en) 2023-11-21

Family

ID=88780799

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311029689.6A Pending CN117095828A (en) 2023-08-15 2023-08-15 Blood glucose value prediction and alarm method based on follow-up record of diabetics

Country Status (1)

Country Link
CN (1) CN117095828A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117612737A (en) * 2024-01-24 2024-02-27 胜利油田中心医院 Intelligent optimization method for diabetes care data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117612737A (en) * 2024-01-24 2024-02-27 胜利油田中心医院 Intelligent optimization method for diabetes care data
CN117612737B (en) * 2024-01-24 2024-05-03 胜利油田中心医院 Intelligent optimization method for diabetes care data

Similar Documents

Publication Publication Date Title
CN109087706B (en) Human health assessment method and system based on sleep big data
US20180260706A1 (en) Systems and methods of identity analysis of electrocardiograms
KR101779800B1 (en) System and method for evaluating multifaceted growth based on machine learning
CA3121039A1 (en) Systems, methods, and devices for biophysical modeling and response prediction
Rubin-Falcone et al. Deep Residual Time-Series Forecasting: Application to Blood Glucose Prediction.
CN117095828A (en) Blood glucose value prediction and alarm method based on follow-up record of diabetics
WO2020089656A1 (en) Predicting physiological parameters
CN115831340B (en) ICU breathing machine and sedative management method and medium based on inverse reinforcement learning
CN117253614A (en) Diabetes risk early warning method based on big data analysis
Doike et al. A blood glucose level prediction system using machine learning based on recurrent neural network for hypoglycemia prevention
CN114242234A (en) TAVR postoperative complication risk value prediction method based on aggregation neural network
US20240006069A1 (en) Medical Event Prediction Using a Personalized Dual-Channel Combiner Network
CN116525117B (en) Data distribution drift detection and self-adaption oriented clinical risk prediction system
Ma et al. Online Blood Glucose Prediction Using Autoregressive Moving Average Model with Residual Compensation Network.
Luo et al. GlucoGuide: an intelligent type-2 diabetes solution using data mining and mobile computing
US20220318626A1 (en) Meta-training framework on dual-channel combiner network system for dialysis event prediction
CN116469148A (en) Probability prediction system and prediction method based on facial structure recognition
CN115171896A (en) System and method for predicting long-term death risk of critically ill patient
CN115547502A (en) Hemodialysis patient risk prediction device based on time sequence data
CN113593703B (en) Device and method for constructing pressure injury risk prediction model
CN115376638A (en) Physiological characteristic data analysis method based on multi-source health perception data fusion
CN113035348A (en) Diabetes diagnosis method based on GRU feature fusion
CN111243697A (en) Method and system for judging target object data based on neural network
Pugh et al. High-confidence data programming for evaluating suppression of physiological alarms
CN115050479B (en) Data quality evaluation method, system and equipment for multi-center research

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination