CN116451110A - Blood glucose prediction model construction method based on signal energy characteristics and pulse period - Google Patents

Blood glucose prediction model construction method based on signal energy characteristics and pulse period Download PDF

Info

Publication number
CN116451110A
CN116451110A CN202310227933.3A CN202310227933A CN116451110A CN 116451110 A CN116451110 A CN 116451110A CN 202310227933 A CN202310227933 A CN 202310227933A CN 116451110 A CN116451110 A CN 116451110A
Authority
CN
China
Prior art keywords
signal
features
pulse wave
value
blood glucose
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310227933.3A
Other languages
Chinese (zh)
Inventor
季忠
米鹏
杨粟瑞
刘子嘉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202310227933.3A priority Critical patent/CN116451110A/en
Publication of CN116451110A publication Critical patent/CN116451110A/en
Pending legal-status Critical Current

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/145Measuring characteristics of blood in vivo, e.g. gas concentration, pH value; Measuring characteristics of body fluids or tissues, e.g. interstitial fluid, cerebral tissue
    • A61B5/14532Measuring characteristics of blood in vivo, e.g. gas concentration, pH value; Measuring characteristics of body fluids or tissues, e.g. interstitial fluid, cerebral tissue for measuring glucose, e.g. by tissue impedance measurement
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/02Detecting, measuring or recording pulse, heart rate, blood pressure or blood flow; Combined pulse/heart-rate/blood pressure determination; Evaluating a cardiovascular condition not otherwise provided for, e.g. using combinations of techniques provided for in this group with electrocardiography or electroauscultation; Heart catheters for measuring blood pressure
    • A61B5/024Detecting, measuring or recording pulse rate or heart rate
    • A61B5/02416Detecting, measuring or recording pulse rate or heart rate using photoplethysmograph signals, e.g. generated by infrared radiation
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7225Details of analog processing, e.g. isolation amplifier, gain or sensitivity adjustment, filtering, baseline or drift compensation
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7253Details of waveform analysis characterised by using transforms
    • A61B5/726Details of waveform analysis characterised by using transforms using Wavelet transforms
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7271Specific aspects of physiological measurement analysis
    • A61B5/7275Determining trends in physiological measurement data; Predicting development of a medical condition based on physiological measurements, e.g. determining a risk factor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Veterinary Medicine (AREA)
  • Medical Informatics (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • Pathology (AREA)
  • Public Health (AREA)
  • Physiology (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Cardiology (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Power Engineering (AREA)
  • Fuzzy Systems (AREA)
  • Emergency Medicine (AREA)
  • Optics & Photonics (AREA)
  • Measuring Pulse, Heart Rate, Blood Pressure Or Blood Flow (AREA)

Abstract

The invention relates to a blood glucose prediction model construction method based on signal energy characteristics and pulse periods, and belongs to the technical field of biomedical signal processing. The method comprises the following steps: s2: and (3) primary screening: extracting Kaiser-Teager energy characteristics and logarithmic energy entropy characteristics of PPG signal quality, and inputting the Kaiser-Teager energy characteristics and logarithmic energy entropy characteristics into an SVM classifier to divide the signal quality; fine screening: screening signals of which the pulse wave interval period does not accord with the threshold value setting, and removing the signals; combining the characteristics of physiological parameters of the human body to form a 41-dimensional characteristic vector according to the pulse wave waveform characteristics and the fine screening signal energy characteristics; according to the importance ranking of the xgboost acquired features, sequentially adding features with high importance degree, and constructing a blood sugar estimation data set; and inputting the constructed data set into a particle swarm BP neural network, and constructing a noninvasive blood glucose estimation model. The signal quality screening method is simple in operation and small in calculated amount, and can improve the stability and applicability of the blood glucose estimation model of the wearable equipment.

Description

Blood glucose prediction model construction method based on signal energy characteristics and pulse period
Technical Field
The invention belongs to the technical field of biomedical signal processing, and relates to a blood glucose prediction model construction method based on signal energy characteristics and pulse periods.
Background
Currently, a biochemical meter detection method and a rapid blood glucose meter measurement method are adopted for blood glucose measurement, the former method is mainly used in places such as hospitals, the blood sample required for collection is large in demand, the detection time is long, and the equipment volume is large; the second is mainly used in household scenes, and the blood sugar concentration measurement value can be calculated by the miniature glucometer in a short time by only collecting 1-3 mu L of blood sample.
Whether biochemical analyzer or fingertip blood sampling, belongs to invasive measurement methods. Frequent blood glucose concentration measurements are a very tedious and painful task for the patient and are accompanied by a risk of infection. In addition, for the household miniature glucometer, the price of the blood glucose test paper is also a little expense. In fact, most diabetics cannot continuously monitor the blood glucose concentration due to the reasons of complicated blood glucose concentration detection, pain and the like, so that the diabetics cannot accept proper treatment measures, and the blood glucose concentration cannot be effectively controlled. Therefore, the realization of a real noninvasive blood glucose concentration detection method has very important practical significance.
The noninvasive blood glucose detection technology based on the PPG signal is remarkable in that the signal acquisition is convenient, and the PPG signal is the reflection of blood flow in blood vessels and can be acquired through the fingertip acquisition of a subject. The PPG signal can be used not only for assessing cardiovascular diseases, but also for predicting blood pressure and blood glucose. Many wearable devices are compatible with monitoring the PPG signal at present, and constructing a blood glucose prediction model by utilizing the PPG signal extraction characteristics is a very promising direction.
However, the signals acquired by the wearable device are poor in signal quality due to the fact that the wearable device is not suitable for wearing and the patient moves, the influence of abnormal signals on subsequent results can be eliminated by the quality of screening signals, and the data calculation amount can be reduced. The PPG signal which is partially acquired may have the condition of no pulse or more distortion, and the characteristics extracted by the partial signal have errors and even cannot be extracted, so that the training efficiency and the prediction accuracy of the model are seriously affected. The screening of signal quality is an important step for both the development and the actual use of the algorithm.
Disclosure of Invention
Therefore, the invention aims to provide a signal quality screening method based on signal energy characteristics and pulse period, which is a simple and rapid PPG signal quality screening method, ensures the quality of PPG signals extracted by subsequent characteristics, and provides high-quality PPG signals for subsequent model construction, thereby improving the stability and high precision of a noninvasive blood glucose estimation model.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a blood glucose prediction model construction method based on signal energy characteristics and pulse periods specifically comprises the following steps:
s1: high-frequency noise removal is carried out on the collected PPG signals;
s2: preliminary screening of PPG signal quality: extracting Kaiser-Teager energy characteristics and logarithmic energy entropy characteristics of the PPG signal after denoising in the step S1 by using a Kaise-Teager operator and entropy method, and inputting the Kaise-Teager energy characteristics and logarithmic energy entropy characteristics into an SVM classifier to divide the signal quality into a good signal quality and a bad signal quality;
s3: accurately screening the good quality PPG signals preliminarily screened in the step S2, screening signals of which the pulse wave interval period does not accord with the threshold setting, and removing the signals from the data set;
s4: according to the pulse wave waveform characteristics, the autoregressive coefficients, the heart rate and the signal energy characteristics screened in the step S3, 39 characteristics are combined, and the characteristics of 2 physiological parameters of the human body are combined to form a 41-dimension characteristic vector;
s5: according to the importance ranking of the acquired features of the xgboost regressor, respectively comparing the features with higher importance ranking of the features, sequentially adding the features with high importance degree, and constructing a blood glucose estimation data set;
s6: and (5) inputting the data set constructed in the step (S5) into a particle swarm BP neural network, and constructing a noninvasive blood glucose estimation model.
Further, in step S2, the extraction method of the Kaiser-Teager energy feature is mainly used for determining the instantaneous energy distribution of the signal, so as to indicate whether the signal is noisy or clean, and help to determine whether the signal is retained; in the discrete domain, the energy operator can be calculated by only three adjacent values of the signal, and has low calculation complexity and high time resolution. The sampling rate of the PPG signal is 64Hz, the continuous glucometer collects the blood sugar value once every 5 minutes, the PPG signal 30 seconds before each blood sugar collection is extracted to correspond to the blood sugar value once, and a section of PPG signal S W Contains 1920 points and is divided into frame lengths L frame Different frame signal s=64 f (τ, n), where τ represents the different signal of each frame (τ=1, …, L frame ) N denotes the frame number (n=1, …, 30), and the Kaiser-Teager feature KTE (τ, n) is calculated as follows:
KTE (τ, n) calculated for each frame (τ=2, 3, …, L frame -1) calculating the mean, variance, percentile and skewness, and integrating the mean of all frames (n=1, 2, …, 30) to obtain the Kaiser-Teager energy features of 4 features.
Further, in step S2, the logarithmic energy entropy feature is a time-domain entropy measure, which is calculated according to the full-band energy spectrum:
and calculating the statistical characteristics of the logarithmic energy entropy sequence to obtain the mean value, variance and percentile of the logarithmic energy entropy sequence, and obtaining 3 characteristics. Combining the features extracted by the two methods to construct 7 feature vectors serving as input features of the SVM signal quality primary screening classifier.
Further, the step S3 specifically includes: and obtaining the valley bottom of the preliminarily screened good quality PPG signal by a differential threshold value method, fitting the baseline drift of the PPG signal by a cubic spline interpolation method, and subtracting the baseline drift from the original signal to obtain the PPG signal with the baseline drift removed.
Positioning the peak value and the valley bottom of the pulse wave through a differential threshold value, calculating the peak value interval and the valley bottom interval and the number of the peak value and the valley bottom, setting the threshold value in a heart rate range (50,140), and removing signals which do not meet the threshold value; the main steps of the differential threshold method positioning are as follows:
(1) Obtaining a differential signal of the PPG signal, and setting a part smaller than zero as 0 to obtain the PPG diff
(2) Acquisition of PPG diff Maximum coordinates peaks of (a) and calculatesAs a threshold condition for judging whether the maximum value is satisfied;
(3) Traversing and judging maximum coordinates peaks corresponding to differential signal PPG diff (peaks)>diff mean And the difference value of the current maximum coordinates peaks (i) -peaks (i-1) meets the heart rate requirement range, adding the current peaks (i) to the final peak value result to obtain screened maximum coordinates peaks2;
(4) The extremum coordinates peaks2 and the peak value of the original PPG signal have certain offset, and 1/3 heart rate periods are searched backwards according to the differential extremum peaks2, so that the peak value coordinates rpeaks are obtained;
(5) Removing the weight rpeaks to finally obtain a peak value of the PPG signal;
(6) And searching for 1/2 period according to the forward direction of the peak value rpeaks, and obtaining the coordinate of the minimum value to obtain valley.
The method comprises the steps of obtaining the peak value and the valley bottom of pulse wave by using a differential threshold method, judging whether two or more periods of a peak interval do not meet the heart rate requirement range (50, 140) according to the peak interval as a main judgment standard, if not, marking that the signal has larger jump, and eliminating the signal; if so, the signal is retained and the heart rate characteristic of the signal is calculated as an input characteristic for the subsequent model.
Further, in step S4, the extracted pulse wave waveform features include 25 features in total, based on the pulse wave waveform. The method for extracting the pulse wave waveform characteristics comprises the following steps: firstly, dividing a pulse wave signal into signals of single pulse wave period according to a valley sequence of the pulse wave signal obtained by adopting a differential threshold method in the step S3; then, pulse wave waveform characteristics are extracted for the signals of each pulse wave period, and 25 characteristic constituent feature vectors are obtained.
In step S4, the method for extracting the autoregressive coefficient is implemented by an autoregressive model, specifically, the autoregressive model predicts the value of the current point by utilizing the point before the pulse wave p moment to obtain the coefficient of the autoregressive model to characterize the pulse wave; the autoregressive model can be described by the following formula:
wherein S (τ, n) represents the τ value of the nth frame PPG signal, e (n) represents the prediction and true error of the autoregressive model, p represents the regression order, AR i The coefficients representing the autoregressive model, b representing the bias term, where p=5 is set, plus the bias term, then the autoregressive coefficients have a total of 6 features.
Further, the heart rate is calculated according to the pulse wave average time interval, and 1 feature is obtained.
Further, in step S4, the feature vector of 41 dimensions includes: 4 Kaiser-Teager energy features, 3 statistics of the log energy entropy sequence, 25 pulse wave waveform features, 6 autoregressive coefficient features, 1 heart rate feature, and 2 features of gender and glycosylated hemoglobin features.
Further, the step S5 specifically includes: constructing an xgboost regressor to acquire importance ranking of the features, sequentially adding the features, and checking model indexes of a data set formed by the corresponding features until performance indexes of the model are not improved; finally obtaining a data set of 10 features; and constructing a noninvasive blood glucose estimation model by using the PSO-BP neural network. The feature importance ranking method and the blood sugar estimation model construction mainly comprise the following steps:
s51: extracting feature vectors of 41 features from the screened signals to construct a data set;
s52: the dataset was Z-score normalized; some outliers outside the range of values may exist in the feature, and to avoid its influence, the Z-score normalization is used, i.e. the difference between the raw data and the mean divided by the variance:
s53: dividing the standardized data into a training set, a verification set and a test set according to the proportion of 7:1:2, and obtaining feature importance scores through xgboost; xgboost, i.e., gradient promote tree, can relatively directly obtain the importance score of each feature after promote tree creation, the importance score measures the value of the feature in the model that prompts decision tree creation. The more features that are used to create a decision tree in a model, the more important it is. Ranking features from high to low according to feature importance scores;
s54: and (3) sequentially adding high-ranking features according to the feature ranking obtained in the step (S53), and then constructing an xgboost model to check model precision of different feature numbers to obtain a feature combination of the final optimal model precision.
Further, the step S6 specifically includes:
s61: constructing a particle swarm BP neural network according to the data set constructed by the importance ranking in the step S54, optimizing the weight of the BP neural network by utilizing a particle swarm algorithm, randomly selecting 5% of the data set, taking the error of the BP neural network in predicting blood sugar and real blood sugar as an fitness function, and optimizing the weight of the BP neural network;
s62: and (3) using the optimized weight as an initial weight value of the BP neural network, dividing the rest 95% of data sets into a training set, a verification set and a test set according to the ratio of 7:1:2, and training the BP neural network to obtain a blood sugar estimation model.
The invention has the beneficial effects that:
1) The invention fully considers the energy characteristics of the pulse wave signals to perform preliminary signal quality screening, and then performs further screening through the pulse period, so that most of useless signals can be removed, the calculated amount is reduced, and the model prediction accuracy is greatly improved.
2) In the modeling process of the method, the energy characteristics and the waveform characteristics of the pulse wave are considered, and most of information of the pulse wave is extracted, so that the noninvasive blood glucose estimation model is more stable and accurate.
3) The signal quality screening method is simple in operation and small in calculated amount, can realize rapid calculation on the embedded equipment, provides a basis for pulse wave signal processing of the wearable equipment, and can further promote the stability and applicability of the blood glucose estimation model of the wearable equipment.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.
Drawings
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:
FIG. 1 is a flow chart of a signal quality screening method of the present invention;
FIG. 2 is a diagram of pulse wave waveform feature points;
FIG. 3 is an AUC curve (AUC score of 0.996) of an SVM mass classifier;
fig. 4 is a randomly truncated 10-segment raw PPG signal;
fig. 5 is a pre-processed (i.e., denoised, screened, de-baseline wander) PPG signal;
FIG. 6 is a graph of model metrics for different feature quantities;
FIG. 7 is a top 10 feature rank for xgboost feature importance;
FIG. 8 is a schematic diagram of Clarke error grid analysis results of a PSO-BP model;
FIG. 9 is a schematic representation of the results of Bland-Altman consistency analysis of the PSO-BP model.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.
Referring to fig. 1 to 9, the present invention provides a pulse wave signal quality screening method based on pulse wave energy characteristics and a noninvasive blood glucose prediction method combining the pulse wave waveform characteristics, and the specific flowchart is shown in fig. 1, wherein the high-frequency interference collected by the PPG signal is removed through wavelet threshold denoising, the pulse wave signal quality is primarily evaluated by utilizing the pulse wave energy characteristics, and then threshold judgment is performed through a pulse period detected by a differential threshold method, so that the pulse wave signal meeting the threshold is a high-quality PPG signal. Finally, waveform characteristics of pulse waves are extracted, and energy characteristics of signals are added to form input characteristics of PPG signals as shown in table 3. In order to reduce the feature vector, each feature of the composition is compared with the invasive blood glucose value, 10 features with high correlation ranking are taken to form an input feature, and finally, a noninvasive blood glucose prediction model is constructed through a particle swarm BP neural network.
For portable equipment, signal distortion and even signal missing often exist, the quality of classified signals is accurately judged, invalid and low-quality signals are removed, and the utilization rate of PPG signals and the accuracy of subsequent algorithms can be greatly improved. And for the judgment of the PPG signal quality, the PPG signal quality is evaluated based on the energy characteristic and the pulse wave waveform characteristic of the PPG signal, so that the PPG signal quality can be rapidly and conveniently screened.
The energy characteristics of the PPG signal extracted by the method are two types: the Kaiser-Teager energy operator is used for tracking the instantaneous energy distribution of the signal; and secondly, extracting logarithmic energy entropy characteristics, calculating the integral change characteristics of the signals, combining the two energy characteristics, and manually constructing a data set of signal quality classification. And training a model of PPG signal quality preliminary classification through an SVM classifier. The specific implementation steps are as follows:
(1) 324 PPG signals are manually screened out, the quality of an artificial standard PPG signal is obvious in waveform, and the PPG signal without obvious distortion is set as a label 1 to represent a high-quality PPG signal; whereas for the presence of more distortion, the distorted PPG signal is set to tag 0, representing a low quality PPG signal. Wherein 90 pieces of data are high quality PPG signals and 234 pieces of data are low quality PPG signals;
(2) Extracting and screening out energy characteristics of the PPG signals to serve as input characteristics of an SVM quality classifier, wherein 70% of data serve as a training set, and 30% of data serve as a testing set;
(3) Evaluating the performance index of the SVM classifier in the test set, wherein a plurality of indexes are required to be evaluated to compare the performance of the classifier because of the imbalance of the data samples, and the performance index is mainly evaluated through the result of a confusion matrix (the two-class confusion matrix is shown in the table 1);
TABLE 1 two classification confusion matrix
The performance index of the two classifiers is calculated by the confusion matrix in the following manner in table 2:
table 2 two classifier index calculations
For the case of imbalance of positive and negative sample distribution, another evaluation method is needed: ROC curve. The ROC curve may be kept unchanged when a class imbalance occurs in the positive and negative samples, for example when there is far more negative samples than positive samples in the method. The comprehensive index of the sensibility and specificity continuous variable can be well reflected. The calculation for the ROC curve is represented by the following two variables:
FPR represents the proportion of all negative samples predicted as positive samples, also called false positive rate, and TPR represents the proportion of all positive samples predicted as positive samples, also called true rate. For classification algorithms, the result of a general prediction is a probability value, and a threshold value may be set beyond which the prediction is predicted as one class, and not beyond which the prediction is defined as another class. Therefore, different thresholds, against different false positive and true rates, the curve is the ROC curve and the area under the ROC curve is the AUC fraction. The performance indexes of the classifier are shown in Table 3, and the AUC curve is shown in FIG. 3.
TABLE 3SVM quality classifier Performance index
The data set used in this embodiment is from the open source data set of BIG idaeas, and participants continuously collect PPG signals for 10 days through the wearable device employee E4 wristband, with a sampling rate of 64Hz, and synchronously collect pulse wave signals at 5 minute intervals using the Dexcom 6 continuous blood glucose monitor. Firstly, pulse wave data of 30 seconds before blood sugar acquisition are extracted through the time of blood sugar acquisition each time, a preliminary data set is constructed, and then the algorithm described in the method is carried out. Acquisition the raw PPG signal is shown in fig. 4. Most pulse wave signals have more noise, baseline drift, distortion and the like, and PPG signal quality judgment is needed to carry out subsequent processing.
The preprocessing stage of the signal comprises: firstly, carrying out wavelet threshold processing on an original acquired PPG signal to remove high-frequency noise of the original signal; then dividing the signal quality by an SVM signal quality classifier, and reserving PPG signals classified into high quality; obtaining pulse period of PPG signal by peak positioned by differential threshold method, retaining PPG signal meeting the requirement of set threshold, and eliminating others to obtain high quality PPG signal, so as to facilitate feature extraction and improve accuracy of model; finally, fitting the wave trough of the PPG signal through the cubic spline difference value to obtain a base line, subtracting the base line drift to obtain a denoised and base drift removed PPG signal, as shown in figure 5.
After preprocessing pulse wave signals, obtaining 5752 PPG signals meeting quality requirements in total, positioning the trough of the PPG signals by using a differential threshold method, dividing the trough into single pulse waves, judging whether the length of the single pulse wave meets the set threshold requirement, and removing unsatisfied pulse wave; and if the requirements are met, extracting the corresponding pulse wave waveform characteristics, as shown in table 5. And extracting energy features of the PPG signal, including Kaiser-Teager features and logarithmic entropy energy features, as shown in table 4, to form an input feature vector, a total of 41-dimensional feature vectors.
The total extracted feature vectors have larger dimensions and contain 41-dimension data, most features have linear correlation or have little correlation with noninvasive blood sugar, and feature screening is needed to improve the accuracy and stability of the model. The method adopts xgboost to carry out feature importance screening. The screening steps are as follows:
(1) Normalizing the extracted features, and constructing a model of a full feature vector by using xgboost;
(2) And sequentially adding the features with high feature importance degree, finding out the points with balanced model precision and feature quantity, and finally obtaining the model with 10 features.
For gradient lifting algorithms, after the lifting tree is created, the importance score for each attribute can be obtained relatively directly, which measures the value of a feature in lifting the decision tree construction in the model, the more features that are used to construct the decision tree in the model, the higher the feature importance is relatively. The feature vectors of top 10 were obtained using xgboost as shown in fig. 7, where the feature numbers are the same as in tables 4 and 5.
TABLE 4 pulse wave energy characterization
TABLE 5 pulse wave waveform characterization
In order to verify the specific selection quantity of the feature parameters, sequentially adding the features with important ranks at the front, constructing a model checking effect, finding out the balance point of which the model precision is not improved any more after the features are added, finally obtaining the optimal feature combination, and then carrying out model training by using a particle swarm BP neural network on the basis of the data set of the feature combination. In order to optimize the condition that the BP neural network is easy to fall into local optimum, a small amount of data (5% of data sets) are selected for optimizing the initial weight of the BP neural network by a particle swarm, and the rest of data sets (95% of data sets) are divided into a training set and a test set according to 7:3; and training the BP neural network of the particle swarm initialization weight by using a random gradient descent method. The specific steps of training the particle swarm BP neural network are as follows:
(1) Firstly, determining the architecture of BP neural network according to the selected weight quantity, wherein the BP neural network architecture used by the method is a BP neural network with a single hidden layer, and the input layer N in Output layer N, in relation to the number of features selected out 1, hidden layer N hidden Is determined by the size of the input layer N hidden =2×N in +1;
(2) Initializing particles according to the BP neural network architecture determined in the step 1, wherein the particles represent the ownership weight of the BP neural network;
(3) Setting a fitness function as a root mean square error of a particle swarm serving as an output value of BP neural network weight and true blood sugar;
(4) Iterating the particle swarm algorithm until the error meets the requirement or the iteration number reaches the upper limit, so as to obtain the BP neural network weight of particle swarm optimization;
(5) And training the BP neural network with the particle swarm optimization weight by using the rest data set to obtain the pso-BP neural network.
The data set after screening the features is trained on a PSO-BP model, then model performance indexes (shown in table 6) are respectively estimated from Mean Absolute Error (MAE), root Mean Square Error (RMSE) and Mean Square Error (MSE), the accuracy of the noninvasive blood glucose estimation model in the medical field is estimated by combining Clarke grid error analysis, and then the consistency of the noninvasive blood glucose estimation model and the real blood glucose value is judged through Bland-Altman consistency analysis.
TABLE 6PSO-BP model Performance index
Finally, a model is built on the data set of the screening characteristics through the BP neural network optimized by the particle swarm, the Clark error grid analysis is shown in FIG. 8, and the Bland-Altman consistency analysis is shown in FIG. 9.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims (10)

1. The blood glucose prediction model construction method based on the signal energy characteristics and the pulse period is characterized by comprising the following steps of:
s1: high-frequency noise removal is carried out on the collected PPG signals;
s2: preliminary screening of PPG signal quality: extracting Kaiser-Teager energy characteristics and logarithmic energy entropy characteristics of the PPG signal after denoising in the step S1 by using a Kaise-Teager operator and entropy method, and inputting the Kaise-Teager energy characteristics and logarithmic energy entropy characteristics into an SVM classifier to divide the signal quality into a good signal quality and a bad signal quality;
s3: accurately screening the good quality PPG signals preliminarily screened in the step S2, screening signals of which the pulse wave interval period does not accord with the threshold setting, and removing the signals from the data set;
s4: according to the pulse wave waveform characteristics, the autoregressive coefficients, the heart rate and the signal energy characteristics screened in the step S3, 39 characteristics are combined, and the characteristics of 2 physiological parameters of the human body are combined to form a 41-dimension characteristic vector;
s5: according to the importance ranking of the acquired features of the xgboost regressor, sequentially adding the features with high importance degree to construct a blood glucose estimation data set;
s6: and (5) inputting the data set constructed in the step (S5) into a particle swarm BP neural network, and constructing a noninvasive blood glucose estimation model.
2. The method according to claim 1, wherein in step S2, the method of extracting the Kaiser-Teager energy features is used to determine the instantaneous energy distribution of the signal to indicate whether the signal is noisy or clean; the sampling rate of the PPG signal is 64Hz, the continuous glucometer collects the blood sugar value once every 5 minutes, the PPG signal 30 seconds before each blood sugar collection is extracted to correspond to the blood sugar value once, and a section of PPG signal S W Contains 1920 points, which are divided into frame lengthsL frame Different frame signal s=64 f (τ, n), where τ represents the different signal of each frame (τ=1, …, L frame ) N denotes the frame number (n=1, …, 30), and the Kaiser-Teager feature KTE (τ, n) is calculated as follows:
KTE (τ, n) calculated for each frame (τ=2, 3, …, L frame -1) calculating the mean, variance, percentile and skewness, and integrating the mean of all frames (n=1, 2, …, 30) to obtain the Kaiser-Teager energy features of 4 features.
3. The method according to claim 2, wherein in step S2, the logarithmic energy entropy feature is a time-domain entropy measure, calculated from a full-band energy spectrum:
and calculating the statistical characteristics of the logarithmic energy entropy sequence to obtain the mean value, variance and percentile of the logarithmic energy entropy sequence, and obtaining 3 characteristics.
4. The method for constructing a blood glucose prediction model according to claim 3, wherein step S3 specifically comprises: and obtaining the valley bottom of the preliminarily screened good quality PPG signal by a differential threshold value method, fitting the baseline drift of the PPG signal by a cubic spline interpolation method, and subtracting the baseline drift from the original signal to obtain the PPG signal with the baseline drift removed.
5. The method for constructing a blood glucose prediction model according to claim 3 or 4, wherein step S3 specifically comprises: locating the peak value and the valley bottom of the pulse wave by a differential value method, calculating the peak value interval and the valley bottom interval and the number of the peak value and the valley bottom, setting the threshold value in a heart rate range (50,140), and removing signals which do not meet the threshold value; the differential threshold method positioning comprises the following steps:
(1) Obtaining a differential signal of the PPG signal, and setting a part smaller than zero as 0 to obtain the PPG diff
(2) Acquisition of PPG diff Maximum coordinates peaks of (a) and calculatesAs a threshold condition for judging whether the maximum value is satisfied;
(3) Traversing and judging maximum coordinates peaks corresponding to differential signal PPG diff (peaks)>diff mean And the difference value of the current maximum coordinates peaks (i) -peaks (i-1) meets the heart rate requirement range, adding the current peaks (i) to the final peak value result to obtain screened maximum coordinates peaks2;
(4) The extreme value coordinates peak 2 and the peak value of the original PPG signal are offset, and 1/3 heart rate periods are searched backwards according to the differential extreme value peak 2, so that the peak value coordinates rpeaks are obtained;
(5) Removing the weight rpeaks to finally obtain a peak value of the PPG signal;
(6) And searching for 1/2 period according to the forward direction of the peak value rpeaks, and obtaining the coordinate of the minimum value to obtain valley.
6. The method for constructing a blood glucose prediction model according to claim 5, wherein in step S4, the method for extracting pulse wave waveform features is as follows: firstly, dividing a pulse wave signal into signals of single pulse wave period according to a valley sequence of the pulse wave signal obtained by adopting a differential threshold method in the step S3; then, pulse wave waveform characteristics are extracted for the signals of each pulse wave period, and 25 characteristic constituent feature vectors are obtained.
7. The method for constructing a blood glucose prediction model according to claim 1, wherein in step S4, the extraction method of the autoregressive coefficients is implemented by an autoregressive model, specifically, the autoregressive model predicts the value of the current point by using the point before the pulse wave p moment to obtain the coefficients of the autoregressive model to characterize the pulse wave; the autoregressive model is described by the following formula:
wherein S (τ, n) represents the τ value of the nth frame PPG signal, e (n) represents the autoregressive model prediction and true error, p represents the regression order, AR i Coefficients representing the autoregressive model, and b represents the bias term.
8. The method according to claim 1, wherein in step S4, the feature vector of 41 dimensions includes: 4 Kaiser-Teager energy features, 3 statistics of the log energy entropy sequence, 25 pulse wave waveform features, 6 autoregressive coefficient features, 1 heart rate feature, and 2 features of gender and glycosylated hemoglobin features.
9. The method of constructing a blood glucose prediction model according to claim 8, wherein step S5 specifically comprises: constructing an xgboost regressor to acquire importance ranking of the features, sequentially adding the features, and checking model indexes of a data set formed by the corresponding features until performance indexes of the model are not improved; finally obtaining a data set of 10 features; the feature importance ranking method comprises the following steps:
s51: extracting feature vectors of 41 features from the screened signals to construct a data set;
s52: the dataset was Z-score normalized;
s53: dividing the standardized data into a training set, a verification set and a test set according to the proportion of 7:1:2, and obtaining feature importance scores through xgboost; ranking features from high to low according to feature importance scores;
s54: and (3) sequentially adding high-ranking features according to the feature ranking obtained in the step (S53), and then constructing an xgboost model to check model precision of different feature numbers to obtain a feature combination of the final optimal model precision.
10. The method of constructing a blood glucose prediction model according to claim 9, wherein step S6 specifically comprises:
s61: constructing a particle swarm BP neural network according to the data set constructed by the importance ranking in the step S54, optimizing the weight of the BP neural network by utilizing a particle swarm algorithm, randomly selecting 5% of the data set, taking the error of the BP neural network in predicting blood sugar and real blood sugar as an fitness function, and optimizing the weight of the BP neural network;
s62: and (3) using the optimized weight as an initial weight value of the BP neural network, dividing the rest 95% of data sets into a training set, a verification set and a test set according to the ratio of 7:1:2, and training the BP neural network to obtain a blood sugar estimation model.
CN202310227933.3A 2023-03-10 2023-03-10 Blood glucose prediction model construction method based on signal energy characteristics and pulse period Pending CN116451110A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310227933.3A CN116451110A (en) 2023-03-10 2023-03-10 Blood glucose prediction model construction method based on signal energy characteristics and pulse period

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310227933.3A CN116451110A (en) 2023-03-10 2023-03-10 Blood glucose prediction model construction method based on signal energy characteristics and pulse period

Publications (1)

Publication Number Publication Date
CN116451110A true CN116451110A (en) 2023-07-18

Family

ID=87122783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310227933.3A Pending CN116451110A (en) 2023-03-10 2023-03-10 Blood glucose prediction model construction method based on signal energy characteristics and pulse period

Country Status (1)

Country Link
CN (1) CN116451110A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116803344A (en) * 2023-07-27 2023-09-26 迈德医疗科技(深圳)有限公司 Blood glucose classification method and system based on multi-norm clustering and double-layer discrete network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116803344A (en) * 2023-07-27 2023-09-26 迈德医疗科技(深圳)有限公司 Blood glucose classification method and system based on multi-norm clustering and double-layer discrete network
CN116803344B (en) * 2023-07-27 2024-02-13 迈德医疗科技(深圳)有限公司 Blood glucose classification method and system based on multi-norm clustering and double-layer discrete network

Similar Documents

Publication Publication Date Title
CN107951485B (en) Ambulatory ECG analysis method and apparatus based on artificial intelligence self study
Dey et al. InstaBP: cuff-less blood pressure monitoring on smartphone using single PPG sensor
CN109961017A (en) A kind of cardiechema signals classification method based on convolution loop neural network
CN109998525B (en) Arrhythmia automatic classification method based on discriminant deep belief network
CN109833035B (en) Classification prediction data processing method of pulse wave blood pressure measuring device
US20220093215A1 (en) Discovering genomes to use in machine learning techniques
CN107595249B (en) Pregnant woman screening method based on pulse waves
CN105868532B (en) A kind of method and system of intelligent evaluation heart aging degree
CN112806977B (en) Physiological parameter measuring method based on multi-scale fusion network
CN110558960A (en) continuous blood pressure non-invasive monitoring method based on PTT and MIV-GA-SVR
CN116451110A (en) Blood glucose prediction model construction method based on signal energy characteristics and pulse period
Gupta et al. Higher order derivative-based integrated model for cuff-less blood pressure estimation and stratification using PPG signals
Tigges et al. Model selection for the Pulse Decomposition Analysis of fingertip photoplethysmograms
CN113593708A (en) Sepsis prognosis prediction method based on integrated learning algorithm
CN112120711B (en) Noninvasive diabetes prediction system and method based on photoplethysmography pulse waves
CN108338777A (en) A kind of pulse signal determination method and device
CN114145725B (en) PPG sampling rate estimation method based on noninvasive continuous blood pressure measurement
CN116649924A (en) Pulse analysis method and device
CN116451129A (en) Pulse classification and identification method and system
CN110811673A (en) Heart sound analysis system based on probabilistic neural network model
CN114469041A (en) Heart rate change data characteristic analysis method in exercise process
CN115633957A (en) Blood glucose prediction method and system based on high-order and fraction low-order statistics
CN115089139A (en) Personalized physiological parameter measuring method combining biological characteristic identification
Quanyu Pulse signal analysis based on deep learning network
Lu et al. Pulse waveform analysis for pregnancy diagnosis based on machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination