CN113974566B - COPD acute exacerbation prediction method based on time window - Google Patents

COPD acute exacerbation prediction method based on time window Download PDF

Info

Publication number
CN113974566B
CN113974566B CN202111319613.8A CN202111319613A CN113974566B CN 113974566 B CN113974566 B CN 113974566B CN 202111319613 A CN202111319613 A CN 202111319613A CN 113974566 B CN113974566 B CN 113974566B
Authority
CN
China
Prior art keywords
days
model
features
time window
predicting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111319613.8A
Other languages
Chinese (zh)
Other versions
CN113974566A (en
Inventor
王琨
朱威
李强
陆银美
侯应伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Qiyi Medical Technology Co ltd
Original Assignee
Wuxi Qiyi Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Qiyi Medical Technology Co ltd filed Critical Wuxi Qiyi Medical Technology Co ltd
Priority to CN202111319613.8A priority Critical patent/CN113974566B/en
Publication of CN113974566A publication Critical patent/CN113974566A/en
Application granted granted Critical
Publication of CN113974566B publication Critical patent/CN113974566B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7271Specific aspects of physiological measurement analysis
    • A61B5/7275Determining trends in physiological measurement data; Predicting development of a medical condition based on physiological measurements, e.g. determining a risk factor

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Psychiatry (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Physiology (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a time window-based method for predicting acute exacerbation of COPD, which comprises the following steps of S1, collecting lung indexes of a patient twice daily (in the morning and afternoon) by using a small lung instrument, an electronic stethoscope and other equipment; s2, predicting T+1, T+2 and T+3 days for a supported model, and keeping the usability of the model; s3, extracting more features according to the features, wherein the features can reflect the change condition of lung monitoring indexes of a patient; s4, taking one exacerbation (the first 7 days) as a positive sample; s5, performing significance test on the characteristics; s6, using the 235 significant features as a model to input parameters, predicting whether the patient is aggravated on the T+d day (d=1, 2, 3), and using lung monitoring data of a time window to predict whether the patient has acute exacerbation risk of COPD, so that the patient can monitor himself at home, and the method has important significance for home care of COPD patients.

Description

COPD acute exacerbation prediction method based on time window
Technical Field
The invention relates to the technical field of COPD acute exacerbation prediction, in particular to a time window-based COPD acute exacerbation prediction method.
Background
Chronic obstructive pulmonary disease (hereinafter referred to as "COPD") is a disease of chronic bronchitis, emphysema, which is a disease causing damage to alveolar structures, or a disease in which both occur and the airways from bronchi to alveoli are closed; symptoms of this disease include: long-term cough with sputum, respiratory distress due to reduced air flow rate caused by airway obstruction, and common respiratory tract infections (such as the common cold); this disease causes high mortality worldwide and increases rapidly due to smoking, air pollution, etc.; the etiology of COPD is an abnormal chronic inflammatory response of the lung to toxic molecules or gases and to various factors that are involved in COPD in complexity (such as smoking, urbanization, pollution, infectious respiratory disease, etc.).
Combinations of clinical parameters have been used to predict acute exacerbations of COPD in patients; however, these clinical parameters are not adequate for accurate prediction of individual cases; furthermore, while COPD patients may develop a likelihood of acute exacerbations after going to the hospital due to the factors described above, COPD patients cannot predict their own likelihood of acute exacerbations; thus, COPD patients may lead to poor results when going to the hospital after an acute exacerbation of COPD has occurred.
Although there are currently literature to predict COPD acute exacerbation events using statistical or machine learning means, the literature currently has the following drawbacks:
1. the existing research is mainly cross section data, and the time sequence data cannot be used for carrying out real-time early warning on the acute exacerbation event of the COPD of the patient;
2. the current research does not perform characteristic excavation of the system, and improves the prediction capability of the model;
3. current studies fail to make predictions of t+1, t+2, t+3, and existing models can only predict the risk probability of future exacerbations in patients.
Disclosure of Invention
The invention aims to provide a time window-based COPD acute exacerbation prediction method, which uses lung monitoring data of a time window to predict whether a patient has a COPD acute exacerbation risk or not by T+d days (d=1, 2, 3), and the patient can monitor himself at home, self-early warn, and the operation is simple, which has important significance for the home care of the COPD patient, so as to solve the problems in the background technology.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a method for predicting acute exacerbation of COPD based on a time window, comprising the steps of:
s1, collecting lung indexes of a patient twice daily (morning and afternoon) by using a small lung instrument, an electronic stethoscope and other devices, such as FVC (fVC), FEV1 and the maximum energy value of lung vibration collected by the stethoscope, wherein the FVC adopts the instrument 'small lung instrument', and acquires forced vital capacity, namely the maximum air quantity which can be exhaled as soon as possible after the maximum inhalation is performed; the FEV1 adopts an instrument 'small lung instrument', and obtains the volume of the maximum exhalation after the maximum deep inhalation, wherein the volume of the gas exhaled by the maximum first second of exhalation; the PEF adopts an instrument 'small lung instrument', and obtains the instant flow rate when the expiratory flow is the fastest in the forced vital capacity measurement process;
s2, predicting T+1, T+2 and T+3 days for a supported model, and predicting by using patient lung monitoring indexes of a fixed time window (7 days) for maintaining the usability of the model, collecting 32 indexes of the patient in the morning and evening every day through electronic equipment, distinguishing the indexes of a five-day time window into date and whether the date is in the morning or not, wherein the characteristic quantity is 32 multiplied by 7 multiplied by 2=448;
s3, extracting more features according to the features, wherein the features can reflect the change condition of lung monitoring indexes of a patient; the data expansion includes: index sliding window statistics, such as 3 day mean/variance, 5 day mean/variance; a difference in the daytime index; 1744 extended feature numbers;
s4, taking one exacerbation (the first 7 days) as a positive sample; for negative samples, the prescribed time window cannot include 30 days before and after the period of the acute attack, so as to prevent the condition from affecting the monitoring index; the negative sample is generated by sampling all data which can be observed continuously for 7 days in the data;
s5, carrying out significance test on the features, and finding out whether 235 features aggravate on the T+d days (d=1, 2 and 3) and have significant correlation;
s6, using the 235 significant features as a model to input parameters, and predicting whether the T+d days (d=1, 2 and 3) are aggravated; the model adopts an integrated model based on a decision tree: xgboost, lightgbm and catheost, and model efficacy was evaluated using 5-fold cross-validation.
Preferably, the interpretation method of the XGBoost model comprises the following steps:
(1) Performing tree model element structure analysis on the XGBoost model to analyze the tree structure of each single tree;
(2) Inputting a test sample to the XGBoost model, and acquiring an effective leaf node corresponding to the test sample and an effective path of a tree of the effective leaf node according to a tree structure;
(3) And calculating a contribution value of the feature according to the effective path, and explaining the XGBoost model according to the obtained contribution value.
Preferably, the XGBoost uses a Boosting integration method, is largely used for data mining, and can process missing values and regularize features so as to realize the function of second-order acceleration optimization of the cost function.
Preferably, the LightGBM is a new gradient-lifted tree framework supporting GBDT, GBRT, GBM and MART algorithms, which is a complete solution for distributed training based on the DMTK framework.
Preferably, the Catboost algorithm includes: in the sensing period, the secondary user sends the energy value in the sensed channel to the fusion center as a characteristic energy vector, and the primary user intermittently sends information of occupying the frequency spectrum resource to the fusion center as a label, so that the construction of the training data set is completed. The model is trained in the fusion center using the Catboost algorithm.
Preferably, the Catboost algorithm is proposed by Yandex, which optimizes the processing of class features and computes leaf node values at the time of tree model selection, rather than data preprocessing, during the training phase, reducing overfitting.
Preferably, the predicted period duration takes eight days as a time window, the eight days are marked as (T-7, T-6, T-5, T-4, T-3, T-2, T-1, T), and for the positive sample, the T-th day is the acute exacerbation onset date; for negative examples, the prescribed time window cannot include 7 days before and after the period of the seizure.
Preferably, in order to achieve the effect of early warning, 3 groups of prediction tasks are set in advance in the prediction period:
(1) Task_1, adopting an observation value from T-5 days to T-1 day, and predicting whether the acute exacerbation is carried out on the T day;
(2) Task_2, adopting an observed value from T-6 days to T-2 days, and predicting whether the acute exacerbation is carried out on the T day;
(3) Task_3, using observations from day T-7 to day T-3, predicts whether or not day T is acutely aggravated.
Preferably, in order to reduce the number of features, a Kolmogorov-Smirnov test is performed on the features, the test can compare whether the two distributions are identical, and then the distribution of each feature on the positive sample and the distribution on the negative sample are tested, and the confidence coefficient is 0.05.
In summary, the beneficial effects of the invention are as follows due to the adoption of the technology:
1. the method can predict and early warn whether COPD acute exacerbation exists on the T+d days (d=1, 2, 3), and predict whether a patient has COPD acute exacerbation risk on the T+d days (d=1, 2, 3) by using lung monitoring data of a time window;
2. the invention uses the characteristic engineering, and the data construction method is as follows: the selection of positive and negative samples combines data sampling and medical knowledge, so that the model effect is remarkably improved;
3. the model has high practicability, the patient can monitor and self early warn at home, and the model is simple to operate, so that the model has important significance for the home care of COPD patients.
Drawings
FIG. 1 is a flow chart of the model construction of the present invention;
FIG. 2 is a ROC curve of five models based on LightGBM for task_1 setting of the present invention;
fig. 3 is a ROC curve of five models based on LightGBM under task_2 setting of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments; all other embodiments, based on the embodiments of the invention, which a person of ordinary skill in the art would obtain without inventive faculty, are within the scope of the invention; thus, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention; all other embodiments, based on the embodiments of the invention, which a person of ordinary skill in the art would obtain without inventive faculty, are within the scope of the invention;
the invention provides a method for predicting acute exacerbation of COPD based on a time window, which is shown in figures 1-3 and comprises the following steps:
s1, collecting lung indexes of a patient twice daily (morning and afternoon) by using a small lung instrument, an electronic stethoscope and other devices, such as FVC (fVC), FEV1 and the maximum energy value of lung vibration collected by the stethoscope, wherein the FVC adopts the instrument 'small lung instrument', and acquires forced vital capacity, namely the maximum air quantity which can be exhaled as soon as possible after the maximum inhalation is performed; the FEV1 adopts an instrument 'small lung instrument', and obtains the volume of the maximum exhalation after the maximum deep inhalation, wherein the volume of the gas exhaled by the maximum first second of exhalation; the PEF adopts an instrument 'small lung instrument', and obtains the instant flow rate (lung index is shown in table 1) when the expiratory flow is the fastest in the forced vital capacity measurement process;
s2, predicting T+1, T+2 and T+3 days for a supported model, and predicting by using patient lung monitoring indexes of a fixed time window (7 days) for maintaining the usability of the model, collecting 32 indexes of the patient in the morning and evening every day through electronic equipment, distinguishing the indexes of a five-day time window into date and whether the date is in the morning or not, wherein the characteristic quantity is 32 multiplied by 7 multiplied by 2=448;
s3, extracting more features according to the features, wherein the features can reflect the change condition of lung monitoring indexes of a patient; the data expansion includes: index sliding window statistics, such as 3 day mean/variance, 5 day mean/variance; a difference in the daytime index; 1744 extended feature numbers;
s4, taking one exacerbation (the first 7 days) as a positive sample; for negative samples, the prescribed time window cannot include 30 days before and after the period of the acute attack, so as to prevent the condition from affecting the monitoring index; the negative sample is generated by sampling all data which can be observed continuously for 7 days in the data;
s5, carrying out significance test on the features, and finding out whether 235 features aggravate on the T+d days (d=1, 2 and 3) and have significant correlation;
s6, using the 235 significant features as a model to input parameters, and predicting whether the T+d days (d=1, 2 and 3) are aggravated; the model adopts an integrated model based on a decision tree: xgboost, lightgbm and catheost, and model efficacy was evaluated using 5-fold cross-validation.
Specifically, the interpretation method of the XGBoost model comprises the following steps:
(1) Performing tree model element structure analysis on the XGBoost model to analyze the tree structure of each single tree;
(2) Inputting a test sample to the XGBoost model, and acquiring an effective leaf node corresponding to the test sample and an effective path of a tree of the effective leaf node according to a tree structure;
(3) And calculating a contribution value of the feature according to the effective path, and explaining the XGBoost model according to the obtained contribution value.
Specifically, the XGBoost utilizes a Boosting integration method, is largely used for data mining, and can process missing values and regularize features, thereby realizing the function of second-order acceleration optimization of the cost function.
Specifically, the LightGBM is a new gradient-lifted tree framework, supports GBDT, GBRT, GBM and MART algorithms, and is several times faster than the existing gradient-enhanced tree implementation due to its completely greedy tree growth method and histogram-based memory and computation optimization, and is a complete solution for distributed training based on the DMTK framework, which quickly becomes a common tool for data mining contestants after the emergence of the LightGBM.
Specifically, the Catboost algorithm includes: in the sensing period, the secondary user sends the energy value in the sensed channel to the fusion center as a characteristic energy vector, and the primary user intermittently sends information of occupying the frequency spectrum resource to the fusion center as a label, so that the construction of the training data set is completed. The model is trained in the fusion center using the Catboost algorithm.
Specifically, the Catboost algorithm is proposed by Yandex, optimizes the processing of class features, and calculates leaf node values at the time of tree model selection, during the training phase rather than the data preprocessing phase, reducing overfitting.
Specifically, the predicted period length takes eight days as a time window to intercept positive samples, the eight days are marked as (T-7, T-6, T-5, T-4, T-3, T-2, T-1, T), and for the positive samples, the T-th day is the starting date of the acute exacerbation; for negative examples, the prescribed time window cannot include 7 days before and after the period of the seizure.
Specifically, in order to achieve the effect of early warning in the prediction period, 3 groups of prediction tasks are set in advance:
(1) Task_1, adopting an observation value from T-5 days to T-1 day, and predicting whether the acute exacerbation is carried out on the T day;
(2) Task_2, adopting an observed value from T-6 days to T-2 days, and predicting whether the acute exacerbation is carried out on the T day;
(3) Task_3, using observations from day T-7 to day T-3, predicts whether or not day T is acutely aggravated.
Specifically, in order to reduce the number of features, a Kolmogorov-Smirnov test is performed on the features, the test can compare whether the two distributions are identical, and then the distribution of each feature on the positive sample and the distribution on the negative sample are tested, and the confidence coefficient is 0.05.
Table 1: observed value feature names and their interpretation;
table 2. Top fifty features that pass the significance test and P-value scoring;
using k-fold hierarchical cross-validation (k=5), the data was split into 5 folds, 8 at each time: 2 is divided into a training set and a testing set for training and testing the model. Verification indicates that the evaluation indexes are sensitivity, specificity and AUC, wherein the threshold is the minimum threshold that causes sensitivity to exceed 0.9, and the specificity is the specificity under the current threshold. The used model is catboost, xgboost and lightgbm, and other super parameters are obtained by performing super parameter search through cross verification; three tasks are set: task_1, task_2, task_3, under each Task, 5 models were set:
(1) M_all, training by adopting all the characteristics;
(2) M_sig, employing all features that pass the saliency test;
(3) M_sigste, using electronic stethoscope-related features that pass the significance test;
(4) M_siglsi, using small lung instrument features that pass the saliency test;
(5) M_sig50, adopting the first 50 features with the lowest p value passing the significance test under the task setting;
(6) M_sig25, employing the first 25 features that pass the saliency test;
(7) M_orig, training with all raw observations.
Task_1 Task_2 Task_3
M_all 0.8135 0.8135 0.8135
M_sig 0.9268 0.9045 0.8887
M_sigSTE 0.9020 0.8845 0.8302
M_sigLSI 0.8279 0.7158 0.6617
M_sig50 0.8826 0.8000 0.8631
M_sig25 0.8173 0.8075 0.8816
M_orig 0.7361 0.7434 0.5782
Table 3. AUC mean score for cross validation;
the task_1 setting had 123 salient features, 31 small lung features passing the saliency test and 92 electronic stethoscopes.
The task_2 setting had 134 salient features, 33 of which passed the saliency test and 101 of which were electronic stethoscopes. The task_3 setting had 131 salient features, 28 small lung features passing the saliency test and 103 electronic stethoscopes.
Table 3 reports the AUC average score for cross-validation, where the model used was LightGBM. (1) Task_1 can get a higher score, which is consistent with visual understanding (one day after prediction is simpler than two or three days after prediction);
(2) Only features generated by a small lung instrument are obviously reduced in score, but only features generated by a stethoscope are still better in performance, so that the observation data of the electronic stethoscope has stronger judging and predicting effects;
(3) The adoption of the significance test to screen the features is obviously improved compared with the direct use of the original observed value or all the features;
(4) With features of front 50 or front 25 of significance, the model score will drop somewhat, indicating a reduced model fitting ability after the number of features is reduced. The ROC curves for five models based on LightGBM at task_1 setting as shown in fig. 1;
table 3 reports the AUC average score for cross-validation, where the model used was LightGBM. To verify performance under other models, we below give the effect under xgboost or catboost models:
Task_1 Task_2 Task_3
M_all 0.8772 0.8673 0.8142
M_sig 0.9181 0.8946 0.8233
M_sigSTE 0.9036 0.8792 0.8110
M_sigLSI 0.8279 0.7610 0.7000
M_sig50 0.8372 0.7831 0.8184
M_sig25 0.8177 0.8047 0.8203
M_orig 0.7812 0.7881 0.6659
table 3-1. AUC mean score for cross-validation. The model used is xgboost.
/>
Table 3-2. AUC mean score for cross-validation. The model used is a catboost.
Sensitivity to Specificity (specificity) Probability threshold
Task_1M_sig50 0.9043 0.7345 0.0113
Task_2M_sig50 0.9043 0.7098 0.0091
Task_3M_sig50 0.9043 0.6623 0.0042
Table 4. Sensitivity and specificity values of the optimal model M sig for each task setting.
In order to verify the effect of different decision tree models on the performance of our predicted task, the following table reports the model performance of the significant features under three task settings, we compared Xgboost, lightgbm with Catboost, the three most powerful decision tree model-based gradient lifting (gradient boosting) algorithms, and according to experimental results, lightgbm performs best under all three of our task settings.
Lightgbm Catboost Xgboost
Task_1M_sig 0.9268 0.8852 0.9181
Task_2M_sig 0.9045 0.8505 0.8946
Task_3M_sig 0.8887 0.8722 0.8233
Table 5. Cross-validation average AUC for three classes of decision tree integration models, catboost, lightgbm, xgboost, based on the most characteristic combination, m_sig, per task setting.
Example 2
Five-fold cross-validation was performed on task1 using the Lightgbm model, with AUC on each fold expressed as follows:
five-fold cross-validation was performed on task2 using the Lightgbm model, with AUC on each fold expressed as follows:
five-fold cross-validation was performed on task3 using the Lightgbm model, with AUC on each fold expressed as follows:
example 3
Five-fold cross validation was performed on task1 using the Lightgbm model, with the following scores for average ACC, precision, recovery, f1, auc:
table 5. Average scores for various indicators cross-validated on task 1.
Five-fold cross validation was performed on task2 using the Lightgbm model, with the following scores for average ACC, precision, recovery, f1, auc:
AUC ACC precision recall F1
M_all 0.8135 0.8694 0.5415 0.7142 0.5475
M_sig 0.9045 0.8665 0.5365 0.9142 0.6441
M_sigSTE 0.8845 0.9108 0.6758 0.5714 0.6112
M_sigLSI 0.7158 0.7509 0.2969 0.7428 0.4028
M_sig50 0.8000 0.8807 0.4988 0.6243 0.5275
M_sig25 0.8075 0.8950 0.4902 0.6571 0.5322
M_orig 0.7434 0.8423 0.7107 0.6 0.5572
table 6. Average scores for various indicators cross-validated on task 2.
Five-fold cross validation was performed on task3 using the Lightgbm model, with the following scores for average ACC, precision, recovery, f1, auc:
table 7. Average scores for various indicators cross-validated on task 3.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions; moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Claims (5)

1. A method for predicting acute exacerbation of COPD based on a time window, the method comprising the steps of:
s1, collecting lung indexes of a patient twice daily by using a small lung instrument, wherein the lung indexes comprise lung vibration energy maximum values collected by FVC, FEV1, PEF and a stethoscope in the morning and afternoon, wherein the FVC adopts the instrument 'small lung instrument' to obtain forced vital capacity, namely the maximum air volume which can be exhaled as soon as possible after the maximum inhalation is performed; the FEV1 adopts an instrument 'small lung instrument', and obtains the volume of the maximum exhalation after the maximum deep inhalation, wherein the volume of the gas exhaled by the maximum first second of exhalation; the PEF adopts an instrument 'small lung instrument', and obtains the instant flow rate when the expiratory flow is the fastest in the forced vital capacity measurement process;
s2, predicting T+1, T+2 and T+3 days for a supported model, predicting lung monitoring indexes of a patient by using a fixed time 7-day window, collecting 32 indexes of the patient in the morning and evening every day through electronic equipment, distinguishing the indexes of the 7-day time window into date and whether the date is in the morning or not, and obtaining the characteristic quantity of 32 multiplied by 7 multiplied by 2=448 in order to keep the usability of the model;
s3, extracting more features according to the features, wherein the features can reflect the change condition of lung monitoring indexes of a patient; the data expansion includes: index sliding window statistics; a difference in the daytime index;
s4, taking the first exacerbation and the first 7 days as a positive sample; for negative samples, the prescribed time window cannot include 30 days before and after the period of the acute attack, so as to prevent the condition from affecting the monitoring index; the negative sample is generated by sampling all data which can be observed continuously for 7 days in the data;
s5, carrying out significance test on the features, and finding out whether 235 features have significant correlation on the aggravation of the T+d days, wherein d=1, 2 and 3;
s6, using the 235 significant features as a model to input parameters, and predicting whether the T+d day is aggravated; the model adopts an integrated model based on a decision tree: XGBoost, lightGBM and Catboost, and evaluate model effects using 5-fold cross-validation;
the prediction period length takes eight days as a time window to intercept positive samples, and the eight days are marked as T-7, T-6, T-5, T-4, T-3, T-2, T-1 and T, and for the positive samples, the T-th day is the starting date of the acute exacerbation; for negative samples, the prescribed time window cannot include 7 days before and after the period of the seizure;
in order to achieve the effect of early warning, 3 groups of prediction tasks are set in advance in the prediction period:
(1) Task_1, adopting an observation value from T-5 days to T-1 day, and predicting whether the acute exacerbation is carried out on the T day;
(2) Task_2, adopting an observed value from T-6 days to T-2 days, and predicting whether the acute exacerbation is carried out on the T day;
(3) Task_3, using the observed value from T-7 days to T-3 days, predicts whether the acute exacerbation is carried out on the T day;
to reduce the number of features, a Kolmogorov-Smirnov test is performed on the features, which compares whether the two distributions are identical, and then tests the distribution of each feature on the positive sample and the distribution on the negative sample, taking a confidence of 0.05.
2. A method for predicting acute exacerbations of COPD based on a time window according to claim 1, wherein: the interpretation method of the XGBoost model comprises the following steps:
(1) Performing tree model element structure analysis on the XGBoost model to analyze the tree structure of each single tree;
(2) Inputting a test sample to the XGBoost model, and acquiring an effective leaf node corresponding to the test sample and an effective path of a tree of the effective leaf node according to a tree structure;
(3) And calculating a contribution value of the feature according to the effective path, and explaining the XGBoost model according to the obtained contribution value.
3. A method for predicting acute exacerbations of COPD based on a time window according to claim 1, wherein: the XGBoost is a Boosting integration method, is largely used for data mining, and can process missing values and regularize features, so that the function of second-order acceleration optimization of the cost function is realized.
4. A method for predicting acute exacerbations of COPD based on a time window according to claim 1, wherein: the LightGBM is a new gradient-lifted tree framework, supporting GBDT, GBRT, GBM and MART algorithms, which is a complete solution for distributed training based on the DMTK framework.
5. A method for predicting acute exacerbations of COPD based on a time window according to claim 1, wherein: the Catboost algorithm includes: in the sensing period, the secondary user sends the energy value in the sensed channel to the fusion center as a characteristic energy vector, and the primary user intermittently sends information of occupying the frequency spectrum resource or not to the fusion center as a label, so that the construction of a training data set is completed, and a model is trained by a Catboost algorithm in the fusion center.
CN202111319613.8A 2021-11-09 2021-11-09 COPD acute exacerbation prediction method based on time window Active CN113974566B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111319613.8A CN113974566B (en) 2021-11-09 2021-11-09 COPD acute exacerbation prediction method based on time window

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111319613.8A CN113974566B (en) 2021-11-09 2021-11-09 COPD acute exacerbation prediction method based on time window

Publications (2)

Publication Number Publication Date
CN113974566A CN113974566A (en) 2022-01-28
CN113974566B true CN113974566B (en) 2023-09-19

Family

ID=79747333

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111319613.8A Active CN113974566B (en) 2021-11-09 2021-11-09 COPD acute exacerbation prediction method based on time window

Country Status (1)

Country Link
CN (1) CN113974566B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117894478B (en) * 2024-03-14 2024-05-28 天津市肿瘤医院(天津医科大学肿瘤医院) Informationized intelligent management method for severe cases of oncology department of severe cases of oncology

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451390A (en) * 2017-02-22 2017-12-08 Cc和I研究有限公司 System for predicting acute exacerbations in patients with chronic obstructive pulmonary disease
CN110123274A (en) * 2019-04-29 2019-08-16 上海电气集团股份有限公司 A kind of monitoring system of septicopyemia
CN110289061A (en) * 2019-06-27 2019-09-27 黎檀实 A kind of Time Series Forecasting Methods of the traumatic hemorrhagic shock condition of the injury
CN111657888A (en) * 2020-05-28 2020-09-15 首都医科大学附属北京天坛医院 Severe acute respiratory distress syndrome early warning method and system
CN113057588A (en) * 2021-03-17 2021-07-02 上海电气集团股份有限公司 Disease early warning method, device, equipment and medium
WO2021148967A1 (en) * 2020-01-23 2021-07-29 Novartis Ag A computer-implemented system and method for outputting a prediction of a probability of a hospitalization of patients with chronic obstructive pulmonary disorder
CN113469227A (en) * 2021-06-18 2021-10-01 南京润楠医疗电子研究院有限公司 Forced expiration total amount prediction method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150080671A1 (en) * 2013-05-29 2015-03-19 Technical University Of Denmark Sleep Spindles as Biomarker for Early Detection of Neurodegenerative Disorders

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451390A (en) * 2017-02-22 2017-12-08 Cc和I研究有限公司 System for predicting acute exacerbations in patients with chronic obstructive pulmonary disease
CN110123274A (en) * 2019-04-29 2019-08-16 上海电气集团股份有限公司 A kind of monitoring system of septicopyemia
CN110289061A (en) * 2019-06-27 2019-09-27 黎檀实 A kind of Time Series Forecasting Methods of the traumatic hemorrhagic shock condition of the injury
WO2021148967A1 (en) * 2020-01-23 2021-07-29 Novartis Ag A computer-implemented system and method for outputting a prediction of a probability of a hospitalization of patients with chronic obstructive pulmonary disorder
CN111657888A (en) * 2020-05-28 2020-09-15 首都医科大学附属北京天坛医院 Severe acute respiratory distress syndrome early warning method and system
CN113057588A (en) * 2021-03-17 2021-07-02 上海电气集团股份有限公司 Disease early warning method, device, equipment and medium
CN113469227A (en) * 2021-06-18 2021-10-01 南京润楠医疗电子研究院有限公司 Forced expiration total amount prediction method

Also Published As

Publication number Publication date
CN113974566A (en) 2022-01-28

Similar Documents

Publication Publication Date Title
Botha et al. Detection of tuberculosis by automatic cough sound analysis
US10332638B2 (en) Methods and systems for pre-symptomatic detection of exposure to an agent
CN109166630B (en) Infectious disease data monitoring and processing method and system
CN111261282A (en) Sepsis early prediction method based on machine learning
EP2677927B1 (en) Respiration monitoring method and system
JP2002542868A (en) Air quality analysis method and apparatus based on human response and clustering method
CN108597601A (en) Diagnosis of chronic obstructive pulmonary disease auxiliary system based on support vector machines and method
CN101939738A (en) Method and apparatus for monitoring physiological parameter variability over time for one or more organs
CN106714682B (en) Device, system, method and computer program for assessing the risk of an exacerbation and/or hospitalization
CN112216402A (en) Epidemic situation prediction method and device based on artificial intelligence, computer equipment and medium
CN111919242B (en) System and method for processing multiple signals
CN113974566B (en) COPD acute exacerbation prediction method based on time window
Khan et al. Automated system design for classification of chronic lung viruses using non-linear dynamic system features and k-nearest neighbour
CN115240803A (en) Model training method, complication prediction system, complication prediction device, and complication prediction medium
Kristinsson et al. Prediction of serious outcomes based on continuous vital sign monitoring of high-risk patients
CN117133464B (en) Intelligent monitoring system and monitoring method for health of old people
CN109192312B (en) Intelligent management system and method for adverse events of heart failure patients
KR102169637B1 (en) Method for predicting of mortality risk and device for predicting of mortality risk using the same
Joshe et al. Symptoms analysis based chronic obstructive pulmonary disease prediction in Bangladesh using machine learning approach
CN114191665A (en) Method and device for classifying man-machine asynchronous phenomena in mechanical ventilation process
Orlandic et al. A semi-supervised algorithm for improving the consistency of crowdsourced datasets: The COVID-19 case study on respiratory disorder classification
Abdullah et al. MERS-CoV disease estimation (MDE) A study to estimate a MERS-CoV by classification algorithms
Xu et al. Automated detection of airflow obstructive diseases: A systematic review of the last decade (2013-2022)
Rehm et al. Use of Machine Learning to Screen for Acute Respiratory Distress Syndrome Using Raw Ventilator Waveform Data
Martins et al. Be-sys: Big data e-health system for analysis and detection of risk of septic shock in adult patients

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant