CN113974566B - COPD acute exacerbation prediction method based on time window - Google Patents
COPD acute exacerbation prediction method based on time window Download PDFInfo
- Publication number
- CN113974566B CN113974566B CN202111319613.8A CN202111319613A CN113974566B CN 113974566 B CN113974566 B CN 113974566B CN 202111319613 A CN202111319613 A CN 202111319613A CN 113974566 B CN113974566 B CN 113974566B
- Authority
- CN
- China
- Prior art keywords
- days
- model
- features
- time window
- predicting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7271—Specific aspects of physiological measurement analysis
- A61B5/7275—Determining trends in physiological measurement data; Predicting development of a medical condition based on physiological measurements, e.g. determining a risk factor
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Pathology (AREA)
- Heart & Thoracic Surgery (AREA)
- Psychiatry (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Physiology (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The invention discloses a time window-based method for predicting acute exacerbation of COPD, which comprises the following steps of S1, collecting lung indexes of a patient twice daily (in the morning and afternoon) by using a small lung instrument, an electronic stethoscope and other equipment; s2, predicting T+1, T+2 and T+3 days for a supported model, and keeping the usability of the model; s3, extracting more features according to the features, wherein the features can reflect the change condition of lung monitoring indexes of a patient; s4, taking one exacerbation (the first 7 days) as a positive sample; s5, performing significance test on the characteristics; s6, using the 235 significant features as a model to input parameters, predicting whether the patient is aggravated on the T+d day (d=1, 2, 3), and using lung monitoring data of a time window to predict whether the patient has acute exacerbation risk of COPD, so that the patient can monitor himself at home, and the method has important significance for home care of COPD patients.
Description
Technical Field
The invention relates to the technical field of COPD acute exacerbation prediction, in particular to a time window-based COPD acute exacerbation prediction method.
Background
Chronic obstructive pulmonary disease (hereinafter referred to as "COPD") is a disease of chronic bronchitis, emphysema, which is a disease causing damage to alveolar structures, or a disease in which both occur and the airways from bronchi to alveoli are closed; symptoms of this disease include: long-term cough with sputum, respiratory distress due to reduced air flow rate caused by airway obstruction, and common respiratory tract infections (such as the common cold); this disease causes high mortality worldwide and increases rapidly due to smoking, air pollution, etc.; the etiology of COPD is an abnormal chronic inflammatory response of the lung to toxic molecules or gases and to various factors that are involved in COPD in complexity (such as smoking, urbanization, pollution, infectious respiratory disease, etc.).
Combinations of clinical parameters have been used to predict acute exacerbations of COPD in patients; however, these clinical parameters are not adequate for accurate prediction of individual cases; furthermore, while COPD patients may develop a likelihood of acute exacerbations after going to the hospital due to the factors described above, COPD patients cannot predict their own likelihood of acute exacerbations; thus, COPD patients may lead to poor results when going to the hospital after an acute exacerbation of COPD has occurred.
Although there are currently literature to predict COPD acute exacerbation events using statistical or machine learning means, the literature currently has the following drawbacks:
1. the existing research is mainly cross section data, and the time sequence data cannot be used for carrying out real-time early warning on the acute exacerbation event of the COPD of the patient;
2. the current research does not perform characteristic excavation of the system, and improves the prediction capability of the model;
3. current studies fail to make predictions of t+1, t+2, t+3, and existing models can only predict the risk probability of future exacerbations in patients.
Disclosure of Invention
The invention aims to provide a time window-based COPD acute exacerbation prediction method, which uses lung monitoring data of a time window to predict whether a patient has a COPD acute exacerbation risk or not by T+d days (d=1, 2, 3), and the patient can monitor himself at home, self-early warn, and the operation is simple, which has important significance for the home care of the COPD patient, so as to solve the problems in the background technology.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a method for predicting acute exacerbation of COPD based on a time window, comprising the steps of:
s1, collecting lung indexes of a patient twice daily (morning and afternoon) by using a small lung instrument, an electronic stethoscope and other devices, such as FVC (fVC), FEV1 and the maximum energy value of lung vibration collected by the stethoscope, wherein the FVC adopts the instrument 'small lung instrument', and acquires forced vital capacity, namely the maximum air quantity which can be exhaled as soon as possible after the maximum inhalation is performed; the FEV1 adopts an instrument 'small lung instrument', and obtains the volume of the maximum exhalation after the maximum deep inhalation, wherein the volume of the gas exhaled by the maximum first second of exhalation; the PEF adopts an instrument 'small lung instrument', and obtains the instant flow rate when the expiratory flow is the fastest in the forced vital capacity measurement process;
s2, predicting T+1, T+2 and T+3 days for a supported model, and predicting by using patient lung monitoring indexes of a fixed time window (7 days) for maintaining the usability of the model, collecting 32 indexes of the patient in the morning and evening every day through electronic equipment, distinguishing the indexes of a five-day time window into date and whether the date is in the morning or not, wherein the characteristic quantity is 32 multiplied by 7 multiplied by 2=448;
s3, extracting more features according to the features, wherein the features can reflect the change condition of lung monitoring indexes of a patient; the data expansion includes: index sliding window statistics, such as 3 day mean/variance, 5 day mean/variance; a difference in the daytime index; 1744 extended feature numbers;
s4, taking one exacerbation (the first 7 days) as a positive sample; for negative samples, the prescribed time window cannot include 30 days before and after the period of the acute attack, so as to prevent the condition from affecting the monitoring index; the negative sample is generated by sampling all data which can be observed continuously for 7 days in the data;
s5, carrying out significance test on the features, and finding out whether 235 features aggravate on the T+d days (d=1, 2 and 3) and have significant correlation;
s6, using the 235 significant features as a model to input parameters, and predicting whether the T+d days (d=1, 2 and 3) are aggravated; the model adopts an integrated model based on a decision tree: xgboost, lightgbm and catheost, and model efficacy was evaluated using 5-fold cross-validation.
Preferably, the interpretation method of the XGBoost model comprises the following steps:
(1) Performing tree model element structure analysis on the XGBoost model to analyze the tree structure of each single tree;
(2) Inputting a test sample to the XGBoost model, and acquiring an effective leaf node corresponding to the test sample and an effective path of a tree of the effective leaf node according to a tree structure;
(3) And calculating a contribution value of the feature according to the effective path, and explaining the XGBoost model according to the obtained contribution value.
Preferably, the XGBoost uses a Boosting integration method, is largely used for data mining, and can process missing values and regularize features so as to realize the function of second-order acceleration optimization of the cost function.
Preferably, the LightGBM is a new gradient-lifted tree framework supporting GBDT, GBRT, GBM and MART algorithms, which is a complete solution for distributed training based on the DMTK framework.
Preferably, the Catboost algorithm includes: in the sensing period, the secondary user sends the energy value in the sensed channel to the fusion center as a characteristic energy vector, and the primary user intermittently sends information of occupying the frequency spectrum resource to the fusion center as a label, so that the construction of the training data set is completed. The model is trained in the fusion center using the Catboost algorithm.
Preferably, the Catboost algorithm is proposed by Yandex, which optimizes the processing of class features and computes leaf node values at the time of tree model selection, rather than data preprocessing, during the training phase, reducing overfitting.
Preferably, the predicted period duration takes eight days as a time window, the eight days are marked as (T-7, T-6, T-5, T-4, T-3, T-2, T-1, T), and for the positive sample, the T-th day is the acute exacerbation onset date; for negative examples, the prescribed time window cannot include 7 days before and after the period of the seizure.
Preferably, in order to achieve the effect of early warning, 3 groups of prediction tasks are set in advance in the prediction period:
(1) Task_1, adopting an observation value from T-5 days to T-1 day, and predicting whether the acute exacerbation is carried out on the T day;
(2) Task_2, adopting an observed value from T-6 days to T-2 days, and predicting whether the acute exacerbation is carried out on the T day;
(3) Task_3, using observations from day T-7 to day T-3, predicts whether or not day T is acutely aggravated.
Preferably, in order to reduce the number of features, a Kolmogorov-Smirnov test is performed on the features, the test can compare whether the two distributions are identical, and then the distribution of each feature on the positive sample and the distribution on the negative sample are tested, and the confidence coefficient is 0.05.
In summary, the beneficial effects of the invention are as follows due to the adoption of the technology:
1. the method can predict and early warn whether COPD acute exacerbation exists on the T+d days (d=1, 2, 3), and predict whether a patient has COPD acute exacerbation risk on the T+d days (d=1, 2, 3) by using lung monitoring data of a time window;
2. the invention uses the characteristic engineering, and the data construction method is as follows: the selection of positive and negative samples combines data sampling and medical knowledge, so that the model effect is remarkably improved;
3. the model has high practicability, the patient can monitor and self early warn at home, and the model is simple to operate, so that the model has important significance for the home care of COPD patients.
Drawings
FIG. 1 is a flow chart of the model construction of the present invention;
FIG. 2 is a ROC curve of five models based on LightGBM for task_1 setting of the present invention;
fig. 3 is a ROC curve of five models based on LightGBM under task_2 setting of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments; all other embodiments, based on the embodiments of the invention, which a person of ordinary skill in the art would obtain without inventive faculty, are within the scope of the invention; thus, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention; all other embodiments, based on the embodiments of the invention, which a person of ordinary skill in the art would obtain without inventive faculty, are within the scope of the invention;
the invention provides a method for predicting acute exacerbation of COPD based on a time window, which is shown in figures 1-3 and comprises the following steps:
s1, collecting lung indexes of a patient twice daily (morning and afternoon) by using a small lung instrument, an electronic stethoscope and other devices, such as FVC (fVC), FEV1 and the maximum energy value of lung vibration collected by the stethoscope, wherein the FVC adopts the instrument 'small lung instrument', and acquires forced vital capacity, namely the maximum air quantity which can be exhaled as soon as possible after the maximum inhalation is performed; the FEV1 adopts an instrument 'small lung instrument', and obtains the volume of the maximum exhalation after the maximum deep inhalation, wherein the volume of the gas exhaled by the maximum first second of exhalation; the PEF adopts an instrument 'small lung instrument', and obtains the instant flow rate (lung index is shown in table 1) when the expiratory flow is the fastest in the forced vital capacity measurement process;
s2, predicting T+1, T+2 and T+3 days for a supported model, and predicting by using patient lung monitoring indexes of a fixed time window (7 days) for maintaining the usability of the model, collecting 32 indexes of the patient in the morning and evening every day through electronic equipment, distinguishing the indexes of a five-day time window into date and whether the date is in the morning or not, wherein the characteristic quantity is 32 multiplied by 7 multiplied by 2=448;
s3, extracting more features according to the features, wherein the features can reflect the change condition of lung monitoring indexes of a patient; the data expansion includes: index sliding window statistics, such as 3 day mean/variance, 5 day mean/variance; a difference in the daytime index; 1744 extended feature numbers;
s4, taking one exacerbation (the first 7 days) as a positive sample; for negative samples, the prescribed time window cannot include 30 days before and after the period of the acute attack, so as to prevent the condition from affecting the monitoring index; the negative sample is generated by sampling all data which can be observed continuously for 7 days in the data;
s5, carrying out significance test on the features, and finding out whether 235 features aggravate on the T+d days (d=1, 2 and 3) and have significant correlation;
s6, using the 235 significant features as a model to input parameters, and predicting whether the T+d days (d=1, 2 and 3) are aggravated; the model adopts an integrated model based on a decision tree: xgboost, lightgbm and catheost, and model efficacy was evaluated using 5-fold cross-validation.
Specifically, the interpretation method of the XGBoost model comprises the following steps:
(1) Performing tree model element structure analysis on the XGBoost model to analyze the tree structure of each single tree;
(2) Inputting a test sample to the XGBoost model, and acquiring an effective leaf node corresponding to the test sample and an effective path of a tree of the effective leaf node according to a tree structure;
(3) And calculating a contribution value of the feature according to the effective path, and explaining the XGBoost model according to the obtained contribution value.
Specifically, the XGBoost utilizes a Boosting integration method, is largely used for data mining, and can process missing values and regularize features, thereby realizing the function of second-order acceleration optimization of the cost function.
Specifically, the LightGBM is a new gradient-lifted tree framework, supports GBDT, GBRT, GBM and MART algorithms, and is several times faster than the existing gradient-enhanced tree implementation due to its completely greedy tree growth method and histogram-based memory and computation optimization, and is a complete solution for distributed training based on the DMTK framework, which quickly becomes a common tool for data mining contestants after the emergence of the LightGBM.
Specifically, the Catboost algorithm includes: in the sensing period, the secondary user sends the energy value in the sensed channel to the fusion center as a characteristic energy vector, and the primary user intermittently sends information of occupying the frequency spectrum resource to the fusion center as a label, so that the construction of the training data set is completed. The model is trained in the fusion center using the Catboost algorithm.
Specifically, the Catboost algorithm is proposed by Yandex, optimizes the processing of class features, and calculates leaf node values at the time of tree model selection, during the training phase rather than the data preprocessing phase, reducing overfitting.
Specifically, the predicted period length takes eight days as a time window to intercept positive samples, the eight days are marked as (T-7, T-6, T-5, T-4, T-3, T-2, T-1, T), and for the positive samples, the T-th day is the starting date of the acute exacerbation; for negative examples, the prescribed time window cannot include 7 days before and after the period of the seizure.
Specifically, in order to achieve the effect of early warning in the prediction period, 3 groups of prediction tasks are set in advance:
(1) Task_1, adopting an observation value from T-5 days to T-1 day, and predicting whether the acute exacerbation is carried out on the T day;
(2) Task_2, adopting an observed value from T-6 days to T-2 days, and predicting whether the acute exacerbation is carried out on the T day;
(3) Task_3, using observations from day T-7 to day T-3, predicts whether or not day T is acutely aggravated.
Specifically, in order to reduce the number of features, a Kolmogorov-Smirnov test is performed on the features, the test can compare whether the two distributions are identical, and then the distribution of each feature on the positive sample and the distribution on the negative sample are tested, and the confidence coefficient is 0.05.
Table 1: observed value feature names and their interpretation;
table 2. Top fifty features that pass the significance test and P-value scoring;
using k-fold hierarchical cross-validation (k=5), the data was split into 5 folds, 8 at each time: 2 is divided into a training set and a testing set for training and testing the model. Verification indicates that the evaluation indexes are sensitivity, specificity and AUC, wherein the threshold is the minimum threshold that causes sensitivity to exceed 0.9, and the specificity is the specificity under the current threshold. The used model is catboost, xgboost and lightgbm, and other super parameters are obtained by performing super parameter search through cross verification; three tasks are set: task_1, task_2, task_3, under each Task, 5 models were set:
(1) M_all, training by adopting all the characteristics;
(2) M_sig, employing all features that pass the saliency test;
(3) M_sigste, using electronic stethoscope-related features that pass the significance test;
(4) M_siglsi, using small lung instrument features that pass the saliency test;
(5) M_sig50, adopting the first 50 features with the lowest p value passing the significance test under the task setting;
(6) M_sig25, employing the first 25 features that pass the saliency test;
(7) M_orig, training with all raw observations.
Task_1 | Task_2 | Task_3 | |
M_all | 0.8135 | 0.8135 | 0.8135 |
M_sig | 0.9268 | 0.9045 | 0.8887 |
M_sigSTE | 0.9020 | 0.8845 | 0.8302 |
M_sigLSI | 0.8279 | 0.7158 | 0.6617 |
M_sig50 | 0.8826 | 0.8000 | 0.8631 |
M_sig25 | 0.8173 | 0.8075 | 0.8816 |
M_orig | 0.7361 | 0.7434 | 0.5782 |
Table 3. AUC mean score for cross validation;
the task_1 setting had 123 salient features, 31 small lung features passing the saliency test and 92 electronic stethoscopes.
The task_2 setting had 134 salient features, 33 of which passed the saliency test and 101 of which were electronic stethoscopes. The task_3 setting had 131 salient features, 28 small lung features passing the saliency test and 103 electronic stethoscopes.
Table 3 reports the AUC average score for cross-validation, where the model used was LightGBM. (1) Task_1 can get a higher score, which is consistent with visual understanding (one day after prediction is simpler than two or three days after prediction);
(2) Only features generated by a small lung instrument are obviously reduced in score, but only features generated by a stethoscope are still better in performance, so that the observation data of the electronic stethoscope has stronger judging and predicting effects;
(3) The adoption of the significance test to screen the features is obviously improved compared with the direct use of the original observed value or all the features;
(4) With features of front 50 or front 25 of significance, the model score will drop somewhat, indicating a reduced model fitting ability after the number of features is reduced. The ROC curves for five models based on LightGBM at task_1 setting as shown in fig. 1;
table 3 reports the AUC average score for cross-validation, where the model used was LightGBM. To verify performance under other models, we below give the effect under xgboost or catboost models:
Task_1 | Task_2 | Task_3 | |
M_all | 0.8772 | 0.8673 | 0.8142 |
M_sig | 0.9181 | 0.8946 | 0.8233 |
M_sigSTE | 0.9036 | 0.8792 | 0.8110 |
M_sigLSI | 0.8279 | 0.7610 | 0.7000 |
M_sig50 | 0.8372 | 0.7831 | 0.8184 |
M_sig25 | 0.8177 | 0.8047 | 0.8203 |
M_orig | 0.7812 | 0.7881 | 0.6659 |
table 3-1. AUC mean score for cross-validation. The model used is xgboost.
/>
Table 3-2. AUC mean score for cross-validation. The model used is a catboost.
Sensitivity to | Specificity (specificity) | Probability threshold | |
Task_1M_sig50 | 0.9043 | 0.7345 | 0.0113 |
Task_2M_sig50 | 0.9043 | 0.7098 | 0.0091 |
Task_3M_sig50 | 0.9043 | 0.6623 | 0.0042 |
Table 4. Sensitivity and specificity values of the optimal model M sig for each task setting.
In order to verify the effect of different decision tree models on the performance of our predicted task, the following table reports the model performance of the significant features under three task settings, we compared Xgboost, lightgbm with Catboost, the three most powerful decision tree model-based gradient lifting (gradient boosting) algorithms, and according to experimental results, lightgbm performs best under all three of our task settings.
Lightgbm | Catboost | Xgboost | |
Task_1M_sig | 0.9268 | 0.8852 | 0.9181 |
Task_2M_sig | 0.9045 | 0.8505 | 0.8946 |
Task_3M_sig | 0.8887 | 0.8722 | 0.8233 |
Table 5. Cross-validation average AUC for three classes of decision tree integration models, catboost, lightgbm, xgboost, based on the most characteristic combination, m_sig, per task setting.
Example 2
Five-fold cross-validation was performed on task1 using the Lightgbm model, with AUC on each fold expressed as follows:
five-fold cross-validation was performed on task2 using the Lightgbm model, with AUC on each fold expressed as follows:
five-fold cross-validation was performed on task3 using the Lightgbm model, with AUC on each fold expressed as follows:
example 3
Five-fold cross validation was performed on task1 using the Lightgbm model, with the following scores for average ACC, precision, recovery, f1, auc:
table 5. Average scores for various indicators cross-validated on task 1.
Five-fold cross validation was performed on task2 using the Lightgbm model, with the following scores for average ACC, precision, recovery, f1, auc:
AUC | ACC | precision | recall | F1 | |
M_all | 0.8135 | 0.8694 | 0.5415 | 0.7142 | 0.5475 |
M_sig | 0.9045 | 0.8665 | 0.5365 | 0.9142 | 0.6441 |
M_sigSTE | 0.8845 | 0.9108 | 0.6758 | 0.5714 | 0.6112 |
M_sigLSI | 0.7158 | 0.7509 | 0.2969 | 0.7428 | 0.4028 |
M_sig50 | 0.8000 | 0.8807 | 0.4988 | 0.6243 | 0.5275 |
M_sig25 | 0.8075 | 0.8950 | 0.4902 | 0.6571 | 0.5322 |
M_orig | 0.7434 | 0.8423 | 0.7107 | 0.6 | 0.5572 |
table 6. Average scores for various indicators cross-validated on task 2.
Five-fold cross validation was performed on task3 using the Lightgbm model, with the following scores for average ACC, precision, recovery, f1, auc:
table 7. Average scores for various indicators cross-validated on task 3.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions; moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Claims (5)
1. A method for predicting acute exacerbation of COPD based on a time window, the method comprising the steps of:
s1, collecting lung indexes of a patient twice daily by using a small lung instrument, wherein the lung indexes comprise lung vibration energy maximum values collected by FVC, FEV1, PEF and a stethoscope in the morning and afternoon, wherein the FVC adopts the instrument 'small lung instrument' to obtain forced vital capacity, namely the maximum air volume which can be exhaled as soon as possible after the maximum inhalation is performed; the FEV1 adopts an instrument 'small lung instrument', and obtains the volume of the maximum exhalation after the maximum deep inhalation, wherein the volume of the gas exhaled by the maximum first second of exhalation; the PEF adopts an instrument 'small lung instrument', and obtains the instant flow rate when the expiratory flow is the fastest in the forced vital capacity measurement process;
s2, predicting T+1, T+2 and T+3 days for a supported model, predicting lung monitoring indexes of a patient by using a fixed time 7-day window, collecting 32 indexes of the patient in the morning and evening every day through electronic equipment, distinguishing the indexes of the 7-day time window into date and whether the date is in the morning or not, and obtaining the characteristic quantity of 32 multiplied by 7 multiplied by 2=448 in order to keep the usability of the model;
s3, extracting more features according to the features, wherein the features can reflect the change condition of lung monitoring indexes of a patient; the data expansion includes: index sliding window statistics; a difference in the daytime index;
s4, taking the first exacerbation and the first 7 days as a positive sample; for negative samples, the prescribed time window cannot include 30 days before and after the period of the acute attack, so as to prevent the condition from affecting the monitoring index; the negative sample is generated by sampling all data which can be observed continuously for 7 days in the data;
s5, carrying out significance test on the features, and finding out whether 235 features have significant correlation on the aggravation of the T+d days, wherein d=1, 2 and 3;
s6, using the 235 significant features as a model to input parameters, and predicting whether the T+d day is aggravated; the model adopts an integrated model based on a decision tree: XGBoost, lightGBM and Catboost, and evaluate model effects using 5-fold cross-validation;
the prediction period length takes eight days as a time window to intercept positive samples, and the eight days are marked as T-7, T-6, T-5, T-4, T-3, T-2, T-1 and T, and for the positive samples, the T-th day is the starting date of the acute exacerbation; for negative samples, the prescribed time window cannot include 7 days before and after the period of the seizure;
in order to achieve the effect of early warning, 3 groups of prediction tasks are set in advance in the prediction period:
(1) Task_1, adopting an observation value from T-5 days to T-1 day, and predicting whether the acute exacerbation is carried out on the T day;
(2) Task_2, adopting an observed value from T-6 days to T-2 days, and predicting whether the acute exacerbation is carried out on the T day;
(3) Task_3, using the observed value from T-7 days to T-3 days, predicts whether the acute exacerbation is carried out on the T day;
to reduce the number of features, a Kolmogorov-Smirnov test is performed on the features, which compares whether the two distributions are identical, and then tests the distribution of each feature on the positive sample and the distribution on the negative sample, taking a confidence of 0.05.
2. A method for predicting acute exacerbations of COPD based on a time window according to claim 1, wherein: the interpretation method of the XGBoost model comprises the following steps:
(1) Performing tree model element structure analysis on the XGBoost model to analyze the tree structure of each single tree;
(2) Inputting a test sample to the XGBoost model, and acquiring an effective leaf node corresponding to the test sample and an effective path of a tree of the effective leaf node according to a tree structure;
(3) And calculating a contribution value of the feature according to the effective path, and explaining the XGBoost model according to the obtained contribution value.
3. A method for predicting acute exacerbations of COPD based on a time window according to claim 1, wherein: the XGBoost is a Boosting integration method, is largely used for data mining, and can process missing values and regularize features, so that the function of second-order acceleration optimization of the cost function is realized.
4. A method for predicting acute exacerbations of COPD based on a time window according to claim 1, wherein: the LightGBM is a new gradient-lifted tree framework, supporting GBDT, GBRT, GBM and MART algorithms, which is a complete solution for distributed training based on the DMTK framework.
5. A method for predicting acute exacerbations of COPD based on a time window according to claim 1, wherein: the Catboost algorithm includes: in the sensing period, the secondary user sends the energy value in the sensed channel to the fusion center as a characteristic energy vector, and the primary user intermittently sends information of occupying the frequency spectrum resource or not to the fusion center as a label, so that the construction of a training data set is completed, and a model is trained by a Catboost algorithm in the fusion center.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111319613.8A CN113974566B (en) | 2021-11-09 | 2021-11-09 | COPD acute exacerbation prediction method based on time window |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111319613.8A CN113974566B (en) | 2021-11-09 | 2021-11-09 | COPD acute exacerbation prediction method based on time window |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113974566A CN113974566A (en) | 2022-01-28 |
CN113974566B true CN113974566B (en) | 2023-09-19 |
Family
ID=79747333
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111319613.8A Active CN113974566B (en) | 2021-11-09 | 2021-11-09 | COPD acute exacerbation prediction method based on time window |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113974566B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117894478B (en) * | 2024-03-14 | 2024-05-28 | 天津市肿瘤医院(天津医科大学肿瘤医院) | Informationized intelligent management method for severe cases of oncology department of severe cases of oncology |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107451390A (en) * | 2017-02-22 | 2017-12-08 | Cc和I研究有限公司 | System for predicting acute exacerbations in patients with chronic obstructive pulmonary disease |
CN110123274A (en) * | 2019-04-29 | 2019-08-16 | 上海电气集团股份有限公司 | A kind of monitoring system of septicopyemia |
CN110289061A (en) * | 2019-06-27 | 2019-09-27 | 黎檀实 | A kind of Time Series Forecasting Methods of the traumatic hemorrhagic shock condition of the injury |
CN111657888A (en) * | 2020-05-28 | 2020-09-15 | 首都医科大学附属北京天坛医院 | Severe acute respiratory distress syndrome early warning method and system |
CN113057588A (en) * | 2021-03-17 | 2021-07-02 | 上海电气集团股份有限公司 | Disease early warning method, device, equipment and medium |
WO2021148967A1 (en) * | 2020-01-23 | 2021-07-29 | Novartis Ag | A computer-implemented system and method for outputting a prediction of a probability of a hospitalization of patients with chronic obstructive pulmonary disorder |
CN113469227A (en) * | 2021-06-18 | 2021-10-01 | 南京润楠医疗电子研究院有限公司 | Forced expiration total amount prediction method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150080671A1 (en) * | 2013-05-29 | 2015-03-19 | Technical University Of Denmark | Sleep Spindles as Biomarker for Early Detection of Neurodegenerative Disorders |
-
2021
- 2021-11-09 CN CN202111319613.8A patent/CN113974566B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107451390A (en) * | 2017-02-22 | 2017-12-08 | Cc和I研究有限公司 | System for predicting acute exacerbations in patients with chronic obstructive pulmonary disease |
CN110123274A (en) * | 2019-04-29 | 2019-08-16 | 上海电气集团股份有限公司 | A kind of monitoring system of septicopyemia |
CN110289061A (en) * | 2019-06-27 | 2019-09-27 | 黎檀实 | A kind of Time Series Forecasting Methods of the traumatic hemorrhagic shock condition of the injury |
WO2021148967A1 (en) * | 2020-01-23 | 2021-07-29 | Novartis Ag | A computer-implemented system and method for outputting a prediction of a probability of a hospitalization of patients with chronic obstructive pulmonary disorder |
CN111657888A (en) * | 2020-05-28 | 2020-09-15 | 首都医科大学附属北京天坛医院 | Severe acute respiratory distress syndrome early warning method and system |
CN113057588A (en) * | 2021-03-17 | 2021-07-02 | 上海电气集团股份有限公司 | Disease early warning method, device, equipment and medium |
CN113469227A (en) * | 2021-06-18 | 2021-10-01 | 南京润楠医疗电子研究院有限公司 | Forced expiration total amount prediction method |
Also Published As
Publication number | Publication date |
---|---|
CN113974566A (en) | 2022-01-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Botha et al. | Detection of tuberculosis by automatic cough sound analysis | |
US10332638B2 (en) | Methods and systems for pre-symptomatic detection of exposure to an agent | |
CN109166630B (en) | Infectious disease data monitoring and processing method and system | |
CN111261282A (en) | Sepsis early prediction method based on machine learning | |
EP2677927B1 (en) | Respiration monitoring method and system | |
JP2002542868A (en) | Air quality analysis method and apparatus based on human response and clustering method | |
CN108597601A (en) | Diagnosis of chronic obstructive pulmonary disease auxiliary system based on support vector machines and method | |
CN101939738A (en) | Method and apparatus for monitoring physiological parameter variability over time for one or more organs | |
CN106714682B (en) | Device, system, method and computer program for assessing the risk of an exacerbation and/or hospitalization | |
CN112216402A (en) | Epidemic situation prediction method and device based on artificial intelligence, computer equipment and medium | |
CN111919242B (en) | System and method for processing multiple signals | |
CN113974566B (en) | COPD acute exacerbation prediction method based on time window | |
Khan et al. | Automated system design for classification of chronic lung viruses using non-linear dynamic system features and k-nearest neighbour | |
CN115240803A (en) | Model training method, complication prediction system, complication prediction device, and complication prediction medium | |
Kristinsson et al. | Prediction of serious outcomes based on continuous vital sign monitoring of high-risk patients | |
CN117133464B (en) | Intelligent monitoring system and monitoring method for health of old people | |
CN109192312B (en) | Intelligent management system and method for adverse events of heart failure patients | |
KR102169637B1 (en) | Method for predicting of mortality risk and device for predicting of mortality risk using the same | |
Joshe et al. | Symptoms analysis based chronic obstructive pulmonary disease prediction in Bangladesh using machine learning approach | |
CN114191665A (en) | Method and device for classifying man-machine asynchronous phenomena in mechanical ventilation process | |
Orlandic et al. | A semi-supervised algorithm for improving the consistency of crowdsourced datasets: The COVID-19 case study on respiratory disorder classification | |
Abdullah et al. | MERS-CoV disease estimation (MDE) A study to estimate a MERS-CoV by classification algorithms | |
Xu et al. | Automated detection of airflow obstructive diseases: A systematic review of the last decade (2013-2022) | |
Rehm et al. | Use of Machine Learning to Screen for Acute Respiratory Distress Syndrome Using Raw Ventilator Waveform Data | |
Martins et al. | Be-sys: Big data e-health system for analysis and detection of risk of septic shock in adult patients |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |