CN113871009A - Sepsis prediction system, storage medium and apparatus in intensive care unit - Google Patents
Sepsis prediction system, storage medium and apparatus in intensive care unit Download PDFInfo
- Publication number
- CN113871009A CN113871009A CN202111137716.2A CN202111137716A CN113871009A CN 113871009 A CN113871009 A CN 113871009A CN 202111137716 A CN202111137716 A CN 202111137716A CN 113871009 A CN113871009 A CN 113871009A
- Authority
- CN
- China
- Prior art keywords
- sepsis
- data
- intensive care
- care unit
- medical monitoring
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 206010040047 Sepsis Diseases 0.000 title claims abstract description 105
- 238000012544 monitoring process Methods 0.000 claims abstract description 31
- 239000013598 vector Substances 0.000 claims abstract description 26
- 238000012549 training Methods 0.000 claims abstract description 25
- 238000000605 extraction Methods 0.000 claims abstract description 20
- 238000007781 pre-processing Methods 0.000 claims abstract description 18
- 230000009466 transformation Effects 0.000 claims abstract description 12
- 238000000034 method Methods 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 7
- 230000002159 abnormal effect Effects 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 5
- 238000000513 principal component analysis Methods 0.000 claims description 5
- 230000007812 deficiency Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 7
- 238000007418 data mining Methods 0.000 abstract description 3
- 238000004422 calculation algorithm Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 9
- 238000005457 optimization Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000012706 support-vector machine Methods 0.000 description 5
- 206010051379 Systemic Inflammatory Response Syndrome Diseases 0.000 description 4
- 238000002790 cross-validation Methods 0.000 description 4
- 230000008030 elimination Effects 0.000 description 4
- 238000003379 elimination reaction Methods 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 206010053159 Organ failure Diseases 0.000 description 2
- 238000000540 analysis of variance Methods 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000034994 death Effects 0.000 description 2
- 231100000517 death Toxicity 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 206010040070 Septic Shock Diseases 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000004768 organ dysfunction Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000036303 septic shock Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Public Health (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Pathology (AREA)
- Primary Health Care (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The invention belongs to the technical field of medical data mining, and provides a sepsis prediction system, a storage medium and equipment in an intensive care unit. The system comprises a data preprocessing module, a data processing module and a data processing module, wherein the data preprocessing module is used for acquiring medical monitoring data of a person to be monitored in an intensive care unit and preprocessing the medical monitoring data; the characteristic selection and extraction module is used for receiving the preprocessed medical monitoring data according to the time sequence and carrying out characteristic selection and characteristic extraction on the received data; the sepsis prediction module is used for converting the time sequence into a characteristic vector through network transformation, inputting the characteristic vector and the current timestamp information into a sepsis prediction model after training, and predicting the probability of sepsis; the sepsis prediction model is formed by connecting a plurality of classifiers in parallel, and the finally predicted sepsis probability is the average value of the output probabilities of the plurality of classifiers.
Description
Technical Field
The invention belongs to the technical field of medical data mining, and particularly relates to a sepsis prediction system, a storage medium and equipment in an intensive care unit.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Recent research studies have shown that approximately 3150 million people worldwide suffer from sepsis annually, with over 600 million people dying from sepsis and the incidence and mortality of sepsis is higher in Intensive Care Units (ICU), with approximately two-thirds of the total number of deaths. The cost of treatment for sepsis also rises year by year, with sepsis costing as much as $ 170 billion per year in the united states and a high hospital care cost of approximately 25 billion pounds per year in the uk, as statistics show. The european critical care society in 2016 combined with the american critical care society issued a Sepsis diagnostic standard (Sepsis-3), defined as: life-threatening organ dysfunction due to dysregulated host response to infection. Furthermore, mortality rates increase dramatically with the delay of antibiotic treatment, e.g. in the case of septic shock, the risk of death increases by about 10% for each hour of delay of antibiotic treatment. Therefore, early detection of sepsis, and timely treatment, is critical to improving mortality from sepsis in ICU.
An ICU multi-parameter monitor in an intensive care unit generates about 500MB of data in 24 hours, and clinically significant medical data is largely discarded due to short storage time of the data of the equipment and lack of special data analysis. Time series data analysis methods for sepsis are currently lacking in intensive care units and in early intensive care units scoring markers are used to monitor the development of the condition in ICU patients, for sepsis monitoring, Sequential Organ Failure Assessment (SOFA) and systemic inflammatory response syndrome assessment (SIRS) are commonly used. However, the development of sepsis is a dynamic process and scoring criteria may not always meet requirements, leading to uncertainty in these scoring criteria.
Disclosure of Invention
In order to solve the technical problems in the background art, the invention provides a sepsis prediction system, a storage medium and equipment in an intensive care unit, which fully utilize various data characteristics in an ICU according to a data mining method, can dynamically adjust early warning information according to time change, and have the remarkable effects of high prediction speed and high prediction accuracy.
In order to achieve the purpose, the invention adopts the following technical scheme:
a first aspect of the invention provides a sepsis prediction system in an intensive care unit, comprising:
the data preprocessing module is used for acquiring and preprocessing medical monitoring data of a person to be monitored in the intensive care unit;
the characteristic selection and extraction module is used for receiving the preprocessed medical monitoring data according to the time sequence and carrying out characteristic selection and characteristic extraction on the received data;
the sepsis prediction module is used for converting the time sequence into a characteristic vector through network transformation, inputting the characteristic vector and the current timestamp information into a sepsis prediction model after training, and predicting the probability of sepsis;
the sepsis prediction model is formed by connecting a plurality of classifiers in parallel, and the finally predicted sepsis probability is the average value of the output probabilities of the plurality of classifiers.
A second aspect of the invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
acquiring medical monitoring data of a person to be monitored in an intensive care unit, and preprocessing the medical monitoring data;
receiving the preprocessed medical monitoring data according to a time sequence, and performing feature selection and feature extraction from the received data;
converting the time sequence into a characteristic vector through network transformation, inputting the characteristic vector and current timestamp information into a sepsis prediction model after training, and predicting the probability of sepsis;
the sepsis prediction model is formed by connecting a plurality of classifiers in parallel, and the finally predicted sepsis probability is the average value of the output probabilities of the plurality of classifiers.
A third aspect of the present invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the program:
acquiring medical monitoring data of a person to be monitored in an intensive care unit, and preprocessing the medical monitoring data;
receiving the preprocessed medical monitoring data according to a time sequence, and performing feature selection and feature extraction from the received data;
converting the time sequence into a characteristic vector through network transformation, inputting the characteristic vector and current timestamp information into a sepsis prediction model after training, and predicting the probability of sepsis;
the sepsis prediction model is formed by connecting a plurality of classifiers in parallel, and the finally predicted sepsis probability is the average value of the output probabilities of the plurality of classifiers.
Compared with the prior art, the invention has the beneficial effects that:
the method receives preprocessed medical monitoring data according to a time sequence, performs feature selection and feature extraction from the received data, converts the time sequence into a feature vector through network transformation, inputs the feature vector and current timestamp information into a sepsis prediction model after training, and predicts the probability of sepsis; the sepsis prediction model is formed by connecting a plurality of classifiers in parallel, the probability of sepsis occurrence predicted finally is the average value of the output probabilities of the plurality of classifiers, various data characteristics in the intensive care unit are fully utilized, early warning information can be dynamically adjusted according to time change, and the sepsis prediction model has the remarkable effects of high prediction speed and high prediction accuracy.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
Fig. 1 is a schematic structural diagram of a sepsis prediction system in an intensive care unit according to an embodiment of the present invention.
FIG. 2 is a schematic diagram of sepsis prediction in an intensive care unit according to an embodiment of the present invention;
FIG. 3 is a diagram of the overall structure framework of the LightGBM model according to the embodiment of the invention;
fig. 4 is a sepsis care profile table in an intensive care unit of an embodiment of the present invention.
Detailed Description
The invention is further described with reference to the following figures and examples.
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
Example one
As shown in fig. 1, the present embodiment provides a sepsis prediction system in an intensive care unit, which specifically includes the following modules:
(1) and the data preprocessing module is used for acquiring the medical monitoring data of the person to be monitored in the intensive care unit and preprocessing the medical monitoring data.
For example: the medical care data in the intensive care unit comprises 40 characteristic indexes such as 8 vital sign indexes, 26 laboratory indexes and 6 individual demographic indexes, and the timestamp of the data index is recorded once per hour. Fig. 4 shows a sepsis care profile table in an intensive care unit in which medical care data in the intensive care unit is recorded.
The preprocessing comprises the steps of performing normalization processing on the acquired medical care data of the patient to be monitored in the intensive care unit, and performing missing value filling and abnormal value screening and replacing. And filling missing values by using a random normal distribution and a multi-neighborhood missing value interpolation method.
Specifically, because the acquired ICU (intensive care unit) data includes problems of data abnormality, data loss, non-uniform data format, and the like, it is necessary to perform normalization processing on the acquired data format first, and then perform filling of missing values and screening and replacement of abnormal values.
Missing values are common in the original clinical data set, and in ICU data, time series data are missing due to long assay time and sampling intervals, which presents an obstacle to clinical data analysis.
The data preprocessing comprises the following specific steps:
calculating the variance and standard deviation corresponding to each characteristic variable, setting a threshold (for example, any value of 0.2-0.8), defining the threshold as a data abnormal value if the threshold is exceeded, and needing to be filtered;
filling missing values of an ICU clinical data set by using a random normal distribution and multi-neighborhood missing value interpolation method according to an autoimpute packet;
and finally, adjusting the data unit into a uniform unit, and converting the data format into time series data.
The autoimpute is a python packet used for analyzing and implementing the interpolation method.
(2) And the feature selection and extraction module is used for receiving the preprocessed medical monitoring data according to the time sequence and carrying out feature selection and feature extraction on the received data.
And selecting optimal multiple groups of characteristic data from the preprocessed medical monitoring data based on multiple characteristic selections. And (4) performing sepsis feature extraction according to principal component analysis and non-negative matrix decomposition.
Because the acquired data features are more in types and the missing rate of part of the data is higher, the features of which the missing rate of the data exceeds a set threshold value are removed firstly. For example: the data missing rate threshold is set to 0.9, i.e., if the data missing rate reaches 0.9, the feature is removed.
Then, feature _ selection in the scimit-learn machine learning library is used for primary feature selection, and feature selection can be carried out for multiple times according to different selection methods until optimal multiple groups of feature data are selected
Each feature selection method gives a ranking of feature relevance, and according to the comprehensive results of the rankings, the first n (for example, n is 20) feature data with high relevance to sepsis can be selected according to the multiple (for example, 40) features of the data set.
Recursive Feature Elimination (RFE) and cross validation using the method of support vector machines (SVC), given an external estimator (e.g., coefficients of a linear model) that assigns weights to features, RFE aims to select features by recursively considering smaller and smaller feature sets, and finally feature ranking.
In Recursive Feature Elimination (RFE), by means of an external estimator of feature assigned weights (e.g. coefficients of a linear model), first, the estimator is trained on an initial set of features, the importance of each feature is filtered by a feature _ importances attribute, which represents how relevant each feature is to the sepsis outcome, and the importance of the feature data is ranked according to the feature _ importances.
It should be noted that other feature choices can be used, and those skilled in the art can specifically choose them according to the actual situation. Such as:
scheme 1: calculating a link for each feature to a sepsis outcome using the F values in the analysis of variance based on univariate feature selection for the F values in the ANOVA;
scheme 2: based on cross validation and recursive feature elimination of a support vector machine, determining the association degree of the features and sepsis results through macroscopic signs such as recall rate, accuracy, precision, AUC and the like, and eliminating the features with small association degree;
scheme 3: based on feature selection of a random forest, 10 decision trees are trained to classify the data set, and finally, the features which appear most frequently in the feature subset are selected.
The specific steps of feature selection and feature extraction are as follows:
basic feature selection is achieved using scinit-learn libraries for Recursive Feature Elimination (RFE) and cross validation based on support vector machines (SVC), given an external estimator (e.g., coefficients of a linear model) that assigns weights to features, RFE's goal is to select features by recursively considering smaller and smaller feature sets, and finally feature ranking. For feature extraction, the main feature extraction for sepsis was performed based on Principal Component Analysis (PCA) and non-Negative Matrix Factorization (NMF). PCA uses the eigenvectors of the sample covariance matrix, representing a subset of eigenvectors associated with the highest eigenvalues, to obtain new factors, which can yield reconstruction properties that minimize the squared error.
(3) And the sepsis prediction module is used for converting the time sequence into a feature vector through network transformation, inputting the feature vector and the current timestamp information into a trained sepsis prediction model, and predicting the probability of sepsis.
The sepsis prediction model is formed by connecting a plurality of classifiers in parallel, and the finally predicted sepsis probability is the average value of the output probabilities of the plurality of classifiers.
In a specific implementation, each classifier is composed of a LightGBM model and a bayesian optimizer, and the LightGBM model avoids overfitting using the bayesian optimizer.
In training the sepsis prediction model, a grid search is used to find the best parameters for the performance of the sepsis prediction model.
In training the sepsis prediction model, the preprocessed medical data are received in time sequence, and the time sequence is converted into feature vectors through network transformation. These feature vectors are input into the LightGBM gradient enhancement algorithm along with the current timestamp information to learn the combination of features related to sepsis that ultimately results in a risk score for the patient. The classification task of sepsis is performed by inputting time information together with feature vectors into the LightGBM algorithm, growing trees using a leaf-wise strategy with depth constraints, and growing by choosing the leaf node with the largest delta loss. And the iterative splitting is carried out each time the leaf node with the maximum splitting gain is found from all the current leaves. Compared with a level-wise growth strategy, the method has the advantages that under the same splitting times, the error of the level-wise growth is lower, and the efficiency is higher.
The improved LightGBM algorithm based on multi-feature fusion replaces the traditional Pre-Sorted algorithm with a histogram algorithm, discretizes continuous features into k feature values, and constructs a k-width histogram. And during data traversing, performing accumulative statistics by taking the discretized value as an index, and then traversing to find an optimal classification value.
The process in training the sepsis prediction model was:
the partition of training set and test set is performed first, 10 times of cross validation are performed to validate the performance of LightGBM, and the overfitting of the model is minimized by training. Model derivation and model parameter optimization are carried out on a training set, and model evaluation is carried out on a test set. And performing predictive classification of sepsis based on the improved LightGBM model of multi-feature fusion and a Bayesian optimization algorithm, and determining a cut-off threshold of a regression value through a Bayesian optimizer and gradient optimization.
When sepsis data is poor, leaf-wise may cause overfitting. Therefore, LightGBM can utilize additional parameters in the bayesian optimization algorithm to limit the depth of the tree and avoid overfitting, thereby speeding up the training process and reducing memory usage. First, a gaussian process is selected as a prior function in a bayesian optimization algorithm to represent the distribution assumptions of the optimized function. Secondly, a maximum probability raiser is constructed as an acquisition function for determining the next point to be evaluated from the model posterior distribution. The search space of hyper-parameters is then converted from a graph structure to a tree structure and non-parametric estimation is used instead of parametric estimation, resulting in better yields in both efficiency and accuracy.
In training the sepsis prediction model, the binary cross entropy between the true outcome and the predicted outcome was used as a loss function.
In the embodiment, the sepsis threshold is determined by an ROC curve and a confusion matrix in a Bayesian optimization algorithm, a cut-off threshold needs to be determined by taking a feature vector as a regression value in LightGBM, overfitting of model training is prevented, model accuracy is prevented from being influenced,
the light Gradient Boosting machine (LightGBM) is a framework for rapidly realizing the GBDT algorithm, supports high-efficiency parallel training, has higher training speed, lower memory consumption and better accuracy, supports distributed processing and can rapidly process mass data. Considering that LightGBM is not easy to be over-fitted and is sensitive to abnormal values, the model adopts forward deficiency compensation for all variables at an early stage, and adopts binary cross entropy between a real result and a predicted result as a loss function.
The sepsis classification performance of the LightGBM model and the sequential organ failure Score (SOFA), the Modified Early Warning Score (MEWS), the systemic inflammatory response syndrome score (SIRS), and the Simplified Acute Physiological Score (SAPS) are compared below and from this the clinical characteristic performance of sepsis is learned. To improve the training effect, we randomly split 90% of the raw sepsis data into multiple equal disjoint sub-data sets, and then train multiple LightGBM classifiers using each sub-data set. Furthermore, due to the problem of imbalance between the amount of sepsis and non-sepsis, we balanced the data for each sub data set separately using a random undersampling technique. Finally, the outputs of the multiple classifiers are integrated using geometric means.
For example: in the initial setting, the classification module comprises 5 classifiers, so that the influence of over-fitting of one classifier on the overall accuracy of the model is avoided, and the output of the classifiers is averaged by using a geometric mean value in the output process.
And (3) calculating a regression value with the maximum prediction score according to a gradient-free algorithm to optimize the model, and evaluating the prediction performance of the model according to the calculated area under the receiver operating characteristic curve (AUROC), the area under the accurate recall rate curve (AUPRC), the sensitivity and the specificity. And (3) optimizing parameters while training the model, evaluating the classification result of the classifier, if the classification evaluation result does not meet the set threshold, re-inputting the classification result into the classifier for re-classification until the classification evaluation requirement is met, and then outputting the classification result, as shown in fig. 2 and 3.
Example two
The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
acquiring medical monitoring data of a person to be monitored in an intensive care unit, and preprocessing the medical monitoring data;
receiving the preprocessed medical monitoring data according to a time sequence, and performing feature selection and feature extraction from the received data;
converting the time sequence into a characteristic vector through network transformation, inputting the characteristic vector and current timestamp information into a sepsis prediction model after training, and predicting the probability of sepsis;
the sepsis prediction model is formed by connecting a plurality of classifiers in parallel, and the finally predicted sepsis probability is the average value of the output probabilities of the plurality of classifiers.
EXAMPLE III
The embodiment provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes the computer program to implement the following steps:
acquiring medical monitoring data of a person to be monitored in an intensive care unit, and preprocessing the medical monitoring data;
receiving the preprocessed medical monitoring data according to a time sequence, and performing feature selection and feature extraction from the received data;
converting the time sequence into a characteristic vector through network transformation, inputting the characteristic vector and current timestamp information into a sepsis prediction model after training, and predicting the probability of sepsis;
the sepsis prediction model is formed by connecting a plurality of classifiers in parallel, and the finally predicted sepsis probability is the average value of the output probabilities of the plurality of classifiers.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A sepsis prediction system in an intensive care unit, comprising:
the data preprocessing module is used for acquiring and preprocessing medical monitoring data of a person to be monitored in the intensive care unit;
the characteristic selection and extraction module is used for receiving the preprocessed medical monitoring data according to the time sequence and carrying out characteristic selection and characteristic extraction on the received data;
the sepsis prediction module is used for converting the time sequence into a characteristic vector through network transformation, inputting the characteristic vector and the current timestamp information into a sepsis prediction model after training, and predicting the probability of sepsis;
the sepsis prediction model is formed by connecting a plurality of classifiers in parallel, and the finally predicted sepsis probability is the average value of the output probabilities of the plurality of classifiers.
2. A sepsis prediction system in the intensive care unit as claimed in claim 1, wherein the preprocessing includes normalization of the acquired medical care data of the person to be monitored in the intensive care unit, filling of missing values and screening and replacement of abnormal values.
3. A sepsis prediction system in the intensive care unit as claimed in claim 2, wherein the deficiency values are filled using a random normal distribution and a multi-neighborhood deficiency interpolation method.
4. A sepsis prediction system in an intensive care unit as claimed in claim 1, characterized in that the optimal sets of feature data are selected from the pre-processed medical care data based on a plurality of feature selections.
5. A sepsis prediction system in the intensive care unit as claimed in claim 1, characterized in that the feature extraction of sepsis is performed based on principal component analysis and non-negative matrix factorization.
6. A sepsis prediction system in the intensive care unit as claimed in claim 1, wherein each classifier is composed of a LightGBM model and a bayesian optimizer, the LightGBM model utilizing the bayesian optimizer to avoid overfitting.
7. A sepsis prediction system in the intensive care unit as claimed in claim 1, characterized in that in training the sepsis prediction model, a grid search is used to find the best parameters for the performance of the sepsis prediction model.
8. A sepsis prediction system in the intensive care unit as claimed in claim 1, characterized in that in training the sepsis prediction model, the binary cross entropy between the true outcome and the predicted outcome is used as a loss function.
9. A computer-readable storage medium, on which a computer program is stored, which program, when executed by a processor, performs the steps of:
acquiring medical monitoring data of a person to be monitored in an intensive care unit, and preprocessing the medical monitoring data;
receiving the preprocessed medical monitoring data according to a time sequence, and performing feature selection and feature extraction from the received data;
converting the time sequence into a characteristic vector through network transformation, inputting the characteristic vector and current timestamp information into a sepsis prediction model after training, and predicting the probability of sepsis;
the sepsis prediction model is formed by connecting a plurality of classifiers in parallel, and the finally predicted sepsis probability is the average value of the output probabilities of the plurality of classifiers.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of:
acquiring medical monitoring data of a person to be monitored in an intensive care unit, and preprocessing the medical monitoring data;
receiving the preprocessed medical monitoring data according to a time sequence, and performing feature selection and feature extraction from the received data;
converting the time sequence into a characteristic vector through network transformation, inputting the characteristic vector and current timestamp information into a sepsis prediction model after training, and predicting the probability of sepsis;
the sepsis prediction model is formed by connecting a plurality of classifiers in parallel, and the finally predicted sepsis probability is the average value of the output probabilities of the plurality of classifiers.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111137716.2A CN113871009A (en) | 2021-09-27 | 2021-09-27 | Sepsis prediction system, storage medium and apparatus in intensive care unit |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111137716.2A CN113871009A (en) | 2021-09-27 | 2021-09-27 | Sepsis prediction system, storage medium and apparatus in intensive care unit |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113871009A true CN113871009A (en) | 2021-12-31 |
Family
ID=78991291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111137716.2A Pending CN113871009A (en) | 2021-09-27 | 2021-09-27 | Sepsis prediction system, storage medium and apparatus in intensive care unit |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113871009A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114628024A (en) * | 2022-02-24 | 2022-06-14 | 重庆市急救医疗中心(重庆市第四人民医院、重庆市急救医学研究所) | Application of biomarker group in preparation of product for auxiliary screening of sepsis, auxiliary screening method and system thereof |
CN115240854A (en) * | 2022-07-29 | 2022-10-25 | 中国医学科学院北京协和医院 | Method and system for processing pancreatitis prognosis data |
CN115579147A (en) * | 2022-09-26 | 2023-01-06 | 一选(浙江)医疗科技有限公司 | Sepsis recognition model training method, sepsis early warning method and device |
CN116051476A (en) * | 2022-12-23 | 2023-05-02 | 广州市番禺区中心医院 | Automatic evaluation system for pneumosepsis based on scanning image analysis |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109119167A (en) * | 2018-07-11 | 2019-01-01 | 山东师范大学 | Pyemia anticipated mortality system based on integrated model |
CN111261282A (en) * | 2020-01-21 | 2020-06-09 | 南京航空航天大学 | Sepsis early prediction method based on machine learning |
CN111951975A (en) * | 2020-08-19 | 2020-11-17 | 哈尔滨工业大学 | Sepsis early warning method based on deep learning model GPT-2 |
CN113057587A (en) * | 2021-03-17 | 2021-07-02 | 上海电气集团股份有限公司 | Disease early warning method and device, electronic equipment and storage medium |
-
2021
- 2021-09-27 CN CN202111137716.2A patent/CN113871009A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109119167A (en) * | 2018-07-11 | 2019-01-01 | 山东师范大学 | Pyemia anticipated mortality system based on integrated model |
CN111261282A (en) * | 2020-01-21 | 2020-06-09 | 南京航空航天大学 | Sepsis early prediction method based on machine learning |
CN111951975A (en) * | 2020-08-19 | 2020-11-17 | 哈尔滨工业大学 | Sepsis early warning method based on deep learning model GPT-2 |
CN113057587A (en) * | 2021-03-17 | 2021-07-02 | 上海电气集团股份有限公司 | Disease early warning method and device, electronic equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
张振宇: "改进集成技术在甲状腺超声图像分类中的应用研究", 中国优秀硕士学位论文全文数据库 医药卫生科技辑, no. 10, 15 October 2014 (2014-10-15), pages 11 - 12 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114628024A (en) * | 2022-02-24 | 2022-06-14 | 重庆市急救医疗中心(重庆市第四人民医院、重庆市急救医学研究所) | Application of biomarker group in preparation of product for auxiliary screening of sepsis, auxiliary screening method and system thereof |
CN115240854A (en) * | 2022-07-29 | 2022-10-25 | 中国医学科学院北京协和医院 | Method and system for processing pancreatitis prognosis data |
CN115240854B (en) * | 2022-07-29 | 2023-10-03 | 中国医学科学院北京协和医院 | Pancreatitis prognosis data processing method and system |
CN115579147A (en) * | 2022-09-26 | 2023-01-06 | 一选(浙江)医疗科技有限公司 | Sepsis recognition model training method, sepsis early warning method and device |
CN115579147B (en) * | 2022-09-26 | 2024-02-09 | 一选(浙江)医疗科技有限公司 | Sepsis recognition model training method, sepsis early warning method and sepsis early warning device |
CN116051476A (en) * | 2022-12-23 | 2023-05-02 | 广州市番禺区中心医院 | Automatic evaluation system for pneumosepsis based on scanning image analysis |
CN116051476B (en) * | 2022-12-23 | 2023-08-18 | 广州市番禺区中心医院 | Automatic evaluation system for pneumosepsis based on scanning image analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113871009A (en) | Sepsis prediction system, storage medium and apparatus in intensive care unit | |
CN111261282A (en) | Sepsis early prediction method based on machine learning | |
US20180150609A1 (en) | Server and method for predicting future health trends through similar case cluster based prediction models | |
Karabulut et al. | Analysis of cardiotocogram data for fetal distress determination by decision tree based adaptive boosting approach | |
CN112633601B (en) | Method, device, equipment and computer medium for predicting disease event occurrence probability | |
Choubey et al. | GA_J48graft DT: a hybrid intelligent system for diabetes disease diagnosis | |
US20180114123A1 (en) | Rule generation method and apparatus using deep learning | |
CN111243736A (en) | Survival risk assessment method and system | |
CN116153495A (en) | Prognosis survival prediction method for immunotherapy of esophageal cancer patient | |
CN111696670B (en) | Intelligent interpretation method for prenatal fetal monitoring based on deep forest | |
CN108399434A (en) | The analyzing and predicting method of the higher-dimension time series data of feature based extraction | |
CN114927230B (en) | Prognosis decision support system and method for severe heart failure patient based on machine learning | |
CN108053885A (en) | A kind of hemorrhagic conversion forecasting system | |
CN111370126A (en) | ICU mortality prediction method and system based on penalty integration model | |
CN113593708A (en) | Sepsis prognosis prediction method based on integrated learning algorithm | |
CN115474939A (en) | Autism spectrum disorder recognition model based on deep expansion neural network | |
CN110400610B (en) | Small sample clinical data classification method and system based on multichannel random forest | |
CN111930601A (en) | Deep learning-based database state comprehensive scoring method and system | |
CN110522446A (en) | A kind of electroencephalogramsignal signal analysis method that accuracy high practicability is strong | |
CN114756420A (en) | Fault prediction method and related device | |
CN113948206B (en) | Disease stage model fusion method based on multi-level framework | |
KR102607425B1 (en) | Method and apparatus for evaluation questions determination | |
CN114974581A (en) | Method for predicting and evaluating long-term death risk of hyperglycemia crisis | |
US11151457B1 (en) | Predictor generation genetic algorithm | |
KR102634529B1 (en) | Agricultural Price Prediction Apparatus Using Multi-Step Time Series Forecasting and Method for Predicting Agricultural Product Price Using the Same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |