CN117558452B - MODS risk assessment model construction method, device, equipment and medium - Google Patents
MODS risk assessment model construction method, device, equipment and medium Download PDFInfo
- Publication number
- CN117558452B CN117558452B CN202410041125.2A CN202410041125A CN117558452B CN 117558452 B CN117558452 B CN 117558452B CN 202410041125 A CN202410041125 A CN 202410041125A CN 117558452 B CN117558452 B CN 117558452B
- Authority
- CN
- China
- Prior art keywords
- mods
- cytokines
- model
- characteristic
- cytokine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 208000010718 Multiple Organ Failure Diseases 0.000 title claims abstract description 236
- 238000012502 risk assessment Methods 0.000 title claims abstract description 118
- 238000010276 construction Methods 0.000 title claims abstract description 26
- 102000004127 Cytokines Human genes 0.000 claims abstract description 364
- 108090000695 Cytokines Proteins 0.000 claims abstract description 364
- 238000000034 method Methods 0.000 claims abstract description 144
- 230000008569 process Effects 0.000 claims abstract description 85
- 238000012549 training Methods 0.000 claims abstract description 72
- 238000012216 screening Methods 0.000 claims abstract description 30
- 238000007477 logistic regression Methods 0.000 claims description 32
- 238000004364 calculation method Methods 0.000 claims description 28
- 238000005457 optimization Methods 0.000 claims description 25
- 238000007476 Maximum Likelihood Methods 0.000 claims description 15
- 238000003860 storage Methods 0.000 claims description 11
- 238000007619 statistical method Methods 0.000 claims description 7
- 238000012952 Resampling Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 230000002757 inflammatory effect Effects 0.000 abstract description 28
- 208000014674 injury Diseases 0.000 abstract description 25
- 208000027418 Wounds and injury Diseases 0.000 abstract description 23
- 230000008733 trauma Effects 0.000 abstract description 19
- 206010052428 Wound Diseases 0.000 abstract description 18
- 210000005259 peripheral blood Anatomy 0.000 abstract description 14
- 239000011886 peripheral blood Substances 0.000 abstract description 14
- 230000009286 beneficial effect Effects 0.000 abstract description 5
- 238000004891 communication Methods 0.000 description 11
- 230000006870 function Effects 0.000 description 11
- 238000005259 measurement Methods 0.000 description 9
- 238000007637 random forest analysis Methods 0.000 description 9
- 238000001514 detection method Methods 0.000 description 8
- 230000006378 damage Effects 0.000 description 7
- 230000035945 sensitivity Effects 0.000 description 7
- 108010078239 Chemokine CX3CL1 Proteins 0.000 description 6
- 102000013818 Fractalkine Human genes 0.000 description 6
- 206010051379 Systemic Inflammatory Response Syndrome Diseases 0.000 description 6
- 239000011324 bead Substances 0.000 description 6
- 210000004027 cell Anatomy 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- 238000003066 decision tree Methods 0.000 description 5
- 238000010606 normalization Methods 0.000 description 5
- 238000007781 pre-processing Methods 0.000 description 5
- 238000002360 preparation method Methods 0.000 description 5
- 102000004890 Interleukin-8 Human genes 0.000 description 4
- 108090001007 Interleukin-8 Proteins 0.000 description 4
- 210000003719 b-lymphocyte Anatomy 0.000 description 4
- 238000013211 curve analysis Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 208000015181 infectious disease Diseases 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 230000000770 proinflammatory effect Effects 0.000 description 4
- 230000009467 reduction Effects 0.000 description 4
- 102000000589 Interleukin-1 Human genes 0.000 description 3
- 108010002352 Interleukin-1 Proteins 0.000 description 3
- 108090001005 Interleukin-6 Proteins 0.000 description 3
- 210000001744 T-lymphocyte Anatomy 0.000 description 3
- 239000000872 buffer Substances 0.000 description 3
- 210000002865 immune cell Anatomy 0.000 description 3
- 210000002540 macrophage Anatomy 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 210000000056 organ Anatomy 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 210000002966 serum Anatomy 0.000 description 3
- 239000011534 wash buffer Substances 0.000 description 3
- GOZMBJCYMQQACI-UHFFFAOYSA-N 6,7-dimethyl-3-[[methyl-[2-[methyl-[[1-[3-(trifluoromethyl)phenyl]indol-3-yl]methyl]amino]ethyl]amino]methyl]chromen-4-one;dihydrochloride Chemical compound Cl.Cl.C=1OC2=CC(C)=C(C)C=C2C(=O)C=1CN(C)CCN(C)CC(C1=CC=CC=C11)=CN1C1=CC=CC(C(F)(F)F)=C1 GOZMBJCYMQQACI-UHFFFAOYSA-N 0.000 description 2
- 208000032456 Hemorrhagic Shock Diseases 0.000 description 2
- -1 IL-12p70 Proteins 0.000 description 2
- 206010061218 Inflammation Diseases 0.000 description 2
- 102000004369 Insulin-like growth factor-binding protein 4 Human genes 0.000 description 2
- 108090000969 Insulin-like growth factor-binding protein 4 Proteins 0.000 description 2
- 102000013264 Interleukin-23 Human genes 0.000 description 2
- 108010065637 Interleukin-23 Proteins 0.000 description 2
- 108010002386 Interleukin-3 Proteins 0.000 description 2
- 102000000646 Interleukin-3 Human genes 0.000 description 2
- 102000036675 Myoglobin Human genes 0.000 description 2
- 108010062374 Myoglobin Proteins 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- 206010053159 Organ failure Diseases 0.000 description 2
- 206010049771 Shock haemorrhagic Diseases 0.000 description 2
- 102400000084 Tumor necrosis factor ligand superfamily member 6, soluble form Human genes 0.000 description 2
- 101800000859 Tumor necrosis factor ligand superfamily member 6, soluble form Proteins 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000001154 acute effect Effects 0.000 description 2
- XAGFODPZIPBFFR-UHFFFAOYSA-N aluminium Chemical compound [Al] XAGFODPZIPBFFR-UHFFFAOYSA-N 0.000 description 2
- 229910052782 aluminium Inorganic materials 0.000 description 2
- 238000004159 blood analysis Methods 0.000 description 2
- 206010052015 cytokine release syndrome Diseases 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007717 exclusion Effects 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 239000011888 foil Substances 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 230000028993 immune response Effects 0.000 description 2
- 230000004054 inflammatory process Effects 0.000 description 2
- 230000028709 inflammatory response Effects 0.000 description 2
- 102000010681 interleukin-8 receptors Human genes 0.000 description 2
- 108010038415 interleukin-8 receptors Proteins 0.000 description 2
- 230000003902 lesion Effects 0.000 description 2
- 239000003446 ligand Substances 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 210000001616 monocyte Anatomy 0.000 description 2
- 208000029744 multiple organ dysfunction syndrome Diseases 0.000 description 2
- 210000000822 natural killer cell Anatomy 0.000 description 2
- 210000000440 neutrophil Anatomy 0.000 description 2
- 230000001991 pathophysiological effect Effects 0.000 description 2
- 230000007115 recruitment Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 230000004083 survival effect Effects 0.000 description 2
- 239000000725 suspension Substances 0.000 description 2
- 230000009885 systemic effect Effects 0.000 description 2
- 230000008718 systemic inflammatory response Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 208000023275 Autoimmune disease Diseases 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 210000001266 CD8-positive T-lymphocyte Anatomy 0.000 description 1
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 102000019034 Chemokines Human genes 0.000 description 1
- 108010012236 Chemokines Proteins 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 206010050685 Cytokine storm Diseases 0.000 description 1
- 238000009007 Diagnostic Kit Methods 0.000 description 1
- 206010019196 Head injury Diseases 0.000 description 1
- 206010062016 Immunosuppression Diseases 0.000 description 1
- 229940119178 Interleukin 1 receptor antagonist Drugs 0.000 description 1
- 102100020881 Interleukin-1 alpha Human genes 0.000 description 1
- 102000003777 Interleukin-1 beta Human genes 0.000 description 1
- 108090000193 Interleukin-1 beta Proteins 0.000 description 1
- 102000019223 Interleukin-1 receptor Human genes 0.000 description 1
- 108050006617 Interleukin-1 receptor Proteins 0.000 description 1
- 108010082786 Interleukin-1alpha Proteins 0.000 description 1
- 238000000585 Mann–Whitney U test Methods 0.000 description 1
- 208000004221 Multiple Trauma Diseases 0.000 description 1
- 208000023637 Multiple injury Diseases 0.000 description 1
- 208000030886 Traumatic Brain injury Diseases 0.000 description 1
- 206010053476 Traumatic haemorrhage Diseases 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003110 anti-inflammatory effect Effects 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 239000012131 assay buffer Substances 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 239000007853 buffer solution Substances 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 210000002230 centromere Anatomy 0.000 description 1
- 208000026106 cerebrovascular disease Diseases 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000003399 chemotactic effect Effects 0.000 description 1
- 238000007374 clinical diagnostic method Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 231100000433 cytotoxic Toxicity 0.000 description 1
- 230000001472 cytotoxic effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000002526 effect on cardiovascular system Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 239000012149 elution buffer Substances 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000005206 flow analysis Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 208000014951 hematologic disease Diseases 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 208000026278 immune system disease Diseases 0.000 description 1
- 230000000899 immune system response Effects 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 230000001506 immunosuppresive effect Effects 0.000 description 1
- 229960003444 immunosuppressant agent Drugs 0.000 description 1
- 239000003018 immunosuppressive agent Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000003960 inflammatory cascade Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000005399 mechanical ventilation Methods 0.000 description 1
- 239000011259 mixed solution Substances 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 230000001338 necrotic effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000010355 oscillation Effects 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 238000010837 poor prognosis Methods 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 208000037920 primary disease Diseases 0.000 description 1
- 239000003805 procoagulant Substances 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 208000020016 psychiatric disease Diseases 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 102000005962 receptors Human genes 0.000 description 1
- 108020003175 receptors Proteins 0.000 description 1
- 208000023504 respiratory system disease Diseases 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000007789 sealing Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 208000037974 severe injury Diseases 0.000 description 1
- 230000009528 severe injury Effects 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 230000000451 tissue damage Effects 0.000 description 1
- 231100000827 tissue damage Toxicity 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- General Physics & Mathematics (AREA)
- Public Health (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Primary Health Care (AREA)
- General Health & Medical Sciences (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Biomedical Technology (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The MODS risk assessment model construction method, device and medium provided by the application are that cytokines in peripheral blood of a patient with severe trauma are subjected to multi-level layer-by-layer screening, a plurality of characteristic cytokines with obvious differences are firstly screened out, and an initial MODS risk assessment model is constructed based on the characteristic cytokines; and then training the initial MODS risk assessment model by expanding the data set, screening a plurality of target cytokines from a plurality of characteristic cytokines in a large amount of data training process, and constructing a target MODS risk assessment model based on the target cytokines. The risk assessment model can accurately and rapidly predict MODS caused by inflammatory factor storm in early stage of severe wounds, and is beneficial to improving the success rate of treatment of severe wounds.
Description
Technical Field
The application relates to the technical field of MODS occurrence risk assessment, in particular to a method, a device, equipment and a medium for constructing an MODS risk assessment model.
Background
Wounds are a global public health problem and with the advancement of medical levels, mortality from post-traumatic bleeding has been greatly reduced. However, the secondary injury from post-hemostatic trauma is not stopped. Secondary injury to systemic inflammatory response (Systemic inflammatory response syndrome, SIRS) and multiple organ dysfunction syndrome (multiple organ dysfunction syndrome, MODS) caused by severe trauma is a major cause of poor prognosis and poor therapeutic effect for such patients, accounting for about two-thirds of the total mortality rate of wounds, but there is currently no clinically effective risk assessment strategy for MODS.
Disclosure of Invention
In view of the foregoing, the present application aims to provide a method, a device, equipment and a medium for constructing a mod risk assessment model, so as to be able to early warn of mod caused by inflammatory factor storms.
Based on the above objects, the present application provides a method for constructing a MODS risk assessment model, including:
acquiring a preprocessed original data set, wherein the original data set comprises a plurality of MODS samples, and each MODS sample comprises a plurality of cytokines and corresponding concentration values thereof;
analyzing the relationship between each cytokine and MODS based on the preprocessed raw data set, and determining a plurality of characteristic cytokines;
constructing an initial MODS risk assessment model based on a plurality of the characteristic cytokines;
training the initial MODS risk assessment model, calculating model inclusion frequencies of a plurality of characteristic cytokines in the training process, and taking the characteristic cytokines with the model inclusion frequencies higher than a preset value as target cytokines;
and constructing a target MODS risk assessment model based on the target cytokines.
Further, the acquiring the preprocessed original data set includes:
acquiring an original data set;
And (3) carrying out standardization processing on the concentration value of each cytokine in the original data set to obtain a preprocessed data set.
Further, the analyzing the relationship between each cytokine and MODS based on the preprocessed raw data set, and determining a plurality of characteristic cytokines, comprising:
based on the preprocessed original data set, determining an initial probability relation between each cytokine and MODS by using a descriptive statistical method, and determining a plurality of first cytokines based on the initial probability relation;
analyzing a first probability relation between each first cytokine and MODS by applying a univariate logistic regression model, and determining a plurality of second cytokines based on the first probability relation;
and analyzing a second probability relation between the plurality of second cytokines and the MODS by using a multivariate logistic regression model, and determining a plurality of characteristic cytokines based on the second probability relation.
Further, the analyzing the second probability relation between the plurality of second cytokines and the mod by using the multivariate logistic regression model, and determining the plurality of characteristic cytokines based on the second probability relation includes:
substituting a plurality of groups of second cytokine levels in each MODS sample into the multivariate logistic regression model for calculation, and obtaining a plurality of process calculation models, wherein each process calculation model is used for representing a second probability relation between part of second cytokines and MODS;
And screening an optimization process calculation model in a plurality of process calculation models based on a second probability relation between part of the second cytokines and MODS, calculating model inclusion frequencies of the characteristic cytokines related in the plurality of optimization process calculation models, and taking the second cytokines with the model inclusion frequencies higher than a preset value as the characteristic cytokines.
Further, the constructing an initial mod s risk assessment model based on a plurality of the characteristic cytokines includes:
based on the preprocessed original data set, calculating initial likelihood estimation of each characteristic cell factor by adopting a maximum likelihood estimation method, and taking the initial likelihood estimation as an initial model parameter of each characteristic cell factor;
and constructing an initial MODS risk assessment model based on the plurality of characteristic cytokines and initial model parameters corresponding to each characteristic cytokine.
Further, the training the initial mod s risk assessment model, calculating a model inclusion frequency of the plurality of characteristic cytokines in the training process, and taking the characteristic cytokines with the model inclusion frequency higher than a preset value as target cytokines includes:
Resampling the preprocessed original data set with a replacement, and forming a plurality of sample sets, wherein each sample set comprises a plurality of MODS samples;
substituting each sample set into the initial MODS risk assessment model for training, and obtaining a plurality of process training models, wherein each process training model is used for representing the relationship between part of characteristic cytokines and MODS;
and screening an optimization process training model from a plurality of process training models based on the relation between part of the characteristic cytokines and the MODS, calculating model inclusion frequencies of the characteristic cytokines related in the plurality of optimization process training models, and taking the characteristic cytokines with the model inclusion frequencies higher than a preset value as target cytokines.
Further, constructing a target MODS risk assessment model based on the target cytokines, comprising:
calculating target likelihood estimates of each target cytokine by using a maximum likelihood estimation method based on a plurality of sample sets, wherein the target likelihood estimates are used as target model parameters of the corresponding target cytokines;
and constructing an MODS target risk assessment model based on the target cytokines and target model parameters corresponding to each target cytokine.
Based on the same inventive concept, the present disclosure further provides a device for constructing an MODS risk assessment model, including:
the acquisition module is used for acquiring a preprocessed original data set, wherein the original data set comprises a plurality of MODS samples, and each MODS sample comprises a plurality of cytokines and corresponding concentration values thereof;
a first screening module for analyzing the relationship between each cytokine and MODS and determining a plurality of characteristic cytokines based on the preprocessed raw data set;
the initial model construction module is used for constructing an initial MODS risk assessment model based on a plurality of the characteristic cytokines;
the second screening module is used for training the initial MODS risk assessment model, calculating model inclusion frequencies of a plurality of characteristic cytokines in the training process, and taking the characteristic cytokines with the model inclusion frequencies higher than a preset value as target cytokines;
and the target model construction module is used for constructing a target MODS risk assessment model based on the target cytokines.
Based on the same inventive concept, the present disclosure also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable by the processor, the processor implementing the method as described above when executing the computer program.
Based on the same inventive concept, the present disclosure also provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method as described above.
From the above, it can be seen that the method for constructing the mod risk assessment model provided by the present application performs multi-level layer-by-layer screening on cytokines in peripheral blood of a patient with severe trauma, firstly screens out a plurality of characteristic cytokines with significant differences, and constructs an initial mod risk assessment model based on the plurality of characteristic cytokines; and then training the initial MODS risk assessment model by expanding the data set, screening a plurality of target cytokines from a plurality of characteristic cytokines in a large amount of data training process, and constructing a target MODS risk assessment model based on the target cytokines. The risk assessment model can accurately and rapidly predict MODS caused by inflammatory factor storm in early stage of severe wounds, and is beneficial to improving the success rate of treatment of severe wounds.
In addition, the initial MODS risk assessment model is a Logistic regression model, and compared with target cytokines screened by using a random forest model, a Lasso Logistic regression model and a GBDT model, the target cytokines screened by the Logistic regression model have better MODS early warning efficiency.
Drawings
In order to more clearly illustrate the technical solutions of the present application or related art, the drawings that are required to be used in the description of the embodiments or related art will be briefly described below, and it is apparent that the drawings in the following description are only embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort to those of ordinary skill in the art.
Fig. 1 is a flow chart of a method for constructing a MODS risk assessment model according to an embodiment of the present application;
FIG. 2 is a graph of ROC of example 2 of the present application;
FIG. 3 is a graph of ROC of example 3 of the present application;
FIG. 4 is a ROC graph of example 4 of the present application;
FIG. 5 is a ROC graph of example 5 of the present application;
FIG. 6 is a ROC graph of target cytokines in the examples of the present application;
FIG. 7 is a differential heat map of 42 characteristic cytokines involved in example 2 of the present application;
fig. 8 is a schematic structural diagram of a MODS risk assessment model building apparatus according to an embodiment of the present application;
fig. 9 is a schematic diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings.
It should be noted that unless otherwise defined, technical or scientific terms used in the embodiments of the present application should be given the ordinary meaning as understood by one of ordinary skill in the art to which the present application belongs. The terms "first," "second," and the like, as used in embodiments of the present application, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
The immune system response after trauma is complex, the immune response of the body to blood loss and tissue damage is different from infection, and it is thought that the immune system abnormality leads to SIRS, and MODS occurs. Early in the wound, necrotic cells at the damaged site release damage related molecular patterns (damage-associated molecular patterns, DAMPs) under the action of mechanical injury factors activate immune cells such as centromere cells and monocytes, initiating defensive protection mechanisms; however, in some cases, a large amount of cytokines and inflammatory mediators are instantaneously produced and released in large amounts, and thus, inflammatory cascades are caused, mod s is caused by immune disorder, and even death is caused, which causes a great obstacle to the treatment of diseases.
Inflammatory factor storms or cytokine release syndromes are life threatening systemic inflammatory responses involving elevated levels of circulating cytokines and excessive activation of immune cells, and can be defined by the following: various therapies, pathogens, cancers, autoimmune diseases, and the like. The essence is an excessive immunity phenomenon generated by the body aiming at external stimulus: cytokines are released in uncontrolled and massive amounts, leading to systemic inflammation. In recent years, a great deal of research is expected to explain the MODS caused by SIRS by exploring inflammatory factor storm of related diseases. However, the current inflammatory factor storm lacks grading standard and definite clinical index, and no related research on the inflammatory factor storm after severe trauma/hemorrhagic shock exists. Cytokine storm caused by excessive cytokine release after severe wound/hemorrhagic shock is a pathophysiological basis for inducing SIRS and remote organ injury, and is an important factor for causing SIRS and MODS after wound.
Based on the above, the application screens inflammatory factor compositions composed of a plurality of target cytokines based on cytokines in peripheral blood of a patient with severe trauma, and builds a MODS risk assessment model so as to be capable of early warning of MODS possibly caused by inflammatory factor storm in early stage of severe trauma.
Embodiments of the present application are described in detail below with reference to the accompanying drawings.
Referring to fig. 1, the application provides a method for constructing a MODS risk assessment model, which specifically includes the following steps:
s100, acquiring a preprocessed original data set, wherein the original data set comprises a plurality of MODS samples, and each MODS sample comprises a plurality of cytokines and corresponding concentration values thereof.
As an exemplary embodiment, each MODS sample corresponds to each cytokine and its corresponding concentration value in the peripheral blood of a subject, and the concentration value of each cytokine in the peripheral blood of the subject can be analyzed by using the flow analyzer, so that a plurality of samples can construct an original data set.
As an exemplary embodiment, the preprocessing process for the raw data set may include data de-hybridization, averaging of the same cytokine concentration values in the same sample, or correction for cytokine concentration values, etc.
S200, analyzing the relation between each cytokine and MODS based on the preprocessed original data set, and determining a plurality of characteristic cytokines.
The step is to screen out the cytokines with larger relation with MODS based on the preprocessed original data set, and the screened out cytokines are called as characteristic cytokines.
Illustratively, the screening process may include sorting of cytokines, single cytokine-based screening, multicellular factor-based screening, and the like. The screening process may also be considered as a process of ordering the importance of a plurality of cytokines based on the relationship with the occurrence of MODS, and ranking the top importance of the plurality of cytokines as characteristic cytokines.
S300, constructing an initial MODS risk assessment model based on a plurality of the characteristic cytokines.
In the step, a risk assessment model constructed based on the characteristic cytokines screened in the step S200 is only used as an initial assessment model because of the limited number of samples of the original data set, and a basis is provided for training with samples with large data volume in the subsequent step S400.
S400, training the initial MODS risk assessment model, calculating model inclusion frequencies of a plurality of characteristic cytokines in the training process, and taking the characteristic cytokines with the model inclusion frequencies higher than a preset value as target cytokines.
In this step, a batch of sample sets may be obtained by a resampling method with a put-back, where each sample set includes a plurality of MODS samples, so that an initial MODS risk assessment model may be trained by using the batch of sample sets, and each training may calculate a relationship between a sample set and occurrence of MODS based on sample data of the sample set by using a factor level of a part of characteristic cytokines as a model variable. Therefore, the characteristic cytokines which are reflected as model variables in the multiple training processes can be counted, the model inclusion frequency of the multiple characteristic cytokines as the model variables is counted, the higher the model inclusion frequency is, the greater the relation between the cytokines and MODS occurrence is indicated, and the characteristic cytokines with the model inclusion frequency higher than a preset value can be used as target cytokines.
S500, constructing a target MODS risk assessment model based on the target cytokines.
The step is to construct a target risk assessment model based on the target cytokines screened in the step S400, wherein the assessment model takes the concentration value of the target cytokines as a variable and outputs the probability relation with the MODS as a result. The target MODS risk model can be used in the peripheral blood analysis process of a patient suffering from subsequent wounds, the concentration value of each cytokine in the peripheral blood of a study object can be analyzed by using a flow analyzer, the concentration value of the target cytokine is extracted and substituted into the MODS risk assessment model, so that the probability relation between the patient and the occurrence of MODS is obtained, and a favorable basis is provided for the treatment of the subsequent patient.
In some embodiments, step S100 obtains a preprocessed raw data set, comprising:
acquiring an original data set;
and (3) carrying out standardization processing on the concentration value of each cytokine in the original data set to obtain a preprocessed data set.
Wherein the normalization of the concentration value of each cytokine is performed according to the following formula:
wherein,is the original measurement of the ith cytokine, < + >>、Mean and standard deviation of the concentration values of the ith cytokine, respectively,/- >Is the standard concentration value of the ith cytokine.
In addition, the concentration values of the cytokines may be first decontaminated, for example, by removing maxima and/or minima, from the concentration data before normalizing the concentration values.
It should be noted that, in order to ensure accuracy of the detection data, when detecting the peripheral blood of the subject, the detection is performed in parallel a plurality of times, and thus, in each sample, a cytokine may correspond to a plurality of concentration values, but in order to avoid the condition that a cytokine corresponds to a plurality of concentration values affecting the subsequent steps, the concentration of the cytokine in each sample is normalized.
In some embodiments, step S200 analyzes the relationship between each cytokine and MODS based on the pre-processed raw dataset, and determines a plurality of characteristic cytokines, including:
s201, determining an initial probability relation between each cytokine and MODS by using a descriptive statistical method based on the preprocessed original data set, and determining a plurality of first cytokines based on the initial probability relation.
For example, the described statistical method classifies a plurality of cytokines according to whether the cytokines are inflammatory factors, and because the occurrence of MODS is closely related to the inflammatory factors, the inflammatory factors can be marked in a plurality of cytokines, wherein whether the inflammatory factors are the initial probability relationship between the cytokines and the occurrence of MODS is greater, and if the inflammatory factors are the inflammatory factors, the initial probability relationship between the inflammatory factors and the occurrence of MODS is greater, and inflammatory factors in the plurality of cytokines are selected as the first cytokines.
Illustratively, the descriptive statistical method is a differential description of cytokine levels of patients and healthy volunteers in the subject, and the initial probability relationship between cytokines with larger variability and MODS is larger, i.e. the variability is the initial probability relationship in this step, and a plurality of cytokines with variability greater than a preset value are determined as the first cytokines.
For example, the above-described screening of inflammatory factors and differential descriptions of cytokine levels may be combined to determine a plurality of first cytokines.
S202, analyzing a first probability relation between each first cytokine and MODS by applying a univariate logistic regression model, and determining a plurality of second cytokines based on the first probability relation.
Illustratively, the univariate logistic regression model is as follows:
pis the probability of MODS occurring at a given cytokine level,is a model parameter, +.>Model parameters of the ith cytokine, andin step->And->May be a random number.
The execution process of the univariate logistic regression model in this step is as follows: substituting the standard concentration value of the first cytokine in each sample after pretreatment into the model to obtain the probability between the first cytokine and MODS. And sequencing the probabilities between the plurality of first cytokines and the MODS, and determining the plurality of first cytokines with the probabilities higher than the preset value as the second cytokines, namely, screening out the plurality of second cytokines which are more important in relation with the MODS from the plurality of first cytokines in the step.
S203, analyzing a second probability relation between a plurality of second cytokines and MODS by using a multivariate logistic regression model, and determining a plurality of characteristic cytokines based on the second probability relation.
This step S203 may be further described as:
s2031 establishing a plurality of second cytokine subsets among a plurality of second cytokines in each of said MODS samples
The step is to select part of all the second cytokines in the preprocessed original data set to form a plurality of second cytokine subsets.
Illustratively, a total of 90 second cytokines, some of which may be randomly selected as a subset of the second cytokines, the number of second cytokines in each subset may be the same or different.
S2032, substituting the factor level of each second cytokine subset into the multivariate logistic regression model for calculation, and obtaining a plurality of process calculation models, wherein each process calculation model is used for representing a second probability relation between a second cytokine subset and MODS;
the standard concentration values of a plurality of second cytokine subsets in each MODS sample are substituted into the initial MODS risk assessment model for calculation, so that a plurality of process calculation models are obtained, and the probability between the second cytokine subsets and the MODS can be obtained. At the same time, a plurality of second cytokine subsets are substituted into the multivariate logistic regression model to obtain a plurality of process calculation models.
S2033, screening an optimization process calculation model from a plurality of process calculation models based on a second probability relation between each second cytokine subset and MODS, calculating a model inclusion frequency of the second cytokines related in the plurality of optimization process calculation models, and taking the second cytokines with the model inclusion frequency higher than a preset value as characteristic cytokines.
For example, the overall process calculation models may be ranked in terms of the occurrence probability of MODS, the process calculation model with the occurrence probability of MODS higher than the preset value may be used as the optimization process calculation model, and the plurality of second cytokines with the frequency of model inclusion higher than the preset value in all the optimization process calculation models may be used as the characteristic cytokines.
Illustratively, the multivariate logistic regression model is as follows:
pis the probability of MODS occurring at a given cytokine level,is a model parameter, +.>Model parameters of the ith cytokine, and +.>And->May be a random number.
In this embodiment, a process of performing multi-level layer-by-layer screening on multiple cytokines in an original data set based on a relationship between the cytokines and the occurrence of the MODS is performed, first, determining a first cytokine among the multiple cytokines by a descriptive statistical method, then determining a second cytokine among the multiple first cytokines by a univariate logistic regression model, and then determining a characteristic cytokine among the multiple second cytokines by a multivariate logistic regression model.
In some embodiments, the constructing an initial MODS risk assessment model based on a plurality of the characteristic cytokines of step S300 comprises:
s301, calculating initial likelihood estimation of each characteristic cell factor by using a maximum likelihood estimation method based on the preprocessed original data set, and taking the initial likelihood estimation as an initial model parameter of each characteristic cell factor.
For example, the risk assessment model to be built in this embodiment is a multivariate logic model, and in step S203, a plurality of characteristic cytokines have been determined, and standard concentration values (or "factor levels") of the plurality of characteristic cytokines may be used as model variables of the model, where the step needs to perform maximum likelihood estimation calculation on model parameters of the model.
Illustratively, the procedure for solving the initial model parameters using Maximum Likelihood Estimation (MLE) for each characteristic cytokine is as follows:
is a likelihood function of the model parameter beta +.>Is the maximum likelihood estimate of the model parameter beta (i.e. the initial likelihood estimate), y i Is the actual case for the ith sample, n is the total number of samples.
S302, constructing an initial MODS risk assessment model based on a plurality of characteristic cytokines and initial model parameters corresponding to each characteristic cytokine.
Illustratively, when there are i characteristic cytokines, the initial MODS risk assessment model is as follows:。
in this embodiment, the risk assessment model constructed based on the characteristic cytokines screened in step S200 can only be used as an initial assessment model due to the limited number of samples in the original data set, and provides a basis for training with samples with large data volume in the subsequent step S400.
In some embodiments, the training the initial MODS risk assessment model in step S400, and calculating a model inclusion frequency of a plurality of the characteristic cytokines in the training process, and taking the characteristic cytokines with the model inclusion frequency higher than a preset value as target cytokines includes:
s401, resampling the preprocessed original data set with a replacement function, and forming a plurality of sample sets, wherein each sample set comprises a plurality of MODS samples;
for example, the Bootstrap method may be used to resample the data with the substitution, specifically, 1000 resampling may be performed on the preprocessed original data set, so as to generate 1000 Bootstrap sample sets, where each sample set includes multiple MODS samples, and the number of MODS samples in each sample set may be the same or different.
S402, establishing a plurality of characteristic cytokine subsets in a plurality of characteristic cytokines in each MODS sample.
The step is to select part of all characteristic cytokines in a Bootstrap sample set to form a plurality of characteristic cytokine subsets.
Illustratively, the number of characteristic cytokines is 44, and a part of the second cytokines can be randomly selected as the second cytokine subset, and the number of the second cytokines in each subset can be the same or different.
S403, substituting the factor level of each characteristic cytokine subset into the initial MODS risk assessment model for training, and obtaining a plurality of process training models, wherein each process training model is used for representing the relation between a characteristic cytokine subset and MODS;
the method comprises the steps of substituting standard concentration values of a plurality of characteristic cytokine subsets in each MODS sample in each sample set into an initial MODS risk assessment model for training to obtain a plurality of process training models, and obtaining probabilities between the plurality of characteristic cytokine subsets and the occurrence of MODS. In other words, multiple process training models can be obtained after training the initial MODS risk assessment model for each sample set.
S404, screening optimization process training models from a plurality of process training models based on the relation between the characteristic cytokine subset and the MODS, calculating model inclusion frequency of the characteristic cytokines related in the plurality of optimization process training models, and taking the characteristic cytokines with the model inclusion frequency higher than a preset value as the characteristic cytokines.
For example, a process training model with a probability of occurrence of MODS higher than a preset value may be extracted in each sample set as an optimization process training model, and a plurality of characteristic cytokines with a frequency of model inclusion higher than a preset value in all the optimization process training models extracted in all the sample sets may be taken as target cytokines.
For example, the multiple process training models in all sample sets may be ranked in terms of occurrence probability of MODS, the process training model with occurrence probability of MODS higher than a preset value is used as an optimization process training model, and multiple characteristic cytokines with frequency higher than the preset value are included in all the optimization process training models as target cytokines.
The model inclusion frequency can be understood as the probability of occurrence in all optimization process training models.
In this embodiment, a batch of sample sets is obtained by using a subsampling method with a put-back function, and each sample set includes a plurality of MODS samples, so that an initial MODS risk assessment model can be trained by using the batch of sample sets, and each training can calculate the relationship between the sample set and the occurrence of MODS based on sample data of one sample set by taking the factor level of part of characteristic cytokines as a model variable. Therefore, the characteristic cytokines which are reflected as model variables in the multiple training processes can be counted, the model inclusion frequency of the multiple characteristic cytokines as the model variables is counted, the higher the model inclusion frequency is, the greater the relation between the cytokines and MODS occurrence is indicated, and the characteristic cytokines with the model inclusion frequency higher than a preset value can be used as target cytokines. The process of model training is the process of screening the target cytokines, and the sample size is large in the embodiment, and the screening process is always based on the relationship between part of characteristic cytokines and MODS, so that the screened target cytokines can be more accurate.
In some embodiments, constructing a target MODS risk assessment model based on the target cytokine comprises:
calculating target likelihood estimates of each target cytokine by using a maximum likelihood estimation method based on a plurality of sample sets, wherein the target likelihood estimates are used as target model parameters of the corresponding target cytokines;
illustratively, the calculation of the target likelihood estimates for each target cytokine in the boottrap sample set is as follows:
L b (beta) is the likelihood function of the b-th MODS sample,is the maximum likelihood estimate of the model parameter beta (i.e., the target likelihood estimate).
And constructing an MODS target risk assessment model based on the target cytokines and target model parameters corresponding to each target cytokine.
Illustratively, when there are 6 characteristic cytokines, the target mod risk assessment model is as follows:
。
in this embodiment, a target risk assessment model is constructed based on the target cytokines screened in step S400, and the assessment model uses the concentration value of the target cytokines as a variable and outputs the probability relationship with the occurrence of MODS as a result. The target MODS risk model can be used in the peripheral blood analysis process of a patient suffering from subsequent wounds, the concentration value of each cytokine in the peripheral blood of a study object can be analyzed by using a flow analyzer, the concentration value of the target cytokine is extracted and substituted into the MODS risk assessment model, so that the probability relation between the patient and the occurrence of MODS is obtained, and a favorable basis is provided for the treatment of the subsequent patient.
It should be noted that, the present application constructs the target MODS risk assessment model in the following four ways, (1) uses a Logistic regression model and combines with a Bootstrap method to construct the target MODS risk assessment model; (2) Constructing a constructed target MODS risk assessment model by combining a random forest method with a Bootstrap method; (3) A target MODS risk assessment model constructed by using a Lasso logistic regression model; (4) And a target MODS risk assessment model is constructed by combining the GBDT model with a Bootstrap method. The construction process of the mode (1) is the process described in the above embodiment, and the following embodiment is used to describe the construction result of the first construction mode, the following three construction processes and the construction result, and evaluate the MODS prediction performance of the target MODS risk assessment model constructed in the above four modes.
Example 1 inclusion criteria and exclusion criteria for study subjects
Inclusion criteria for the study object mentioned in the previous examples were:
(1) Multi-trauma patients with ISS > 15;
(2) And signing an informed consent form.
Wherein, the elimination criteria of the study object are:
(1) The craniocerebral injury of GCS >9 or AIS > 3 is clearly diagnosed;
(2) Clearly has infection focus or has infectious diseases, and has higher infection risk;
(3) Pregnant women, lactating women, cardiovascular and cerebrovascular diseases, acute episode of respiratory diseases, HIV, etc.);
(4) Serious primary diseases affecting survival (e.g., unresectable tumors, hematological diseases, etc.);
(5) Immunosuppressants used in the last 6 months, and/or cytotoxic drugs used;
(6) Patients with mental disorders;
(7) Other clinical trials were enrolled within 30 days;
(8) The investigator judged that the test was not completed or did not participate in the test.
The demographic characteristics of the selected subjects are shown in table 1 based on the inclusion criteria and exclusion criteria of the subjects above.
Table 1 demographic characteristics of study subjects
Follow-up planning for patients in study subjects was: follow-up once daily 7 days before admission, once 14 days, once 28 days, the follow-up content includes: (1) Whether MODS occurs, a clinical diagnostic method for organ function impairment in trauma patients is: two or more consecutive days (excluding the first 48 hours) with a sequential organ failure score (Sequential Organ Failure Assessment, SOFA) of 5 points or more; (2) 28 natural survival rate; (3) Mechanical ventilation time, ICU stay time, hospitalization costs, etc.
As shown in table 1, 35 patients who met the inclusion criteria were each collected with peripheral blood from the day of patient admission as a sample and 17 healthy volunteers as controls. In this step, the values of each cytokine and its concentration in the peripheral blood of 35 patients on the day of admission and the values of each cytokine and its concentration in the peripheral blood of 17 healthy volunteers were collected as MODS samples to form a raw data set.
Example 2 target MODS Risk assessment model constructed by Logistic Logistic regression model and Bootstrap method
The construction process, i.e., the process described in the above embodiment, can also be described in the following manner:
1. data preparation and preprocessing:
normalization was performed on each cytokine concentration measurement:
wherein,is the original measurement of the ith cytokine, < + >>、Mean and standard deviation of the concentration values of the ith cytokine, respectively,/->Is the standard concentration value of the ith cytokine.
2. Primary screening:
descriptive statistical methods were used for preliminary analysis and 120 first cytokines were screened.
The relationship between each first cytokine and MODS is analyzed by using univariate Logistic regression model, and 90 second cytokines are screened out:
pIs the occurrence of MO at a given cytokine levelProbability of DS is that,is a model parameter, +.>Model parameters for the i-th cytokine.
3. Characteristic cytokine selection:
the relationship between the occurrence of multiple second cytokines and MODS was analyzed using a multivariate Logistic regression model, and 44 characteristic cytokines were screened as shown in Table 2:
p is the probability of MODS occurring at a given cytokine level,is a model parameter, +.>Model parameters for the i-th cytokine.
And comparing whether the difference between the two populations of the 44 cytokines in the factor levels of healthy and severely wounded patients was statistically significant using Wilcoxon-Mann-Whitney test, the results indicate that 42 cytokines other than IL_3 and IL_33 in the 44 cytokines in Table 2 were statistically significant, and the differential heat map thereof is shown in FIG. 7.
4. Establishing an initial MODS risk assessment model:
a Logistic regression model (i.e., an initial MODS risk assessment model) was constructed using the selected 44 characteristic cytokines as variables.
Solving model parameters by using a Maximum Likelihood Estimation (MLE) method:
is a likelihood function of the model parameter beta +.>Is the maximum likelihood estimate of the model parameter beta, y i Is the actual case for the ith sample, n is the total number of samples.
The initial MODS risk assessment model was constructed as follows:
。
5. training and optimizing an initial MODS risk assessment model:
the original data set is resampled 1000 times by using the Bootstrap method, and 1000 Bootstrap sample sets are generated. The initial MODS risk assessment model was trained based on a Bootstrap sample set, and 6 target cytokines were selected from the 44 characteristic cytokines, see Table 2 in detail.
6. Construction of target MODS risk assessment model
Constructing a Logistic regression model by using the selected 6 target cytokines as variables; and calculating parameters of a logistic regression model by using the Bootstrap sample set:
L b (beta) is the likelihood function of the b-th Bootstrap sample,is the maximum likelihood estimate of the model parameter beta.
The target MODS risk assessment model was constructed as follows:
。
wherein the numerical values of the model parameters are specifically shown in table 3.
ROC curve analysis of early warning efficacy of selected 6 target cytokines against MODS: bringing standard concentration values of 6 target cytokines such as IL-6, IL-8, IFN-gama, sTNF-RII, BLC, IL-1 RA and the like of a subject into a target MODS risk assessment model, and calculating a predicted value. This->Is 0.8208166. I.e. if->Patients are at higher risk of developing MODS +. >The patient is at a lower risk of developing MODS.
ROC curves were plotted based on standard concentration values of 6 target cytokines, such as il_6, il_8, ifn_gama, stnf_ RII, BLC, IL _1ra, and the like, of the subjects, as shown in fig. 2. The results show that the area under the ROC curve for the 6 target cytokines is 0.9722, the agreement ratio: 0.9118 (95% CI:0.7632, 0.9814), kappa:0.8247, specificity: 1.0000, sensitivity: 0.8333. the area under the ROC curve is higher, the specificity and the sensitivity are better, and the combined prediction efficiency of the 6 factors is better, namely the model has better MODS early warning efficiency.
TABLE 2 model inclusion frequencies for 44 characteristic cytokines in example 2
TABLE 3 model parameters table in the target MODS risk assessment model of example 2
Example 3 construction of a severe post-traumatic inflammatory factor storm MODS prediction model Using random forest method in combination with Bootstrap method
1. Data preparation and preprocessing:
normalization was performed on each cytokine concentration measurement:
wherein,is the original measurement of the ith cytokine, < + >>、Mean and standard deviation of the concentration values of the ith cytokine, respectively,/->Is the standard concentration value of the ith cytokine.
2. Feature selection and primary screening:
A random forest algorithm was applied to score feature importance and select 44 features (i.e., feature cytokines). Feature importance may be calculated by a decrease in average out-of-purity or an increase in average accuracy.
3. Random forest model construction (i.e., construction of an initial MODS risk assessment model):
a plurality of training data sets are created using a Bootstrap sampling method. For each training dataset, a decision tree is created using the following steps:
a subset of features is randomly selected from the 44 features.
The best segmentation features and segmentation points are chosen to maximize the Information Gain (IG) or minimize the genie unrepeace (Gini Impurity):
wherein,D p a dataset that is a parent node, f is a segmentation feature,D j is the data set of the j-th child node,N j andN p the number of samples of child and parent nodes respectively,p k is the sample ratio of the kth class.
4. Training and aggregation of random forest models:
a number of decision trees are trained and their predictions are aggregated to form a random forest model. The method for aggregating the predicted results may be a majority voting method:
where B is the number of decision trees,is the prediction result of the B-th tree.
5. Feature selection optimization:
and calculating the feature importance of the random forest trained by each Bootstrap sample set, and recording the selection frequency of each feature.
Target cytokines with importance top 6 in multiple boottrap sample sets were selected as final features, as shown in table 4.
6. And constructing a Logistic regression model, namely a target MODS risk assessment model, based on factor levels and model parameters of 6 target cytokines, wherein the model parameters can be calculated based on a Bootstrap sample set through a maximum likelihood function method.
The target MODS risk assessment model was constructed as follows:
。
the values of the model parameters are shown in Table 5.
7. ROC curve analysis of early warning efficacy of selected 6 target cytokines against MODS:
bringing standard concentration values of 6 target cytokines such as sTNF_RII, IL_8, fractalkine, IL _23, IL_13, IGFBP_4 and the like of a subject into a target MODS risk assessment model, and calculating a predicted value. This->Is 0.4332870. If->Patients are at higher risk of developing MODS ifThe patient is at a lower risk of developing MODS.
ROC curves were plotted based on standard concentration values of 6 target cytokines in subjects, stnf_rii, il_8, fractalkine, IL _23, il_13, igfbp_4, etc., as shown in fig. 3. The results show that the area under the ROC curve for these 6 target cytokines is 0.8090, the agreement rate: 0.7941 (95% CI:0.6210, 0.9130), kappa:0.5854, specificity: 0.7500, sensitivity: 0.8333.
Therefore, the area, specificity and sensitivity under the ROC curve in this embodiment are lower than those of embodiment 2, which indicates that the MODS early warning performance of the model is not as good as that of embodiment 2.
TABLE 4 model inclusion frequencies for 44 characteristic cytokines in example 3
TABLE 5 model parameters table in the target MODS risk assessment model of example 3
Example 4 target MODS Risk assessment model constructed Using Lasso logistic regression model
1. Data preparation and preprocessing:
normalization was performed on each cytokine concentration measurement:
wherein,is the original measurement of the ith cytokine, < + >>、Mean and standard deviation of the concentration values of the ith cytokine, respectively,/->Is the standard concentration value of the ith cytokine.
2. Lasso model construction (i.e., construction of initial MODS risk assessment model):
the relationship of cytokines to the occurrence of MODS was analyzed using the Lasso model.
The Lasso model performs feature selection and regularization by adding an L1 regularization term based on a common least squares estimation.
Optimization problem formula of Lasso regression:
wherein y is i Is a response variable, x ij Is the predicted variable, beta j Is a regression coefficient, N is the number of samples, p is the number of features, and λ is the regularization parameter.
3. Parameter optimization and model training:
cross-validation was used to determine the optimal regularization parameter λ, which was set to 0.1.
The Lasso model is trained and feature selection is performed simultaneously, namely 6 target cytokines are selected from a plurality of feature cytokines.
4. And constructing a Logistic regression model, namely a target MODS risk assessment model, based on factor levels of 6 target cytokines and model parameters, wherein the model parameters can be calculated based on the preprocessed original data set through a maximum likelihood function method.
The target MODS risk assessment model was constructed as follows:
。
the values of the model parameters are shown in Table 6.
5. ROC curve analysis of early warning efficacy of selected 6 target cytokines against MODS:
bringing standard concentration values of 6 target cytokines such as IL_8, IL_6, IL_13, IFN_ gama, BLC, fractalkine and the like of a subject into a target MODS risk assessment model, and calculating a predicted value. This->Is 0.4787222 if +.>Patients are at higher risk of developing MODS +.>The patient is at a lower risk of developing MODS.
ROC curves were plotted based on standard concentration values of 6 target cytokines in subjects, il_8, il_6, il_13, ifn_ gama, BLC, fractalkine, etc., as shown in fig. 4. The results show that the area under the ROC curve for these 6 target cytokines is 0.8438, the agreement rate: 0.7941 (95% CI:0.6210, 0.9130), kappa:0.5911, specificity: 0.7222, sensitivity: 0.8750.
Therefore, the area, specificity and sensitivity under the ROC curve in this embodiment are lower than those of embodiment 2, which indicates that the MODS early warning performance of the model is not as good as that of embodiment 2.
TABLE 6 model parameters table in the target MODS risk assessment model of example 4
Example 5 target MODS Risk assessment model constructed by GBDT model in combination with Bootstrap method
1. Data preparation and preprocessing:
normalization was performed on each cytokine concentration measurement:
wherein,is the original measurement of the ith cytokine, < + >>、Mean and standard deviation of the concentration values of the ith cytokine, respectively,/->Is the standard concentration value of the ith cytokine.
2. GBDT model construction (target MODS risk assessment model):
cytokine versus MODS was analyzed using GBDT model:
initializing a GBDT model, and setting a base learner as a decision tree.
Decision trees are gradually added, and each tree learns the residual error of the previous step.
GBDT iteratively updates the formula:
wherein Fm (x) is the model of the mth step, v is the learning rate, gamma jm Is the value of the jth leaf node of the mth tree, R jm Is the region of the jth leaf node of the mth tree.
The GBDT model modeled in this embodiment includes a tree model of 5 layers, 32 leaf nodes in total, data flows from top to bottom, the vertex flows to all final leaf nodes, the circulation mode uses the first layer as an example, the standard concentration value of the cytokine passes through the nodes of the layer, the judgment is made according to the threshold value obtained by training each node, which branch should be passed when going to the next layer, and the like, when the data flows to the last layer, the outputs of all leaf nodes of the layer are accumulated, and the accumulated value is used as the predicted value of the model 。
3. Feature selection optimization:
the original data set is resampled 1000 times by using the Bootstrap method, and 1000 Bootstrap sample sets are generated. The initial MODS risk assessment model is trained based on a Bootstrap sample set, and 6 target cytokines are selected from 24 characteristic cytokines, and the specific table is shown in Table 7.
4. ROC curve analysis of early warning efficacy of selected 6 target cytokines against MODS:
the standard concentration values of the target cytokines of Fractalkine, IL-12p70, sFasL, IL-3, IL-23, myoglobin and the like of the subjects are brought into a target MODS risk assessment model, and a predicted value is calculated. This->Is 0.4332870. If->Patients are at a higher risk of developing MODS if +.>The patient is at a lower risk of developing MODS.
ROC curves were plotted based on standard concentration values of 6 target cytokines such as Fractalkine, IL-12p70, sFasL, IL-3, IL-23, myoglobin, etc. in subjects, as shown in FIG. 5. The results show that the area under the ROC curve for the 6 target cytokines is 0.9635, the agreement ratio: 0.90844 (96% CI:0.7314, 0.9575), kappa:0.7902, specificity: 1.0000, sensitivity: 0.9473.
therefore, the area under the ROC curve in this embodiment is lower than that in embodiment 2, which indicates that the MODS early warning performance of the model is not as good as that in embodiment 2.
TABLE 7 model inclusion frequencies for 24 characteristic cytokines in example 5
Through the description of the above embodiments 2-5, when the initial MODS risk assessment model is GBDT, lasso regression model, random forest and logistics regression model, the area under ROC curves corresponding to the selected 6 target cytokines during model training is respectively: 0.9635,0.9722,0.809 and 0.8438. In summary, we select the Logistic regression model with the highest predicted efficacy to take the initial MODS risk assessment model, and construct an MODS early warning model of inflammatory factor storm after severe injury by using 6 cytokines such as IL_6, IL_8, IFN_gama, sTNF_ RII, BLC, IL _1RA and the like screened in the training process, so as to have better MODS early warning efficacy.
Example 6 application of the detection ligand composition of 6 target cytokines in example 2 Rapid Joint diagnostic kit
This example uses a method of magnetic bead antibody (i.e., the ligand) binding multichannel detection of small molecules to detect the concentration of 6 target cytokines. The detection method comprises the following steps:
1. early preparation
(1) Resuspending the mixed beads: the pre-mixed beads for detecting IL_6, IL_8, IFN_gama, sTNF_ RII, BLC, IL _1RA concentration were sonicated for 1 min and shaken for 1 min.
(2) Preparing an elution buffer: the 20X wash was left at room temperature to allow the salt to dissolve well. 2.5ml of 20 Xwashing solution was dissolved in 47.5. 47.5 ml water and stored at 4℃for 1 month at the upper limit.
(3) Preparing a standard: the standard was dissolved in 250 ul system buffer.
The tube was allowed to stand at room temperature for 10 minutes as C7, then diluted 3-fold as C6-C0, and C0 was the only system buffer. C7 takes 100ul, then 25ul adds 75ul as C6, and so on.
(4) Adding 5ml of system buffer solution into Matrix B (glass bottle), shaking, mixing, standing at room temperature for 15 min, keeping the Matrix B at below-70 deg.C, and keeping at upper limit for 1 month.
2. Sample collection
0.1ml of peripheral blood from the patient was collected using a procoagulant tube, allowed to stand at 4℃for 40 minutes, centrifuged at 3000 rpm for 15 minutes, and the supernatant carefully aspirated.
3. Experimental procedure
(1) Allowing all reagents to reach room temperature (20-25C)
In the experimental process, attention is paid to avoiding liquid scattering and illumination, and a matched flat plate is used.
(2) 25ul of serum was removed and mixed with 25ul of system buffer, and then with 150ul Matrix B. Standard 25ul was added to 25ul matrix B and 25ul assay buffer was added to each 25ul of the experimental group.
(3) Shaking beads 30s,25 ul and adding the above mixed solution. Note that the beads are sometimes oscillated to prevent sinking during sample addition.
(4) Incubate with shaking at 25℃for 3 hours, 650 rpm.
(5) 400g for 5 minutes
(6) The supernatant was carefully and quickly sucked off with a gun without dumping.
(7) The solution was washed with 200 ul wash buffer, resuspended and allowed to stand for one minute, and the above two steps were repeated 2 times.
(8) 25ul detection antibody was added to each well.
(9) The cells were sealed with fresh aluminum foil paper and incubated with shaking at room temperature for 1.5 hours at 650 rpm.
(10) 25ul SA-PE was added directly without washing.
(11) Sealing with new aluminum foil paper, shaking at 650 rpm for 45 min, centrifuging, carefully sucking off hybridization solution, adding 200 ul wash buffer, and storing overnight.
(12) 400g for 5 minutes, and blotted off.
(13) 120ul wash buffer re-suspension
(14) Preferably on the same day. If the read plate detection is available, the oscillation suspension of the beads before each plate read is guaranteed.
4. Data analysis
Flow analysis was performed using a flow analyzer, and the data was then used to quantitatively determine cytokine concentrations using either online or downloadable LEGENDplex.
The verification example of the target MODS risk assessment model in example 2 using the 6 target cytokines measured by the kit of this example is as follows:
(1) Model verification example one:
female patients aged 42 suffer multiple injuries from ISS29 score, SOFA score 8 score, and MODS occurred within one week after injury.
Post-traumatic early serum il_6, il_8, ifn_gama, stnf_ RII, BLC, IL _1ra (in pg/ml) levels are 311.87, 139.59,0.0001, 1592.10, 62.399, 2536.3, respectively.
Target MODS risk assessment model calculation incorporated in example 2And the risk of inflammatory factor storm MODS after the occurrence of wounds is extremely high and the positive case prediction is correct when the inflammatory factor storm MODS is larger than the Cut-off value 0.8208166.
(2) Model application example two:
female patients aged 50 had multiple lesions of ISS26 score and SOFA score 6 score, with no MODS occurring after the lesion.
Post-traumatic early serum il_6, il_8, ifn_gama, stnf_ RII, BLC, IL _1ra (in pg/ml) levels are 102.11, 166.63,0.002, 1207.96, 40.014, 28914.7, respectively.
Carry into the above formula calculationLess than Cut-off 0.8208166, the risk of inflammatory factor storm MODS after trauma is extremely low, and negative cases are correctly predicted.
In addition, the present application also studied the characteristics and biological significance of each factor of il_6, il_8, ifn_gama, stnf_ RII, BLC, IL _1ra, respectively, and compared the predicted efficacy of a single factor to further confirm the scientificity of the present invention. First, IFN_gama is generally considered a pro-inflammatory cytokine, also called macrophage activating factor, produced mainly by natural killer cells (NK cells) and CD4+/CD8+ T cells. The present study found that IFN-gama was significantly reduced in healthier patients following trauma. The IFN_gama reduction is reasonable and scientific due to the massive reduction of CD4+ T cells after severe trauma, and the IFN_gama reduction also suggests that the increased susceptibility of immunosuppression induced infection can be caused in the later stage of severe trauma. IL-6 is a key mediator of the acute response, the most classical pro-inflammatory factor, involved in almost all types of inflammatory responses, which are more like "positive controls" in the model, justifying the reasonability of this model, IL-6 being significantly elevated after severe trauma. IL_8 is called chemokine CXCL8, which is a cytokine secreted by macrophages and epithelial cells, etc. IL-8 binds to the chemokine receptors interleukin-8 receptor alpha (IL 8RA, also known as CXCR 1) and interleukin-8 receptor beta (IL 8RB, also known as CXCR 2) and has a cellular chemotactic effect on neutrophils to effect its modulation of inflammatory responses, IL-8 being significantly elevated after severe trauma. B-lymphocyte chemokines (BLCs), secreted primarily by helper T, act to induce recruitment of B cells, and are significantly reduced after severe trauma. Although BLC was not reported in wound-related studies, significant B-cell and T-cell reduction after severe wounds has been demonstrated, and the results are scientific and reasonable. Interleukin 1 receptor antagonists (IL-1 RA) are proteins produced in humans that antagonize IL-1 beta and alpha, bind primarily to the IL-1 receptor, and can be produced by a variety of immune cell assays, with IL-1 RA being significantly elevated after severe trauma. sTNF_RII is a receptor for TNF- α, and after binding to TNF- α, acts primarily as a pro-inflammatory agent, with a significant rise in sTNF_RII following severe trauma. At present, no sTNF_RII is seen as a wound detection indicator.
Further, the individual predictive power of the 6 target cytokines was also evaluated, as shown in fig. 6, wherein the area under the ROC curves for il_6, il_8, ifn_gama, stnf_ RII, BLC, IL _1ra were 0.764,0.733,0.618,0.661,0.747,0.490, respectively, each being lower than the combined predictive power of 0.972, suggesting that the combined diagnostic power of the 6 factors is optimal.
From the pathophysiological point of view, the 6 factors are "evenly" distributed on different positions (neutrophils, monocytes/macrophages, T cells, B cells) in the innate immune process and play different functions (activation, pro-inflammatory, recruitment, anti-inflammatory) and do not substantially coincide with each other. This demonstrates to some extent that our screening mechanism is scientific and rational, covering the whole process of excessive immune response after severe trauma, and as such, has excellent performance after combination.
Moreover, the occurrence of MODS is essentially the striking of systemic inflammatory reaction, namely inflammatory factor storm, on distant organs, so the invention can well predict the occurrence of MODS by only finding out wound-specific inflammatory factors without paying special attention to the influence caused by age, sex, complications and the like.
In conclusion, the invention can rapidly and accurately predict the occurrence of inflammatory factor storm MODS of a patient with severe wounds by using the inflammatory factor composition, and can greatly improve the success rate of clinic treatment for severe wounds.
It should be noted that, the method of the embodiments of the present application may be performed by a single device, for example, a computer or a server. The method of the embodiment can also be applied to a distributed scene, and is completed by mutually matching a plurality of devices. In the case of such a distributed scenario, one of the devices may perform only one or more steps of the methods of embodiments of the present application, and the devices may interact with each other to complete the methods.
It should be noted that some embodiments of the present application are described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments described above and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Based on the same inventive concept, the application also provides a MODS risk assessment model construction device corresponding to the method of any embodiment.
Referring to fig. 8, the mod s risk assessment model construction apparatus includes:
an obtaining module 601, configured to obtain a preprocessed raw data set, where the raw data set includes a plurality of MODS samples, and each MODS sample includes a plurality of cytokines and concentration values corresponding to the cytokines;
a first screening module 602 configured to analyze a relationship between each cytokine and MODS based on the preprocessed raw data set and determine a plurality of characteristic cytokines;
an initial model construction module 603, configured to construct an initial mod risk assessment model based on a plurality of the characteristic cytokines;
the second screening module 604 is configured to train the initial MODS risk assessment model, calculate model inclusion frequencies of the plurality of characteristic cytokines in the training process, and take the characteristic cytokines with the model inclusion frequencies higher than a preset value as target cytokines;
the target model construction module 605 is configured to construct a target mod risk assessment model based on the target cytokines.
For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, the functions of each module may be implemented in the same piece or pieces of software and/or hardware when implementing the present application.
The device of the foregoing embodiment is configured to implement the corresponding method for constructing the MODS risk assessment model in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which is not described herein.
Based on the same inventive concept, the application also provides an electronic device corresponding to the method of any embodiment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the method for constructing the MODS risk assessment model according to any embodiment when executing the program.
Fig. 9 shows a more specific hardware architecture of an electronic device according to this embodiment, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit ), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc. for executing relevant programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage device, dynamic storage device, or the like. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present specification are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.
The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
Communication interface 1040 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path for transferring information between components of the device (e.g., processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).
It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.
The electronic device of the foregoing embodiment is configured to implement the corresponding method for constructing the MODS risk assessment model in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which is not described herein.
Based on the same inventive concept, corresponding to any of the above embodiments of the method, the present application further provides a non-transitory computer readable storage medium storing computer instructions for causing the computer to execute the MODS risk assessment model building method according to any of the above embodiments.
The computer readable media of the present embodiments, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device.
The storage medium of the foregoing embodiments stores computer instructions for causing the computer to execute the method for constructing the MODS risk assessment model according to any one of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the application (including the claims) is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the present application, the steps may be implemented in any order, and there are many other variations of the different aspects of the embodiments of the present application as described above, which are not provided in detail for the sake of brevity.
Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure the embodiments of the present application. Furthermore, the devices may be shown in block diagram form in order to avoid obscuring the embodiments of the present application, and this also takes into account the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform on which the embodiments of the present application are to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the application, it should be apparent to one skilled in the art that embodiments of the application can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.
While the present application has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of those embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the embodiments discussed.
The present embodiments are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Accordingly, any omissions, modifications, equivalents, improvements and/or the like which are within the spirit and principles of the embodiments are intended to be included within the scope of the present application.
Claims (9)
1. The MODS risk assessment model construction method is characterized by comprising the following steps of:
acquiring a preprocessed original data set, wherein the original data set comprises a plurality of MODS samples, and each MODS sample comprises a plurality of cytokines and corresponding concentration values thereof;
analyzing the relationship between each cytokine and MODS based on the preprocessed raw data set, and determining a plurality of characteristic cytokines;
constructing an initial MODS risk assessment model based on a plurality of the characteristic cytokines;
Training the initial MODS risk assessment model, calculating model inclusion frequencies of a plurality of characteristic cytokines in the training process, and taking the characteristic cytokines with the model inclusion frequencies higher than a preset value as target cytokines;
constructing a target MODS risk assessment model based on the target cytokines;
the training the initial mod s risk assessment model, calculating a model inclusion frequency of a plurality of the characteristic cytokines in the training process, and taking the characteristic cytokines with the model inclusion frequency higher than a preset value as target cytokines, including:
resampling the preprocessed original data set with a replacement, and forming a plurality of sample sets, wherein each sample set comprises a plurality of MODS samples;
establishing a plurality of characteristic cytokine subsets among a plurality of characteristic cytokines in each of the mod s samples;
substituting the factor level of each characteristic cytokine subset into the initial MODS risk assessment model for training, and obtaining a plurality of process training models, wherein each process training model is used for representing the relation between a characteristic cytokine subset and MODS;
and screening an optimization process training model from a plurality of process training models based on the relation between the characteristic cytokine subset and the MODS, calculating model inclusion frequency of the characteristic cytokines related in the plurality of optimization process training models, and taking the characteristic cytokines with the model inclusion frequency higher than a preset value as the characteristic cytokines.
2. The method of claim 1, wherein the acquiring the preprocessed raw data set comprises:
acquiring an original data set;
and (3) carrying out standardization processing on the concentration value of each cytokine in the original data set to obtain a preprocessed data set.
3. The method of claim 1, wherein analyzing the relationship between each cytokine and MODS and determining a plurality of characteristic cytokines based on the pre-processed raw dataset comprises:
based on the preprocessed original data set, determining an initial probability relation between each cytokine and MODS by using a descriptive statistical method, and determining a plurality of first cytokines based on the initial probability relation;
analyzing a first probability relation between each first cytokine and MODS by applying a univariate logistic regression model, and determining a plurality of second cytokines based on the first probability relation;
and analyzing a second probability relation between the plurality of second cytokines and the MODS by using a multivariate logistic regression model, and determining a plurality of characteristic cytokines based on the second probability relation.
4. The method of claim 3, wherein analyzing the second probability relationship of the plurality of second cytokines to the occurrence of MODS using the multivariate logistic regression model and determining the plurality of characteristic cytokines based on the second probability relationship comprises:
Establishing a plurality of second cytokine subsets in a plurality of second cytokines in each of the mod s samples;
substituting the factor level of each second cytokine subset into the multivariate logistic regression model for calculation, and obtaining a plurality of process calculation models, wherein each process calculation model is used for representing a second probability relation between a second cytokine subset and MODS;
screening an optimization process calculation model from a plurality of process calculation models based on a second probability relation between each second cytokine subset and MODS, calculating a model inclusion frequency of the second cytokines involved in the plurality of optimization process calculation models, and taking the second cytokines with the model inclusion frequency higher than a preset value as characteristic cytokines.
5. The method of claim 3, wherein said constructing an initial MODS risk assessment model based on a plurality of said characteristic cytokines comprises:
based on the preprocessed original data set, calculating initial likelihood estimation of each characteristic cell factor by adopting a maximum likelihood estimation method, and taking the initial likelihood estimation as an initial model parameter of each characteristic cell factor;
And constructing an initial MODS risk assessment model based on the plurality of characteristic cytokines and initial model parameters corresponding to each characteristic cytokine.
6. The method of claim 1, wherein constructing a target MODS risk assessment model based on the target cytokine comprises:
calculating target likelihood estimates of each target cytokine by using a maximum likelihood estimation method based on a plurality of sample sets, wherein the target likelihood estimates are used as target model parameters of the corresponding target cytokines;
and constructing an MODS target risk assessment model based on the target cytokines and target model parameters corresponding to each target cytokine.
7. A MODS risk assessment model construction apparatus, characterized by comprising:
the acquisition module is used for acquiring a preprocessed original data set, wherein the original data set comprises a plurality of MODS samples, and each MODS sample comprises a plurality of cytokines and corresponding concentration values thereof;
a first screening module for analyzing the relationship between each cytokine and MODS and determining a plurality of characteristic cytokines based on the preprocessed raw data set;
The initial model construction module is used for constructing an initial MODS risk assessment model based on a plurality of the characteristic cytokines;
the second screening module is used for training the initial MODS risk assessment model, calculating model inclusion frequencies of a plurality of characteristic cytokines in the training process, and taking the characteristic cytokines with the model inclusion frequencies higher than a preset value as target cytokines;
the target model construction module is used for constructing a target MODS risk assessment model based on the target cytokines;
wherein the second screening module is further configured to:
resampling the preprocessed original data set with a replacement, and forming a plurality of sample sets, wherein each sample set comprises a plurality of MODS samples;
establishing a plurality of characteristic cytokine subsets among a plurality of characteristic cytokines in each of the mod s samples;
substituting the factor level of each characteristic cytokine subset into the initial MODS risk assessment model for training, and obtaining a plurality of process training models, wherein each process training model is used for representing the relation between a characteristic cytokine subset and MODS;
and screening an optimization process training model from a plurality of process training models based on the relation between the characteristic cytokine subset and the MODS, calculating model inclusion frequency of the characteristic cytokines related in the plurality of optimization process training models, and taking the characteristic cytokines with the model inclusion frequency higher than a preset value as the characteristic cytokines.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 6 when the program is executed by the processor.
9. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410041125.2A CN117558452B (en) | 2024-01-11 | 2024-01-11 | MODS risk assessment model construction method, device, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410041125.2A CN117558452B (en) | 2024-01-11 | 2024-01-11 | MODS risk assessment model construction method, device, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117558452A CN117558452A (en) | 2024-02-13 |
CN117558452B true CN117558452B (en) | 2024-03-26 |
Family
ID=89813241
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410041125.2A Active CN117558452B (en) | 2024-01-11 | 2024-01-11 | MODS risk assessment model construction method, device, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117558452B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105653450A (en) * | 2015-12-28 | 2016-06-08 | 中国石油大学(华东) | Software defect data feature selection method based on combination of modified genetic algorithm and Adaboost |
WO2021064195A1 (en) * | 2019-10-02 | 2021-04-08 | Faisal Aldo | Systems and methods for monitoring the state of a disease using a biomarker, systems and methods for identifying a biomarker of interest for a disease |
CN113537600A (en) * | 2021-07-20 | 2021-10-22 | 浙江省水利水电勘测设计院 | Medium-and-long-term rainfall forecast modeling method based on whole-process coupled machine learning |
CN113838577A (en) * | 2021-11-08 | 2021-12-24 | 北京航空航天大学 | Convenient layered old people MODS early death risk assessment model, device and establishment method |
CN114023440A (en) * | 2021-11-08 | 2022-02-08 | 中国人民解放军总医院 | Model and device capable of explaining layered old people MODS early death risk assessment and establishing method thereof |
CN114096845A (en) * | 2019-07-12 | 2022-02-25 | 贝克曼库尔特有限公司 | Systems and methods for assessing immune response to infection |
CN115337000A (en) * | 2022-10-19 | 2022-11-15 | 之江实验室 | Machine learning method for evaluating brain aging caused by diseases based on brain structure images |
CN116364268A (en) * | 2022-11-01 | 2023-06-30 | 山东大学 | Novel breast cancer prediction method based on punishment COX regression |
WO2023235768A2 (en) * | 2022-06-01 | 2023-12-07 | Children's Hospital Medical Center | Biomarker-based risk model to predict death and persistent multiple organ dysfunction syndrome in pediatric septic shock |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9696321B2 (en) * | 2012-06-06 | 2017-07-04 | National University Corporation Okayama University | Therapeutic agent, method of treatment and method for predicting the severity of systemic inflammatory response syndrome (SIRS), diseases caused or accompanied by neutrophil activation |
-
2024
- 2024-01-11 CN CN202410041125.2A patent/CN117558452B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105653450A (en) * | 2015-12-28 | 2016-06-08 | 中国石油大学(华东) | Software defect data feature selection method based on combination of modified genetic algorithm and Adaboost |
CN114096845A (en) * | 2019-07-12 | 2022-02-25 | 贝克曼库尔特有限公司 | Systems and methods for assessing immune response to infection |
WO2021064195A1 (en) * | 2019-10-02 | 2021-04-08 | Faisal Aldo | Systems and methods for monitoring the state of a disease using a biomarker, systems and methods for identifying a biomarker of interest for a disease |
CN113537600A (en) * | 2021-07-20 | 2021-10-22 | 浙江省水利水电勘测设计院 | Medium-and-long-term rainfall forecast modeling method based on whole-process coupled machine learning |
CN113838577A (en) * | 2021-11-08 | 2021-12-24 | 北京航空航天大学 | Convenient layered old people MODS early death risk assessment model, device and establishment method |
CN114023440A (en) * | 2021-11-08 | 2022-02-08 | 中国人民解放军总医院 | Model and device capable of explaining layered old people MODS early death risk assessment and establishing method thereof |
WO2023235768A2 (en) * | 2022-06-01 | 2023-12-07 | Children's Hospital Medical Center | Biomarker-based risk model to predict death and persistent multiple organ dysfunction syndrome in pediatric septic shock |
CN115337000A (en) * | 2022-10-19 | 2022-11-15 | 之江实验室 | Machine learning method for evaluating brain aging caused by diseases based on brain structure images |
CN116364268A (en) * | 2022-11-01 | 2023-06-30 | 山东大学 | Novel breast cancer prediction method based on punishment COX regression |
Non-Patent Citations (2)
Title |
---|
《Characteristics and Risk Factors of Myocardial Injury after Traumatic Hemorrhagic Shock》;常盼盼 等;《Clinical Medicine》;20220817;全文 * |
全身炎症反应综合征患者血清细胞因子动态变化的研究;梅雪;李春盛;王烁;;中国危重病急救医学;20060210(02);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN117558452A (en) | 2024-02-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Fortino et al. | Machine-learning–driven biomarker discovery for the discrimination between allergic and irritant contact dermatitis | |
JP2024104300A (en) | Cancer classifier models, machine learning systems, and methods of use | |
CN111640509A (en) | Cervical cancer postoperative recurrence risk prediction method and system | |
JP2020507838A (en) | Systems and methods for using supervised learning to predict subject-specific pneumonia transcription | |
RU2017129983A (en) | The method of applying information about a complex group of biomarkers for the diagnosis of a lung cancer in a subject and the diagnostic kit and computer system using it. | |
JP2023526241A (en) | Clinical predictor based on multiple machine learning models | |
CN112735592B (en) | Construction method and application method of lung cancer prognosis model and electronic equipment | |
CN111640518A (en) | Cervical cancer postoperative survival prediction method, system, equipment and medium | |
CN113270188A (en) | Method and device for constructing prognosis prediction model of patient after esophageal squamous carcinoma radical treatment | |
CN105209631A (en) | A method for improving disease diagnosis using measured analytes | |
Cai et al. | Predicting acute kidney injury risk in acute myocardial infarction patients: an artificial intelligence model using medical information mart for intensive care databases | |
US20230263477A1 (en) | Universal pan cancer classifier models, machine learning systems and methods of use | |
Weekes et al. | Development and validation of a prognostic tool: pulmonary embolism short-term clinical outcomes risk estimation (PE-SCORE) | |
Yang et al. | Explainable ensemble machine learning model for prediction of 28-day mortality risk in patients with sepsis-associated acute kidney injury | |
Xue et al. | Machine learning for the prediction of acute kidney injury in patients after cardiac surgery | |
Jamal et al. | A biomarker based severity progression indicator for COVID-19: the Kuwait prognosis indicator score | |
CN117558452B (en) | MODS risk assessment model construction method, device, equipment and medium | |
CN117554628B (en) | Inflammatory factor composition, model and kit for early warning of MODS | |
WO2023278601A1 (en) | Methods and systems for machine learning analysis of inflammatory skin diseases | |
Qin et al. | Refining empiric subgroups of pediatric sepsis using machine-learning techniques on observational data | |
CN114188014A (en) | Patient hospital unhealthy prognosis prediction model construction method, system and application | |
CN117438097A (en) | Method and system for predicting recurrence risk after early liver cancer operation | |
CN114540485B (en) | Method, system and application for predicting ACLF occurrence or prognosis of HBV related liver disease patient | |
Wang et al. | Systemic lupus erythematosus with high disease activity identification based on machine learning | |
KR102305806B1 (en) | Method for prodicting prognosis in lung cancer patient using clinical information and gene polymorphism information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |