CN116994653A - Sepsis diagnosis model construction method, compound screening method and electronic equipment - Google Patents
Sepsis diagnosis model construction method, compound screening method and electronic equipment Download PDFInfo
- Publication number
- CN116994653A CN116994653A CN202311247147.6A CN202311247147A CN116994653A CN 116994653 A CN116994653 A CN 116994653A CN 202311247147 A CN202311247147 A CN 202311247147A CN 116994653 A CN116994653 A CN 116994653A
- Authority
- CN
- China
- Prior art keywords
- sepsis
- genes
- gene
- obtaining
- diagnostic model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 206010040047 Sepsis Diseases 0.000 title claims abstract description 236
- 150000001875 compounds Chemical class 0.000 title claims abstract description 75
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000012216 screening Methods 0.000 title claims abstract description 32
- 238000003745 diagnosis Methods 0.000 title claims abstract description 31
- 238000010276 construction Methods 0.000 title claims abstract description 28
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 213
- 230000014509 gene expression Effects 0.000 claims abstract description 39
- 238000010201 enrichment analysis Methods 0.000 claims abstract description 27
- 230000032683 aging Effects 0.000 claims abstract description 14
- 238000007477 logistic regression Methods 0.000 claims abstract description 9
- 238000012706 support-vector machine Methods 0.000 claims abstract description 9
- 238000000611 regression analysis Methods 0.000 claims abstract description 8
- 102000004169 proteins and genes Human genes 0.000 claims description 27
- 240000007164 Salvia officinalis Species 0.000 claims description 19
- 235000005412 red sage Nutrition 0.000 claims description 16
- 230000008685 targeting Effects 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 11
- 244000132619 red sage Species 0.000 claims description 9
- 108700012928 MAPK14 Proteins 0.000 claims description 8
- 101001030211 Homo sapiens Myc proto-oncogene protein Proteins 0.000 claims description 7
- 101000861454 Homo sapiens Protein c-Fos Proteins 0.000 claims description 7
- 102000054819 Mitogen-activated protein kinase 14 Human genes 0.000 claims description 7
- 235000011135 Salvia miltiorrhiza Nutrition 0.000 claims description 7
- 101100457345 Danio rerio mapk14a gene Proteins 0.000 claims description 6
- 101100457347 Danio rerio mapk14b gene Proteins 0.000 claims description 6
- 101150003941 Mapk14 gene Proteins 0.000 claims description 6
- 230000004186 co-expression Effects 0.000 claims description 6
- 102100021631 B-cell lymphoma 6 protein Human genes 0.000 claims description 5
- 101000971234 Homo sapiens B-cell lymphoma 6 protein Proteins 0.000 claims description 5
- 101000876829 Homo sapiens Protein C-ets-1 Proteins 0.000 claims description 5
- 102100035251 Protein C-ets-1 Human genes 0.000 claims description 5
- 238000002493 microarray Methods 0.000 claims description 5
- AZQWKYJCGOJGHM-UHFFFAOYSA-N 1,4-benzoquinone Chemical compound O=C1C=CC(=O)C=C1 AZQWKYJCGOJGHM-UHFFFAOYSA-N 0.000 claims description 4
- WTPPRJKFRFIQKT-UHFFFAOYSA-N 1,6-dimethyl-8,9-dihydronaphtho[1,2-g][1]benzofuran-10,11-dione;1-methyl-6-methylidene-8,9-dihydro-7h-naphtho[1,2-g][1]benzofuran-10,11-dione Chemical compound O=C1C(=O)C2=C3CCCC(=C)C3=CC=C2C2=C1C(C)=CO2.O=C1C(=O)C2=C3CCC=C(C)C3=CC=C2C2=C1C(C)=CO2 WTPPRJKFRFIQKT-UHFFFAOYSA-N 0.000 claims description 4
- 101000898093 Homo sapiens Protein C-ets-2 Proteins 0.000 claims description 4
- 102100021890 Protein C-ets-2 Human genes 0.000 claims description 4
- HYXITZLLTYIPOF-UHFFFAOYSA-N Tanshinone II Natural products O=C1C(=O)C2=C3CCCC(C)(C)C3=CC=C2C2=C1C(C)=CO2 HYXITZLLTYIPOF-UHFFFAOYSA-N 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 4
- AZEZEAABTDXEHR-UHFFFAOYSA-M sodium;1,6,6-trimethyl-10,11-dioxo-8,9-dihydro-7h-naphtho[1,2-g][1]benzofuran-2-sulfonate Chemical compound [Na+].C12=CC=C(C(CCC3)(C)C)C3=C2C(=O)C(=O)C2=C1OC(S([O-])(=O)=O)=C2C AZEZEAABTDXEHR-UHFFFAOYSA-M 0.000 claims description 4
- AIGAZQPHXLWMOJ-UHFFFAOYSA-N tanshinone IIA Natural products C1=CC2=C(C)C=CC=C2C(C(=O)C2=O)=C1C1=C2C(C)=CO1 AIGAZQPHXLWMOJ-UHFFFAOYSA-N 0.000 claims description 4
- 238000010200 validation analysis Methods 0.000 claims description 4
- XHALVRQBZGZHFE-BBOMDTFKSA-N Methyl rosmarinate Chemical compound C([C@H](C(=O)OC)OC(=O)\C=C\C=1C=C(O)C(O)=CC=1)C1=CC=C(O)C(O)=C1 XHALVRQBZGZHFE-BBOMDTFKSA-N 0.000 claims description 3
- 102100038895 Myc proto-oncogene protein Human genes 0.000 claims description 3
- 102100027584 Protein c-Fos Human genes 0.000 claims description 3
- 239000002253 acid Substances 0.000 claims description 3
- 150000002576 ketones Chemical class 0.000 claims description 3
- LJEROUHPDCFEGJ-VVTLTYDVSA-N (1r,3r,4s)-4,7,9-trihydroxy-1,3-dimethyl-3,4-dihydro-1h-benzo[g]isochromene-5,10-dione Chemical compound O=C1C2=CC(O)=CC(O)=C2C(=O)C2=C1[C@H](O)[C@@H](C)O[C@@H]2C LJEROUHPDCFEGJ-VVTLTYDVSA-N 0.000 claims description 2
- PHJUPBDWRVMJQA-UHFFFAOYSA-N 2-(4-hydroxy-3-methoxyphenyl)-5-(3-hydroxypropyl)-7-methoxy-1-benzofuran-3-carbaldehyde Chemical compound C1=C(O)C(OC)=CC(C2=C(C3=CC(CCCO)=CC(OC)=C3O2)C=O)=C1 PHJUPBDWRVMJQA-UHFFFAOYSA-N 0.000 claims description 2
- 241000304195 Salvia miltiorrhiza Species 0.000 claims description 2
- 238000011161 development Methods 0.000 abstract description 5
- 230000018109 developmental process Effects 0.000 abstract description 5
- 230000008506 pathogenesis Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 24
- 230000000875 corresponding effect Effects 0.000 description 18
- 238000004458 analytical method Methods 0.000 description 15
- 230000006870 function Effects 0.000 description 11
- 230000009758 senescence Effects 0.000 description 8
- 230000000694 effects Effects 0.000 description 6
- 230000003993 interaction Effects 0.000 description 5
- 208000010718 Multiple Organ Failure Diseases 0.000 description 4
- 206010028980 Neoplasm Diseases 0.000 description 4
- 230000031018 biological processes and functions Effects 0.000 description 4
- 208000029744 multiple organ dysfunction syndrome Diseases 0.000 description 4
- 238000003068 pathway analysis Methods 0.000 description 4
- 230000001225 therapeutic effect Effects 0.000 description 4
- 235000017276 Salvia Nutrition 0.000 description 3
- 108091023040 Transcription factor Proteins 0.000 description 3
- 102000040945 Transcription factor Human genes 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000004663 cell proliferation Effects 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 230000002452 interceptive effect Effects 0.000 description 3
- 238000003012 network analysis Methods 0.000 description 3
- 230000037361 pathway Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000004393 prognosis Methods 0.000 description 3
- XYKZSUXWBGUGQV-UHFFFAOYSA-N 4,8-dimethylnaphtho[2,1-f][1]benzofuran-7,11-dione Chemical compound CC1=CC=CC2=C(C(C=3OC=C(C=3C3=O)C)=O)C3=CC=C21 XYKZSUXWBGUGQV-UHFFFAOYSA-N 0.000 description 2
- 230000003110 anti-inflammatory effect Effects 0.000 description 2
- 239000003963 antioxidant agent Substances 0.000 description 2
- 230000003078 antioxidant effect Effects 0.000 description 2
- 235000006708 antioxidants Nutrition 0.000 description 2
- 230000036772 blood pressure Effects 0.000 description 2
- 230000024245 cell differentiation Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000003111 delayed effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 229940000406 drug candidate Drugs 0.000 description 2
- 238000013399 early diagnosis Methods 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 230000028993 immune response Effects 0.000 description 2
- 230000028709 inflammatory response Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000008816 organ damage Effects 0.000 description 2
- 230000003334 potential effect Effects 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000019491 signal transduction Effects 0.000 description 2
- 230000004083 survival effect Effects 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 208000011580 syndromic disease Diseases 0.000 description 2
- 230000008718 systemic inflammatory response Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- WCGUUGGRBIKTOS-GPOJBZKASA-N (3beta)-3-hydroxyurs-12-en-28-oic acid Chemical compound C1C[C@H](O)C(C)(C)[C@@H]2CC[C@@]3(C)[C@]4(C)CC[C@@]5(C(O)=O)CC[C@@H](C)[C@H](C)[C@H]5C4=CC[C@@H]3[C@]21C WCGUUGGRBIKTOS-GPOJBZKASA-N 0.000 description 1
- -1 2- (4-hydroxy-3-methoxyphenyl) -5- (3-hydroxypropyl) -7-methoxy-3-benzofurancarbaldehyde (2- (4-hydroxy-3-methoxyphenyl) -5- (3-hydroxy-propyl) -7-methoxy-3-benzofurancarboxaldehyde) Chemical compound 0.000 description 1
- CSCPPACGZOOCGX-UHFFFAOYSA-N Acetone Chemical compound CC(C)=O CSCPPACGZOOCGX-UHFFFAOYSA-N 0.000 description 1
- 235000013479 Amaranthus retroflexus Nutrition 0.000 description 1
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 240000005674 Ceanothus americanus Species 0.000 description 1
- 235000014224 Ceanothus americanus Nutrition 0.000 description 1
- 235000001904 Ceanothus herbaceus Nutrition 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 108091067362 Fos family Proteins 0.000 description 1
- 102000039539 Fos family Human genes 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 241000207923 Lamiaceae Species 0.000 description 1
- 102000043136 MAP kinase family Human genes 0.000 description 1
- 108091054455 MAP kinase family Proteins 0.000 description 1
- 102100023482 Mitogen-activated protein kinase 14 Human genes 0.000 description 1
- 206010061309 Neoplasm progression Diseases 0.000 description 1
- 208000012902 Nervous system disease Diseases 0.000 description 1
- 108010090920 Proto-Oncogene Proteins c-bcl-6 Proteins 0.000 description 1
- 102000013538 Proto-Oncogene Proteins c-bcl-6 Human genes 0.000 description 1
- 239000004480 active ingredient Substances 0.000 description 1
- 230000000844 anti-bacterial effect Effects 0.000 description 1
- 230000000259 anti-tumor effect Effects 0.000 description 1
- KZNIFHPLKGYRTM-UHFFFAOYSA-N apigenin Chemical compound C1=CC(O)=CC=C1C1=CC(=O)C2=C(O)C=C(O)C=C2O1 KZNIFHPLKGYRTM-UHFFFAOYSA-N 0.000 description 1
- 229940117893 apigenin Drugs 0.000 description 1
- XADJWCRESPGUTB-UHFFFAOYSA-N apigenin Natural products C1=CC(O)=CC=C1C1=CC(=O)C2=CC(O)=C(O)C=C2O1 XADJWCRESPGUTB-UHFFFAOYSA-N 0.000 description 1
- 235000008714 apigenin Nutrition 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 230000008236 biological pathway Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000032677 cell aging Effects 0.000 description 1
- 230000022131 cell cycle Effects 0.000 description 1
- 230000003915 cell function Effects 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000005754 cellular signaling Effects 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 238000013524 data verification Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- VPLLTGLLUHLIHA-UHFFFAOYSA-N dicyclohexyl(phenyl)phosphane Chemical compound C1CCCCC1P(C=1C=CC=CC=1)C1CCCCC1 VPLLTGLLUHLIHA-UHFFFAOYSA-N 0.000 description 1
- 230000002526 effect on cardiovascular system Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000003090 exacerbative effect Effects 0.000 description 1
- 230000008717 functional decline Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 241000411851 herbal medicine Species 0.000 description 1
- 210000002865 immune cell Anatomy 0.000 description 1
- 230000003832 immune regulation Effects 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- XHALVRQBZGZHFE-KRWDZBQOSA-N methyl rosmarinate Natural products COC(=O)[C@H](Cc1ccc(O)c(O)c1)OC(=O)C=Cc2ccc(O)c(O)c2 XHALVRQBZGZHFE-KRWDZBQOSA-N 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000004879 molecular function Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000002633 protecting effect Effects 0.000 description 1
- 230000006916 protein interaction Effects 0.000 description 1
- 230000000754 repressing effect Effects 0.000 description 1
- 230000011506 response to oxidative stress Effects 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- XHALVRQBZGZHFE-UHFFFAOYSA-N rosmarinic acid methyl ester Natural products C=1C=C(O)C(O)=CC=1C=CC(=O)OC(C(=O)OC)CC1=CC=C(O)C(O)=C1 XHALVRQBZGZHFE-UHFFFAOYSA-N 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 230000005751 tumor progression Effects 0.000 description 1
- 229940096998 ursolic acid Drugs 0.000 description 1
- PLSAJKYPRJGMHO-UHFFFAOYSA-N ursolic acid Natural products CC1CCC2(CCC3(C)C(C=CC4C5(C)CCC(O)C(C)(C)C5CCC34C)C2C1C)C(=O)O PLSAJKYPRJGMHO-UHFFFAOYSA-N 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/27—Regression, e.g. linear or logistic regression
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Software Systems (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Epidemiology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Chemical & Material Sciences (AREA)
- Pathology (AREA)
- Primary Health Care (AREA)
- Computing Systems (AREA)
- Analytical Chemistry (AREA)
- Bioethics (AREA)
- Mathematical Physics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The application provides a sepsis diagnosis model construction method, a compound screening method and electronic equipment, wherein the sepsis diagnosis model construction method comprises the following steps: obtaining a sepsis gene expression dataset and an aging gene dataset; analyzing the sepsis gene expression dataset and the senescent gene dataset according to a genobody enrichment analysis and a kyoto gene and genome encyclopedia enrichment analysis to obtain sepsis-related genes; analyzing the sepsis related genes by using lasso regression analysis and a support vector machine to obtain sepsis related junction genes; and constructing a sepsis diagnosis model according to multi-factor logistic regression and the sepsis related junction genes. The sepsis diagnosis model construction method can screen out core genes closely related to sepsis development from a large number of genes, and construct a sepsis diagnosis model according to the genes, and the sepsis diagnosis model plays an important role in pathogenesis and clinical manifestation of sepsis.
Description
Technical Field
The application relates to the technical field of health monitoring, in particular to a sepsis diagnosis model construction method, a compound screening method and electronic equipment.
Background
Sepsis is a severe infection-induced syndrome with profound and fatal effects. Systemic inflammatory response triggered by sepsis may lead to reduced blood pressure, tissue organ damage, and Multiple Organ Dysfunction Syndrome (MODS). Because early symptoms of sepsis are often vague and difficult to diagnose accurately in time, patients often seek treatment when the illness is serious, resulting in delay of treatment time.
The widely used diagnostic standard sepsis-3, although having a certain accuracy, involves multiple system examinations, which require a longer time, possibly resulting in delayed treatment, affecting the prognosis of the patient.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a sepsis diagnostic model construction method, a compound screening method, and an electronic device that can overcome at least one of the above drawbacks.
In a first aspect, an embodiment of the present application provides a method for constructing a sepsis diagnostic model, including: obtaining a sepsis gene expression dataset and an aging gene dataset; analyzing the sepsis gene expression dataset and the senescent gene dataset according to a genobody enrichment analysis and a kyoto gene and genome encyclopedia enrichment analysis to obtain sepsis-related genes; analyzing the sepsis related genes by using lasso regression analysis and a support vector machine to obtain sepsis related junction genes; and constructing a sepsis diagnosis model according to multi-factor logistic regression and the sepsis related junction genes.
According to one embodiment of the present application, after the obtaining of the sepsis gene expression data set and the senescent gene data set, further comprising: and processing the sepsis gene expression data set by using a microarray data linear model so as to normalize and normalize an expression matrix of the sepsis gene expression data set.
According to one embodiment of the application, the analyzing the sepsis gene expression data set to obtain sepsis-related genes includes: and analyzing and processing the sepsis gene expression data set according to a weighted gene co-expression network to obtain the sepsis related genes, wherein the sepsis related genes comprise genes positively related to sepsis and genes negatively related to sepsis.
According to one embodiment of the present application, the sepsis diagnostic model construction method further includes: obtaining a sepsis gene expression validation set; validating the sepsis diagnostic model according to the sepsis gene expression validation set.
According to one embodiment of the application, the sepsis associated hub gene comprises: BCL6, ETS1, ETS2, FOS, MAPK14 and MYC.
In a second aspect, embodiments of the present application provide a method for screening a compound, comprising: obtaining main compounds in red sage root; obtaining a targeting protein from the primary compound; obtaining genes corresponding to the target proteins; obtaining a sepsis diagnostic model constructed according to the sepsis diagnostic model construction method of the first aspect; obtaining sepsis related junction genes according to the sepsis diagnostic model; obtaining common genes related to sepsis in the red sage root according to genes corresponding to the targeting proteins and the sepsis related junction genes; obtaining sepsis related compounds related to sepsis in the salvia miltiorrhiza bunge according to the common gene.
According to one embodiment of the application, the common genes include: MYC, FOS, and MAPK14.
According to one embodiment of the application, the sepsis related compound comprises: isoprostol I, wu Ermei acid, dihydro Wu Ermei ketone, danshen new quinone B, danshen new quinone A, 2- (4-hydroxy-3-methoxyphenyl) -5- (3-hydroxypropyl) -7-methoxy-3-benzofurancarbaldehyde, tanshinone IIA and methyl rosmarinic acid.
According to one embodiment of the application, the method for screening a compound further comprises: a compound for treating sepsis is prepared using at least one of the sepsis-related compounds.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to implement the sepsis diagnostic model building method of the first aspect or the compound screening method of the second aspect when executing the instructions.
The sepsis diagnosis model construction method, the compound screening method and the electronic equipment provided by the embodiment of the application can screen core genes closely related to sepsis development from a large number of genes, and construct a sepsis diagnosis model according to the genes, and the sepsis diagnosis model plays an important role in pathogenesis and clinical manifestation of sepsis and has important significance for early diagnosis and prognosis evaluation.
Drawings
Fig. 1 is a flowchart of a sepsis diagnostic model construction method according to an embodiment of the present application.
Fig. 2 is a schematic diagram of parameters of a clustering module according to an embodiment of the application.
FIG. 3a is a schematic diagram of the enrichment analysis of the kyoto gene and genome encyclopedia of genes ontology according to an embodiment of the present application.
FIG. 3b is a schematic diagram of a bulk enrichment analysis of Kyoto genes and genomic encyclopedia enrichment analysis genes according to another embodiment of the present application.
FIG. 4 is a schematic diagram of a protein-protein network according to an embodiment of the present application.
Fig. 5 is a schematic diagram of a core network according to an embodiment of the present application.
Fig. 6 is a schematic illustration of a lasso analysis according to an embodiment of the present application.
Fig. 7 is a wien diagram illustrating an embodiment of the present application.
Fig. 8 is a schematic diagram of a sepsis diagnostic model according to an embodiment of the present application.
FIG. 9 is a flow chart of a method for screening compounds according to an embodiment of the present application.
Fig. 10 is a wien diagram illustrating another embodiment of the present application.
FIG. 11 is a schematic diagram of a compound network according to an embodiment of the present application.
Fig. 12 is a schematic diagram of an electronic device according to an embodiment of the application.
Description of main reference numerals: 20-an electronic device; a 21-processor; 22-memory.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the present application.
It should be noted that, in the embodiments of the present application, "at least one" refers to one or more, and a plurality refers to two or more. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.
It should be noted that, in the embodiments of the present application, the terms "first," "second," and the like are used for distinguishing between the descriptions and not necessarily for indicating or implying a relative importance, or for indicating or implying a sequence. Features defining "first", "second" may include one or more of the stated features, either explicitly or implicitly. In describing embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as examples, illustrations, or descriptions. Any embodiment or design described herein as "exemplary" or "e.g." in an embodiment should not be taken as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without any inventive effort, are intended to be within the scope of the present application.
Sepsis is a severe infection-induced syndrome with profound and fatal effects. Systemic inflammatory response triggered by sepsis may lead to reduced blood pressure, tissue organ damage, and Multiple Organ Dysfunction Syndrome (MODS). Because early symptoms of sepsis are often vague and difficult to diagnose accurately in time, patients often seek treatment when the illness is serious, resulting in delay of treatment time.
The widely used diagnostic standard sepsis-3, although having a certain accuracy, involves multiple system examinations, which require a longer time, possibly resulting in delayed treatment, affecting the prognosis of the patient.
In view of the above, the application provides a sepsis diagnostic model construction method, a compound screening method and electronic equipment, wherein the sepsis diagnostic model construction method combines a gene expression data set and an aging gene data set, and fully utilizes the correlation between aging and sepsis to improve the prediction and diagnosis accuracy of sepsis. Genes associated with sepsis are identified by analysis of sepsis gene expression datasets, and then junction genes associated with sepsis are determined by enrichment analysis of the genome-noumenon and genome-encyclopedias. These junction genes may play an important role in the development and progress of sepsis, so that when a sepsis diagnosis model is constructed, these key genes are incorporated into the model, and the prediction accuracy and stability of the model are expected to be improved. The sepsis related junction genes are combined with other clinical indexes through a multi-factor logistic regression method, so that a sepsis diagnosis model is built, an effective tool is provided for clinicians, the clinicians are helped to diagnose and treat sepsis earlier, and the survival rate and the treatment effect of patients are improved.
Some embodiments of the application are described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.
Fig. 1 is a flowchart of a sepsis diagnostic model construction method according to an embodiment of the present application. The sepsis diagnostic model construction method as shown in fig. 1 at least comprises the following steps: s100: obtaining a sepsis gene expression dataset and an aging gene dataset; s200: analyzing the sepsis gene expression dataset and the senescent gene dataset according to the gene ontology enrichment analysis and the kyoto gene and genome encyclopedia enrichment analysis to obtain sepsis-related genes; s300: analyzing sepsis related genes by using lasso regression analysis and a support vector machine to obtain sepsis related junction genes; s400: and constructing a sepsis diagnostic model according to multi-factor logistic regression and sepsis related junction genes.
S100: and obtaining sepsis gene expression data sets and aging gene data sets.
In the embodiment of the application, in the sepsis diagnostic model construction method, a sepsis gene expression data set and an aging gene data set are first acquired in step S100.
Specifically, the sepsis diagnostic model construction method provided by the embodiment of the application comprises the steps of obtaining a gene expression data set related to sepsis in a GEO database, wherein the gene expression data set comprises data sets GSE26440, GSE13904 and GSE32707. Wherein, GSE26440 includes 32 control samples and 98 sepsis samples, GSE13904 includes 18 control samples and 158 sepsis samples, and GSE32707 includes 34 control samples and 89 sepsis samples. In an embodiment of the application, data set GSE26440 is used as a training set and data sets GSE13904 and GSE32707 are used as verification sets.
In the embodiment of the application, the sepsis diagnostic model construction method further comprises the step of processing the sepsis gene expression data set by using a microarray data linear model so as to normalize and normalize an expression matrix of the sepsis gene expression data set.
Specifically, a linear model (linear models for microarray data, limma) software package of microarray data in R language may be applied to implement normalization processing and normalization processing for data sets GSE26440, GSE13904, and GSE32707.
It will be appreciated that preprocessing the raw data is an important step, which can improve data quality and accuracy. Therefore, the standardized and normalized processing is carried out on the expression matrix of each data set by using the limma software package, so that the technical variation among different data sets can be eliminated, the consistency and comparability of the data are ensured, and a more reliable basis is provided for subsequent analysis and mining.
It will be appreciated that the sepsis diagnostic model construction method provided by the embodiments of the present application further includes obtaining a senescent gene data set from a database (e.g., cellAge). Wherein the senescence-associated genes dataset included 183 non-tumor senescence-associated genes. It is understood that a non-tumor senescence-associated gene refers to a gene that plays an important role in the senescence process, but is not associated with tumor progression. These genes play a key role in physiological processes such as cell aging and tissue function decline. Because sepsis itself does not involve a tumor, selection of non-tumor senescence-associated genes can better analyze the association of senescence with sepsis, etc.
It is understood that aging is a known risk factor for sepsis. With age, immune cells gradually decrease in function, resulting in a defect in immune response. In particular, in severe infectious diseases such as sepsis, damage to the immune system may result in the body not effectively eliminating the source of infection and thus exacerbating the condition. The aging-related genes are incorporated into the sepsis diagnosis model construction method, so that not only can the prediction accuracy and the risk assessment accuracy be improved, but also the understanding of the sepsis development mechanism can be enhanced, and more personalized medical guidance can be provided for individuals.
S200: the sepsis gene expression dataset and the senescent gene dataset were analyzed according to the genobody enrichment analysis and the kyoto gene and genome encyclopedia enrichment analysis to obtain sepsis-related genes.
In the embodiment of the application, the method for constructing the sepsis diagnostic model further comprises the step of analyzing a sepsis gene expression data set and an aging gene data set according to gene ontology enrichment analysis and kyoto gene and genome encyclopedia enrichment analysis to acquire sepsis related genes in step S200.
Specifically, a genetic Co-expression network analysis (WGCNA) package in the R language was used to perform a genetic Co-expression network analysis on the training set GSE26440, and an undirected network was constructed, wherein the undirected network had a topology fitting index (Topological Overlap Measure, TOM) of 0.8 and a soft threshold of 10.
It will be appreciated that the construction of an undirected network is one method in gene co-expression network analysis (WGCNA). In this process, a gene co-expression network is first constructed by calculating the correlation between genes, which can be measured by different correlation indexes.
It will be appreciated that the topological fit index is an indicator in WGCNA that measures similarity between genes. Topology fitting index is a measure of topological similarity between genes by taking into account the pattern of connection of genes in a network. The value range of the topological fitting index is-1 to 1, and the higher the topological fitting index value is, the tighter the connection between genes is, and the stronger the correlation is. Setting the topology fitting index to 0.8 means that highly similar genes are more of a concern when constructing an undirected network.
It will be appreciated that the similarity between nodes in an undirected network is often expressed in terms of weights. The soft threshold is a parameter used to adjust node similarity weights that can control the connection density in the network. Setting the soft threshold to 10 means that relatively strong connections remain when the network is constructed, edges will only be formed between pairs of nodes that have a similarity above the soft threshold.
In an embodiment of the application, gene Ontology enrichment analysis (GO) and kyoto Gene and genome encyclopedia enrichment analysis (Kyoto Encyclopedia of Genes and Genomes, KEGG) pathway enrichment analysis may be performed on senescence-associated genes associated with sepsis by using a cluster analysis tool (cluster profiler) software package in the R language. Among them, GO analysis is mainly used to identify the enrichment of genes in biological processes, cellular components and molecular functions to reveal the important role of these genes in cellular functions. Whereas KEGG pathway analysis is primarily used to identify the collection of genes associated with a particular biological process and signaling pathway.
Referring to fig. 2 together, fig. 2 is a schematic diagram of parameters of a clustering module according to an embodiment of the application.
In the present example, by setting the p-value of the GO analysis to 0.05, and using the enrichment analysis (enrichgo) function, it was identified which gene sets had significant enrichment in aging and sepsis. The results of the GO analysis are then presented using a graphical tool in the R language. For example, a ggplot2 software package in the R language may be used. It will be appreciated that one skilled in the art may select different graphical tools to present the results of the analysis as desired, as the application is not limited in this regard.
In an embodiment of the application, fig. 2 shows the correlation of different modules with sepsis and p-values. Wherein the positive correlation between the genes in the black module and sepsis is highest, 0.54, and p value is. The negative correlation between the genes in the green module and sepsis is highest, and is-0.57, and the p value is +.>。
In the embodiment of the application, by analyzing each module in the undirected network, two modules with the highest correlation degree with sepsis, namely a black module and a green module, are obtained in total. The black module is a positive correlation module, and the black module is in maximum positive correlation with sepsis. The green module is a negative correlation module, and the green module is in maximum negative correlation with sepsis. 1953 genes are included in the positive correlation block and 2823 genes are included in the negative correlation block.
Referring to fig. 3a and 3b together, fig. 3a and 3b are schematic diagrams of the kyoto gene and genome encyclopedia enrichment analysis gene ontology enrichment analysis according to an embodiment of the present application.
In embodiments of the present application, KEGG pathway analysis is primarily used to identify gene sets associated with specific biological processes and signaling pathways. The enrichment analysis of the KEGG pathway was performed on the gene set by setting the p-value to 0.05 and using the KEGG enrichment analysis (endrich KEGG) function. As shown in fig. 3a and 3b, the senescence gene dataset intersected by the black module for 22 genes and the green module for 35 genes. That is, 57 sepsis-related genes were obtained in total by KEGG pathway analysis.
It will be appreciated that in embodiments of the application, not only genes associated with sepsis may be identified by KEGG pathway analysis, but these genes may also be correlated with known biological pathways and processes. The key pathways in the pathogenesis of sepsis can be further analyzed, and a basis is provided for constructing a sepsis diagnosis model.
S300, carrying out analysis on sepsis related genes by using lasso regression analysis and a support vector machine so as to obtain sepsis related junction genes.
In the embodiment of the application, the method for constructing the sepsis diagnostic model further includes the step of applying lasso regression analysis and a support vector machine to analyze sepsis related genes to obtain sepsis related junction genes in step S300.
Specifically, in an embodiment of the present application, a protein-protein network is first constructed using an interactive gene/protein search tool (Search Tool for the Retrieval of Interacting Genes/Proteins, sting) network, and the threshold for the high confidence score is set to 0.7. It will be appreciated that the threshold for high confidence score is used to screen which protein interactions in the protein-protein network are considered reliable. In this embodiment, interactions below this threshold will be excluded, but only if the confidence score for interactions between proteins reaches 0.7 or higher, will be included in the constructed PPI network. By setting the threshold, interaction information contained in the constructed PPI network can be ensured to have higher reliability, and subsequent analysis is facilitated.
Referring to fig. 4, fig. 4 is a schematic diagram of a protein-protein network according to an embodiment of the application.
The data in the protein-protein network is then imported into the Cytoscape software, and the core network in the protein-protein network is extracted according to the molecular complex detection (Molecular Complex Detection, MCODE) algorithm.
It will be appreciated that when extracting the core network using the MCODE algorithm, the default setting of the MCODE algorithm may be used, i.e. the threshold for the expected node degree is 2, the threshold for the k-core is 2, and the maximum depth is 100.
Referring to fig. 5, fig. 5 is a schematic diagram of a core network according to an embodiment of the application. As shown in FIG. 5, the total number of genes involved in the core network is 13.
Subsequently, 13 genes in the core network were analyzed using LASSO regression analysis (Least Absolute Shrinkage and Selection Operator, LASSO). Specifically, ten-fold cross-validation was applied to validate lambda values in the lasso analysis.
Referring to fig. 6, fig. 6 is a schematic diagram of lasso analysis according to an embodiment of the present application, and as shown in fig. 6, total number of non-zero coefficient genes is 11.
It will be appreciated that the lasso analysis may automatically make feature selection by adjusting the regularization parameters (lambda values) to bring coefficients of certain features to zero, thereby excluding features that contribute less or are not relevant to the model. Non-zero coefficients refer to coefficients that remain and are not zero after lasso analysis regularization. By preserving non-zero coefficients, lasso analysis can find the most important features, thereby improving the predictive performance of the model.
Subsequently, 13 genes in the core network were analyzed using a support vector machine (Support Vector Machine, SVM) and 8 high-correlation genes were obtained.
Finally, the high-correlation genes screened by the support vector machine are subjected to intersection removal by using a wien diagram tool (Venn Diagram Plotter) and non-zero coefficient genes in lasso regression analysis, so that sepsis related junction genes are obtained.
Referring to fig. 7 together, fig. 7 is a schematic diagram of wien according to an embodiment of the application. As shown in FIG. 7, a total of 6 sepsis-associated hub genes were screened.
In the embodiment of the application, the finally obtained pivot genes related to sepsis are BCL6, ETS1, ETS2, FOS, MAPK14 and MYC.
It is understood that BCL6 is an abbreviation for B cell lymphoma 6 protein, a transcription repressing factor, playing a key role in regulating immune response and cell proliferation. ETS1 and ETS2 are transcription factor family members that play an important role in cell proliferation, differentiation and immune regulation. FOS is a member of the FOS family of transcription factors that are involved in the regulation of the cell cycle and in inflammatory responses. MAPK14 is a mitogen-activated protein kinase 14, a member of the MAPK family, which plays a key role in cell signaling and inflammatory responses. MYC is a transcription factor, and has important effects on cell growth and differentiation by regulating cell proliferation and apoptosis.
S400, constructing a sepsis diagnosis model according to multi-factor logistic regression and sepsis related junction genes.
In the embodiment of the application, the method for constructing the sepsis diagnostic model further comprises the step of constructing the sepsis diagnostic model according to multi-factor logistic regression and sepsis related junction genes in step S400.
Specifically, in embodiments of the present application, a regression modeling strategy (Regression Modeling Strategies, rms) software package in the R language may be used to implement multi-factor logistic regression and ultimately obtain a sepsis diagnostic model.
Referring to fig. 8, fig. 8 is a schematic diagram of a sepsis diagnosis model according to an embodiment of the present application.
Methods of using the sepsis diagnostic model are described below. The first line is the score of each individual item, and the first line in the nomogram is taken as the scale of the score of each individual item, and each scale represents the score of the corresponding individual item. Lines 2 to 7 are names and coefficients of the individual items, and in the alignment chart, lines 2 to 7 list the names of the individual items in order, such as BCL6, ETS1, and the like. And drawing a vertical line upwards from the positioned coefficient scale points until the vertical line is intersected with the first row, wherein the intersection point is the score of the single item. The scores of all the individual items are added to obtain a total score. In the 8 th row, finding a scale point corresponding to the total score, and according to the 9 th row proportional risk corresponding to the scale point, namely the risk that the user suffers from sepsis under the corresponding total score.
In the embodiment of the application, the pivot genes are used as the diagnosis basis of a sepsis diagnosis model, so that the accuracy and the reliability of the model are improved, and more accurate help is provided for early diagnosis and treatment of sepsis patients.
In an embodiment of the present application, the method for constructing a sepsis diagnostic model further includes verifying the effect of the sepsis diagnostic model according to the data verification sets GSE13904 and GSE32707.
FIG. 9 is a schematic flow chart of a compound screening method according to another embodiment of the present application, wherein the compound screening method shown in FIG. 9 at least comprises the following steps: s110: obtaining main compounds in red sage root; s210: obtaining a targeting protein from the primary compound; s310: obtaining genes corresponding to the target proteins; s410: obtaining a sepsis diagnostic model constructed according to a sepsis diagnostic model construction method; s510: obtaining sepsis related junction genes according to a sepsis diagnostic model; s610: obtaining common genes related to sepsis in the red sage root according to genes corresponding to the targeting proteins and the sepsis related junction genes; s710: and obtaining sepsis related compounds related to sepsis in the red sage root according to the common gene.
S110: obtaining main compounds in radix Salviae Miltiorrhizae.
According to the compound screening method provided by the embodiment of the present application, the main compound of the root of red-rooted salvia is first obtained in step S110.
It is understood that Salvia Miltiorrhiza (subject name: salvia miltiorrhiza) is a common herb, also known as red root or root of Salvia Miltiorrhiza, belonging to the Labiatae family. It is one of the important members of Chinese traditional herbal medicine, and has long history and wide application. The active ingredients of Saviae Miltiorrhizae radix have antioxidant and antiinflammatory effects, and can relieve oxidative stress and inflammatory reaction, and be used for preventing and improving cardiovascular diseases and nervous system diseases. In practice, red sage root has a good therapeutic effect on patients with sepsis, but the composition of red sage root is complex, and particularly, it is not clear which compounds have therapeutic effects on sepsis.
In the embodiment of the application, the main component of the red sage root is firstly obtained, and 202 compounds are totally obtained. Then analyzing the main components of the red sage root, and screening 65 compounds with potential effects. Specifically, 65 compounds can be screened from 202 compounds by setting oral bioavailability (Oral Bioavailability, OB) >30 and Drug-like (DL) > 0.18.
It is understood that oral bioavailability refers to the percentage of a drug that is absorbed and produces a biological effect in the body after oral administration. Drug-like refers to whether the chemical structure of a compound is similar to a known drug molecule. Through calculation and screening of oral bioavailability and drug-like properties, compounds with higher oral bioavailability and drug-like properties can be screened, which are more likely to be potential drug candidates.
S210: the target protein is obtained from the primary compound.
According to the compound screening method provided by the embodiment of the application, step S210 further includes obtaining the target protein according to the main compound obtained in step S110.
In the examples of the present application, 354 targeting proteins were obtained in total by analyzing the potential effect targeting proteins of these 65 compounds. Specifically, the compound's corresponding targeting protein can be queried by an interactive gene/protein retrieval tool (Search Tool for the Retrieval of Interacting Genes/Proteins, STRING).
S310: obtaining genes corresponding to the target proteins;
according to the compound screening method provided by the embodiment of the application, the step S310 further comprises the step of further acquiring the corresponding genes according to the targeting proteins acquired in the step S210.
In the embodiment of the application, genes corresponding to the 354 target proteins can be obtained through an interactive gene/protein retrieval tool.
S410: obtaining a sepsis diagnostic model constructed according to a sepsis diagnostic model construction method.
According to the compound screening method provided by the embodiment of the present application, step S410 further includes obtaining a sepsis diagnosis model constructed by the sepsis diagnosis model construction method, and the specific obtaining manner is shown in fig. 1 to 8 and corresponding descriptions thereof, which are not repeated here.
S510: obtaining the sepsis related junction genes according to a sepsis diagnostic model.
According to the compound screening method provided by the embodiment of the present application, step S510 further includes obtaining a sepsis-related junction gene by a sepsis diagnosis model, and the specific obtaining manner is shown in fig. 1 to 8 and corresponding descriptions thereof, which are not repeated here.
S610: and obtaining common genes related to sepsis in the red sage root according to genes corresponding to the targeting proteins and the sepsis related junction genes.
According to the compound screening method provided by the embodiment of the application, the step S610 further comprises the step of obtaining common genes related to sepsis in the red sage root according to genes corresponding to the targeting protein and the sepsis related junction genes.
Specifically, referring to fig. 10 together, fig. 10 is a schematic diagram of wien according to another embodiment of the present application. As shown in FIG. 10, the genes corresponding to the targeting proteins share three common genes in the sepsis-associated junction genes. In the embodiment of the application, 3 common genes (MYC, FOS and MAPK 14) are found in the process of carrying out wien mapping intersection on 354 targeting proteins and 6 sepsis core genes obtained in a sepsis diagnosis model construction method. This suggests that the root of red-rooted salvia may act on sepsis patients through these three genes. These common genes may play a key role in the onset and progression of sepsis, whereas the salvianic compounds may play a therapeutic role by modulating the expression and function of these genes.
S710: and obtaining sepsis related compounds related to sepsis in the red sage root according to the common gene.
According to the compound screening method provided by the embodiment of the application, step S710 further includes obtaining sepsis-related compounds related to sepsis in the root of red-rooted salvia according to the common gene.
In the embodiment of the application, the network diagram of three genes and corresponding compounds is drawn through Cytoscape so as to better understand the relationship between the red sage root compound and sepsis related genes.
It will be appreciated that the network diagram is a graphical representation of the interaction relationship between genes and compounds in the form of nodes (genes and compounds) and edges (interactions). In this network diagram, genes and compounds are represented as nodes, respectively, and the targeting relationship between compounds and genes is represented as edges.
Referring to fig. 11, fig. 11 is a schematic diagram of a compound network according to an embodiment of the application. In the embodiment of the application, nine compounds which are possibly related to the treatment of sepsis in the screened red sage root are totally used. These compounds are respectively: isopropanone I (Isotanshinone I), wu Ermei Acid (Ursolic Acid), dihydro Wu Ermei ketone (dihydroxosothashinone), danshen-neoquinone B (dan-shixinkum B), danshen-neoquinone A (dan-shixinkum a), 2- (4-hydroxy-3-methoxyphenyl) -5- (3-hydroxypropyl) -7-methoxy-3-benzofurancarbaldehyde (2- (4-hydroxy-3-methoxyphenyl) -5- (3-hydroxy-propyl) -7-methoxy-3-benzofurancarboxaldehyde), apigenin, tanshinone IIA (Tanshinone IIA), methyl rosmarinic Acid (Methyl rosmarinate).
In an embodiment of the application, the method of compound screening further comprises the step of preparing a compound for treating sepsis using at least one of the sepsis-related compounds.
It will be appreciated that the presence of these compounds in salvia miltiorrhiza makes salvia miltiorrhiza a potential natural drug source for sepsis treatment. They may play the following roles in sepsis treatment: antiinflammatory, antioxidant, antibacterial, cardiovascular protecting and antitumor effects. These compounds may provide new potential therapeutic directions and drug candidates for the treatment of sepsis.
Fig. 12 is an electronic device 20 according to an embodiment of the present application. As shown in fig. 12, the electronic device 20 includes at least the following: a processor 21 and a memory 22.
In an embodiment of the application, the memory 22 is used for storing instructions executable by the processor 21, the processor 21 being configured to implement a sepsis diagnostic model building method as shown in fig. 1 or a compound screening method as shown in fig. 9 when executing the instructions.
In an embodiment of the application, a computer readable storage medium comprises instructions instructing a device to perform the sepsis diagnostic model building method according to the first aspect. For example, the instructions instruct the device to perform a sepsis diagnostic model building method as shown in fig. 1 or a compound screening method as shown in fig. 9.
The program to be executed in the electronic device 20 according to an embodiment of the present application may be a program (a program for causing a computer to function) for controlling a central processing unit (Central Processing Unit, CPU) or the like to realize the functions of the above-described embodiment according to an aspect of the present application. Information processed by these devices is temporarily stored in a random access Memory (Random Access Memory, RAM) when the processing is performed, and then stored in various ROMs such as a Read Only Memory (Flash ROM) and a Hard Disk Drive (HDD), and Read, corrected, and written by a CPU as necessary.
Note that, a part of the electronic device 20 of the above embodiment may be implemented by a computer. In this case, the program for realizing the control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into a computer system and executed.
The term "computer system" as used herein refers to a computer system built into the electronic device 20, and uses hardware including an OS and peripheral devices. The term "computer-readable recording medium" refers to a removable medium such as a flexible disk, a magneto-optical disk, a ROM, or a CD-ROM, and a storage device such as a hard disk incorporated in a computer system.
Also, the "computer-readable recording medium" may include: a medium for dynamically storing a program in a short time, such as a communication line in the case of transmitting the program via a network such as the internet or a communication line such as a telephone line; a medium storing a program for a fixed time, such as a volatile memory in a computer system, which is a server or a client in this case. The program may be a program for realizing a part of the functions described above, or may be a program capable of realizing the functions described above by being combined with a program recorded in a computer system.
The electronic device 20 in the above embodiment may be realized as an aggregate (device group) composed of a plurality of devices. Each device constituting the device group may include a part or all of each function or each functional block of the electronic apparatus 20 according to the above embodiment. The device group may have all the functions or functional blocks of the electronic apparatus 20.
It can be appreciated that the sepsis diagnostic model construction method, the compound screening method and the electronic device 20 provided in the embodiments of the present application combine the gene expression data set and the aging gene data set, and make full use of the association between aging and sepsis to improve the accuracy of sepsis prediction and diagnosis. Genes associated with sepsis are identified by analysis of sepsis gene expression datasets, and then junction genes associated with sepsis are determined by enrichment analysis of the genome-noumenon and genome-encyclopedias. These junction genes may play an important role in the development and progress of sepsis, so that when a sepsis diagnosis model is constructed, these key genes are incorporated into the model, and the prediction accuracy and stability of the model are expected to be improved. The sepsis related junction genes are combined with other clinical indexes through a multi-factor logistic regression method, so that a sepsis diagnosis model is built, an effective tool is provided for clinicians, the clinicians are helped to diagnose and treat sepsis earlier, and the survival rate and the treatment effect of patients are improved.
It will be appreciated by persons skilled in the art that the above embodiments have been provided for the purpose of illustrating the application and are not to be construed as limiting the application, and that suitable modifications and variations of the above embodiments are within the scope of the application as claimed.
Claims (10)
1. The sepsis diagnosis model construction method is characterized by comprising the following steps of:
obtaining a sepsis gene expression dataset and an aging gene dataset;
analyzing the sepsis gene expression dataset and the senescent gene dataset according to a genobody enrichment analysis and a kyoto gene and genome encyclopedia enrichment analysis to obtain sepsis-related genes;
analyzing the sepsis related genes by using lasso regression analysis and a support vector machine to obtain sepsis related junction genes;
and constructing a sepsis diagnosis model according to multi-factor logistic regression and the sepsis related junction genes.
2. A method of constructing a diagnostic model of sepsis according to claim 1, further comprising, after the acquiring of the sepsis gene expression data set and the senescent gene data set:
and processing the sepsis gene expression data set by using a microarray data linear model so as to normalize and normalize an expression matrix of the sepsis gene expression data set.
3. A method of constructing a diagnostic model of sepsis according to claim 2, wherein analyzing the sepsis gene expression data set and the senescent gene data set according to a gene ontology enrichment analysis and a kyoto gene and genome encyclopedia enrichment analysis to obtain sepsis-related genes comprises:
and analyzing and processing the sepsis gene expression data set according to a weighted gene co-expression network to obtain the sepsis related genes, wherein the sepsis related genes comprise genes positively related to sepsis and genes negatively related to sepsis.
4. A method of constructing a diagnostic model of sepsis according to claim 3, wherein the method of constructing a diagnostic model of sepsis further comprises:
obtaining a sepsis gene expression validation set;
validating the sepsis diagnostic model according to the sepsis gene expression validation set.
5. A method of constructing a diagnostic model of sepsis according to claim 1, wherein the sepsis-associated junction gene comprises: BCL6, ETS1, ETS2, FOS, MAPK14 and MYC.
6. A method of screening a compound, comprising:
obtaining main compounds in red sage root;
obtaining a targeting protein from the primary compound;
obtaining genes corresponding to the target proteins;
obtaining a sepsis diagnostic model constructed according to the sepsis diagnostic model construction method of any one of claims 1 to 5;
obtaining sepsis related junction genes according to the sepsis diagnostic model;
obtaining common genes related to sepsis in the red sage root according to genes corresponding to the targeting proteins and the sepsis related junction genes;
obtaining sepsis related compounds related to sepsis in the salvia miltiorrhiza bunge according to the common gene.
7. The method of screening compounds according to claim 6, wherein the common gene comprises: MYC, FOS, and MAPK14.
8. A compound screening method according to claim 6, wherein said sepsis-related compound comprises: isoprostol I, wu Ermei acid, dihydro Wu Ermei ketone, danshen new quinone B, danshen new quinone A, 2- (4-hydroxy-3-methoxyphenyl) -5- (3-hydroxypropyl) -7-methoxy-3-benzofurancarbaldehyde, tanshinone IIA and methyl rosmarinic acid.
9. The method of compound screening according to claim 8, wherein the method of compound screening further comprises: a compound for treating sepsis is prepared using at least one of the sepsis-related compounds.
10. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to implement the sepsis diagnostic model building method of any one of claims 1 to 5 or the compound screening method of any one of claims 6 to 8 when executing the instructions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311247147.6A CN116994653A (en) | 2023-09-26 | 2023-09-26 | Sepsis diagnosis model construction method, compound screening method and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311247147.6A CN116994653A (en) | 2023-09-26 | 2023-09-26 | Sepsis diagnosis model construction method, compound screening method and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116994653A true CN116994653A (en) | 2023-11-03 |
Family
ID=88534114
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311247147.6A Pending CN116994653A (en) | 2023-09-26 | 2023-09-26 | Sepsis diagnosis model construction method, compound screening method and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116994653A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106103744A (en) * | 2014-02-11 | 2016-11-09 | 英国国防部 | For predicting the equipment of onset of sepsis, test kit and method |
CN110218792A (en) * | 2019-05-31 | 2019-09-10 | 江苏省肿瘤医院 | It is a kind of for breast cancer diagnosis and the marker and its preparation method of prognosis |
CN113610845A (en) * | 2021-09-09 | 2021-11-05 | 汕头大学医学院附属肿瘤医院 | Tumor local control prediction model construction method, prediction method and electronic equipment |
CN115044665A (en) * | 2022-06-08 | 2022-09-13 | 中国人民解放军海军军医大学 | Application of ARG1 in preparation of sepsis diagnosis, severity judgment or prognosis evaluation reagent or kit |
-
2023
- 2023-09-26 CN CN202311247147.6A patent/CN116994653A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106103744A (en) * | 2014-02-11 | 2016-11-09 | 英国国防部 | For predicting the equipment of onset of sepsis, test kit and method |
CN110218792A (en) * | 2019-05-31 | 2019-09-10 | 江苏省肿瘤医院 | It is a kind of for breast cancer diagnosis and the marker and its preparation method of prognosis |
CN113610845A (en) * | 2021-09-09 | 2021-11-05 | 汕头大学医学院附属肿瘤医院 | Tumor local control prediction model construction method, prediction method and electronic equipment |
CN115044665A (en) * | 2022-06-08 | 2022-09-13 | 中国人民解放军海军军医大学 | Application of ARG1 in preparation of sepsis diagnosis, severity judgment or prognosis evaluation reagent or kit |
Non-Patent Citations (1)
Title |
---|
HE, SHASHA ET AL: "Alterations in the gut microbiome and metabolome profiles of septic mice treated with Shen FuHuang formula", 《FRONTIERS IN MICROBIOLOGY》, vol. 14, pages 1 - 10 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Decoding the genomics of abdominal aortic aneurysm | |
Bi et al. | A fast and accurate method for genome-wide time-to-event data analysis and its application to UK Biobank | |
Nagarajan et al. | Application of computational biology and artificial intelligence technologies in cancer precision drug discovery | |
Li et al. | Network-based approach identified cell cycle genes as predictor of overall survival in lung adenocarcinoma patients | |
Wang et al. | Integrated bioinformatic analysis reveals YWHAB as a novel diagnostic biomarker for idiopathic pulmonary arterial hypertension | |
Liu et al. | Statistical methods for analyzing tissue microarray data | |
JP2019534506A (en) | System and method for medical data mining | |
Masconi et al. | Effects of different missing data imputation techniques on the performance of undiagnosed diabetes risk prediction models in a mixed-ancestry population of South Africa | |
US10854326B2 (en) | Systems and methods for full body circulation and drug concentration prediction | |
CN110714078B (en) | Marker gene for colorectal cancer recurrence prediction in stage II and application thereof | |
JP7041614B6 (en) | Multi-level architecture for pattern recognition in biometric data | |
van der Lee et al. | Artificial intelligence in pharmacology research and practice | |
CN110289092A (en) | The method for improving medical diagnosis on disease using surveyed analyte | |
CN115862850B (en) | Modeling method and device of hepatocellular carcinoma monitoring model based on longitudinal multidimensional data | |
Shi et al. | Predicting two-year quality of life after breast cancer surgery using artificial neural network and linear regression models | |
RU2632509C1 (en) | Method for diagnostics of non-infectious diseases based on statistical methods of data processing | |
Huie et al. | Machine intelligence identifies soluble TNFa as a therapeutic target for spinal cord injury | |
Khalilimeybodi et al. | Context-specific network modeling identifies new crosstalk in β-adrenergic cardiac hypertrophy | |
JP7124265B2 (en) | Biomarker detection method, disease determination method, biomarker detection device, and biomarker detection program | |
CN116994653A (en) | Sepsis diagnosis model construction method, compound screening method and electronic equipment | |
CN115691751A (en) | Traditional Chinese medicine prescription screening method and system based on diagnosis and treatment experience and intelligent learning | |
Wosiak et al. | On integrating clustering and statistical analysis for supporting cardiovascular disease diagnosis | |
Huang et al. | A neural network model to screen feature genes for pancreatic cancer | |
Blackstone et al. | Clinical-pathologic conference: use and choice of statistical methods for the clinical study,“superficial adenocarcinoma of the esophagus” | |
Pattichis et al. | Guest editorial on the special issue on integrating informatics and technology for precision medicine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |