CN112259219A - System, equipment and storage medium for predicting diseases based on upper gastrointestinal hemorrhage - Google Patents
System, equipment and storage medium for predicting diseases based on upper gastrointestinal hemorrhage Download PDFInfo
- Publication number
- CN112259219A CN112259219A CN202011059143.1A CN202011059143A CN112259219A CN 112259219 A CN112259219 A CN 112259219A CN 202011059143 A CN202011059143 A CN 202011059143A CN 112259219 A CN112259219 A CN 112259219A
- Authority
- CN
- China
- Prior art keywords
- case
- clustering
- upper gastrointestinal
- gastrointestinal hemorrhage
- diagnosed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 206010046274 Upper gastrointestinal haemorrhage Diseases 0.000 title claims abstract description 50
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 50
- 201000010099 disease Diseases 0.000 title claims abstract description 49
- 239000013598 vector Substances 0.000 claims abstract description 35
- 208000024891 symptom Diseases 0.000 claims abstract description 31
- 238000003745 diagnosis Methods 0.000 claims abstract description 23
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 238000000034 method Methods 0.000 claims abstract description 11
- 208000012671 Gastrointestinal haemorrhages Diseases 0.000 claims abstract description 7
- 238000003064 k means clustering Methods 0.000 claims abstract description 4
- 210000004369 blood Anatomy 0.000 claims description 9
- 239000008280 blood Substances 0.000 claims description 9
- 238000004891 communication Methods 0.000 claims description 9
- 230000002431 foraging effect Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000011156 evaluation Methods 0.000 claims description 5
- 230000003908 liver function Effects 0.000 claims description 5
- 241000282461 Canis lupus Species 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 230000008447 perception Effects 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 210000003608 fece Anatomy 0.000 claims description 2
- 230000003907 kidney function Effects 0.000 claims description 2
- 239000000284 extract Substances 0.000 claims 1
- 238000012216 screening Methods 0.000 abstract description 6
- 208000032843 Hemorrhage Diseases 0.000 description 11
- 230000000740 bleeding effect Effects 0.000 description 7
- 230000002496 gastric effect Effects 0.000 description 7
- 210000001035 gastrointestinal tract Anatomy 0.000 description 5
- 208000034507 Haematemesis Diseases 0.000 description 4
- 102000001554 Hemoglobins Human genes 0.000 description 4
- 108010054147 Hemoglobins Proteins 0.000 description 4
- 230000001154 acute effect Effects 0.000 description 4
- 208000007502 anemia Diseases 0.000 description 3
- 230000036772 blood pressure Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 210000003743 erythrocyte Anatomy 0.000 description 3
- 208000028299 esophageal disease Diseases 0.000 description 3
- 208000014674 injury Diseases 0.000 description 3
- 210000004185 liver Anatomy 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 208000018556 stomach disease Diseases 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 210000002700 urine Anatomy 0.000 description 3
- 206010000060 Abdominal distension Diseases 0.000 description 2
- BPYKTIZUTYGOLE-IFADSCNNSA-N Bilirubin Chemical compound N1C(=O)C(C)=C(C=C)\C1=C\C1=C(C)C(CCC(O)=O)=C(CC2=C(C(C)=C(\C=C/3C(=C(C=C)C(=O)N\3)C)N2)CCC(O)=O)N1 BPYKTIZUTYGOLE-IFADSCNNSA-N 0.000 description 2
- 208000015877 Duodenal disease Diseases 0.000 description 2
- 208000007882 Gastritis Diseases 0.000 description 2
- 206010017865 Gastritis erosive Diseases 0.000 description 2
- 208000032456 Hemorrhagic Shock Diseases 0.000 description 2
- 208000008469 Peptic Ulcer Diseases 0.000 description 2
- 206010049771 Shock haemorrhagic Diseases 0.000 description 2
- 208000005718 Stomach Neoplasms Diseases 0.000 description 2
- PNNCWTXUWKENPE-UHFFFAOYSA-N [N].NC(N)=O Chemical compound [N].NC(N)=O PNNCWTXUWKENPE-UHFFFAOYSA-N 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 238000004820 blood count Methods 0.000 description 2
- 238000009534 blood test Methods 0.000 description 2
- 230000001149 cognitive effect Effects 0.000 description 2
- DDRJAANPRJIHGJ-UHFFFAOYSA-N creatinine Chemical compound CN1CC(=O)NC1=N DDRJAANPRJIHGJ-UHFFFAOYSA-N 0.000 description 2
- 230000006378 damage Effects 0.000 description 2
- 210000001198 duodenum Anatomy 0.000 description 2
- 210000003238 esophagus Anatomy 0.000 description 2
- 208000020694 gallbladder disease Diseases 0.000 description 2
- 206010017758 gastric cancer Diseases 0.000 description 2
- 210000000265 leukocyte Anatomy 0.000 description 2
- 208000019423 liver disease Diseases 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 208000011906 peptic ulcer disease Diseases 0.000 description 2
- 208000007232 portal hypertension Diseases 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 201000011549 stomach cancer Diseases 0.000 description 2
- 230000008733 trauma Effects 0.000 description 2
- 210000002438 upper gastrointestinal tract Anatomy 0.000 description 2
- 206010000087 Abdominal pain upper Diseases 0.000 description 1
- 206010002243 Anastomotic ulcer Diseases 0.000 description 1
- 241000548070 Ancylostomia Species 0.000 description 1
- 208000032467 Aplastic anaemia Diseases 0.000 description 1
- 208000022211 Arteriovenous Malformations Diseases 0.000 description 1
- 208000037157 Azotemia Diseases 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 208000019838 Blood disease Diseases 0.000 description 1
- 208000018380 Chemical injury Diseases 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 206010012713 Diaphragmatic hernia Diseases 0.000 description 1
- 208000001762 Gastric Dilatation Diseases 0.000 description 1
- 208000012895 Gastric disease Diseases 0.000 description 1
- 206010067786 Haemorrhagic erosive gastritis Diseases 0.000 description 1
- 208000034919 Hemobilia Diseases 0.000 description 1
- 208000031220 Hemophilia Diseases 0.000 description 1
- 208000009292 Hemophilia A Diseases 0.000 description 1
- 208000032982 Hemorrhagic Fever with Renal Syndrome Diseases 0.000 description 1
- 206010054885 Hepatic haemangioma rupture Diseases 0.000 description 1
- 206010023177 Jejunal ulcer Diseases 0.000 description 1
- 206010024238 Leptospirosis Diseases 0.000 description 1
- 206010024652 Liver abscess Diseases 0.000 description 1
- 206010027070 Mediastinal abscess Diseases 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 208000014174 Oesophageal disease Diseases 0.000 description 1
- 206010030201 Oesophageal ulcer Diseases 0.000 description 1
- 206010030216 Oesophagitis Diseases 0.000 description 1
- 208000016222 Pancreatic disease Diseases 0.000 description 1
- 208000037581 Persistent Infection Diseases 0.000 description 1
- 206010037660 Pyrexia Diseases 0.000 description 1
- 206010067171 Regurgitation Diseases 0.000 description 1
- 201000008982 Thoracic Aortic Aneurysm Diseases 0.000 description 1
- 102000003929 Transaminases Human genes 0.000 description 1
- 108090000340 Transaminases Proteins 0.000 description 1
- 206010047115 Vasculitis Diseases 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 208000002223 abdominal aortic aneurysm Diseases 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003872 anastomosis Effects 0.000 description 1
- 238000002583 angiography Methods 0.000 description 1
- 208000022531 anorexia Diseases 0.000 description 1
- 230000005744 arteriovenous malformation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000000013 bile duct Anatomy 0.000 description 1
- 208000034158 bleeding Diseases 0.000 description 1
- 210000004204 blood vessel Anatomy 0.000 description 1
- 208000027503 bloody stool Diseases 0.000 description 1
- 210000002318 cardia Anatomy 0.000 description 1
- 208000019425 cirrhosis of liver Diseases 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 230000001447 compensatory effect Effects 0.000 description 1
- 201000005890 congenital diaphragmatic hernia Diseases 0.000 description 1
- 208000018631 connective tissue disease Diseases 0.000 description 1
- 229940109239 creatinine Drugs 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 206010061428 decreased appetite Diseases 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 208000007784 diverticulitis Diseases 0.000 description 1
- 206010013864 duodenitis Diseases 0.000 description 1
- 208000019064 esophageal ulcer Diseases 0.000 description 1
- 208000006881 esophagitis Diseases 0.000 description 1
- 230000002550 fecal effect Effects 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 230000024924 glomerular filtration Effects 0.000 description 1
- 208000035861 hematochezia Diseases 0.000 description 1
- 238000005534 hematocrit Methods 0.000 description 1
- 208000014951 hematologic disease Diseases 0.000 description 1
- 230000002440 hepatic effect Effects 0.000 description 1
- 230000008595 infiltration Effects 0.000 description 1
- 238000001764 infiltration Methods 0.000 description 1
- 210000001630 jejunum Anatomy 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 208000032839 leukemia Diseases 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 201000000349 mediastinal cancer Diseases 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000004877 mucosa Anatomy 0.000 description 1
- 208000025402 neoplasm of esophagus Diseases 0.000 description 1
- 208000018280 neoplasm of mediastinum Diseases 0.000 description 1
- 210000000440 neutrophil Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000002784 stomach Anatomy 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 206010042772 syncope Diseases 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 210000000115 thoracic cavity Anatomy 0.000 description 1
- 208000009852 uremia Diseases 0.000 description 1
- 230000006496 vascular abnormality Effects 0.000 description 1
- 208000019553 vascular disease Diseases 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Public Health (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Primary Health Care (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Pathology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Epidemiology (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a system, a device and a storage medium for predicting diseases based on upper gastrointestinal hemorrhage, wherein the system comprises: a data acquisition module: a case sample for extracting the above gastrointestinal hemorrhage from the electronic case library as a clinical first complaint symptom; a feature extraction module: the system is used for extracting symptom information of upper gastrointestinal hemorrhage, related biological index data and corresponding disease names in each case sample to form case sample feature vectors; a disease clustering module: clustering the characteristic vectors of the case samples by adopting a K-means clustering algorithm optimized by an improved bird swarm algorithm; a joint diagnosis module: the method is used for clustering cases to be diagnosed and establishing a combined diagnosis model based on symptom information of upper gastrointestinal hemorrhage and biological index data. The method carries out combined diagnosis and screening based on the symptom information and effective biological indexes of the gastrointestinal hemorrhage on the basis of improving the clustering algorithm, and can accurately predict the cause and the result of the upper gastrointestinal hemorrhage.
Description
Technical Field
The invention relates to the field of disease auxiliary diagnosis equipment, in particular to a system, equipment and a storage medium for predicting diseases based on upper gastrointestinal hemorrhage.
Background
The upper gastrointestinal hemorrhage refers to the hemorrhage of esophagus, stomach, duodenum and upper jejunum, and is one of the clinically common emergencies, and clinically manifested as hematemesis, dark stool, anemia and the like. The most common reasons for upper gastrointestinal bleeding are peptic ulcer, erosive gastritis, portal hypertension due to liver cirrhosis, esophageal variceal bleeding, gastric cancer, and liver trauma. Clinical manifestations depend on the nature, location, amount and rate of bleeding lesions and are closely related to the general condition of the patient as it bleeds.
In the clinical work of upper gastrointestinal hemorrhage, the diagnosis of upper gastrointestinal hemorrhage mainly depends on clinical symptomatology, lacks objective and effective biological diagnosis indexes, and only depends on clinical diagnosis, so that the cause and result of the upper gastrointestinal hemorrhage are difficult to accurately predict.
Disclosure of Invention
In view of the above, the present invention provides a system, a device, and a storage medium for predicting diseases based on upper gastrointestinal hemorrhage, which are used to combine the symptom characteristic information of upper gastrointestinal hemorrhage with the objective index of effective biological index for joint diagnosis and screening, so as to accurately analyze the cause and result of upper gastrointestinal hemorrhage.
In a first aspect of the invention, a system for predicting a disease based on upper gastrointestinal bleeding, the system comprising:
a data acquisition module: a case sample for extracting the above gastrointestinal hemorrhage from the electronic case library as a clinical first complaint symptom;
a feature extraction module: the system is used for extracting symptom information of upper gastrointestinal hemorrhage, related biological index data and corresponding disease names in each case sample to form case sample feature vectors;
a disease clustering module: the clustering method is used for determining the clustering category and clustering the characteristic vectors of the case samples by adopting an improved K-means clustering algorithm optimized by a bird swarm algorithm;
a joint diagnosis module: the method is used for acquiring the cases to be diagnosed, clustering the cases to be diagnosed and establishing a combined diagnosis model of semantic similarity between symptom information of upper gastrointestinal hemorrhage between the cases to be diagnosed and the case samples in the cluster and Euclidean distance between biological index data.
Preferably, in the feature extraction module, feature items of upper gastrointestinal hemorrhage symptom information in each case sample are extracted through TF-IDF, and feature vectors of the upper gastrointestinal hemorrhage symptom information are established; the relevant biological index data comprises pulse, blood pressure, bowel sound, urine volume, liver function change data and fiberscope examination data, and the biological index data is normalized to form a biological index vector; the case sample feature vector is composed of a feature vector of symptom information of upper gastrointestinal hemorrhage and a biological index vector.
Preferably, the disease clustering module specifically includes:
an initialization unit: initializing a population, calculating a fitness value, and selecting an initialized global optimal value;
a population updating unit: generating a random number a between (0,1) and a constant P, when a > P, selecting foraging, otherwise keeping alert;
when a is larger than P, the foraging position is updated in a nonlinear weight adjustment mode, and the position updating formula is as follows:
wherein the content of the first and second substances,respectively showing the position of the ith bird in the jth dimension space at the time t, C1And C2Respectively a perception coefficient and a social evolution coefficient, Pi,jFor the optimal position for the i-th bird to pass, Gi,jAnd the optimal position of the population is obtained.
When a < P, the bird group remains alert and the position update formula is:
wherein, a1、a2∈[0,2]K is [1, N ]]And k ≠ i, fi、fkThe fitness values of the ith and the k wolfs are respectively; sumf is the sum of fitness of the whole population, epsilon is a constant, meanjIs the average fitness value of the population;
allowing the bird group to migrate periodically, generating a producer and an entrepressor, and updating the positions of the producer and the entrepressor;
an evaluation unit: calculating a new fitness value, updating a historical optimal value, returning to the population updating unit until a set iteration number is reached, and outputting an optimal position as a clustering center point.
Preferably, in the disease clustering module, the fitness function of the evaluation unit is the sum of the intra-class distances, that isWhere K is the number of cluster categories, d (X)i,Cj) For each data object X in class jiTo a corresponding cluster center point CjThe distance of (c).
Preferably, the combined diagnosis module specifically comprises:
a clustering unit: performing feature extraction on the acquired case to be diagnosed to obtain a case feature vector to be diagnosed, calculating Euclidean distances from the case feature vector to be diagnosed to each cluster central point, and selecting a cluster with the minimum Euclidean distance;
a calculation unit: calculating cosine similarity alpha between characteristic vectors of symptom information of upper gastrointestinal hemorrhage between a case to be diagnosed and case samples in the cluster; calculating the Euclidean distance beta of a biological index vector between a case to be diagnosed and a case sample in a cluster;
a judging unit: calculating the joint similarity s, s-w of each case sample1α+w2(1-β),w1+w2And (1) taking the disease type corresponding to the case sample with the maximum joint similarity as the disease prediction result of the case to be diagnosed.
In a second aspect of the present invention, an electronic device is disclosed, comprising: at least one processor, at least one memory, a communication interface, and a bus;
the processor, the memory and the communication interface complete mutual communication through the bus;
the memory stores program instructions executable by the processor, which are invoked by the processor to implement the system of the first aspect of the invention.
In a third aspect of the invention, a computer-readable storage medium is disclosed, which stores computer instructions for causing a computer to implement the system of the first aspect of the invention.
Compared with the prior art, the invention has the following beneficial effects:
1) the improved bird swarm algorithm is combined with the K mean value algorithm to perform case clustering, the population position is updated in a nonlinear weight adjustment mode, the learning capacity of excellent individuals is enhanced in the iterative learning process, the population can be efficiently optimized in the foraging initial stage, and the convergence speed is increased; and disease category subdivision is carried out based on the aggregation result, so that the disease screening range can be reduced, and the disease prediction speed can be accelerated.
2) The invention combines the symptom information of upper gastrointestinal hemorrhage with the objective index of effective biological index, carries out combined diagnosis and screening by establishing a characteristic information semantic similarity calculation model and a diagnosis model constructed by biological cognitive index, can accurately predict the result of upper gastrointestinal hemorrhage, can analyze the cause of upper gastrointestinal hemorrhage based on the big data information of an electronic disease case library, and is simple and practical auxiliary diagnosis equipment.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of the system for predicting disease based on upper gastrointestinal hemorrhage according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1, the present invention provides a system for predicting a disease based on upper gastrointestinal hemorrhage, which includes a data acquisition module 100, a feature extraction module 200, a disease clustering module 300, and a joint diagnosis module 400;
the data acquisition module 100: a case sample for extracting the above gastrointestinal hemorrhage from the electronic case library as a clinical first complaint symptom;
the diseases of upper gastrointestinal hemorrhage mainly include: 1) esophageal diseases including esophagitis, esophageal ulcer, esophageal tumor, esophageal cardia mucosa tear and physical/chemical injury; 2) gastric and duodenal diseases including peptic ulcer, esophageal and gastric variceal rupture, portal hypertension gastropathy, acute hemorrhagic erosive gastritis, gastric vascular abnormality (arteriovenous malformation), gastric cancer and other tumors, acute gastric dilatation, duodenitis and diverticulitis, diaphragmatic hernia, gastric torsion, ancylostomia, jejunal ulcer and anastomotic stomal ulcer after gastrointestinal anastomosis; 3) liver and gallbladder diseases including intrahepatic localized chronic infection, liver abscess, liver cancer, hepatic hemangioma rupture, and hepatic parenchymal rupture caused by trauma can cause intrahepatic biliary tract hemorrhage. But also damage to the bile duct itself; 4) diseases of organs or tissues adjacent to the upper gastrointestinal tract include pancreatic disease involving the duodenum; thoracic or abdominal aortic aneurysms break into the digestive tract; mediastinal tumors or abscesses break into the esophagus; 5) systemic diseases manifest bleeding in the gastrointestinal tract. Hematologic disorders including leukemia, aplastic anemia, hemophilia; vascular diseases; connective tissue disease and vasculitis; stress-related gastric mucosal injury; acute infectious diseases include epidemic hemorrhagic fever, leptospirosis; uremia, etc.
Different diseases have different characteristic information and biological index data. For example, the clinical symptom characteristic information of erosive gastritis generally includes upper abdominal pain, severe stress states such as anemia, syncope and hemorrhagic shock can appear when blood is repeatedly hematemesis, and abdominal distension, anorexia, upper abdominal fullness feeling, acid regurgitation, dark stool and the like are presented secondly; the normal or increased number of leukocytes and an increased proportion of neutrophils or lymphocytes are observed in the blood routine. Combining blood of a patient infected by the whole body to culture bacteria which can be positive; and (3) checking hidden red blood cells or hemoglobin in the excrement, wherein the positive excrement occult blood test is an index of the gastrointestinal hemorrhage. The electronic case library comprises various diseases corresponding to hemorrhage in the digestive tract, and the disease etiology and result prediction analysis is carried out according to big data in the electronic case library.
The feature extraction module 200: the system is used for extracting symptom information of upper gastrointestinal hemorrhage, related biological index data and corresponding disease names in each case sample to form case sample feature vectors;
the upper gastrointestinal hemorrhage is usually acute and severe, and is usually analyzed mainly according to the change conditions of the medical history, the hematemesis and bloody stool conditions, the pulse, the blood pressure, the bowel sound, the urine volume, the liver function and the like, and is usually combined with fiberscope examination, selective arteriography and the like. Here, the clinical manifestations of upper gastrointestinal hemorrhage mainly include hematemesis and dark stool, and the concrete manifestations depend on the bleeding speed, amount of bleeding, time of blood staying in the digestive tract and bleeding part, and other symptoms also can cause hemorrhagic shock, and sometimes can also cause symptoms such as anemia and fever; the biological index data herein mainly include: 1) and (5) blood convention. Early in bleeding, there was no significant change in red blood cell count, hemoglobin amount, and hematocrit. After 3-4 hours, the value of hemoglobin and red blood cells is reduced due to the dilution because of the volume expansion treatment or the compensatory infiltration of tissue fluid into blood vessels to supplement the plasma volume. 2-5 hours after the upper digestive tract is subjected to massive hemorrhage, the white blood cell count can be increased to 10-20 multiplied by 109/L, and the normal state can be recovered two to three days after the hemorrhage stops; 2) and renal function. Due to the decomposition of hemoglobin, the glomerular filtration rate is reduced, and the blood urea nitrogen is increased, reaches a peak within 24-48 hours and is reduced to normal within 3-4 days generally. A blood urea nitrogen/blood creatinine ratio greater than 25:1 indicates upper gastrointestinal bleeding; 3) and liver function. Some patients are accompanied by elevated bilirubin and transaminase; 4) and the feces are conventional. The fecal occult blood test is positive, and directly prompts gastrointestinal hemorrhage and the like; in specific implementation, objective index data (such as pulse, blood pressure, bowel sounds, urine volume, liver function and other changes) of part of biological indexes can be screened as the related biological index data of the invention.
Extracting characteristic items of upper gastrointestinal hemorrhage symptom information in each case sample through TF-IDF, and establishing a characteristic vector A of the upper gastrointestinal hemorrhage symptom information; normalizing the biological index data to form a biological index vector B; the case sample feature vector consists of a feature vector of symptom information of upper gastrointestinal hemorrhage and a biological index vector, wherein C is [ A, B ].
The disease clustering module 300: the clustering method is used for determining the clustering category and clustering the characteristic vectors of the case samples by adopting an improved K-means clustering algorithm optimized by a bird swarm algorithm;
upper gastrointestinal bleeding disorders generally fall into five major categories: 1) (iii) esophageal disorders; 2) gastric and duodenal diseases; 3) liver and gallbladder diseases; 4) diseases of the upper gastrointestinal tract adjacent organs or tissues; 5) the invention relates to a method for clustering general diseases, and the like, therefore, the clustering class number K is 5, and the disease clustering module specifically comprises:
an initialization unit: initializing a population, respectively selecting and calculating fitness values, and selecting an initialized global optimal value; each population individual represents a case sample data.
A population updating unit: generating a random number a between (0,1) and a constant P, when a > P, selecting foraging, otherwise keeping alert;
when a is larger than P, the foraging position is updated in a nonlinear weight adjustment mode, and the position updating formula is as follows:
wherein the content of the first and second substances,respectively showing the position of the ith bird in the jth dimension space at the time t, C1And C2Respectively a perception coefficient and a social evolution coefficient, Pi,jFor the optimal position for the i-th bird to pass, Gi,jAnd the optimal position of the population is obtained.
When a < P, the bird group remains alert and the position update formula is:
wherein, a1、a2∈[0,2]K is [1, N ]]And k ≠ i, fi、fkThe fitness values of the ith and the k wolfs are respectively; sumf is the sum of fitness of the whole population, epsilon is a constant, meanjIs the average fitness value of the population;
allowing the bird group to migrate periodically, generating a producer and an entrepressor, and updating the positions of the producer and the entrepressor;
the position updating formula for the producer and the food entrepreneur is as follows:
where randn (0,1) represents a Gaussian distribution with a mean of 0 and a variance of 1, and FL ∈ [0,2 ].
An evaluation unit: calculating a new fitness value, the fitness function being the sum of the intra-class distances, i.e.Where K is the number of cluster categories, d (X)i,Cj) For each data object X in class jiTo a corresponding cluster center point CjThe distance of (c).
And after the fitness is obtained through calculation, selecting the position with the best fitness, updating the historical optimal value, returning to the population updating unit until the set iteration times are reached, and outputting the optimal position as a clustering center point.
The improved bird swarm algorithm is combined with the K mean value algorithm to perform case clustering, the population position is updated in a nonlinear weight adjustment mode, the learning capacity of excellent individuals is enhanced in the iterative learning process, the population can be efficiently optimized in the foraging initial stage, and the convergence speed is increased; and disease category subdivision is carried out based on the aggregation result, so that the disease screening range can be reduced, and the disease prediction speed can be accelerated.
The joint diagnostic module 400: the method is used for acquiring the cases to be diagnosed, clustering the cases to be diagnosed and establishing a combined diagnosis model of semantic similarity between symptom information of upper gastrointestinal hemorrhage between the cases to be diagnosed and the case samples in the cluster and Euclidean distance between biological index data. The combined diagnosis module specifically comprises:
a clustering unit: performing feature extraction on the acquired case to be diagnosed to obtain a case feature vector to be diagnosed, calculating Euclidean distances from the case feature vector to be diagnosed to each cluster central point, and selecting a cluster with the minimum Euclidean distance;
a calculation unit: calculating cosine similarity alpha between characteristic vectors of symptom information of upper gastrointestinal hemorrhage between a case to be diagnosed and case samples in the cluster; calculating the Euclidean distance beta of a biological index vector between a case to be diagnosed and a case sample in a cluster;
a judging unit: calculating the joint similarity s, s-w of each case sample1α+w2(1-β),w1、w2Is a weight coefficient, w1+w2And (1) taking the disease type corresponding to the case sample with the maximum joint similarity as the disease prediction result of the case to be diagnosed.
The invention combines the symptom information of the upper gastrointestinal hemorrhage with the objective index of the effective biological index, carries out combined diagnosis and screening by establishing a characteristic information semantic similarity calculation model and a diagnosis model constructed by biological cognitive indexes, can accurately predict the result of the upper gastrointestinal hemorrhage, and can analyze the cause of the upper gastrointestinal hemorrhage based on the big data information of the electric power case library.
The present invention also discloses an electronic device, comprising: at least one processor, at least one memory, a communication interface, and a bus;
the processor, the memory and the communication interface complete mutual communication through the bus;
the memory stores program instructions executable by the processor, and the processor calls the program instructions to realize a data acquisition module, a feature extraction module, a disease clustering module and a joint diagnosis module in the system.
The invention also discloses a computer readable storage medium which stores computer instructions, wherein the computer instructions enable the computer to realize the data acquisition module, the feature extraction module, the disease clustering module and the joint diagnosis module in the system. The storage medium includes: u disk, removable hard disk, ROM, RAM, magnetic disk or optical disk, etc.
The above-described system embodiments are merely illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts shown as units may or may not be physical units, i.e. may be distributed over a plurality of network units. Some or all of the modules may be selected according to the actual Xian to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (7)
1. A system for predicting a disease based on upper gastrointestinal bleeding, the system comprising:
a data acquisition module: a case sample for extracting the above gastrointestinal hemorrhage from the electronic case library as a clinical first complaint symptom;
a feature extraction module: the system is used for extracting symptom information of upper gastrointestinal hemorrhage, related biological index data and corresponding disease names in each case sample to form case sample feature vectors;
a disease clustering module: the clustering method is used for determining the clustering category and clustering the characteristic vectors of the case samples by adopting an improved K-means clustering algorithm optimized by a bird swarm algorithm;
a joint diagnosis module: the method is used for obtaining cases to be diagnosed, clustering the cases to be diagnosed, establishing a combined diagnosis model of semantic similarity between symptom information of upper gastrointestinal hemorrhage between the cases to be diagnosed and case samples in the cluster and Euclidean distance between biological index data, and predicting the cases to be diagnosed.
2. The system according to claim 1, wherein the feature extraction module extracts feature items of the symptom information of upper gastrointestinal hemorrhage in each case sample through TF-IDF to establish feature vectors of the symptom information of upper gastrointestinal hemorrhage; the relevant biological index data comprises blood routine, liver function, kidney function and feces routine, and the biological index data is normalized to form a biological index vector; the case sample feature vector is composed of a feature vector of symptom information of upper gastrointestinal hemorrhage and a biological index vector.
3. The system for predicting disease based on upper gastrointestinal hemorrhage of claim 1, wherein the disease clustering module specifically comprises:
an initialization unit: initializing a population, calculating a fitness value, and selecting an initialized global optimal value;
a population updating unit: generating a random number a between (0,1) and a constant P, when a > P, selecting foraging, otherwise keeping alert;
when a is larger than P, the foraging position is updated in a nonlinear weight adjustment mode, and the position updating formula is as follows:
wherein the content of the first and second substances,respectively showing the position of the ith bird in the jth dimension space at the time t, C1And C2Respectively a perception coefficient and a social evolution coefficient, Pi,jFor the optimal position for the i-th bird to pass, Gi,jAnd the optimal position of the population is obtained.
When a < P, the bird group remains alert and the position update formula is:
wherein, a1、a2∈[0,2]K is [1, N ]]And k ≠ i, fi、fkThe fitness values of the ith and the k wolfs are respectively; sumf is the sum of fitness of the whole population, epsilon is a constant, meanjIs the average fitness value of the population;
allowing the bird group to migrate periodically, generating a producer and an entrepressor, and updating the positions of the producer and the entrepressor;
an evaluation unit: calculating a new fitness value, updating a historical optimal value, returning to the population updating unit until a set iteration number is reached, and outputting an optimal position as a clustering center point.
4. The system of claim 3, wherein the disease clustering module is configured to determine the fitness function of the evaluation unit as the sum of the intra-class distancesWhere K is the number of cluster categories, d (X)i,Cj) For each data object X in class jiTo a corresponding cluster center point CjThe distance of (c).
5. The system for predicting a disease based on upper gastrointestinal bleeding according to claim 1, wherein the joint diagnosis module specifically comprises:
a clustering unit: performing feature extraction on the acquired case to be diagnosed to obtain a case feature vector to be diagnosed, calculating Euclidean distances from the case feature vector to be diagnosed to each cluster central point, and selecting a cluster with the minimum Euclidean distance;
a calculation unit: calculating cosine similarity alpha between characteristic vectors of symptom information of upper gastrointestinal hemorrhage between a case to be diagnosed and case samples in the cluster; calculating the Euclidean distance beta of a biological index vector between a case to be diagnosed and a case sample in a cluster;
a judging unit: calculating the joint similarity s, s-w of each case sample1α+w2(1-β),w1+w2And (1) taking the disease type corresponding to the case sample with the maximum joint similarity as the disease prediction result of the case to be diagnosed.
6. An electronic device, comprising: at least one processor, at least one memory, a communication interface, and a bus;
the processor, the memory and the communication interface complete mutual communication through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to implement the system of any one of claims 1-5.
7. A computer readable storage medium storing computer instructions which cause a computer to implement the system of any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011059143.1A CN112259219B (en) | 2020-09-30 | 2020-09-30 | System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011059143.1A CN112259219B (en) | 2020-09-30 | 2020-09-30 | System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112259219A true CN112259219A (en) | 2021-01-22 |
CN112259219B CN112259219B (en) | 2024-02-02 |
Family
ID=74233950
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011059143.1A Active CN112259219B (en) | 2020-09-30 | 2020-09-30 | System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112259219B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113487327A (en) * | 2021-07-27 | 2021-10-08 | 中国银行股份有限公司 | Transaction parameter setting method and device based on clustering algorithm |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002008755A2 (en) * | 2000-07-24 | 2002-01-31 | Yeda Research And Development Co. Ltd. | Identifying antigen clusters for monitoring a global state of an immune system |
JP2005509127A (en) * | 2000-11-27 | 2005-04-07 | インテリジェント メディカル ディバイシーズ エル.エル. シー. | Clinically intelligent diagnostic apparatus and method |
CN104915560A (en) * | 2015-06-11 | 2015-09-16 | 万达信息股份有限公司 | Method for disease diagnosis and treatment scheme based on generalized neural network clustering |
CN106446603A (en) * | 2016-09-29 | 2017-02-22 | 福州大学 | Gene expression data clustering method based on improved PSO algorithm |
WO2017059003A1 (en) * | 2015-09-29 | 2017-04-06 | Crescendo Bioscience | Biomarkers and methods for assessing psoriatic arthritis disease activity |
KR20190080331A (en) * | 2017-12-28 | 2019-07-08 | 경북대학교 산학협력단 | Extended BSA Coverage Path Searching Device, Method and Recording Medium thereof |
CN110648088A (en) * | 2019-11-26 | 2020-01-03 | 国网江西省电力有限公司电力科学研究院 | Electric energy quality disturbance source judgment method based on bird swarm algorithm and SVM |
CN111597878A (en) * | 2020-04-02 | 2020-08-28 | 南京农业大学 | BSA-IA-BP-based colony total number prediction method |
CN111653359A (en) * | 2020-05-30 | 2020-09-11 | 吾征智能技术(北京)有限公司 | Intelligent prediction model construction method and prediction system for hemorrhagic diseases |
-
2020
- 2020-09-30 CN CN202011059143.1A patent/CN112259219B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002008755A2 (en) * | 2000-07-24 | 2002-01-31 | Yeda Research And Development Co. Ltd. | Identifying antigen clusters for monitoring a global state of an immune system |
JP2005509127A (en) * | 2000-11-27 | 2005-04-07 | インテリジェント メディカル ディバイシーズ エル.エル. シー. | Clinically intelligent diagnostic apparatus and method |
CN104915560A (en) * | 2015-06-11 | 2015-09-16 | 万达信息股份有限公司 | Method for disease diagnosis and treatment scheme based on generalized neural network clustering |
WO2017059003A1 (en) * | 2015-09-29 | 2017-04-06 | Crescendo Bioscience | Biomarkers and methods for assessing psoriatic arthritis disease activity |
CN106446603A (en) * | 2016-09-29 | 2017-02-22 | 福州大学 | Gene expression data clustering method based on improved PSO algorithm |
KR20190080331A (en) * | 2017-12-28 | 2019-07-08 | 경북대학교 산학협력단 | Extended BSA Coverage Path Searching Device, Method and Recording Medium thereof |
CN110648088A (en) * | 2019-11-26 | 2020-01-03 | 国网江西省电力有限公司电力科学研究院 | Electric energy quality disturbance source judgment method based on bird swarm algorithm and SVM |
CN111597878A (en) * | 2020-04-02 | 2020-08-28 | 南京农业大学 | BSA-IA-BP-based colony total number prediction method |
CN111653359A (en) * | 2020-05-30 | 2020-09-11 | 吾征智能技术(北京)有限公司 | Intelligent prediction model construction method and prediction system for hemorrhagic diseases |
Non-Patent Citations (3)
Title |
---|
OMRAN, MG,等: "A Color Image Quantization Algorithm Based on Particle Swarm Optimization", INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, vol. 29, no. 03, pages 261 - 269 * |
李德玉,等: "一个基于序列比对的概念漂移检测算法", 山西大学学报(自然科学版), vol. 39, no. 03, pages 415 - 422 * |
王进成,等: "基于均值的云自适应鸟群优化算法", 科学技术与工程, no. 11, pages 167 - 172 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113487327A (en) * | 2021-07-27 | 2021-10-08 | 中国银行股份有限公司 | Transaction parameter setting method and device based on clustering algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN112259219B (en) | 2024-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Fox et al. | Spontaneously occurring arrhythmogenic right ventricular cardiomyopathy in the domestic cat: a new animal model similar to the human disease | |
Molnár et al. | Clinical significance of granuloma in Crohn’s disease | |
CN105442052B (en) | A kind of DNA library of checkout and diagnosis dissection of aorta disease Disease-causing gene and its application | |
CN101124336A (en) | Methods and devices for diagnosis of appendicitis | |
Singla et al. | A novel fuzzy logic-based medical expert system for diagnosis of chronic kidney disease | |
CN109585011A (en) | The Illnesses Diagnoses method and machine readable storage medium of chest pain patients | |
McConachie et al. | Scoring system for multiple organ dysfunction in adult horses with acute surgical gastrointestinal disease | |
CN112259219A (en) | System, equipment and storage medium for predicting diseases based on upper gastrointestinal hemorrhage | |
CN113456033B (en) | Physiological index characteristic value data processing method, system and computer equipment | |
Zarnescu et al. | Abdominal compartment syndrome in acute pancreatitis: A narrative review | |
Patel et al. | Totally anomalous pulmonary venous connection and complex congenital heart disease: prenatal echocardiographic diagnosis and prognosis | |
CN111192687A (en) | Line graph prediction model for advanced appendicitis and application thereof | |
Moemen | Prognostic categorization of intensive care septic patients | |
CN115602319B (en) | Noninvasive hepatic fibrosis assessment device | |
De Grandi et al. | Highly Elevated Plasma γ‐Glutamyltransferase Elevations: A Trait Caused by γ‐Glutamyltransferase 1 Transmembrane Mutations | |
Tamburrini et al. | The “Black Pattern”, a Simplified Ultrasound Approach to Non-Traumatic Abdominal Emergencies | |
KR102310888B1 (en) | Methods for providing information of mortality risk and devices for providing information of mortality risk using the same | |
RU2684727C1 (en) | Method for diagnosis acute appendicitis with clinical symptoms that imitate right renal colic | |
Hafez et al. | Evaluation of the gallbladder wall thickening as a non-invasive predictor of esophageal varices in cirrhotic patients | |
US11725226B2 (en) | Method for detecting kidney cancer | |
CN116912236B (en) | Method, system and storable medium for predicting fetal congenital heart disease risk based on artificial intelligence | |
Khan et al. | Impact of Prior Abdominal Procedures on Peritoneal Dialysis Catheter Outcomes: Findings From the North American Peritoneal Dialysis Catheter Registry | |
RU2739687C1 (en) | Method for determining average rate of formation of hepatic fibrosis in patients with chronic hepatitis c | |
CN115497630B (en) | Method and system for processing acute severe ulcerative colitis data | |
RU2240045C2 (en) | Method for predicting nephrogenous hypertension development in children suffering from glomerulonephritis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |