CN112259219B - System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding - Google Patents
System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding Download PDFInfo
- Publication number
- CN112259219B CN112259219B CN202011059143.1A CN202011059143A CN112259219B CN 112259219 B CN112259219 B CN 112259219B CN 202011059143 A CN202011059143 A CN 202011059143A CN 112259219 B CN112259219 B CN 112259219B
- Authority
- CN
- China
- Prior art keywords
- case
- upper gastrointestinal
- clustering
- diagnosed
- disease
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 55
- 206010046274 Upper gastrointestinal haemorrhage Diseases 0.000 title claims abstract description 52
- 201000010099 disease Diseases 0.000 title claims abstract description 50
- 238000003860 storage Methods 0.000 title claims abstract description 10
- 239000013598 vector Substances 0.000 claims abstract description 35
- 208000024891 symptom Diseases 0.000 claims abstract description 33
- 238000003745 diagnosis Methods 0.000 claims abstract description 23
- 208000032843 Hemorrhage Diseases 0.000 claims abstract description 17
- 238000000034 method Methods 0.000 claims abstract description 17
- 210000001035 gastrointestinal tract Anatomy 0.000 claims abstract description 12
- 238000000605 extraction Methods 0.000 claims abstract description 9
- 238000004891 communication Methods 0.000 claims description 9
- 210000004369 blood Anatomy 0.000 claims description 8
- 239000008280 blood Substances 0.000 claims description 8
- 230000002431 foraging effect Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000011156 evaluation Methods 0.000 claims description 5
- 230000003908 liver function Effects 0.000 claims description 5
- 230000002550 fecal effect Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 238000003064 k means clustering Methods 0.000 claims description 3
- 230000008447 perception Effects 0.000 claims description 3
- 230000003907 kidney function Effects 0.000 claims description 2
- 239000000284 extract Substances 0.000 claims 1
- 238000012216 screening Methods 0.000 abstract description 6
- 238000005457 optimization Methods 0.000 abstract description 3
- 208000034158 bleeding Diseases 0.000 description 8
- 230000000740 bleeding effect Effects 0.000 description 8
- 230000002496 gastric effect Effects 0.000 description 5
- 210000004185 liver Anatomy 0.000 description 5
- 208000034507 Haematemesis Diseases 0.000 description 4
- 102000001554 Hemoglobins Human genes 0.000 description 4
- 108010054147 Hemoglobins Proteins 0.000 description 4
- 230000001154 acute effect Effects 0.000 description 4
- 206010000060 Abdominal distension Diseases 0.000 description 3
- 206010016100 Faeces discoloured Diseases 0.000 description 3
- 206010067715 Gastrointestinal sounds abnormal Diseases 0.000 description 3
- 230000036772 blood pressure Effects 0.000 description 3
- 210000003743 erythrocyte Anatomy 0.000 description 3
- 208000028299 esophageal disease Diseases 0.000 description 3
- 210000002784 stomach Anatomy 0.000 description 3
- 230000009885 systemic effect Effects 0.000 description 3
- 210000002700 urine Anatomy 0.000 description 3
- BPYKTIZUTYGOLE-IFADSCNNSA-N Bilirubin Chemical compound N1C(=O)C(C)=C(C=C)\C1=C\C1=C(C)C(CCC(O)=O)=C(CC2=C(C(C)=C(\C=C/3C(=C(C=C)C(=O)N\3)C)N2)CCC(O)=O)N1 BPYKTIZUTYGOLE-IFADSCNNSA-N 0.000 description 2
- 241000282421 Canidae Species 0.000 description 2
- 208000015877 Duodenal disease Diseases 0.000 description 2
- 208000007882 Gastritis Diseases 0.000 description 2
- 206010017865 Gastritis erosive Diseases 0.000 description 2
- 208000032456 Hemorrhagic Shock Diseases 0.000 description 2
- 208000008469 Peptic Ulcer Diseases 0.000 description 2
- 206010049771 Shock haemorrhagic Diseases 0.000 description 2
- 208000005718 Stomach Neoplasms Diseases 0.000 description 2
- PNNCWTXUWKENPE-UHFFFAOYSA-N [N].NC(N)=O Chemical compound [N].NC(N)=O PNNCWTXUWKENPE-UHFFFAOYSA-N 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 208000007502 anemia Diseases 0.000 description 2
- 208000007474 aortic aneurysm Diseases 0.000 description 2
- 238000009534 blood test Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000001149 cognitive effect Effects 0.000 description 2
- DDRJAANPRJIHGJ-UHFFFAOYSA-N creatinine Chemical compound CN1CC(=O)NC1=N DDRJAANPRJIHGJ-UHFFFAOYSA-N 0.000 description 2
- 210000001198 duodenum Anatomy 0.000 description 2
- 238000001839 endoscopy Methods 0.000 description 2
- 210000003238 esophagus Anatomy 0.000 description 2
- 210000003608 fece Anatomy 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 208000020694 gallbladder disease Diseases 0.000 description 2
- 206010017758 gastric cancer Diseases 0.000 description 2
- 208000014674 injury Diseases 0.000 description 2
- 230000003902 lesion Effects 0.000 description 2
- 210000000265 leukocyte Anatomy 0.000 description 2
- 208000019423 liver disease Diseases 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 201000011549 stomach cancer Diseases 0.000 description 2
- 210000001519 tissue Anatomy 0.000 description 2
- 230000008733 trauma Effects 0.000 description 2
- 206010002243 Anastomotic ulcer Diseases 0.000 description 1
- 208000032467 Aplastic anaemia Diseases 0.000 description 1
- 208000022211 Arteriovenous Malformations Diseases 0.000 description 1
- 208000037157 Azotemia Diseases 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 206010012713 Diaphragmatic hernia Diseases 0.000 description 1
- 208000018672 Dilatation Diseases 0.000 description 1
- 208000000624 Esophageal and Gastric Varices Diseases 0.000 description 1
- 208000012895 Gastric disease Diseases 0.000 description 1
- 206010061164 Gastric mucosal lesion Diseases 0.000 description 1
- 208000012671 Gastrointestinal haemorrhages Diseases 0.000 description 1
- 208000036581 Haemorrhagic anaemia Diseases 0.000 description 1
- 206010067786 Haemorrhagic erosive gastritis Diseases 0.000 description 1
- 206010061192 Haemorrhagic fever Diseases 0.000 description 1
- 208000034919 Hemobilia Diseases 0.000 description 1
- 208000031220 Hemophilia Diseases 0.000 description 1
- 208000009292 Hemophilia A Diseases 0.000 description 1
- 208000032982 Hemorrhagic Fever with Renal Syndrome Diseases 0.000 description 1
- 206010054885 Hepatic haemangioma rupture Diseases 0.000 description 1
- 206010020772 Hypertension Diseases 0.000 description 1
- 206010023177 Jejunal ulcer Diseases 0.000 description 1
- 206010024238 Leptospirosis Diseases 0.000 description 1
- 206010024652 Liver abscess Diseases 0.000 description 1
- 206010027070 Mediastinal abscess Diseases 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 206010030201 Oesophageal ulcer Diseases 0.000 description 1
- 206010030216 Oesophagitis Diseases 0.000 description 1
- 208000016222 Pancreatic disease Diseases 0.000 description 1
- 208000037581 Persistent Infection Diseases 0.000 description 1
- 206010067171 Regurgitation Diseases 0.000 description 1
- 201000008982 Thoracic Aortic Aneurysm Diseases 0.000 description 1
- 102000003929 Transaminases Human genes 0.000 description 1
- 108090000340 Transaminases Proteins 0.000 description 1
- 206010056091 Varices oesophageal Diseases 0.000 description 1
- 206010046996 Varicose vein Diseases 0.000 description 1
- 206010047115 Vasculitis Diseases 0.000 description 1
- 210000001015 abdomen Anatomy 0.000 description 1
- 208000002223 abdominal aortic aneurysm Diseases 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003872 anastomosis Effects 0.000 description 1
- 238000002583 angiography Methods 0.000 description 1
- 208000022531 anorexia Diseases 0.000 description 1
- 230000005744 arteriovenous malformation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000000013 bile duct Anatomy 0.000 description 1
- 238000004820 blood count Methods 0.000 description 1
- 238000009640 blood culture Methods 0.000 description 1
- 210000004204 blood vessel Anatomy 0.000 description 1
- 230000000747 cardiac effect Effects 0.000 description 1
- 208000019425 cirrhosis of liver Diseases 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 230000001447 compensatory effect Effects 0.000 description 1
- 208000018631 connective tissue disease Diseases 0.000 description 1
- 229940109239 creatinine Drugs 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 206010061428 decreased appetite Diseases 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 208000007784 diverticulitis Diseases 0.000 description 1
- 206010013864 duodenitis Diseases 0.000 description 1
- 208000019064 esophageal ulcer Diseases 0.000 description 1
- 208000024170 esophageal varices Diseases 0.000 description 1
- 201000010120 esophageal varix Diseases 0.000 description 1
- 208000006881 esophagitis Diseases 0.000 description 1
- 210000003722 extracellular fluid Anatomy 0.000 description 1
- 208000003457 familial thoracic 1 aortic aneurysm Diseases 0.000 description 1
- 230000024924 glomerular filtration Effects 0.000 description 1
- 208000035861 hematochezia Diseases 0.000 description 1
- 238000005534 hematocrit Methods 0.000 description 1
- 208000014951 hematologic disease Diseases 0.000 description 1
- 230000001631 hypertensive effect Effects 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000008595 infiltration Effects 0.000 description 1
- 238000001764 infiltration Methods 0.000 description 1
- 210000001630 jejunum Anatomy 0.000 description 1
- 208000032839 leukemia Diseases 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 201000000349 mediastinal cancer Diseases 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 208000025402 neoplasm of esophagus Diseases 0.000 description 1
- 208000018280 neoplasm of mediastinum Diseases 0.000 description 1
- 210000000440 neutrophil Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 208000024691 pancreas disease Diseases 0.000 description 1
- 208000011906 peptic ulcer disease Diseases 0.000 description 1
- 210000002381 plasma Anatomy 0.000 description 1
- 208000007232 portal hypertension Diseases 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 206010042772 syncope Diseases 0.000 description 1
- 210000000115 thoracic cavity Anatomy 0.000 description 1
- 208000009852 uremia Diseases 0.000 description 1
- 208000027185 varicose disease Diseases 0.000 description 1
- 230000006496 vascular abnormality Effects 0.000 description 1
- 208000019553 vascular disease Diseases 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Abstract
The invention discloses a system, a device and a storage medium for predicting diseases based on upper gastrointestinal bleeding, wherein the system comprises: and a data acquisition module: the method comprises the steps of extracting the digestive tract hemorrhage from an electronic case library as a case sample of a first clinical complaint symptom; and the feature extraction module is used for: the method is used for extracting symptom information of upper gastrointestinal hemorrhage in each case sample, related biological index data and corresponding disease names to form a case sample feature vector; disease clustering module: the K mean value clustering algorithm is used for clustering the case sample feature vectors by adopting the improved bird swarm optimization; and a joint diagnosis module: the method is used for clustering the cases to be diagnosed and establishing a combined diagnosis model based on symptom information and biological index data of upper gastrointestinal bleeding. The invention performs joint diagnosis and screening based on symptom information of the upper gastrointestinal hemorrhage and effective biological indexes on the basis of improving a clustering algorithm, and can accurately predict the cause and result of the upper gastrointestinal hemorrhage.
Description
Technical Field
The invention relates to the field of disease auxiliary diagnosis equipment, in particular to a system, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding.
Background
Upper gastrointestinal hemorrhage refers to hemorrhage of upper segment of esophagus, stomach, duodenum and jejunum, and is one of the common clinical emergency, and the clinical manifestations include hematemesis, black stool, anemia, etc. The most common causes of upper gastrointestinal hemorrhage are portal hypertension caused by peptic ulcer, erosive gastritis and liver cirrhosis, esophageal varices rupture and hemorrhage, gastric cancer, liver trauma and the like. Clinical manifestations depend on the nature, location, blood loss and speed of bleeding lesions and are closely related to the general condition of the patient when bleeding.
In the clinical work of upper gastrointestinal bleeding, diagnosis of upper gastrointestinal bleeding mainly depends on clinical symptomatology, lacks objective and effective biological diagnosis indexes, and is difficult to accurately predict the cause and result of upper gastrointestinal bleeding by only clinical diagnosis.
Disclosure of Invention
In view of the above, the present invention provides a system, device and storage medium for predicting diseases based on upper gastrointestinal bleeding, which are used for combining symptom characteristic information of upper gastrointestinal bleeding with objective indexes of effective biological indexes to perform combined diagnosis and screening so as to accurately analyze the cause and result of upper gastrointestinal bleeding.
In a first aspect of the invention, a system for predicting disease based on upper gastrointestinal bleeding, the system comprising:
and a data acquisition module: the method comprises the steps of extracting the digestive tract hemorrhage from an electronic case library as a case sample of a first clinical complaint symptom;
and the feature extraction module is used for: the method is used for extracting symptom information of upper gastrointestinal hemorrhage in each case sample, related biological index data and corresponding disease names to form a case sample feature vector;
disease clustering module: the method comprises the steps of determining a clustering category, and clustering the case sample feature vectors by adopting a K-means clustering algorithm optimized by an improved shoal algorithm;
and a joint diagnosis module: the method is used for acquiring the case to be diagnosed, clustering the case to be diagnosed, and establishing a combined diagnosis model of semantic similarity between symptom information of upper gastrointestinal hemorrhage and Euclidean distance between biological index data between the case to be diagnosed and a case sample inside the cluster.
Preferably, in the feature extraction module, feature items of symptom information of upper gastrointestinal bleeding in each case sample are extracted through TF-IDF, and feature vectors of the symptom information of upper gastrointestinal bleeding are established; the related biological index data comprise pulse, blood pressure, borborygmus, urine volume, liver function change data and fiber endoscopy data, and the biological index data are normalized to form a biological index vector; the characteristic vector of the case sample consists of a characteristic vector of symptom information of upper gastrointestinal bleeding and a biological index vector.
Preferably, the disease clustering module specifically includes:
an initializing unit: initializing a population, calculating a fitness value, and selecting an initialized global optimal value;
population updating unit: a random number a and a constant P between the generated (0, 1), selecting foraging when a > P, otherwise keeping alert;
when a > P, foraging position updating is carried out in a nonlinear weight adjustment mode, and a position updating formula is as follows:
wherein,respectively are provided withRepresenting the position of the ith bird in the j-th dimensional space t moment, C 1 And C 2 Respectively a perception coefficient and a social evolution coefficient, P i,j G for the optimal position of the ith bird i,j Is the optimal position of the group.
When a < P, the bird group keeps alert, and the position update formula is:
wherein a is 1 、a 2 ∈[0,2]K is [1, N]Random integer between them, and k not equal to i, f i 、f k The adaptation values of the ith and the k gray wolves are respectively; sumf is the sum of fitness of the whole population, ε is a constant, mean j Is the average fitness value of the population;
allowing the bird group to migrate according to the period, generating producer and eater, and updating the positions of the producer and the eater;
evaluation unit: and calculating a new fitness value, updating a historical optimal value, returning to the population updating unit until the set iteration times are reached, and outputting an optimal position as a clustering center point.
Preferably, in the disease clustering module, the fitness function of the evaluation unit is the sum of intra-class distances, namelyWherein K is the number of cluster categories, d (X) i ,C j ) For each data object X in class j i To a corresponding cluster center point C j Is a distance of (3).
Preferably, the joint diagnosis module specifically includes:
clustering unit: extracting features of the obtained case to be diagnosed to obtain a feature vector of the case to be diagnosed, calculating Euclidean distances from the feature vector of the case to be diagnosed to each clustering center point, and selecting a cluster with the smallest Euclidean distance;
a calculation unit: calculating cosine similarity alpha between feature vectors of symptom information of upper gastrointestinal bleeding between a case to be diagnosed and case samples in the cluster; calculating Euclidean distance beta of biological index vectors between the case to be diagnosed and case samples in the cluster;
a judging unit: calculating the joint similarity s of each case sample, s=w 1 α+w 2 (1-β),w 1 +w 2 =1, taking the disease type corresponding to the case sample with the largest joint similarity as the disease prediction result of the case to be diagnosed.
In a second aspect of the present invention, an electronic device is disclosed, comprising: at least one processor, at least one memory, a communication interface, and a bus;
the processor, the memory and the communication interface complete communication with each other through the bus;
the memory stores program instructions executable by the processor that the processor invokes to implement the system according to the first aspect of the present invention.
In a third aspect of the present invention, a computer-readable storage medium is disclosed, the computer-readable storage medium storing computer instructions that cause the computer to implement the system according to the first aspect of the present invention.
Compared with the prior art, the invention has the following beneficial effects:
1) According to the invention, the improved bird swarm algorithm is combined with the K-means algorithm to perform case clustering, the population position is updated in a nonlinear weight adjustment mode, the learning ability of excellent individuals is enhanced in the iterative learning process, so that the population can realize efficient optimization in the early foraging stage, and the convergence rate is improved; based on the aggregation result, the disease classification is further subdivided, so that the disease screening range can be reduced, and the disease prediction speed can be increased.
2) The invention combines the symptom information of upper gastrointestinal hemorrhage with the objective index of effective biological index, and performs combined diagnosis and screening by establishing a characteristic information semantic similarity calculation model and a diagnosis model constructed by biological cognitive index, so that the upper gastrointestinal hemorrhage disease result can be accurately predicted, the cause of the upper gastrointestinal hemorrhage disease can be analyzed based on the big data information of the electronic case library, and the invention is simple and practical auxiliary diagnosis equipment.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of the system for predicting disease based on upper gastrointestinal bleeding according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will clearly and fully describe the technical aspects of the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, are intended to fall within the scope of the present invention.
As shown in fig. 1, the present invention proposes a system for predicting a disease based on upper gastrointestinal bleeding, the system comprising a data acquisition module 100, a feature extraction module 200, a disease clustering module 300, and a joint diagnosis module 400;
the data acquisition module 100: the method comprises the steps of extracting the digestive tract hemorrhage from an electronic case library as a case sample of a first clinical complaint symptom;
the diseases of upper gastrointestinal hemorrhage are mainly: 1) Esophageal diseases including esophagitis, esophageal ulcers, esophageal tumors, esophageal cardiac mucosal tears, physical/chemical lesions; 2) Gastric, duodenal diseases including peptic ulcers, esophageal fundus varices rupture, portal hypertensive gastric disease, acute hemorrhagic erosive gastritis, gastric vascular abnormalities (arteriovenous malformations), gastric cancer and other tumors of the stomach, acute gastric distention, duodenitis and diverticulitis, diaphragmatic hernias, gastric torsion, ancylostomachache, jejunal ulcers after gastrointestinal anastomosis, and stomal ulcers; 3) Liver and gall bladder diseases including localized chronic infection in liver, liver abscess, liver cancer, hepatic hemangioma rupture, central rupture of liver parenchyma caused by trauma, etc. can lead to intrahepatic biliary tract hemorrhage. Also includes damage to the bile duct itself; 4) Diseases of adjacent organs or tissues of the upper digestive tract include pancreatic diseases involving the duodenum; a thoracic or abdominal aortic aneurysm breaks into the digestive tract; mediastinal tumor or abscess is broken into the esophagus; 5) Systemic diseases manifest bleeding in the gastrointestinal tract. Hematological disorders include leukemia, aplastic anemia, hemophilia; vascular disease; connective tissue disease and vasculitis; stress-related gastric mucosal lesions; acute infectious diseases include epidemic hemorrhagic fever, leptospirosis; uremia, and the like.
Different diseases have different characteristic information and biological index data. For example, the characteristic information of the clinical symptoms of erosive gastritis generally comprises pain in the upper abdomen, and repeated hematemesis can cause severe stress states such as anemia, syncope, hemorrhagic shock and the like, and then the symptoms are abdominal distension, anorexia, upper abdominal fullness, acid regurgitation, black stool and the like; blood convention can be seen in normal or increased leukocyte counts with an elevated proportion of neutrophils or lymphocytes. Blood culture bacteria of patients with combined systemic infection can be positive; the hidden red blood cells or hemoglobin in the feces are checked, and the positive fecal occult blood test is an index of digestive tract hemorrhage. The electronic case library contains various diseases corresponding to the hemorrhage on the alimentary canal, and the invention carries out disease etiology and result prediction analysis by depending on big data in the electronic case library.
Feature extraction module 200: the method is used for extracting symptom information of upper gastrointestinal hemorrhage in each case sample, related biological index data and corresponding disease names to form a case sample feature vector;
upper gastrointestinal hemorrhage is usually a severe acute symptom, and is usually analyzed by combination with fiber endoscopy and selective angiography, mainly based on the history of the disease, hematemesis, hematochezia, pulse, blood pressure, borborygmus, urine volume, liver function, etc. Here, the clinical manifestations of upper gastrointestinal hemorrhage are mainly hematemesis and black stool, and the symptoms of hemorrhagic shock, anemia and fever are sometimes complicated due to other symptoms depending on the bleeding speed, the bleeding amount, the stay time of blood in the gastrointestinal tract and the bleeding position; the biological index data here mainly includes: 1) Blood is normal. Early in bleeding, there was no significant change in erythrocyte count, hemoglobin amount, and hematocrit. After 3-4 hours, the blood plasma volume is supplemented by dilatation treatment or compensatory infiltration of interstitial fluid into blood vessels, and the values of hemoglobin and erythrocytes are reduced by dilution. The white blood cell count can be increased to 10-20X 109/L2-5 hours after the upper digestive tract is bloodletting, and the normal state is recovered two to three days after the bleeding is stopped; 2) Kidney function. Due to the decomposition of hemoglobin, the glomerular filtration rate is reduced, and the increase of blood urea nitrogen can occur, and the peak is reached in 24-48 hours, generally the time is reduced to be normal in 3-4 days. The ratio of blood urea nitrogen to blood creatinine being greater than 25:1 indicates upper gastrointestinal bleeding; 3) And liver function. Some patients are accompanied by bilirubin and elevated transaminases; 4) And the feces are conventional. Positive fecal occult blood test, direct prompt of digestive tract hemorrhage and the like; in specific implementation, objective index data (such as changes of pulse, blood pressure, borborygmus, urine volume, liver function and the like) of part of biological indexes can be screened as relevant biological index data of the invention.
Extracting characteristic items of symptom information of upper gastrointestinal bleeding in each case sample through TF-IDF, and establishing a characteristic vector A of the symptom information of upper gastrointestinal bleeding; normalizing the biological index data to form a biological index vector B; the case sample feature vector consists of feature vectors of symptom information of upper gastrointestinal hemorrhage and biological index vectors, and C= [ A, B ].
Disease clustering module 300: the method comprises the steps of determining a clustering category, and clustering the case sample feature vectors by adopting a K-means clustering algorithm optimized by an improved shoal algorithm;
upper gastrointestinal bleeding disorders are generally divided into five general categories: 1) Esophageal disease; 2) Stomach, duodenal diseases; 3) Liver and gall bladder diseases; 4) Diseases of adjacent organs or tissues of the upper digestive tract; 5) Systemic diseases, etc., thus the clustering category number k=5 of the present invention, the disease clustering module specifically includes:
an initializing unit: initializing a population, respectively selecting and calculating fitness values, and selecting an initialization global optimal value; each population of individuals represents a case sample data.
Population updating unit: a random number a and a constant P between the generated (0, 1), selecting foraging when a > P, otherwise keeping alert;
when a > P, foraging position updating is carried out in a nonlinear weight adjustment mode, and a position updating formula is as follows:
wherein,respectively representing the position of the ith bird in the j-th dimensional space t moment, C 1 And C 2 Respectively a perception coefficient and a social evolution coefficient, P i,j G for the optimal position of the ith bird i,j Is the optimal position of the group.
When a < P, the bird group keeps alert, and the position update formula is:
wherein a is 1 、a 2 ∈[0,2]K is [1, N]Random integer between them, and k not equal to i, f i 、f k The adaptation values of the ith and the k gray wolves are respectively; sumf is the sum of fitness of the whole population, ε is a constant, mean j Is the average fitness value of the population;
allowing the bird group to migrate according to the period, generating producer and eater, and updating the positions of the producer and the eater;
the location update formula for producer and eater is:
where randn (0, 1) represents a Gaussian distribution with a mean of 0 and a variance of 1, FL ε [0,2].
Evaluation unit: calculating a new fitness value, wherein the fitness function is the sum of the intra-class distances, namelyWherein K is the number of cluster categories, d (X) i ,C j ) For each data object X in class j i To a corresponding cluster center point C j Is a distance of (3).
After the fitness is obtained through calculation, selecting the position with the best fitness, updating the historical optimal value, returning to the population updating unit until the set iteration times are reached, and outputting the optimal position as a clustering center point.
According to the invention, the improved bird swarm algorithm is combined with the K-means algorithm to perform case clustering, the population position is updated in a nonlinear weight adjustment mode, the learning ability of excellent individuals is enhanced in the iterative learning process, so that the population can realize efficient optimization in the early foraging stage, and the convergence rate is improved; based on the aggregation result, the disease classification is further subdivided, so that the disease screening range can be reduced, and the disease prediction speed can be increased.
The joint diagnostic module 400: the method is used for acquiring the case to be diagnosed, clustering the case to be diagnosed, and establishing a combined diagnosis model of semantic similarity between symptom information of upper gastrointestinal hemorrhage and Euclidean distance between biological index data between the case to be diagnosed and a case sample inside the cluster. The joint diagnosis module specifically comprises:
clustering unit: extracting features of the obtained case to be diagnosed to obtain a feature vector of the case to be diagnosed, calculating Euclidean distances from the feature vector of the case to be diagnosed to each clustering center point, and selecting a cluster with the smallest Euclidean distance;
a calculation unit: calculating cosine similarity alpha between feature vectors of symptom information of upper gastrointestinal bleeding between a case to be diagnosed and case samples in the cluster; calculating Euclidean distance beta of biological index vectors between the case to be diagnosed and case samples in the cluster;
a judging unit: calculating the joint similarity s of each case sample, s=w 1 α+w 2 (1-β),w 1 、w 2 As the weight coefficient, w 1 +w 2 =1, taking the disease type corresponding to the case sample with the largest joint similarity as the disease prediction result of the case to be diagnosed.
According to the invention, the symptom information of upper gastrointestinal bleeding and the objective index of effective biological indexes are combined, and the diagnosis model constructed by the characteristic information semantic similarity calculation model and the biological cognitive index is established for combined diagnosis and screening, so that the upper gastrointestinal bleeding disease result can be accurately predicted, and the upper gastrointestinal bleeding disease cause can be analyzed based on the big data information of the electric power case library.
The invention also discloses an electronic device, comprising: at least one processor, at least one memory, a communication interface, and a bus;
the processor, the memory and the communication interface complete communication with each other through the bus;
the memory stores program instructions executable by the processor, and the processor invokes the program instructions to implement a data acquisition module, a feature extraction module, a disease clustering module, and a joint diagnosis module in the system of the present invention.
The invention also discloses a computer readable storage medium which stores computer instructions for causing the computer to realize a data acquisition module, a feature extraction module, a disease clustering module and a joint diagnosis module in the system. The storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic or optical disk, or other various media capable of storing program code.
The system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, i.e., may be distributed over a plurality of network elements. Some or all of the modules may be selected according to the actual government office in feudal China to achieve the purpose of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.
Claims (6)
1. A system for predicting disease based on upper gastrointestinal bleeding, the system comprising:
and a data acquisition module: the method comprises the steps of extracting the digestive tract hemorrhage from an electronic case library as a case sample of a first clinical complaint symptom;
and the feature extraction module is used for: the method is used for extracting symptom information of upper gastrointestinal hemorrhage in each case sample, related biological index data and corresponding disease names to form a case sample feature vector;
disease clustering module: the method comprises the steps of determining a clustering category, and clustering the case sample feature vectors by adopting a K-means clustering algorithm optimized by an improved shoal algorithm;
the disease clustering module specifically comprises:
an initializing unit: initializing a population, calculating a fitness value, and selecting an initialized global optimal value;
population updating unit: a random number a and a constant P between the generated (0, 1), selecting foraging when a > P, otherwise keeping alert;
when a > P, foraging position updating is carried out in a nonlinear weight adjustment mode, and a position updating formula is as follows:
wherein,respectively representing the position of the ith bird in the j-th dimensional space t moment, C 1 And C 2 Respectively a perception coefficient and a social evolution coefficient, P i,j G for the optimal position of the ith bird i,j Is the optimal position of the group;
when a < P, the bird group keeps alert, and the position update formula is:
wherein a is 1 、a 2 ∈[0,2]K is [1, N]Random integer between them, and k not equal to i, f i 、f k The fitness values of the ith bird and the kth bird are respectively; sumf is the sum of fitness of the whole population, ε is a constant, mean j Is the average fitness value of the population;
allowing the bird group to migrate according to the period, generating producer and eater, and updating the positions of the producer and the eater;
evaluation unit: calculating a new fitness value, updating a historical optimal value, returning to the population updating unit until the set iteration times are reached, and outputting an optimal position as a clustering center point;
and a joint diagnosis module: the method comprises the steps of obtaining a case to be diagnosed, clustering the case to be diagnosed, establishing a combined diagnosis model of semantic similarity between symptom information of upper gastrointestinal hemorrhage between the case to be diagnosed and a case sample inside the cluster and Euclidean distance between biological index data, and predicting the case to be diagnosed.
2. The system for predicting disease based on upper gastrointestinal bleeding according to claim 1, wherein the feature extraction module extracts feature items of symptom information of upper gastrointestinal bleeding in each case sample by TF-IDF, and creates feature vectors of symptom information of upper gastrointestinal bleeding; the related biological index data comprise blood routine, liver function, kidney function and fecal routine, and the biological index data are normalized to form a biological index vector; the characteristic vector of the case sample consists of a characteristic vector of symptom information of upper gastrointestinal bleeding and a biological index vector.
3. The system for predicting diseases based on upper gastrointestinal bleeding as set forth in claim 1, wherein the fitness function of the evaluation unit in the disease clustering module is a sum of intra-class distances, namelyWherein K is the number of cluster categories, d (X) i ,C j ) For each data object X in class j i To a corresponding cluster center point C j Is a distance of (3).
4. The system for predicting disease based on upper gastrointestinal bleeding of claim 1, wherein the joint diagnosis module specifically comprises:
clustering unit: extracting features of the obtained case to be diagnosed to obtain a feature vector of the case to be diagnosed, calculating Euclidean distances from the feature vector of the case to be diagnosed to each clustering center point, and selecting a cluster with the smallest Euclidean distance;
a calculation unit: calculating cosine similarity alpha between feature vectors of symptom information of upper gastrointestinal bleeding between a case to be diagnosed and case samples in the cluster; calculating Euclidean distance beta of biological index vectors between the case to be diagnosed and case samples in the cluster;
a judging unit: calculating the joint similarity s of each case sample, s=w 1 α+w 2 (1-β),w 1 +w 2 =1, taking the disease type corresponding to the case sample with the largest joint similarity as the disease prediction result of the case to be diagnosed.
5. An electronic device, comprising: at least one processor, at least one memory, a communication interface, and a bus;
the processor, the memory and the communication interface complete communication with each other through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to implement the system of any of claims 1-4.
6. A computer readable storage medium storing computer instructions that cause the computer to implement the system of any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011059143.1A CN112259219B (en) | 2020-09-30 | 2020-09-30 | System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011059143.1A CN112259219B (en) | 2020-09-30 | 2020-09-30 | System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112259219A CN112259219A (en) | 2021-01-22 |
CN112259219B true CN112259219B (en) | 2024-02-02 |
Family
ID=74233950
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011059143.1A Active CN112259219B (en) | 2020-09-30 | 2020-09-30 | System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112259219B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002008755A3 (en) * | 2000-07-24 | 2003-09-12 | Yeda Res & Dev | Identifying antigen clusters for monitoring a global state of an immune system |
JP2005509127A (en) * | 2000-11-27 | 2005-04-07 | インテリジェント メディカル ディバイシーズ エル.エル. シー. | Clinically intelligent diagnostic apparatus and method |
CN104915560A (en) * | 2015-06-11 | 2015-09-16 | 万达信息股份有限公司 | Method for disease diagnosis and treatment scheme based on generalized neural network clustering |
CN106446603A (en) * | 2016-09-29 | 2017-02-22 | 福州大学 | Gene expression data clustering method based on improved PSO algorithm |
WO2017059003A1 (en) * | 2015-09-29 | 2017-04-06 | Crescendo Bioscience | Biomarkers and methods for assessing psoriatic arthritis disease activity |
KR20190080331A (en) * | 2017-12-28 | 2019-07-08 | 경북대학교 산학협력단 | Extended BSA Coverage Path Searching Device, Method and Recording Medium thereof |
CN110648088A (en) * | 2019-11-26 | 2020-01-03 | 国网江西省电力有限公司电力科学研究院 | Electric energy quality disturbance source judgment method based on bird swarm algorithm and SVM |
CN111597878A (en) * | 2020-04-02 | 2020-08-28 | 南京农业大学 | BSA-IA-BP-based colony total number prediction method |
CN111653359A (en) * | 2020-05-30 | 2020-09-11 | 吾征智能技术(北京)有限公司 | Intelligent prediction model construction method and prediction system for hemorrhagic diseases |
-
2020
- 2020-09-30 CN CN202011059143.1A patent/CN112259219B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002008755A3 (en) * | 2000-07-24 | 2003-09-12 | Yeda Res & Dev | Identifying antigen clusters for monitoring a global state of an immune system |
JP2005509127A (en) * | 2000-11-27 | 2005-04-07 | インテリジェント メディカル ディバイシーズ エル.エル. シー. | Clinically intelligent diagnostic apparatus and method |
CN104915560A (en) * | 2015-06-11 | 2015-09-16 | 万达信息股份有限公司 | Method for disease diagnosis and treatment scheme based on generalized neural network clustering |
WO2017059003A1 (en) * | 2015-09-29 | 2017-04-06 | Crescendo Bioscience | Biomarkers and methods for assessing psoriatic arthritis disease activity |
CN106446603A (en) * | 2016-09-29 | 2017-02-22 | 福州大学 | Gene expression data clustering method based on improved PSO algorithm |
KR20190080331A (en) * | 2017-12-28 | 2019-07-08 | 경북대학교 산학협력단 | Extended BSA Coverage Path Searching Device, Method and Recording Medium thereof |
CN110648088A (en) * | 2019-11-26 | 2020-01-03 | 国网江西省电力有限公司电力科学研究院 | Electric energy quality disturbance source judgment method based on bird swarm algorithm and SVM |
CN111597878A (en) * | 2020-04-02 | 2020-08-28 | 南京农业大学 | BSA-IA-BP-based colony total number prediction method |
CN111653359A (en) * | 2020-05-30 | 2020-09-11 | 吾征智能技术(北京)有限公司 | Intelligent prediction model construction method and prediction system for hemorrhagic diseases |
Non-Patent Citations (3)
Title |
---|
A Color Image Quantization Algorithm Based on Particle Swarm Optimization;Omran, MG,等;INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS;第29卷(第03期);第261-269页 * |
一个基于序列比对的概念漂移检测算法;李德玉,等;山西大学学报(自然科学版);第39卷(第03期);第415-422页 * |
基于均值的云自适应鸟群优化算法;王进成,等;科学技术与工程(第11期);第167-172页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112259219A (en) | 2021-01-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113538313B (en) | Polyp segmentation method and device, computer equipment and storage medium | |
Sordo | Introduction to neural networks in healthcare | |
CN105442052B (en) | A kind of DNA library of checkout and diagnosis dissection of aorta disease Disease-causing gene and its application | |
CN116681958B (en) | Fetal lung ultrasonic image maturity prediction method based on machine learning | |
CN111710432B (en) | Phenotype-based quantitative measuring and calculating method and equipment for pathogenic genes | |
CN113274031B (en) | Arrhythmia classification method based on depth convolution residual error network | |
CN111028223A (en) | Microsatellite unstable intestinal cancer energy spectrum CT iodine water map image omics feature processing method | |
CN113763336A (en) | Image multi-task identification method and electronic equipment | |
CN113192064A (en) | Esophageal cancer B3 type blood vessel identification method based on coefficient of variation method | |
CN115281688A (en) | Cardiac hypertrophy multi-label detection system based on multi-mode deep learning | |
CN112259219B (en) | System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding | |
US20240062904A1 (en) | Tumor diagnosis system and construction method thereof, terminal device and storage medium | |
CN112017772B (en) | Method and system for constructing disease cognitive model based on female leucorrhea | |
CN114202504A (en) | Carotid artery ultrasonic automatic Doppler method, ultrasonic equipment and storage medium | |
CN111192687A (en) | Line graph prediction model for advanced appendicitis and application thereof | |
Vécsei et al. | Automated classification of duodenal imagery in celiac disease using evolved fourier feature vectors | |
CN113538344A (en) | Image recognition system, device and medium for distinguishing atrophic gastritis and gastric cancer | |
Beam et al. | A statistical method for the comparison of a discrete diagnostic test with several continuous diagnostic tests | |
CN115602319A (en) | Noninvasive hepatic fibrosis assessment device | |
CN112002414B (en) | Gastric juice-based system, gastric juice-based equipment and gastric juice-based storage medium | |
CN116912236B (en) | Method, system and storable medium for predicting fetal congenital heart disease risk based on artificial intelligence | |
CN115497630B (en) | Method and system for processing acute severe ulcerative colitis data | |
CN117058467B (en) | Gastrointestinal tract lesion type identification method and system | |
CN115064267B (en) | Biliary tract occlusion risk assessment system and establishment method thereof | |
CN115954051A (en) | Method and device for identifying mutant P53 gene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |