CN112002415A - Intelligent cognitive disease system based on human excrement - Google Patents
Intelligent cognitive disease system based on human excrement Download PDFInfo
- Publication number
- CN112002415A CN112002415A CN202010853281.0A CN202010853281A CN112002415A CN 112002415 A CN112002415 A CN 112002415A CN 202010853281 A CN202010853281 A CN 202010853281A CN 112002415 A CN112002415 A CN 112002415A
- Authority
- CN
- China
- Prior art keywords
- characteristic
- words
- characteristic information
- word
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 208000010877 cognitive disease Diseases 0.000 title claims abstract description 28
- 210000003608 fece Anatomy 0.000 claims abstract description 71
- 201000010099 disease Diseases 0.000 claims abstract description 49
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 49
- 208000024891 symptom Diseases 0.000 claims abstract description 30
- 230000001149 cognitive effect Effects 0.000 claims abstract description 15
- 238000000034 method Methods 0.000 claims abstract description 14
- 238000012163 sequencing technique Methods 0.000 claims description 7
- 238000012216 screening Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000019771 cognition Effects 0.000 claims description 2
- 210000002700 urine Anatomy 0.000 claims 2
- 208000028698 Cognitive impairment Diseases 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 11
- 239000008280 blood Substances 0.000 description 8
- 210000004369 blood Anatomy 0.000 description 8
- 208000032843 Hemorrhage Diseases 0.000 description 7
- 230000000740 bleeding effect Effects 0.000 description 7
- 235000019645 odor Nutrition 0.000 description 6
- 206010012735 Diarrhoea Diseases 0.000 description 5
- 239000013598 vector Substances 0.000 description 5
- 206010046274 Upper gastrointestinal haemorrhage Diseases 0.000 description 4
- 206010008631 Cholera Diseases 0.000 description 3
- 208000004232 Enteritis Diseases 0.000 description 3
- 240000007594 Oryza sativa Species 0.000 description 3
- 235000007164 Oryza sativa Nutrition 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 3
- 230000002550 fecal effect Effects 0.000 description 3
- 235000009566 rice Nutrition 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 206010010774 Constipation Diseases 0.000 description 2
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- 208000015634 Rectal Neoplasms Diseases 0.000 description 2
- 208000025865 Ulcer Diseases 0.000 description 2
- 239000010426 asphalt Substances 0.000 description 2
- 238000009534 blood test Methods 0.000 description 2
- 210000001072 colon Anatomy 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 210000001198 duodenum Anatomy 0.000 description 2
- 208000001848 dysentery Diseases 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 210000001035 gastrointestinal tract Anatomy 0.000 description 2
- 208000035861 hematochezia Diseases 0.000 description 2
- 208000014617 hemorrhoid Diseases 0.000 description 2
- 239000002932 luster Substances 0.000 description 2
- 206010038038 rectal cancer Diseases 0.000 description 2
- 210000000664 rectum Anatomy 0.000 description 2
- 201000001275 rectum cancer Diseases 0.000 description 2
- 210000002784 stomach Anatomy 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 206010000060 Abdominal distension Diseases 0.000 description 1
- 206010002153 Anal fissure Diseases 0.000 description 1
- 208000016583 Anus disease Diseases 0.000 description 1
- 241000244186 Ascaris Species 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 206010009900 Colitis ulcerative Diseases 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 206010012742 Diarrhoea infectious Diseases 0.000 description 1
- 206010013839 Duodenal ulcer haemorrhage Diseases 0.000 description 1
- MBMLMWLHJBBADN-UHFFFAOYSA-N Ferrous sulfide Chemical compound [Fe]=S MBMLMWLHJBBADN-UHFFFAOYSA-N 0.000 description 1
- 206010016654 Fibrosis Diseases 0.000 description 1
- 208000009531 Fissure in Ano Diseases 0.000 description 1
- 208000007882 Gastritis Diseases 0.000 description 1
- 206010023126 Jaundice Diseases 0.000 description 1
- 240000008790 Musa x paradisiaca Species 0.000 description 1
- 235000018290 Musa x paradisiaca Nutrition 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 206010036774 Proctitis Diseases 0.000 description 1
- 208000003100 Pseudomembranous Enterocolitis Diseases 0.000 description 1
- AUNGANRZJHBGPY-SCRDCRAPSA-N Riboflavin Chemical compound OC[C@@H](O)[C@@H](O)[C@@H](O)CN1C=2C=C(C)C(C)=CC=2N=C2C1=NC(=O)NC2=O AUNGANRZJHBGPY-SCRDCRAPSA-N 0.000 description 1
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 201000006704 Ulcerative Colitis Diseases 0.000 description 1
- 206010046996 Varicose vein Diseases 0.000 description 1
- 208000031975 Yang Deficiency Diseases 0.000 description 1
- 208000023505 abnormal feces Diseases 0.000 description 1
- 230000036528 appetite Effects 0.000 description 1
- 235000019789 appetite Nutrition 0.000 description 1
- 229910052788 barium Inorganic materials 0.000 description 1
- DSAJWYNOEDNPEQ-UHFFFAOYSA-N barium atom Chemical compound [Ba] DSAJWYNOEDNPEQ-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000000941 bile Anatomy 0.000 description 1
- 208000027503 bloody stool Diseases 0.000 description 1
- 230000007882 cirrhosis Effects 0.000 description 1
- 208000019425 cirrhosis of liver Diseases 0.000 description 1
- 210000005079 cognition system Anatomy 0.000 description 1
- 230000006998 cognitive state Effects 0.000 description 1
- 208000029742 colonic neoplasm Diseases 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000013872 defecation Effects 0.000 description 1
- 210000003238 esophagus Anatomy 0.000 description 1
- 235000011389 fruit/vegetable juice Nutrition 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 210000003405 ileum Anatomy 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000002458 infectious effect Effects 0.000 description 1
- 230000000968 intestinal effect Effects 0.000 description 1
- 210000000936 intestine Anatomy 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 210000001630 jejunum Anatomy 0.000 description 1
- 210000002429 large intestine Anatomy 0.000 description 1
- 235000012054 meals Nutrition 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000003097 mucus Anatomy 0.000 description 1
- 235000011837 pasties Nutrition 0.000 description 1
- 231100000915 pathological change Toxicity 0.000 description 1
- 230000036285 pathological change Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000002601 radiography Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 210000000952 spleen Anatomy 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 231100000397 ulcer Toxicity 0.000 description 1
- 210000002438 upper gastrointestinal tract Anatomy 0.000 description 1
- 208000027185 varicose disease Diseases 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Abstract
The invention provides an intelligent cognitive disease system based on human body excrement. The method comprises the following steps: the characteristic word acquisition module is used for acquiring an initial text and acquiring a keyword from the initial text as a reference characteristic word through a TF-IDF characteristic word positioning algorithm; the characteristic information base establishing module is used for generating a characteristic information classification base by a classification and clustering method; the identification model establishing module is used for generating an identification model according to the disease symptom characteristic information and the characteristic information classification library; and the recognition module is used for acquiring the characteristic words of the human body feces to be recognized, calculating the similarity between the characteristic words of the human body feces to be recognized and the reference characteristic words in the recognition model by utilizing a cosine similarity algorithm, and generating a corresponding cognitive report. According to the invention, a corresponding recognition model can be established by using a TF-IDF characteristic word positioning algorithm and a cosine similarity algorithm, so that diseases corresponding to human excrement can be quickly and accurately recognized, and the recognition accuracy is improved.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a system for intelligently recognizing diseases based on human excrement.
Background
Generally, stool traits include a large number. The shape and hardness of the excrement are the first; secondly, the color of the feces; thirdly, the contents of the feces; and fourthly, the smell of the feces, wherein the color, smell, character and the like of the feces are different under pathological conditions, so that corresponding pathological changes can be suggested.
In recent years, many institutions develop intelligent cognitive disease research based on human stool, but mostly begin with image recognition of stool characters, and analyze the physical condition of a tester according to the recognition result of the image. However, the method is low in efficiency and recognition rate, and in most cases, doctors are still required to perform manual judgment, so that an intelligent disease cognition system based on human excrement is urgently needed, and disease cognition can be efficiently and accurately completed.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
In view of the above, the invention provides a system for intelligently recognizing diseases based on human body feces, and aims to solve the technical problem that the efficiency and accuracy of the system for recognizing diseases cannot be improved by establishing a TF-IDF feature word positioning algorithm and a cosine similarity algorithm in the prior art.
The technical scheme of the invention is realized as follows:
in one aspect, the present invention provides a system for intelligently recognizing diseases based on human body feces, comprising:
the feature word acquisition module is used for acquiring an initial text, establishing a TF-IDF feature word positioning algorithm, acquiring a keyword from the initial text through the TF-IDF feature word positioning algorithm, and taking the keyword as a reference feature word;
the characteristic information base establishing module is used for dividing the reference characteristic words into different categories by a classification and clustering method and generating a corresponding characteristic information classification base according to the categories;
the identification model establishing module is used for acquiring corresponding disease symptom characteristic information according to the reference characteristic words of the characteristic information classification library and generating an identification model by combining the disease symptom characteristic information and the characteristic information classification library;
and the recognition module is used for establishing a cosine similarity algorithm, acquiring the characteristic words of the human body feces to be recognized, calculating the similarity between the characteristic words of the human body feces to be recognized and the reference characteristic words in the recognition model by using the cosine similarity algorithm, and generating a corresponding cognitive report.
On the basis of the above technical solution, preferably, the feature word obtaining module includes an initial text screening module, configured to obtain an initial text associated with the stool from the network, obtain a local term database, screen out a text description associated with the stool from the initial text according to the local term database, and use the text description as a text to be calculated.
On the basis of the technical scheme, preferably, the feature word acquisition module comprises a keyword extraction module for establishing a TF-IDF feature word positioning algorithm, calculating TF-IDF values of all words in the text to be calculated through the TF-IDF feature word positioning algorithm, sequencing all the words according to the numerical values of the TF-IDF values, and selecting the words corresponding to the TF-IDF values which are ranked 10 above as the reference feature words according to the sequencing sequence.
On the basis of the above technical solution, preferably, the characteristic information base establishing module further includes a characteristic information base module, configured to obtain a category of the reference characteristic word, and classify the reference characteristic word according to the category of the reference characteristic word, where the category of the reference characteristic word includes: and (3) generating a set of different categories according to the categories of the reference characteristic words and the corresponding reference characteristic words by using the odor, the form and the color of the excrement, and taking the set as a characteristic information classification library.
On the basis of the above technical solution, preferably, the identification model establishing module includes an information matching module, which is used for searching the local disease symptom characteristic information base according to the reference characteristic words in the characteristic information classification base, finding out the disease symptom characteristic information corresponding to the reference characteristic words, and generating the identification model by combining the characteristic information classification base and the corresponding disease symptom characteristic information.
On the basis of the above technical solution, preferably, the recognition module includes a calculation recognition module for establishing a cosine similarity algorithm to obtain the feature words to be recognized of the human body feces to be recognized, calculating the similarity between the feature words to be recognized and the reference feature words by the cosine similarity algorithm, and generating the corresponding cognitive report according to the similarity.
On the basis of the above technical solution, preferably, the recognition module includes a report generation module for setting a similarity threshold, comparing the similarity with the similarity threshold, when the similarity is greater than the similarity threshold, searching for disease symptom feature information corresponding to the reference feature word, and generating a corresponding cognitive report; and when the similarity is smaller than the similarity threshold value, reselecting the similarity for comparison.
Still further preferably, the human stool-based intelligent cognitive disease device comprises:
the feature word acquisition unit is used for acquiring an initial text, establishing a TF-IDF feature word positioning algorithm, acquiring a keyword from the initial text through the TF-IDF feature word positioning algorithm, and taking the keyword as a reference feature word;
the characteristic information base establishing unit is used for dividing the reference characteristic words into different categories by a classification and clustering method and generating a corresponding characteristic information classification base according to the categories;
the identification model establishing unit is used for acquiring corresponding disease symptom characteristic information according to the reference characteristic words of the characteristic information classification library and generating an identification model by combining the disease symptom characteristic information and the characteristic information classification library;
and the recognition unit is used for establishing a cosine similarity algorithm, acquiring the characteristic words of the human body feces to be recognized, calculating the similarity between the characteristic words of the human body feces to be recognized and the reference characteristic words in the recognition model by using the cosine similarity algorithm, and generating a corresponding cognitive report.
Compared with the prior art, the intelligent cognitive disease system based on the human body feces has the following beneficial effects:
(1) by establishing a TF-IDF characteristic word positioning algorithm, keywords can be obtained from an initial text, the recognition and identification precision of the whole disease is improved, and the system can accurately inquire the corresponding keywords conveniently;
(2) by utilizing a cosine similarity algorithm, the similarity between the characteristic words of the human body feces to be recognized and the reference characteristic words in the recognition model can be calculated, the recognition accuracy of the system is improved, and the recognition speed is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a block diagram of a first embodiment of the intelligent cognitive disease system based on human feces according to the present invention;
FIG. 2 is a structural diagram of a second embodiment of the intelligent cognitive disease system based on human feces according to the present invention;
FIG. 3 is a block diagram of a third embodiment of the intelligent cognitive disease system based on human feces according to the present invention;
FIG. 4 is a block diagram of a fourth embodiment of the intelligent cognitive disease system based on human feces according to the present invention;
FIG. 5 is a block diagram of a fifth embodiment of the intelligent cognitive disease system based on human feces according to the present invention;
FIG. 6 is a block diagram of the structure of the intelligent cognitive disease equipment based on human feces.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1, fig. 1 is a block diagram illustrating a first embodiment of a system for intelligent cognitive diseases based on human feces according to the present invention. Wherein, the intelligent cognitive disease system based on human excrement comprises: the system comprises a feature word acquisition module 10, a feature information base establishment module 20, a recognition model establishment module 30 and a recognition module 40.
The feature word obtaining module 10 is configured to obtain an initial text, establish a TF-IDF feature word positioning algorithm, obtain a keyword from the initial text through the TF-IDF feature word positioning algorithm, and use the keyword as a reference feature word;
the characteristic information base establishing module 20 is used for dividing the reference characteristic words into different categories by a classification and clustering method and generating a corresponding characteristic information classification base according to the categories;
the identification model establishing module 30 is configured to obtain corresponding disease symptom feature information according to the reference feature words of the feature information classification library, and generate an identification model by combining the disease symptom feature information and the feature information classification library;
the recognition module 40 is configured to establish a cosine similarity algorithm, obtain the feature words of the human body feces to be recognized, calculate the similarity between the feature words of the human body feces to be recognized and the reference feature words in the recognition model by using the cosine similarity algorithm, and generate a corresponding cognitive report.
It should be understood that the execution subject of the present implementation may be a processor or a controller in a patient or a doctor cognitive state processing terminal, etc.
It should be understood that the present embodiment also includes another stool-aware disease prediction device, including:
the text description unit is used for identifying the characteristic information of the excrement to be diagnosed and acquiring the text characteristic information description corresponding to the characteristic information of the excrement to be diagnosed;
the database construction unit is used for identifying characteristic information data of human body excrement characters (including color, smell, characters, hardness, excrement content and the like) and corresponding disease data based on a TF-IDF characteristic word positioning algorithm, and establishing a corresponding database according to the excrement characteristic information data and the corresponding disease knowledge and disease symptom characteristic information data;
the excrement cognitive learning model building unit is used for dividing the different excrement characteristic information data into different categories (including color, smell, property, hardness, excrement content and the like) and sets, establishing a cosine similarity algorithm classifier, and acquiring an excrement characteristic information cognitive learning model by using the cosine similarity algorithm classifier through the different sets;
and the screening and diagnosing unit is used for screening and diagnosing the similarity of the text description of the fecal characteristic information corresponding to the fecal characteristic information to be diagnosed through the fecal characteristic information learning model.
Further, as shown in fig. 2, a structural block diagram of a second embodiment of the system for intelligent cognitive diseases based on human stool according to the present invention is proposed based on the above embodiments, in this embodiment, the feature word obtaining module 10 further includes:
the initial text screening module 101 is configured to acquire an initial text associated with the feces from the network, acquire a local term database, screen a text description associated with the feces from the initial text according to the local term database, and use the text description as a text to be calculated;
the keyword extraction module 102 is used for establishing a TF-IDF characteristic word positioning algorithm, calculating TF-IDF values of all words in a text to be calculated through the TF-IDF characteristic word positioning algorithm, sequencing all the words according to the TF-IDF values, and selecting the words corresponding to the TF-IDF values which are ranked 10 above as reference characteristic words according to the sequencing sequence;
it should be understood that the system first obtains initial text associated with feces from the network, obtains a local term database, and screens out text description associated with feces from the initial text according to the local term database to serve as the text to be calculated.
It should be understood that obtaining the initial text associated with the stool from the network is the initial text downloaded by the system from various websites and forums in accordance with the term stool, including: the initial text can also be a template input by an administrator in advance, and the system searches the obtained initial text according to the template input by the administrator in advance.
It should be understood that the local term database is some terms preset by the administrator about stool, such as: the loose paste or juice or water sample is commonly seen in various infectious or non-infectious diarrhea and enteritis; when the yellow-green dilute water sample contains membranous substances, pseudomembranous enteritis can occur; rice swill-like feces (white rice washing water sample) is commonly found in cholera and side cholera, which are serious infectious diseases and need to be isolated and treated as soon as possible; when the feces contain more mucus which can be seen by naked eyes, the feces are mostly caused by small intestine inflammation and proctitis; when feces contain macroscopic purulent blood, the feces are called purulent blood stool and are commonly seen in dysentery, ulcerative colitis, colon or rectal cancer, local enteritis and the like; fresh blood is usually found in hemorrhoids or anal fissure and mostly adheres to the surface of secret feces; the black excrement is also called as asphalt excrement, is like asphalt, is soft and rich in luster, is mostly upper gastrointestinal bleeding caused by various reasons, and has strong positive occult blood test, while the black excrement caused by taking medicines has no luster and has negative occult blood test; feces after barium meal radiography can temporarily become yellow-white; the thin strip or flat strip indicates that the rectum is narrow and mostly seen in the rectum cancer; the dry stool is usually hard ball-shaped or sheep manure-like, and is found in constipation patients or in elderly with weak defecation.
It should be understood that the system also establishes a TF-IDF feature word positioning algorithm, TF-IDF values of all words in the text to be calculated are calculated through the TF-IDF feature word positioning algorithm, all the words are sorted according to the numerical values of the TF-IDF values, the words corresponding to the TF-IDF values of 10 th before the ranking are selected as reference feature words according to the sorting sequence, and the reference feature words can be quickly and accurately found through the TF-IDF feature word positioning algorithm, so that the system efficiency is improved, and meanwhile time is saved.
It should be understood that the method for extracting the keywords of the article by using TF-IDF mainly includes calculating TF (word frequency) of each word in the article, then calculating IDF (weight) of the word, and finally multiplying TF and IDF and then sorting to obtain topN keywords, that is, the keywords of the article. The key words are key characteristic information of human body excrement character description.
The specific implementation mode is as follows: and calculating the word frequency of the text description of the human body stool character. Word frequency is the total number of times a word appears in an article; of course to eliminate the differences between different article sizes and facilitate comparison between different articles, we normalize the word frequencies here: the word frequency is the total number of times of appearance of a certain word in an article/the total number of words of the article; or in one way: the word frequency is the total number of times a word appears in an article/the number of words appearing most frequently in an article.
The inverse document frequency is calculated. The Inverse Document Frequency (IDF) is log (total number of documents in the corpus of words/number of documents containing the word + 1). To avoid a denominator of 0, 1 is added to the denominator. Namely:
it should be noted that the purpose of this position +1 is to prevent the denominator from being 0. As can be seen from the expression: the more the document number of the word is, the smaller the IDF value is, and the word is not important; conversely, the more important the word is; the IDF is more like a weight assigned to TF.
The TF-IDF value is calculated. TF-IDF is actually: TF IDF. A high word frequency within a particular document, and a low document frequency for that word across the document collection, may result in a high-weighted TF-IDF. Therefore, TF-IDF tends to filter out common words, preserving important words.
And finally, solving the keywords. And after calculating the TF-IDF value of each word in the article, sequencing, and selecting the most high value as a keyword.
Further, as shown in fig. 3, a structural block diagram of a third embodiment of the system for intelligent cognitive diseases based on human stool according to the present invention is proposed based on the above embodiments, in this embodiment, the characteristic information base establishing module 20 further includes:
the characteristic information base module 201 is configured to obtain a category of a reference characteristic word, and classify the reference characteristic word according to the category of the reference characteristic word, where the category of the reference characteristic word includes: and (3) generating a set of different categories according to the categories of the reference characteristic words and the corresponding reference characteristic words by using the odor, the form and the color of the excrement, and taking the set as a characteristic information classification library.
It should be understood that, the system then classifies and clusters the abnormal characteristic information of human feces according to the reference characteristic words, which is a common method, without specific explanation, into several categories and levels, such as odor, form, color, hardness, and feces content, and generates a set of different categories according to the categories of the reference characteristic words and the corresponding reference characteristic words, and the set is used as a characteristic information classification library, for example, with feces color as an example, there are green feces, bright red feces, purplish red feces, dull red feces, coffee feces, yellow feces, black feces, blackish brown feces, bright blood feces, silver feces, red feces, and the like; the characteristics of feces are exemplified by: hard stool, soft stool, loose stool, watery stool, pasty or juice-like stool, loose stool, rice-like stool, frozen stool, white pottery-like stool, thin strip-like stool and the like, and different characters indicate different patients; taking stool odor as an example: sour, burnt, rancid, fishy, odd etc. odors, different odors being indicative of different patients, etc.
Further, as shown in fig. 4, a structural block diagram of a fourth embodiment of the system for intelligent cognitive diseases based on human stool according to the present invention is proposed based on the above embodiments, in this embodiment, the identification model building module 30 includes:
the information matching module 301 is configured to search the local disease symptom feature information base according to the reference feature words in the feature information classification base, find disease symptom feature information corresponding to the reference feature words, and generate an identification model by combining the feature information classification base and the corresponding disease symptom feature information.
It should be understood that the system will then extract respective reference feature words according to these several categories and levels, and use these reference feature words to perform continuous matching with the possible disease symptom feature information base, so as to generate the abnormal stool characteristic feature information and the corresponding candidate disease and disease symptom feature information base. The disease symptom characteristic information base is a database which is created locally in advance, and corresponding data is filled by an administrator. Such as: the feces of the patients with constipation are spherical hard blocks; porridge-like or watery stool is caused by diarrhea caused by various reasons; rice soup-like is found in cholera and also in patients with parachoea; dark or pitch-like bleeding is a bleeding which is usually caused by the esophagus, stomach and duodenum; bleeding of feces, namely, bleeding in large intestine or haemorrhoids, wherein the color of the feces is bright red and is near blood; spleen yang deficiency and the like in cases of loose stool, poor appetite and abdominal distension; for another example, if the color of the feces is "argil-like" grey white, it indicates that jaundice or biliary obstruction caused by calculi, tumor, ascaris, etc. may be caused, resulting in the failure of the bile flavin to be excreted with the feces; if pig blood is not taken and medicines which can discharge black excrement are not taken, the excrement is black and generally represents the upper gastrointestinal hemorrhage. Usually, the stomach and duodenum bleed, and the blood passes through the intestine, where the iron in the blood turns into iron sulfide and becomes black under the action of intestinal bacteria. In patients with upper gastrointestinal bleeding, about 50% of the bleeding is due to ulcer disease, most of which are duodenal ulcer bleeding. In addition to ulcer, gastritis, cirrhosis with rupture of esophageal or fundus varices and gastric cancer are also common causes of upper gastrointestinal bleeding; the bloody stool is mostly bleeding from the lower digestive tract, which includes: jejunum, ileum, rectum, colon, due to short "path" and little chemical change, these parts bleed and the stool turns bright red. If the upper gastrointestinal tract has a large amount of bleeding, the blood cannot stay in the intestinal tract too much, and the discharged excrement is red, and the like. Also, for example: normally, the feces should be yellow banana feces, 1 feces per day. It is reported that a gradual change in stool color from brown to green is considered normal, but other colors may suggest a serious disease.
Further, as shown in fig. 5, a block diagram of a fifth embodiment of the intelligent cognitive disease system based on human stool according to the present invention is proposed based on the above embodiments, in this embodiment, the identification module 40 includes:
the calculation and recognition module 401 is configured to establish a cosine similarity algorithm, obtain a feature word to be recognized of human body stool to be recognized, calculate a similarity between the feature word to be recognized and a reference feature word by using the cosine similarity algorithm, and generate a corresponding cognitive report according to the similarity.
A report generating module 402, configured to set a similarity threshold, compare the similarity with the similarity threshold, find disease symptom feature information corresponding to the reference feature word when the similarity is greater than the similarity threshold, and generate a corresponding cognitive report; and when the similarity is smaller than the similarity threshold value, reselecting the similarity for comparison.
It should be understood that the system will eventually use the cosine theorem to calculate their similarity. Specifically, the cosine similarity calculation method is used for inputting the characteristic information of the abnormal human stool characteristics to be identified into the model for similarity calculation and analysis, and the similarity of the weight of the characteristic information of the related abnormal characteristics is obtained. The method comprises the steps of calculating keywords of an article to be identified related to the stool character description, selecting the keywords with the same number from the keywords, combining the keywords into a set, calculating the word frequency of the article for words in the set, generating word frequency vectors of the article, further solving cosine similarity of the two vectors through Euclidean distance or cosine distance, wherein the larger the value is, the more similar the cosine similarity is, recognizing diseases corresponding to the stool, and generating a corresponding recognition report.
It should be understood that, in this embodiment, the specific method is as follows: we can imagine them as two line segments in space, both from the origin [0,0]Starting from a different direction. An included angle is formed between the two line segments, if the included angle is 0 degree, the direction is the same, and the line segments are overlapped; if the included angle is 90 degrees, the right angle is formed, and the directions are completely dissimilar; if the angle is 180 degrees, it means the direction is exactly opposite. Therefore, the similarity degree of the vectors can be judged according to the size of the included angle. The smaller the included angle is, the more similar the included angle is; the closer the cosine value is to 1, the closer the angle is to 0 degrees, i.e. the more similar the two vectors are. Suppose A and B are two n-dimensional vectors, A is [ A ]1,A2,...,An]B is [ B ]1,B2,...,Bn]And the cosine value of the included angle between the A and the B is as follows:
the diseases corresponding to the excrement can be quickly and accurately recognized through the cosine value of the included angle, and the efficiency of the whole system is improved.
The above description is only for illustrative purposes and does not limit the technical solutions of the present application in any way.
Through the above description, it can be easily found that the present embodiment provides a system for intelligently recognizing diseases based on human feces, which includes: the characteristic word acquisition module is used for acquiring an initial text and acquiring a keyword from the initial text as a reference characteristic word through a TF-IDF characteristic word positioning algorithm; the characteristic information base establishing module is used for generating a characteristic information classification base by a classification and clustering method; the identification model establishing module is used for generating an identification model according to the disease symptom characteristic information and the characteristic information classification library; and the recognition module is used for acquiring the characteristic words of the human body feces to be recognized, calculating the similarity between the characteristic words of the human body feces to be recognized and the reference characteristic words in the recognition model by utilizing a cosine similarity algorithm, and generating a corresponding cognitive report. In the embodiment, the TF-IDF feature word positioning algorithm and the cosine similarity algorithm are utilized to establish the corresponding recognition model, so that diseases corresponding to human feces are quickly and accurately recognized, and the recognition accuracy is improved.
In addition, the embodiment of the invention also provides equipment for intelligently recognizing the diseases based on the human body excrement. As shown in fig. 6, the intelligent cognitive disease device based on human stool comprises: a feature word obtaining unit 10, a feature information base establishing unit 20, a recognition model establishing unit 30, and a recognition unit 40.
The feature word obtaining unit 10 is configured to obtain an initial text, establish a TF-IDF feature word positioning algorithm, obtain a keyword from the initial text through the TF-IDF feature word positioning algorithm, and use the keyword as a reference feature word;
the characteristic information base establishing unit 20 is configured to divide the reference characteristic words into different categories by a classification and clustering method, and generate a corresponding characteristic information classification base according to the categories;
the identification model establishing unit 30 is configured to obtain corresponding disease symptom feature information according to the reference feature words of the feature information classification library, and generate an identification model by combining the disease symptom feature information and the feature information classification library;
the recognition unit 40 is configured to establish a cosine similarity algorithm, obtain the feature words of the human body feces to be recognized, calculate the similarity between the feature words of the human body feces to be recognized and the reference feature words in the recognition model by using the cosine similarity algorithm, and generate a corresponding cognitive report.
In addition, it should be noted that the above-described embodiments of the apparatus are merely illustrative, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of the modules to implement the purpose of the embodiments according to actual needs, and the present invention is not limited herein.
In addition, the technical details that are not described in detail in this embodiment can be referred to the system for intelligently recognizing the disease based on human stool provided in any embodiment of the present invention, and are not described herein again.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (8)
1. The intelligent cognitive disease system based on human excrement is characterized by comprising the following components in parts by weight:
the feature word acquisition module is used for acquiring an initial text, establishing a TF-IDF feature word positioning algorithm, acquiring a keyword from the initial text through the TF-IDF feature word positioning algorithm, and taking the keyword as a reference feature word;
the characteristic information base establishing module is used for dividing the reference characteristic words into different categories by a classification and clustering method and generating a corresponding characteristic information classification base according to the categories;
the identification model establishing module is used for acquiring corresponding disease symptom characteristic information according to the reference characteristic words of the characteristic information classification library and generating an identification model by combining the disease symptom characteristic information and the characteristic information classification library;
and the recognition module is used for establishing a cosine similarity algorithm, acquiring the characteristic words of the human body feces to be recognized, calculating the similarity between the characteristic words of the human body feces to be recognized and the reference characteristic words in the recognition model by using the cosine similarity algorithm, and generating a corresponding cognitive report.
2. The human stool based intelligent cognitive disorder system of claim 1, wherein: the characteristic word obtaining module comprises an initial text screening module which is used for obtaining an initial text related to the excrement from the network, obtaining a local word database, screening a text description related to the excrement from the initial text according to the local word database, and taking the text description as a text to be calculated.
3. The system of claim 2, wherein the cognitive impairment is based on human stool intelligence, and wherein: the characteristic word obtaining module comprises a keyword extracting module and is used for establishing a TF-IDF characteristic word positioning algorithm, calculating TF-IDF values of all words in the text to be calculated through the TF-IDF characteristic word positioning algorithm, sequencing all the words according to the TF-IDF values, and selecting the words corresponding to the TF-IDF values which are ranked 10 above as reference characteristic words according to the sequencing sequence.
4. The human stool based intelligent cognitive disorder system of claim 3, wherein: the characteristic information base establishing module further comprises a characteristic information base module which is used for obtaining the category of the reference characteristic word and classifying the reference characteristic word according to the category of the reference characteristic word, wherein the category of the reference characteristic word comprises: and (3) generating a set of different categories according to the categories of the reference characteristic words and the corresponding reference characteristic words by using the odor, the form and the color of the excrement, and taking the set as a characteristic information classification library.
5. The human stool based intelligent cognitive disorder system of claim 4, wherein: the identification model establishing module comprises an information matching module which is used for searching the local disease symptom characteristic information base according to the reference characteristic words in the characteristic information classification base, finding out the disease symptom characteristic information corresponding to the reference characteristic words and generating the identification model by combining the characteristic information classification base and the corresponding disease symptom characteristic information.
6. The human stool based intelligent cognitive disorder system of claim 5, wherein: the recognition module comprises a calculation recognition module used for establishing a cosine similarity algorithm, acquiring the characteristic words to be recognized of the human body feces to be recognized, calculating the similarity between the characteristic words to be recognized and the reference characteristic words through the cosine similarity algorithm, and generating a corresponding cognition report according to the similarity.
7. The human stool based intelligent cognitive disorder system of claim 6, wherein: the recognition module comprises a report generation module, a recognition module and a recognition module, wherein the report generation module is used for setting a similarity threshold, comparing the similarity with the similarity threshold, searching disease symptom characteristic information corresponding to the reference characteristic words when the similarity is greater than the similarity threshold, and generating corresponding cognitive reports; and when the similarity is smaller than the similarity threshold value, reselecting the similarity for comparison.
8. The utility model provides a based on human excrement and urine intelligence cognitive disease equipment which characterized in that, based on human excrement and urine intelligence cognitive disease equipment includes:
the feature word acquisition unit is used for acquiring an initial text, establishing a TF-IDF feature word positioning algorithm, acquiring a keyword from the initial text through the TF-IDF feature word positioning algorithm, and taking the keyword as a reference feature word;
the characteristic information base establishing unit is used for dividing the reference characteristic words into different categories by a classification and clustering method and generating a corresponding characteristic information classification base according to the categories;
the identification model establishing unit is used for acquiring corresponding disease symptom characteristic information according to the reference characteristic words of the characteristic information classification library and generating an identification model by combining the disease symptom characteristic information and the characteristic information classification library;
and the recognition unit is used for establishing a cosine similarity algorithm, acquiring the characteristic words of the human body feces to be recognized, calculating the similarity between the characteristic words of the human body feces to be recognized and the reference characteristic words in the recognition model by using the cosine similarity algorithm, and generating a corresponding cognitive report.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010853281.0A CN112002415B (en) | 2020-08-23 | 2020-08-23 | Intelligent cognitive disease system based on human excrement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010853281.0A CN112002415B (en) | 2020-08-23 | 2020-08-23 | Intelligent cognitive disease system based on human excrement |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112002415A true CN112002415A (en) | 2020-11-27 |
CN112002415B CN112002415B (en) | 2024-03-01 |
Family
ID=73473492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010853281.0A Active CN112002415B (en) | 2020-08-23 | 2020-08-23 | Intelligent cognitive disease system based on human excrement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112002415B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112786191A (en) * | 2021-01-18 | 2021-05-11 | 吾征智能技术(北京)有限公司 | Disease cognition system, equipment and storage medium based on stool convention |
CN113593696A (en) * | 2021-07-12 | 2021-11-02 | 金世柱 | Excrement self-screening diagnosis system |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106372439A (en) * | 2016-09-21 | 2017-02-01 | 北京大学 | Method for acquiring and processing disease symptoms and weight knowledge thereof based on case library |
JP2017091270A (en) * | 2015-11-11 | 2017-05-25 | 大日本印刷株式会社 | Information processing device, information processing system, and program |
CN108628825A (en) * | 2018-04-10 | 2018-10-09 | 平安科技(深圳)有限公司 | Text message Similarity Match Method, device, computer equipment and storage medium |
CN109243618A (en) * | 2018-09-12 | 2019-01-18 | 腾讯科技(深圳)有限公司 | Construction method, disease label construction method and the smart machine of medical model |
CN109977422A (en) * | 2019-04-18 | 2019-07-05 | 中国石油大学(华东) | A kind of case history key message extraction model based on participle technique |
CN109977406A (en) * | 2019-03-26 | 2019-07-05 | 浙江大学 | A kind of Chinese medicine state of an illness text key word extracting method based on sick position |
CN110222713A (en) * | 2019-05-05 | 2019-09-10 | 深圳先进技术研究院 | A kind of infant's excrement sampled images specification processing system and method |
CN110335653A (en) * | 2019-06-30 | 2019-10-15 | 浙江大学 | Non-standard case history analytic method based on openEHR case history format |
WO2020007028A1 (en) * | 2018-07-04 | 2020-01-09 | 平安科技(深圳)有限公司 | Medical consultation data recommendation method, device, computer apparatus, and storage medium |
CN110705247A (en) * | 2019-08-30 | 2020-01-17 | 山东科技大学 | Based on x2-C text similarity calculation method |
CN111341437A (en) * | 2020-02-21 | 2020-06-26 | 山东大学齐鲁医院 | Digestive tract disease judgment auxiliary system based on tongue image |
-
2020
- 2020-08-23 CN CN202010853281.0A patent/CN112002415B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017091270A (en) * | 2015-11-11 | 2017-05-25 | 大日本印刷株式会社 | Information processing device, information processing system, and program |
CN106372439A (en) * | 2016-09-21 | 2017-02-01 | 北京大学 | Method for acquiring and processing disease symptoms and weight knowledge thereof based on case library |
CN108628825A (en) * | 2018-04-10 | 2018-10-09 | 平安科技(深圳)有限公司 | Text message Similarity Match Method, device, computer equipment and storage medium |
WO2020007028A1 (en) * | 2018-07-04 | 2020-01-09 | 平安科技(深圳)有限公司 | Medical consultation data recommendation method, device, computer apparatus, and storage medium |
CN109243618A (en) * | 2018-09-12 | 2019-01-18 | 腾讯科技(深圳)有限公司 | Construction method, disease label construction method and the smart machine of medical model |
CN109977406A (en) * | 2019-03-26 | 2019-07-05 | 浙江大学 | A kind of Chinese medicine state of an illness text key word extracting method based on sick position |
CN109977422A (en) * | 2019-04-18 | 2019-07-05 | 中国石油大学(华东) | A kind of case history key message extraction model based on participle technique |
CN110222713A (en) * | 2019-05-05 | 2019-09-10 | 深圳先进技术研究院 | A kind of infant's excrement sampled images specification processing system and method |
CN110335653A (en) * | 2019-06-30 | 2019-10-15 | 浙江大学 | Non-standard case history analytic method based on openEHR case history format |
CN110705247A (en) * | 2019-08-30 | 2020-01-17 | 山东科技大学 | Based on x2-C text similarity calculation method |
CN111341437A (en) * | 2020-02-21 | 2020-06-26 | 山东大学齐鲁医院 | Digestive tract disease judgment auxiliary system based on tongue image |
Non-Patent Citations (3)
Title |
---|
YUNG-CHI SHEN 等: "Discovering the potential opportunities of scientific advancement and technological innovation: A case study of smart health monitoring technology", TECHNOLOGICAL FORECASTING & SOCIAL CHANGE, pages 281 - 283 * |
武永亮;赵书良;李长镜;魏娜娣;王子晏;: "基于TF-IDF和余弦相似度的文本分类方法", 中文信息学报, no. 05, pages 143 - 150 * |
聂秀萍 等: "基于文本挖掘的国外农业科研项目研究热点主题分析", 江西农业学报, 22 July 2018 (2018-07-22), pages 102 - 106 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112786191A (en) * | 2021-01-18 | 2021-05-11 | 吾征智能技术(北京)有限公司 | Disease cognition system, equipment and storage medium based on stool convention |
CN112786191B (en) * | 2021-01-18 | 2023-12-05 | 吾征智能技术(北京)有限公司 | Disease cognition system, equipment and storage medium based on excrement convention |
CN113593696A (en) * | 2021-07-12 | 2021-11-02 | 金世柱 | Excrement self-screening diagnosis system |
Also Published As
Publication number | Publication date |
---|---|
CN112002415B (en) | 2024-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Carvalho et al. | Breast cancer diagnosis from histopathological images using textural features and CBIR | |
Retico et al. | Lung nodule detection in low-dose and thin-slice computed tomography | |
CN112002415B (en) | Intelligent cognitive disease system based on human excrement | |
Bashar et al. | Automatic detection of informative frames from wireless capsule endoscopy images | |
Dheir et al. | Classification of anomalies in gastrointestinal tract using deep learning | |
CN106204599A (en) | The automatic segmentation system and method for image in digestive tract | |
CN109544526A (en) | A kind of atrophic gastritis image identification system, device and method | |
CN111985246B (en) | Disease cognitive system based on main symptoms and accompanying symptom words | |
CN111563891A (en) | Disease prediction system based on color cognition | |
Huang et al. | Gastroesophageal reflux disease diagnosis using hierarchical heterogeneous descriptor fusion support vector machine | |
Reddy et al. | Classification of nonalcoholic fatty liver texture using convolution neural networks | |
Patel et al. | Automated bleeding detection in wireless capsule endoscopy images based on sparse coding | |
Suhendra et al. | A novel approach to multi-class atopic dermatitis disease severity scoring using multi-class SVM | |
Ribeiro et al. | Polyps detection in colonoscopies | |
Raju et al. | Intelligent recognition of colorectal cancer combining application of computer-assisted diagnosis with deep learning approaches | |
Sun et al. | Removal of non-informative frames for wireless capsule endoscopy video segmentation | |
Raju et al. | An advanced diagnostic ColoRectalCADx utilises CNN and unsupervised visual explanations to discover malignancies | |
Maulana et al. | Performance analysis and feature extraction for classifying the severity of atopic dermatitis diseases | |
Fayyadh et al. | Brain tumor detection and classifiaction using CNN algorithm and deep learning techniques | |
Dardzinska et al. | Decision-making process in colon disease and Crohn’s disease treatment | |
CN114004821A (en) | Intestinal ganglion cell auxiliary identification method based on cascade rcnn | |
Skubalska-Rafajlowicz et al. | Fast k-NN classification rule using metric on space-filling curves | |
Al Mamun et al. | Application of Deep Convolution Neural Network in Breast Cancer Prediction using Digital Mammograms | |
Chávez-Hoffmeister | The humerus and stratigraphic range of Palaeospheniscus (Aves, Sphenisciformes) | |
Boschetto et al. | Superpixel-based automatic segmentation of villi in confocal endomicroscopy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |