CN112017774B - Method and system for constructing disease prediction model based on halitosis accompanying symptoms - Google Patents

Method and system for constructing disease prediction model based on halitosis accompanying symptoms Download PDF

Info

Publication number
CN112017774B
CN112017774B CN202010901806.3A CN202010901806A CN112017774B CN 112017774 B CN112017774 B CN 112017774B CN 202010901806 A CN202010901806 A CN 202010901806A CN 112017774 B CN112017774 B CN 112017774B
Authority
CN
China
Prior art keywords
halitosis
disease
knowledge database
symptoms
symptom
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010901806.3A
Other languages
Chinese (zh)
Other versions
CN112017774A (en
Inventor
杜乐
杜登斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuzheng Intelligent Technology Beijing Co ltd
Original Assignee
Wuzheng Intelligent Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuzheng Intelligent Technology Beijing Co ltd filed Critical Wuzheng Intelligent Technology Beijing Co ltd
Priority to CN202010901806.3A priority Critical patent/CN112017774B/en
Publication of CN112017774A publication Critical patent/CN112017774A/en
Application granted granted Critical
Publication of CN112017774B publication Critical patent/CN112017774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Pathology (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The application relates to a method and a system for constructing a disease prediction model based on halitosis accompanying symptoms, wherein the construction method comprises the following steps: on the premise of establishing halitosis as a first clinical complaint symptom, establishing a first disease knowledge database of halitosis accompanying symptoms and corresponding disease information; classifying the first disease knowledge database according to pathological, non-pathological and clustering methods until each centroid is not changed any more, so as to obtain a second disease knowledge database; extracting feature vectors from the second disease knowledge database, and establishing a feature vector set; and calculating the semantic similarity between the symptom characteristics and the corresponding diseases to obtain a model. The application deeply digs the disease database of the odor syndrome through the k-means clustering method, the feature extraction and the semantic similarity, fully exerts the value of medical data, provides a simple self-checking channel of the odor syndrome for common people, and reduces the pressure of inquiry of the traditional medical institution.

Description

Method and system for constructing disease prediction model based on halitosis accompanying symptoms
Technical Field
The application relates to the technical fields of intelligent medical treatment and medical information, relates to a method and a system for constructing a disease prediction model, and particularly relates to a method and a system for constructing a disease prediction model based on halitosis accompanying symptoms.
Background
Bad breath refers to malodor emanating from the mouth or other air filled cavities such as the nose, sinuses, pharynx. Commonly referred to as bad breath, is a common condition in daily life. The accompanying symptom characteristic information of halitosis is generally: thick and greasy tongue coating, dry mouth, bitter taste, short breath, chest distress, gastrointestinal discomfort, abdominal accounting, frequent urination, constipation, loose stool, soreness of waist and knees, limb numbness and pain, easy internal heat (easy internal heat during menstrual period), easy sweating of the palms and soles, frequent fever, easy fatigue, easy cold, dysphoria, insomnia, listlessness, dizziness, dry hair, tinnitus and the like. In fact, halitosis is not an independent disease, but a symptom of warning signals sent by the body.
Currently, methods for bad breath inspection generally include: visual inspection, microbiological inspection, renal function, blood sugar, X-ray film inspection, X-ray barium meal inspection, gastroscopy, rhinoscopy and the like, which are time-consuming and labor-consuming, are very complex and are not necessarily accurate.
Because the bad breath is complicated in cause, the disease caused by the bad breath can be large or small, the patient often does not pay much attention or pays much attention to the bad breath, the accurate diagnosis needs to be checked in detail, and time and economic pressure are brought to the patient for the common patient; on the other hand, if a large number of light patients go to medical institutions for examination, medical resources are strained.
Disclosure of Invention
In order to solve the problem of disease prediction of the symptom accompanied by halitosis and reduce the time and economic pressure of patients, the application provides a method and a system for constructing a disease prediction model based on the symptom accompanied by halitosis, wherein the construction method comprises the following steps: on the premise of establishing halitosis as a first clinical complaint symptom, establishing a first disease knowledge database of halitosis accompanying symptoms and corresponding disease information; classifying the first disease knowledge database according to pathological halitosis symptoms, non-pathological halitosis symptoms and a clustering method until each clustering center is not changed any more, so as to obtain a second disease knowledge database; extracting feature vectors from the second disease knowledge database, and establishing a feature vector set containing symptom features and diseases corresponding to the symptom features; and calculating the semantic similarity between symptom features and the corresponding diseases, and sequencing the feature vector sets according to the semantic similarity.
In some embodiments of the present application, the classifying the first disease knowledge database according to pathological and non-pathological halitosis symptoms and clustering methods until each cluster center is no longer changed, and obtaining the second disease knowledge database includes the steps of:
classifying the first disease knowledge database according to the pathological halitosis symptoms and the non-pathological halitosis symptoms by a k-means clustering method and Euclidean distance until each centroid is not changed, and obtaining a second disease knowledge database.
In some embodiments of the application, said extracting feature vectors from said second disease knowledge database is performed by TF-IDF algorithm.
In some embodiments of the application, the semantic similarity between the symptom feature and its corresponding disease is characterized by a cosine distance.
The system for predicting the disease model based on the halitosis accompanying symptoms comprises an acquisition module, a storage module, a prediction model and a calculation module, wherein the acquisition module is used for acquiring the halitosis accompanying symptoms and corresponding disease information of a person to be tested and feeding back the prediction result of the prediction model by a user; the storage module is used for storing the halitosis accompanying symptoms, corresponding disease information and a first disease knowledge database; the prediction model is used for matching a disease characteristic vector set corresponding to the halitosis accompanying symptom information of the testee according to the halitosis accompanying symptom information of the testee; the calculation module is used for calculating the halitosis accompanying symptom information and the corresponding disease characteristic vector, and outputting a disease prediction result according to the user requirement.
In some embodiments of the application, the calculation module calculates the distance between the halitosis-associated symptom information and its corresponding disease feature vector by cosine distance.
In some embodiments of the present application, the prediction model includes a model constructed by a method for constructing a disease prediction model based on symptoms associated with halitosis according to the object of the first aspect of the present application.
Further, the storage module updates the first disease knowledge database according to user feedback.
In a third aspect of the present application, there is provided a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method for constructing a disease prediction model based on symptoms associated with halitosis according to the first aspect of the present application.
The beneficial effects of the application are as follows: according to the application, the disease database of the halitosis syndrome is classified by a k-means clustering method, and then the disease database of the halitosis syndrome is further mined by feature extraction and semantic similarity, so that the value of medical data is fully exerted, a simple self-checking channel of the halitosis syndrome is provided for common people, and the pressure of inquiry of a traditional medical institution and the economic burden of patients are reduced.
Drawings
FIG. 1 is a method of constructing a disease prediction model based on symptoms associated with halitosis in some embodiments of the application;
fig. 2 is a system basic block diagram of a disease prediction model based on symptoms associated with bad breath in some embodiments of the application.
Detailed Description
The principles and features of the present application are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the application and are not to be construed as limiting the scope of the application.
The application provides a method for constructing a disease prediction model based on halitosis accompanying symptoms, which comprises the following steps: s101, on the premise of establishing halitosis as a first clinical complaint symptom, establishing a first disease knowledge database of halitosis accompanying symptoms and corresponding disease information; s102, classifying the first disease knowledge database according to pathological halitosis symptoms, non-pathological halitosis symptoms and a clustering method until each clustering center is not changed any more, and obtaining a second disease knowledge database; s103, extracting feature vectors from the second disease knowledge database, and establishing a feature vector set containing symptom features and diseases corresponding to the symptom features; s104, calculating semantic similarity between symptom features and diseases corresponding to the symptom features, and sequencing the feature vector sets according to the size of the semantic similarity.
It should be noted that the clustering algorithm generally includes a K-Means, DBSCAN, BIRCH, meanShift algorithm, and preferably, the method classifies the first disease database samples by using a K-means clustering method, and further classifies the first disease database samples by using euclidean distance.
The specific calculation formula is as follows:wherein X, Y represents a respective set of samples of the various malodorous accompanying symptoms in the first disease knowledge database, x i Or y i Representing the ith symptom syndrome.
The symptoms of the non-pathological halitosis can be improved and relieved or even eliminated, so that the psychological burden and the economic burden of the patients with mild diseases are relieved through classification of the pathological and non-pathological halitosis accompanying symptoms, and on the other hand, the patients with the pathological halitosis are timely reminded, and the patients are fully stressed.
Specifically, common pathological diseases of halitosis are: gingivitis, periodontitis, periodontal abscess, dental caries, pulpitis, stomatocace, suppurative tonsillitis, suppurative sinusitis, gastritis, gastric cancer, pyloric obstruction, uremia, diabetic ketoacidosis, etc. If it is periodontitis: its early symptoms are not obvious. Along with the change of diseases, halitosis with periodontal pocket, tooth Zhou Yinong and tooth loosening can occur, with symptoms of bite weakness, dull pain, gingival bleeding, etc.; if gingivitis: in addition to bad breath, gingival bleeding may occur during brushing or biting of hard objects. The free gingiva and the gingival papilla are in bright red or dark red locally, the inflammation congestion range of the severe patients can be affected by attached gingiva, and in addition, the gingival tissue is swollen, the gingival margin is thickened, the papilla between teeth is round and blunt, the free gingiva and the gingival papilla are not clung to the tooth surface any more, the stippling disappears, and the surface is bright. The gums become soft and fragile, lack elasticity, and become firm and hypertrophic. The gingival sulcus detection can reach more than 3 cm. The gingival sulcus can be slightly explored to bleed. The exudates in the gingival sulcus are increased; if suppurative tonsillitis: except halitosis, tonsil swelling with severe pharyngalgia is the main symptom. Pharyngalgia begins on one side, and then evident pain develops in both sides of the pharynx, which is exacerbated when swallowing. Pain may radiate to the ear. The patients can have symptoms such as aversion to cold, high fever, headache, poor appetite, limb muscle soreness, fatigue weakness, general discomfort, constipation and the like;
manifestation of bad breath symptoms of immune, visceral dysfunction): in addition to the obvious sign of bad breath, the following single symptoms or the following multiple symptoms can appear according to the individual differences of patients: thick and greasy tongue coating, dry mouth, bitter taste, short breath, chest distress, gastrointestinal discomfort, abdominal accounting, frequent urination, constipation, loose stool, soreness of waist and knees, limb numbness and pain, easy internal heat (easy internal heat during menstrual period), easy sweating of the palms and soles, frequent fever, easy fatigue, easy cold, dysphoria, insomnia, listlessness, dizziness, dry hair, tinnitus and other symptoms. Specific diseases of this class are generally: gastrointestinal diseases such as peptic ulcer, chronic gastritis, functional dyspepsia, etc.; helicobacter pylori infection; diet and weight loss, or the reduction of salivary secretion caused by the inability of the old to eat or by endocrine disorders in women during menstruation, is beneficial to the growth of anaerobic bacteria, thus generating halitosis; some females in adolescence have the ovarian dysfunction, and when the sex hormone level is low, the resistance of oral tissues is reduced, and bacteria are easy to infect, so that halitosis is generated; second), simple oral malodor disorder (non-pathological malodor symptoms) manifestations: besides obvious bad breath, oral and gingival swelling and pain, local fever and the like. Specific diseases are generally: dental caries, gingivitis, periodontitis, oral mucositis, dental caries, periodontal disease and other oral diseases, and bacteria, especially anaerobic bacteria, are easily grown in the oral cavity, so that sulfides are generated by decomposition of the bacteria, and bad taste is generated, thereby generating halitosis.
In step S102 of some embodiments of the present application, the clustering method is a k-means clustering method.
In step S102 of some embodiments of the present application, the classifying the first disease knowledge database according to pathological halitosis symptoms and non-pathological halitosis symptoms and clustering methods until each clustering center is not changed, and obtaining a second disease knowledge database includes the steps of:
classifying the first disease knowledge database according to the pathological halitosis symptoms and the non-pathological halitosis symptoms by a k-means clustering method and Euclidean distance until each centroid is not changed, and obtaining a second disease knowledge database.
Specifically, the k-means clustering method comprises the following steps: 1) Firstly, determining a k value, namely, hopefully clustering data sets to obtain k sets; 2) Randomly selecting k data points from the first disease dataset as centroids; 3) For each point in the dataset (corresponding to a bad breath syndrome), calculating its distance from each centroid, which centroid is closest to, and dividing into the set to which that centroid belongs; 4) After all data are grouped together, there are k total groups. Then re-computing the centroid of each set; 5) If the distance between the newly calculated centroid and the original centroid is less than some set threshold (indicating that the position of the recalculated centroid does not change much, tends to stabilize, or converges), we can consider the cluster to have reached the desired result and the algorithm terminates: 6) If the distance between the new centroid and the original centroid varies greatly, 3-5 steps are needed to iterate.
In step S103 of some embodiments of the present application, the extracting feature vectors from the second disease knowledge database is implemented by TF-IDF algorithm. The main ideas of the main TF-IDF are: if a word appears in one article with a high frequency TF and in other articles with few occurrences, the word or phrase is considered to have good category discrimination and is suitable for classification. In the present application, a word or phrase corresponds to a symptom associated with halitosis and its corresponding disease.
The Term Frequency (TF) represents the frequency with which terms (keywords) appear in text. This number will typically be normalized (typically word frequency divided by the total number of articles) to prevent it from biasing toward long documents.
The formula is:
namely:
if the fewer documents containing the term t, the larger the IDF, the better the category discrimination of the term is. The formula is:
where |D| is the total number of files in the corpus. I { j ∈dj } | represents the number of files containing the word ti (i.e., the number of files of ni, j+.0). If the term is not in the corpus, it will result in zero denominator, so 1+|{ j: ti εdj } | is typically used. Namely:
the denominator is added with 1 to avoid that the denominator is 0;
high term frequencies within a particular document, and low document frequencies of that term throughout the document collection, may yield a high weighted TF-IDF. Thus, TF-IDF tends to filter out common words, preserving important words. The formula is: TF-idf=tf IDF.
In step S104 of some embodiments of the present application, the semantic similarity between the symptom feature and its corresponding disease is characterized by a cosine distance.
Specifically, if there are two vectors in the n-dimensional space, vector a (a 1, a2, a3,) an, vector b (b 1, b2, b3,) bn, is available according to the dot product formulaThe vector a or the vector b at this time corresponds to two sets of halitosis-accompanying symptom feature vectors or two sets of halitosis-accompanying symptom-corresponding disease feature vectors. If the included angle is 90 degrees, the right angle is formed, and the directions are completely dissimilar; if the angle is 180 degrees, this means that the directions are exactly opposite. Therefore, the similarity degree of the vectors can be judged by the size of the included angle. The smaller the angle, the more similar the representation.
The application provides a disease prediction model based on halitosis accompanying symptoms, which comprises an acquisition module 11, a storage module 12, a prediction model 13 and a calculation module 14, wherein the acquisition module 11 is used for acquiring the halitosis accompanying symptoms and corresponding disease information of a person to be tested and feeding back the prediction result of the prediction model by a user; the storage module 12 is used for storing halitosis accompanying symptoms, corresponding disease information and a first disease knowledge database; the prediction model 13 is configured to match a disease feature vector set corresponding to the halitosis accompanying symptom information of the person to be tested according to the halitosis accompanying symptom information; the calculating module 14 is configured to calculate the halitosis accompanying symptom information and the disease feature vector corresponding thereto, and output a disease prediction result according to the user's requirement. It should be noted that the above matching may be forward matching, reverse matching, accurate matching or fuzzy matching, and the selection may be provided by the obtaining module to the user to output different results, so as to realize the personalized requirements of the user and understand the accompanying symptoms of halitosis or corresponding diseases.
In some embodiments of the present application, the calculation module 14 calculates the distance between the halitosis-associated symptom information and its corresponding disease feature vector by cosine distance.
In some embodiments of the present application, the prediction model 13 includes a model constructed by a method for constructing a disease prediction model based on symptoms associated with halitosis according to the object of the first aspect of the present application.
Further, the storage module 11 updates the first disease knowledge database according to the feedback of the user. In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the embodiments of the apparatus described above are merely illustrative, e.g., the division of the units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment. In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. For example, the current GPU (display processing unit) usually has a storage and calculation function, so the acquisition module 11, the storage module 12, the prediction model 13 and the calculation module 14 may be integrated on one GPU or multiple GPUs, and the functional modules may be respectively carried on different servers.
In a third aspect of the present application, there is provided a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method for constructing a disease prediction model based on symptoms associated with halitosis according to the first aspect of the present application. For example, on the premise of establishing halitosis as a first clinical complaint symptom, a first disease knowledge database of halitosis-associated symptoms and corresponding disease information is established; classifying the first disease knowledge database according to pathological halitosis symptoms, non-pathological halitosis symptoms and a clustering method until each clustering center is not changed any more, so as to obtain a second disease knowledge database; extracting feature vectors from the second disease knowledge database, and establishing a feature vector set containing symptom features and diseases corresponding to the symptom features; and calculating the semantic similarity between symptom features and the corresponding diseases, and sequencing the feature vector sets according to the semantic similarity.
The foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the application are intended to be included within the scope of the application.

Claims (4)

1. The method for constructing the disease prediction model based on the symptom accompanied by halitosis is characterized by comprising the following steps:
on the premise of establishing halitosis as a first clinical complaint symptom, establishing a first disease knowledge database of halitosis accompanying symptoms and corresponding disease information;
classifying the first disease knowledge database according to pathological halitosis symptoms, non-pathological halitosis symptoms and clustering methods until each clustering center is not changed, and obtaining a second disease knowledge database:
classifying the first disease knowledge database according to pathological halitosis symptoms and non-pathological halitosis symptoms by a k-means clustering method and Euclidean distance until each centroid is not changed any more, so as to obtain a second disease knowledge database;
extracting feature vectors from the second disease knowledge database through a TF-IDF algorithm, and establishing a feature vector set containing symptom features and diseases corresponding to the symptom features;
calculating the semantic similarity between symptom features and the corresponding diseases, and sequencing the feature vector sets according to the semantic similarity, wherein the semantic similarity between symptom features and the corresponding diseases is characterized by cosine distance.
2. A system based on a disease prediction model of halitosis accompanying symptoms is characterized by comprising an acquisition module, a storage module, a prediction model and a calculation module,
the acquisition module is used for acquiring the halitosis accompanying symptoms and corresponding disease information of the testee and the prediction result feedback of the user to the prediction model;
the storage module is used for storing the halitosis accompanying symptoms, corresponding disease information and a first disease knowledge database;
the prediction model is used for matching a disease characteristic vector set corresponding to the halitosis accompanying symptom information of the testee according to the halitosis accompanying symptom information of the testee; the prediction model is constructed by the following steps: on the premise of establishing halitosis as a first clinical complaint symptom, establishing a first disease knowledge database of halitosis accompanying symptoms and corresponding disease information; classifying the first disease knowledge database according to the pathological halitosis symptoms, the non-pathological halitosis symptoms and the clustering method until each clustering center is not changed any more to obtain a second disease knowledge database, classifying the first disease knowledge database according to the pathological halitosis symptoms and the non-pathological halitosis symptoms by a k-means clustering method and Euclidean distance until each centroid is not changed any more to obtain a second disease knowledge database, extracting feature vectors from the second disease knowledge database by a TF-IDF algorithm, and establishing a feature vector set containing symptom features and disease corresponding to the symptom features; calculating the semantic similarity between symptom features and diseases corresponding to the symptom features, and sequencing the feature vector sets according to the semantic similarity, wherein the semantic similarity between the symptom features and the diseases corresponding to the symptom features is characterized by cosine distance;
the calculation module is used for calculating the halitosis accompanying symptom information and the corresponding disease characteristic vector, and outputting a disease prediction result according to the user requirement.
3. The system of claim 2, wherein the storage module updates the first disease knowledge database based on user feedback.
4. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the method for constructing a disease prediction model based on symptoms associated with bad breath as claimed in claim 1.
CN202010901806.3A 2020-08-31 2020-08-31 Method and system for constructing disease prediction model based on halitosis accompanying symptoms Active CN112017774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010901806.3A CN112017774B (en) 2020-08-31 2020-08-31 Method and system for constructing disease prediction model based on halitosis accompanying symptoms

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010901806.3A CN112017774B (en) 2020-08-31 2020-08-31 Method and system for constructing disease prediction model based on halitosis accompanying symptoms

Publications (2)

Publication Number Publication Date
CN112017774A CN112017774A (en) 2020-12-01
CN112017774B true CN112017774B (en) 2023-10-03

Family

ID=73516429

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010901806.3A Active CN112017774B (en) 2020-08-31 2020-08-31 Method and system for constructing disease prediction model based on halitosis accompanying symptoms

Country Status (1)

Country Link
CN (1) CN112017774B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902793A (en) * 2012-12-26 2014-07-02 深圳循证医学信息技术有限公司 Intelligent disease diagnosis and treatment device and system
CN106372439A (en) * 2016-09-21 2017-02-01 北京大学 Method for acquiring and processing disease symptoms and weight knowledge thereof based on case library
CN106951719A (en) * 2017-04-10 2017-07-14 荣科科技股份有限公司 The construction method and constructing system of clinical diagnosis model, clinical diagnosing system
KR101788030B1 (en) * 2016-06-15 2017-11-15 주식회사 카이아이컴퍼니 System and method for risk diagnosis on oral disease and oral care
KR101875306B1 (en) * 2017-01-11 2018-07-05 전북대학교산학협력단 System for providing disease information using cluster of medicine teminologies
CN108699586A (en) * 2016-03-07 2018-10-23 优比欧迈公司 For characterizing and the method and system of mouth associated disease
CN111063434A (en) * 2019-12-26 2020-04-24 北京中润普达信息技术有限公司 Venereal disease diagnosis system based on clinical symptom characteristics
CN111599463A (en) * 2020-05-09 2020-08-28 吾征智能技术(北京)有限公司 Intelligent auxiliary diagnosis system based on sound cognition model

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8504343B2 (en) * 2007-01-31 2013-08-06 University Of Notre Dame Du Lac Disease diagnoses-bases disease prediction
US10246753B2 (en) * 2015-04-13 2019-04-02 uBiome, Inc. Method and system for characterizing mouth-associated conditions
US20190155993A1 (en) * 2017-11-20 2019-05-23 ThinkGenetic Inc. Method and System Supporting Disease Diagnosis

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902793A (en) * 2012-12-26 2014-07-02 深圳循证医学信息技术有限公司 Intelligent disease diagnosis and treatment device and system
CN108699586A (en) * 2016-03-07 2018-10-23 优比欧迈公司 For characterizing and the method and system of mouth associated disease
KR101788030B1 (en) * 2016-06-15 2017-11-15 주식회사 카이아이컴퍼니 System and method for risk diagnosis on oral disease and oral care
CN106372439A (en) * 2016-09-21 2017-02-01 北京大学 Method for acquiring and processing disease symptoms and weight knowledge thereof based on case library
KR101875306B1 (en) * 2017-01-11 2018-07-05 전북대학교산학협력단 System for providing disease information using cluster of medicine teminologies
CN106951719A (en) * 2017-04-10 2017-07-14 荣科科技股份有限公司 The construction method and constructing system of clinical diagnosis model, clinical diagnosing system
CN111063434A (en) * 2019-12-26 2020-04-24 北京中润普达信息技术有限公司 Venereal disease diagnosis system based on clinical symptom characteristics
CN111599463A (en) * 2020-05-09 2020-08-28 吾征智能技术(北京)有限公司 Intelligent auxiliary diagnosis system based on sound cognition model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Association Between Halitosis Diagnosed by a Questionnaire and Halimeter and Symptoms of Gastroesophageal Reflux Disease";Lee, Hyo-Jung,et al.;《Association Between Halitosis Diagnosed by a Questionnaire and Halimeter and Symptoms of Gastroesophageal Reflux Disease》;第20卷(第4期);483-490 *
"口臭病因和诊断方法的新进展";叶 玮;《第十九届中国国际口腔器材展览会暨学术研讨会》;59 *

Also Published As

Publication number Publication date
CN112017774A (en) 2020-12-01

Similar Documents

Publication Publication Date Title
Miller et al. Emergence of oropharyngeal, laryngeal and swallowing activity in the developing fetal upper aerodigestive tract: an ultrasound evaluation
Nowjack-Raymer et al. Numbers of natural teeth, diet, and nutritional status in US adults
Townsend et al. Inheritance of tooth size in Australian Aboriginals
Muddugangadhar et al. A clinical study to compare between resting and stimulated whole salivary flow rate and pH before and after complete denture placement in different age groups
Russell The periodontal index
Gingrich et al. Lingual propulsive pressures across consistencies generated by the anteromedian and posteromedian tongue by healthy young adults
Park et al. Differences in orofacial muscle strength according to age and sex in East Asian healthy adults
Geber et al. Dental markers of poverty: Biocultural deliberations on oral health of the poor in mid‐nineteenth‐century Ireland
Zhang et al. Auto-annotating sleep stages based on polysomnographic data
CN112017774B (en) Method and system for constructing disease prediction model based on halitosis accompanying symptoms
CN112259220B (en) System, equipment and storage medium for predicting diseases based on nasal bleeding accompanying symptoms
Van Dinter Ptyalism in Pregnant Woman
Haupt Detailed diagnoses and surgical procedures for patients discharged from short-stay hospitals, United States, 1979
Kurata et al. Racial differences in peptic ulcer disease: fact or myth?
Bandini et al. Bolus texture testing as a clinical method for evaluating food oral processing and choking risk: a pilot study
Owlia et al. Prevalence of Chronic Diseases in Elderly Living in Yazd Nursing Homes, and Its Relations with Oral Soft Tissue Lesions (OSTL)
Braza Human prenatal investment affected by maternal age and parity
WO2022141926A1 (en) Gastrointestinal perforation diagnosis and intervention device, and diagnosis and intervention system
Atkinson The impact of energetic trade-offs on the developmental trajectory and life history strategy of Homo sapiens: The modern human female phenotype
Yusuf et al. Relationship between the Number of Teeth, Occlusal Pairs, Oral Lesions, and Body Mass Index: A Study of Institutionalized Independent Elderlies in Jakarta
Le Riche, Harding,* Kinnear, AA,** Loewenthal, LJA, Boshoff, PH & Smit The Diepkloof nutrition and health study on Bantu boys, South Africa: final evaluation and conclusions
Zorina DYNAMICS OF THE DISEASES PREVALENCE AND MORBIDITY RATE IN THE REPUBLIC OF KAZAKHSTAN
Hwang et al. Study on the Relationship Trend between Chronic Diseases and Oral Health according to Diabetes Treatment: Focusing on the Local Health Survey for 2015, 2016 and 2017.
Milagres et al. Self-assessed masticatory function and frailty in Brazilian older adults: the FIBRA Study
Knight et al. P347 Do we really know what a healthy BMI looks like?: A single centre service evaluation of body composition data using bioelectrical impedance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant