CN112269880A - Sweet text classification matching system based on linear function - Google Patents
Sweet text classification matching system based on linear function Download PDFInfo
- Publication number
- CN112269880A CN112269880A CN202011217922.XA CN202011217922A CN112269880A CN 112269880 A CN112269880 A CN 112269880A CN 202011217922 A CN202011217922 A CN 202011217922A CN 112269880 A CN112269880 A CN 112269880A
- Authority
- CN
- China
- Prior art keywords
- sweet
- feature vector
- vector set
- matching
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 235000009508 confectionery Nutrition 0.000 title claims abstract description 217
- 238000012886 linear function Methods 0.000 title claims abstract description 53
- 239000013598 vector Substances 0.000 claims abstract description 134
- 239000003814 drug Substances 0.000 claims abstract description 56
- 238000000034 method Methods 0.000 claims abstract description 23
- 238000004364 calculation method Methods 0.000 claims abstract description 10
- 208000024891 symptom Diseases 0.000 claims description 39
- 235000019605 sweet taste sensations Nutrition 0.000 claims description 36
- 201000010099 disease Diseases 0.000 claims description 16
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 16
- 238000012795 verification Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 7
- 238000012216 screening Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 11
- 230000006870 function Effects 0.000 description 8
- 210000000952 spleen Anatomy 0.000 description 8
- 210000002784 stomach Anatomy 0.000 description 7
- 235000019640 taste Nutrition 0.000 description 6
- 206010012601 diabetes mellitus Diseases 0.000 description 3
- 210000000214 mouth Anatomy 0.000 description 3
- 206010012735 Diarrhoea Diseases 0.000 description 2
- 230000036528 appetite Effects 0.000 description 2
- 235000019789 appetite Nutrition 0.000 description 2
- 230000006735 deficit Effects 0.000 description 2
- 230000035622 drinking Effects 0.000 description 2
- 239000003651 drinking water Substances 0.000 description 2
- 235000020188 drinking water Nutrition 0.000 description 2
- 230000004064 dysfunction Effects 0.000 description 2
- 208000017667 Chronic Disease Diseases 0.000 description 1
- 206010013911 Dysgeusia Diseases 0.000 description 1
- 208000000059 Dyspnea Diseases 0.000 description 1
- 206010013975 Dyspnoeas Diseases 0.000 description 1
- 206010020710 Hyperphagia Diseases 0.000 description 1
- 241001122767 Theaceae Species 0.000 description 1
- 208000031971 Yin Deficiency Diseases 0.000 description 1
- 210000001015 abdomen Anatomy 0.000 description 1
- 208000019790 abdominal distention Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 235000019658 bitter taste Nutrition 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 210000003710 cerebral cortex Anatomy 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 206010013781 dry mouth Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 206010016256 fatigue Diseases 0.000 description 1
- 239000000796 flavoring agent Substances 0.000 description 1
- 235000019634 flavors Nutrition 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000002496 gastric effect Effects 0.000 description 1
- 235000003642 hunger Nutrition 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002182 neurohumoral effect Effects 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 208000022530 polyphagia Diseases 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 235000019643 salty taste Nutrition 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 235000019615 sensations Nutrition 0.000 description 1
- 208000013220 shortness of breath Diseases 0.000 description 1
- 235000019614 sour taste Nutrition 0.000 description 1
- 235000019654 spicy taste Nutrition 0.000 description 1
- 210000000108 taste bud cell Anatomy 0.000 description 1
- 230000035922 thirst Effects 0.000 description 1
- 230000003867 tiredness Effects 0.000 description 1
- 208000016255 tiredness Diseases 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
Abstract
The invention provides a classification and matching system for a sweet text based on a linear function. The method comprises the following steps: the acquisition module is used for acquiring the characteristic information of the sweet, and establishing a sweet characteristic vector set according to the characteristic information of the sweet; the classification module is used for establishing a linear function classification method, classifying the sweet feature vector set according to the linear function classification method, establishing a traditional Chinese medicine sweet feature vector set and a western medicine sweet feature vector set, and combining the traditional Chinese medicine sweet feature vector set and the western medicine sweet feature vector set into a sweet feature vector matching model; the calculation module is used for establishing a TF-IDF algorithm, acquiring the text information of the sweet to be matched, selecting the characteristic words of the sweet and establishing a vector set of the sweet characteristics to be matched; and the matching module is used for calculating the similarity between the sweet feature vector matching model and the sweet feature vector set to be matched through the Jacard similarity coefficient and generating a matching report according to the similarity. The text information can be accurately matched through a linear function classification method, a TF-IDF algorithm and a Jacard similarity coefficient, and the accuracy of the whole matching process is improved.
Description
Technical Field
The invention relates to the field of artificial intelligence, in particular to a sweet text classification matching system based on a linear function.
Background
In common speaking, "smelling the nose and smelling the smell, tasting the tongue with five flavors". Sour, sweet, bitter, spicy and salty taste information is transmitted by small papillae densely distributed on the tongue surface and taste cells called tongue buds, and then excitation is generated by taste centers of cerebral cortex, and the feedback loop neurohumoral system completes the whole taste analysis activity, but some people feel abnormal taste in the mouth when eating or do not eat the mouth, which often indicates that some diseases can be caused.
At present, the matching means for realizing the matching between the sweet text information and the corresponding disease information in the medical science is usually to collect the sweet text through the completion of a clinician, and then to select the sweet text through the operation of the clinician on a computer, but the prior art means is usually to traverse and match a large amount of information when matching the information, so that not only the consumed resources are large, the consumed time is long, and the prior scheme needs to be improved urgently.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
In view of this, the invention provides a system for classifying and matching a sweet text based on a linear function, and aims to solve the technical problem that the prior art cannot classify the sweet text information through the linear function so as to reduce the resource consumed by data processing.
The technical scheme of the invention is realized as follows:
in one aspect, the present invention provides a linear function-based sweet text classification matching system, including:
the acquisition module is used for acquiring the characteristic information of the sweet, and establishing a sweet characteristic vector set according to the characteristic information of the sweet;
the classification module is used for establishing a linear function classification method, classifying the sweet feature vector set according to the linear function classification method, establishing a traditional Chinese medicine sweet feature vector set and a western medicine sweet feature vector set, and merging the traditional Chinese medicine sweet feature vector set and the western medicine sweet feature vector set into a sweet feature vector matching model;
the calculating module is used for establishing a TF-IDF algorithm, acquiring the sweet text information to be matched, selecting sweet feature words from the sweet text information to be matched through the TF-IDF algorithm, and establishing a sweet feature vector set to be matched;
and the matching module is used for calculating the similarity between the sweet feature vector matching model and the sweet feature vector set to be matched through the Jacard similarity coefficient and generating a matching report according to the similarity.
On the basis of the above technical solution, preferably, the obtaining module includes a processing module, configured to obtain feature information of the sweet, where the feature information of the sweet is feature information of a symptom accompanied by the sweet, establish a feature information integrity verification rule, verify the feature information of the symptom accompanied by the sweet according to the feature information integrity verification rule, and establish a sweet feature vector set according to the feature information of the symptom accompanied by the sweet when the verification is passed.
On the basis of the above technical solution, preferably, the obtaining module includes an adding module, configured to obtain historical characteristic information of the symptom associated with the sweet, compare the historical characteristic information of the symptom associated with the sweet with the characteristic information of the symptom associated with the sweet, screen out non-duplicated historical characteristic information of the symptom associated with the sweet, and add the historical characteristic information of the symptom associated with the sweet to an import sweet characteristic vector set.
On the basis of the above technical solution, preferably, the classification module includes a classification calculation module, configured to establish a linear classification function, and set two classification categories: according to the traditional Chinese medicine sweet taste and the western medicine sweet taste, the sweet taste feature vector set is used as a function vector, classification categories are used as classification marks, the traditional Chinese medicine sweet taste feature vector set and the western medicine sweet taste feature vector set are established by utilizing a linear classification function, and the traditional Chinese medicine sweet taste feature vector set and the western medicine sweet taste feature vector set are combined into a sweet taste feature vector matching model.
On the basis of the technical scheme, preferably, the calculation module comprises an algorithm module for establishing a TF-IDF algorithm to obtain the sweet text information to be matched, calculating the word frequency of each word in the sweet text information to be matched through the TF-IDF algorithm, and taking the word with the calculated word frequency as the word to be screened.
On the basis of the above technical scheme, preferably, the calculation module includes a feature word processing module, which sets a common word bank and a word frequency threshold, screens words to be screened according to the common word bank, after screening out the common words, compares the word frequency of the remaining words to be screened with the word frequency threshold, selects the words to be screened which meet the word frequency threshold as the feature words of the sweetness, and establishes a feature vector set of the sweetness to be matched.
On the basis of the above technical solution, preferably, the matching module includes a matching report generating module for establishing a jaccard similarity coefficient, calculating a similarity between the sweet feature vector matching model and the sweet feature vector set to be matched by the jaccard similarity coefficient, and generating a corresponding matching report according to the similarity.
Still further preferably, the linear function-based spoken text classification matching device comprises:
the acquiring unit is used for acquiring the characteristic information of the sweet and the characteristic information of the disease, and respectively establishing a sweet characteristic vector set and a disease characteristic vector set according to the characteristic information of the sweet and the characteristic information of the disease;
the classification unit is used for establishing a linear function classification method, classifying the sweet feature vector set according to the linear function classification method, establishing a traditional Chinese medicine sweet feature vector set and a western medicine sweet feature vector set, and merging the traditional Chinese medicine sweet feature vector set and the western medicine sweet feature vector set into a sweet feature vector matching model;
the calculating unit is used for establishing a TF-IDF algorithm, acquiring the sweet text information to be matched, selecting sweet feature words from the sweet text information to be matched through the TF-IDF algorithm, and establishing a sweet feature vector set to be matched;
and the matching unit is used for calculating the similarity between the sweet feature vector matching model and the sweet feature vector set to be matched through the Jacard similarity coefficient and generating a matching report according to the similarity.
Compared with the prior art, the sweet text classification matching system based on the linear function has the following beneficial effects:
(1) the linear function classification method and the TF-IDF algorithm are used for extracting the feature words, so that the accuracy of the extracted feature words can be improved, the matching of subsequent information is facilitated, meanwhile, the feature vector set is classified through the linear function classification method, the resource consumption during information matching is greatly reduced, and the resource matching speed is improved;
(2) the similarity of the information text is calculated by using the Jacard similarity coefficient, so that the accuracy of information matching can be improved, and the speed of information matching can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a block diagram of a first embodiment of a linear function based spoken text classification matching system according to the present invention;
FIG. 2 is a block diagram of a second embodiment of the system for matching classified sweet text based on linear function according to the present invention;
FIG. 3 is a block diagram illustrating a third embodiment of the system for matching classified sweet text based on linear function according to the present invention;
FIG. 4 is a block diagram illustrating a fourth embodiment of the system for matching classified sweet text based on linear function according to the present invention;
FIG. 5 is a block diagram illustrating a fifth embodiment of the system for matching classified sweet text based on linear function according to the present invention;
FIG. 6 is a block diagram of the device for classifying and matching the sweet text based on linear function according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Referring to fig. 1, fig. 1 is a block diagram illustrating a first embodiment of a linear function-based classification and matching system for a sweet text according to the present invention. Wherein the sweet text classification matching system based on the linear function comprises: an acquisition module 10, a classification module 20, a calculation module 30 and a matching module 40.
The acquisition module 10 is configured to acquire the characteristics information of the sweet, and establish a sweet characteristic vector set according to the characteristics information of the sweet;
the classification module 20 is configured to establish a linear function classification method, classify the sweet feature vector set according to the linear function classification method, establish a traditional Chinese medicine sweet feature vector set and a western medicine sweet feature vector set, and merge the traditional Chinese medicine sweet feature vector set and the western medicine sweet feature vector set into a sweet feature vector matching model;
the calculating module 30 is used for establishing a TF-IDF algorithm, acquiring the sweet text information to be matched, selecting sweet feature words from the sweet text information to be matched through the TF-IDF algorithm, and establishing a sweet feature vector set to be matched;
and the matching module 40 is used for calculating the similarity between the sweet feature vector matching model and the sweet feature vector set to be matched through the Jacard similarity coefficient, and generating a matching report according to the similarity.
Further, as shown in fig. 2, a block diagram of a second embodiment of the system for matching a classified sweet text based on a linear function according to the present invention is provided based on the above embodiments, in this embodiment, the obtaining module 10 further includes:
the processing module 101 is configured to obtain feature information of the sweet, where the feature information is feature information of symptom accompanied by the sweet, establish a feature information integrity verification rule, verify the feature information of symptom accompanied by the sweet according to the feature information integrity verification rule, and establish a sweet feature vector set according to the feature information of symptom accompanied by the sweet when the verification is passed.
And the adding module 102 is used for acquiring historical characteristic information of the accompanying symptoms of the sweet, comparing the historical characteristic information of the accompanying symptoms of the sweet with the characteristic information of the accompanying symptoms of the sweet, screening out the characteristic information of the accompanying symptoms of the historical sweet without duplication, and adding the characteristic information of the accompanying symptoms of the historical sweet into an imported sweet characteristic vector set.
It should be understood that, in this embodiment, the system acquires the feature information of the sweet, the feature information is feature information of symptom accompanied by the sweet, establishes a feature information integrity verification rule, verifies the feature information of symptom accompanied by the sweet according to the feature information integrity verification rule, and when the verification is passed, establishes a set of feature vectors of the sweet according to the feature information of symptom accompanied by the sweet, which is to detect the feature words in advance to ensure that the information can be directly matched when matched, and the matching failure due to incomplete feature information is avoided.
It is understood that the symptoms associated with sweetness are generally manifested as dry sweetness with little drinking, shortness of breath, tiredness, poor appetite, distension in the abdomen, and dry and soft stools. Taste recovery is at least 10 days, since taste bud cells are all renewed by surrounding epithelial cells. However, treatment must be found early, and is found within a month after the onset of taste disturbance.
It should be understood that, in this embodiment, historical feature information of the symptom accompanied by the sweet is also taken, the historical feature information of the symptom accompanied by the sweet is compared with the feature information of the symptom accompanied by the sweet, the feature information of the symptom accompanied by the sweet without duplication is screened out, and the feature information of the symptom accompanied by the sweet is added to the imported sweet feature vector set, which is further to add the sweet feature vector set further, so as to increase the reliability of information matching.
It should be understood that, in this embodiment, all the disease and disease symptom characteristic information corresponding to the characteristic information of the sweet accompanying symptom is also obtained, and a vector set of the disease and disease symptom characteristic information corresponding to the characteristic information of the sweet accompanying symptom is established. For example, TCM believes that sweetness is mostly caused by gastric dysfunction. Clinically, the oral liquid is divided into sweet taste due to heat in spleen and stomach and sweet taste due to qi and yin in spleen and stomach. The former is mostly caused by excessive intake of pungent, spicy and thick-tasting food and internal heat or exogenous pathogenic heat accumulated in the spleen and stomach, which is mostly damp-heat in the spleen and stomach. It is seen in diabetes patients who like to eat sweet, fat and thick taste. The symptoms of sweet and thirst, drinking water preference, polyphagia and hunger, or sores on lips and tongue, dry stool, red tongue with dry coating, rapid and forceful pulse and the like; the latter is caused by the impairment of spleen and stomach due to aging or chronic diseases, resulting in impairment of both qi and yin, endogenous deficient heat, and burning of spleen fluid, which is commonly manifested as dry mouth due to qi and yin deficiency of spleen and stomach, poor drinking water, short breath, fatigue, poor appetite, abdominal distention, dry and soft stool, etc.
Further, as shown in fig. 3, a block diagram of a third embodiment of the system for classifying and matching a sweet text based on a linear function according to the present invention is provided based on the above embodiments, in this embodiment, the classification module 20 further includes:
the classification calculation module 201 is configured to establish a linear classification function, and set two classification categories: according to the traditional Chinese medicine sweet taste and the western medicine sweet taste, the sweet taste feature vector set is used as a function vector, classification categories are used as classification marks, the traditional Chinese medicine sweet taste feature vector set and the western medicine sweet taste feature vector set are established by utilizing a linear classification function, and the traditional Chinese medicine sweet taste feature vector set and the western medicine sweet taste feature vector set are combined into a sweet taste feature vector matching model.
It should be understood that in this example, a linear classification function is established to classify the sweet into two categories according to the cause of onset of the sweet, and to classify the disease according to the characteristic information of the symptoms. These two categories are: traditional Chinese medicine and Western medicine (such as diabetes and the like). Each sample consists of a vector (i.e., the vector of text features) and a label (indicating which category the sample belongs to). Then, the classification category is used as a classification mark, a traditional Chinese medicine sweet taste feature vector set and a western medicine sweet taste feature vector set are established by utilizing a linear classification function, and the traditional Chinese medicine sweet taste feature vector set and the western medicine sweet taste feature vector set are combined into a sweet taste feature vector matching model
Further, as shown in fig. 4, a block diagram of a fourth embodiment of the system for matching a classified sweet text based on a linear function according to the present invention is provided based on the above embodiments, in this embodiment, the calculating module 30 includes:
the algorithm module 301 is configured to establish a TF-IDF algorithm, acquire the sweet text information to be matched, calculate the word frequency of each word in the sweet text information to be matched through the TF-IDF algorithm, and use the word with the calculated word frequency as a word to be screened.
The feature word processing module 302 sets a common word bank and a word frequency threshold, screens words to be screened according to the common word bank, compares the word frequency of the remaining words to be screened with the word frequency threshold after screening out the common words, selects the words to be screened meeting the word frequency threshold as the features of the sweetness, and establishes a feature vector set of the sweetness to be matched.
It should be understood that, in this embodiment, a TF-IDF algorithm is further established to obtain the sweet text information to be matched, the word frequency of each word in the sweet text information to be matched is calculated through the TF-IDF algorithm, and the word with the calculated word frequency is used as the word to be filtered.
It should be understood that the main ideas of TF-IDF are: if a word appears in an article with a high frequency TF and rarely appears in other articles, the word or phrase is considered to have a good classification capability and is suitable for classification. The word frequency (TF) represents the frequency with which terms (keywords) appear in text. This number is typically normalized (typically word frequency divided by the total word count of the article) to prevent it from being biased towards long documents.
It should be understood that, in order to select the feature words, the system further sets a common word bank and a word frequency threshold, and selects the words to be selected according to the common word bank. The common word bank comprises words such as conjunctions, word-atmosphere words and punctuation marks, after the common words are screened out, the word frequency of the remaining words to be screened is compared with a word frequency threshold value, the words to be screened meeting the word frequency threshold value are selected as the characteristic words of the sweet taste, and a characteristic vector set of the sweet taste to be matched is established.
Further, as shown in fig. 5, a block diagram of a fifth embodiment of the system for matching a classified sweet text based on a linear function according to the present invention is provided based on the above embodiments, in this embodiment, the matching module 40 includes:
the matching report generating module 401 is configured to establish a jaccard similarity coefficient, calculate a similarity between the sweet feature vector matching model and the sweet feature vector set to be matched according to the jaccard similarity coefficient, and generate a corresponding matching report according to the similarity.
It should be understood that, finally, the system establishes Jacard similarity coefficients, calculates the similarity between the sweet feature vector matching model and the sweet feature vector set to be matched through the Jacard similarity coefficients, sets a corresponding similarity range, compares the calculated similarity with the similarity range, and finally generates a corresponding matching report, for example, if the oral cavity is sweet, usually caused by diabetes, or possibly caused by dysfunction of the spleen and stomach. Especially in the morning, the sensation was more pronounced. Even drinking boiled water, the tea can feel sweet.
The above description is only for illustrative purposes and does not limit the technical solutions of the present application in any way.
As can be easily found from the above description, the present embodiment provides a system for matching a classification of a sweet text based on a linear function, including: the acquisition module is used for acquiring the characteristic information of the sweet, and establishing a sweet characteristic vector set according to the characteristic information of the sweet; the classification module is used for establishing a linear function classification method, classifying the sweet feature vector set according to the linear function classification method, establishing a traditional Chinese medicine sweet feature vector set and a western medicine sweet feature vector set, and merging the traditional Chinese medicine sweet feature vector set and the western medicine sweet feature vector set into a sweet feature vector matching model; the calculating module is used for establishing a TF-IDF algorithm, acquiring the sweet text information to be matched, selecting sweet feature words from the sweet text information to be matched through the TF-IDF algorithm, and establishing a sweet feature vector set to be matched; and the matching module is used for calculating the similarity between the sweet feature vector matching model and the sweet feature vector set to be matched through the Jacard similarity coefficient and generating a matching report according to the similarity. The embodiment can accurately match the text information through a linear function classification method, a TF-IDF algorithm and a Jacard similarity coefficient, and improves the accuracy of the whole matching process.
In addition, the embodiment of the invention also provides a device for classifying and matching the sweet texts based on the linear function. As shown in fig. 6, the linear function-based spoken text classification matching apparatus includes: an acquisition unit 10, a classification unit 20, a calculation unit 30 and a matching unit 40.
An obtaining unit 10, configured to obtain the characteristic information of the sweet and the characteristic information of the disease, and respectively establish a sweet characteristic vector set and a disease characteristic vector set according to the characteristic information of the sweet and the characteristic information of the disease;
the classification unit 20 is configured to establish a linear function classification method, classify the sweet feature vector set according to the linear function classification method, establish a traditional Chinese medicine sweet feature vector set and a western medicine sweet feature vector set, and merge the traditional Chinese medicine sweet feature vector set and the western medicine sweet feature vector set into a sweet feature vector matching model;
the calculating unit 30 is used for establishing a TF-IDF algorithm, acquiring the sweet text information to be matched, selecting sweet feature words from the sweet text information to be matched through the TF-IDF algorithm, and establishing a sweet feature vector set to be matched;
and the matching unit 40 is used for calculating the similarity between the sweet feature vector matching model and the sweet feature vector set to be matched through the Jacard similarity coefficient, and generating a matching report according to the similarity.
In addition, it should be noted that the above-described embodiments of the apparatus are merely illustrative, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of the modules to implement the purpose of the embodiments according to actual needs, and the present invention is not limited herein.
In addition, the technical details that are not described in detail in this embodiment may be referred to a linear function-based classification and matching system for a sweet text provided in any embodiment of the present invention, and are not described herein again.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (8)
1. A linear function based sweet text classification matching system, comprising:
the acquisition module is used for acquiring the characteristic information of the sweet, and establishing a sweet characteristic vector set according to the characteristic information of the sweet;
the classification module is used for establishing a linear function classification method, classifying the sweet feature vector set according to the linear function classification method, establishing a traditional Chinese medicine sweet feature vector set and a western medicine sweet feature vector set, and merging the traditional Chinese medicine sweet feature vector set and the western medicine sweet feature vector set into a sweet feature vector matching model;
the calculating module is used for establishing a TF-IDF algorithm, acquiring the sweet text information to be matched, selecting sweet feature words from the sweet text information to be matched through the TF-IDF algorithm, and establishing a sweet feature vector set to be matched;
and the matching module is used for calculating the similarity between the sweet feature vector matching model and the sweet feature vector set to be matched through the Jacard similarity coefficient and generating a matching report according to the similarity.
2. The linear function based spoken text classification matching system of claim 1, characterized by: the acquisition module comprises a processing module and is used for acquiring the characteristic information of the sweet, wherein the characteristic information of the sweet is characteristic information of the symptom accompanied by the sweet, establishing a characteristic information integrity verification rule, verifying the characteristic information of the symptom accompanied by the sweet according to the characteristic information integrity verification rule, and establishing a sweet characteristic vector set according to the characteristic information of the symptom accompanied by the sweet when the verification is passed.
3. The linear function based spoken text classification matching system of claim 2, characterized by: the acquisition module comprises an adding module which is used for acquiring historical sweet accompanying symptom characteristic information, comparing the historical sweet accompanying symptom characteristic information with the sweet accompanying symptom characteristic information, screening out non-repeated historical sweet accompanying symptom characteristic information, and adding the historical sweet accompanying symptom characteristic information into an imported sweet characteristic vector set.
4. The linear function based spoken text classification matching system of claim 3, characterized by: the classification module comprises a classification calculation module for establishing a linear classification function and setting two classification categories: according to the traditional Chinese medicine sweet taste and the western medicine sweet taste, the sweet taste feature vector set is used as a function vector, classification categories are used as classification marks, the traditional Chinese medicine sweet taste feature vector set and the western medicine sweet taste feature vector set are established by utilizing a linear classification function, and the traditional Chinese medicine sweet taste feature vector set and the western medicine sweet taste feature vector set are combined into a sweet taste feature vector matching model.
5. The linear function based spoken text classification matching system of claim 4, characterized by: the calculation module comprises an algorithm module used for establishing a TF-IDF algorithm to obtain the sweet text information to be matched, calculating the word frequency of each word in the sweet text information to be matched through the TF-IDF algorithm, and taking the word with the calculated word frequency as the word to be screened.
6. The linear function-based spoken text classification matching system of claim 5, characterized by: the calculation module comprises a feature word processing module, a common word bank and a word frequency threshold are set, words to be screened are screened according to the common word bank, after the common words are screened out, the word frequency of the remaining words to be screened is compared with the word frequency threshold, the words to be screened meeting the word frequency threshold are selected as the feature words of the sweet taste, and a feature vector set of the sweet taste to be matched is established.
7. The linear function-based spoken text classification matching system of claim 6, characterized by: the matching module comprises a matching report generating module which is used for establishing Jacard similarity coefficients, calculating the similarity between the sweet feature vector matching model and the sweet feature vector set to be matched through the Jacard similarity coefficients, and generating a corresponding matching report according to the similarity.
8. A linear function-based sweet text classification matching device is characterized by comprising:
the acquiring unit is used for acquiring the characteristic information of the sweet and the characteristic information of the disease, and respectively establishing a sweet characteristic vector set and a disease characteristic vector set according to the characteristic information of the sweet and the characteristic information of the disease;
the classification unit is used for establishing a linear function classification method, classifying the sweet feature vector set according to the linear function classification method, establishing a traditional Chinese medicine sweet feature vector set and a western medicine sweet feature vector set, and merging the traditional Chinese medicine sweet feature vector set and the western medicine sweet feature vector set into a sweet feature vector matching model;
the calculating unit is used for establishing a TF-IDF algorithm, acquiring the sweet text information to be matched, selecting sweet feature words from the sweet text information to be matched through the TF-IDF algorithm, and establishing a sweet feature vector set to be matched;
and the matching unit is used for calculating the similarity between the sweet feature vector matching model and the sweet feature vector set to be matched through the Jacard similarity coefficient and generating a matching report according to the similarity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011217922.XA CN112269880B (en) | 2020-11-04 | 2020-11-04 | Sweet text classification matching system based on linear function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011217922.XA CN112269880B (en) | 2020-11-04 | 2020-11-04 | Sweet text classification matching system based on linear function |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112269880A true CN112269880A (en) | 2021-01-26 |
CN112269880B CN112269880B (en) | 2024-02-09 |
Family
ID=74346045
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011217922.XA Active CN112269880B (en) | 2020-11-04 | 2020-11-04 | Sweet text classification matching system based on linear function |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112269880B (en) |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102622373A (en) * | 2011-01-31 | 2012-08-01 | 中国科学院声学研究所 | Statistic text classification system and statistic text classification method based on term frequency-inverse document frequency (TF*IDF) algorithm |
WO2012134180A2 (en) * | 2011-03-28 | 2012-10-04 | 가톨릭대학교 산학협력단 | Emotion classification method for analyzing inherent emotions in a sentence, and emotion classification method for multiple sentences using context information |
CN105046273A (en) * | 2015-07-07 | 2015-11-11 | 南京邮电大学 | Epilepsia electrocorticogram signal classification method based on multiscale sample entropy |
CN105205090A (en) * | 2015-05-29 | 2015-12-30 | 湖南大学 | Web page text classification algorithm research based on web page link analysis and support vector machine |
CN106548134A (en) * | 2016-10-17 | 2017-03-29 | 沈阳化工大学 | GA optimizes palmmprint and the vena metacarpea fusion identification method that SVM and normalization combine |
CN108733733A (en) * | 2017-04-21 | 2018-11-02 | 为朔生物医学有限公司 | Categorization algorithms for biomedical literatures, system based on machine learning and storage medium |
CN109145097A (en) * | 2018-06-11 | 2019-01-04 | 人民法院信息技术服务中心 | A kind of judgement document's classification method based on information extraction |
CN109215754A (en) * | 2018-09-10 | 2019-01-15 | 平安科技(深圳)有限公司 | Medical record data processing method, device, computer equipment and storage medium |
CN109902223A (en) * | 2019-01-14 | 2019-06-18 | 中国科学院信息工程研究所 | A kind of harmful content filter method based on multi-modal information feature |
CN110298032A (en) * | 2019-05-29 | 2019-10-01 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Text classification corpus labeling training system |
WO2020007028A1 (en) * | 2018-07-04 | 2020-01-09 | 平安科技(深圳)有限公司 | Medical consultation data recommendation method, device, computer apparatus, and storage medium |
CN111415740A (en) * | 2020-02-12 | 2020-07-14 | 东北大学 | Method and device for processing inquiry information, storage medium and computer equipment |
CN111816321A (en) * | 2020-07-09 | 2020-10-23 | 武汉东湖大数据交易中心股份有限公司 | System, apparatus and storage medium for intelligent infectious disease identification based on legal diagnostic criteria |
-
2020
- 2020-11-04 CN CN202011217922.XA patent/CN112269880B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102622373A (en) * | 2011-01-31 | 2012-08-01 | 中国科学院声学研究所 | Statistic text classification system and statistic text classification method based on term frequency-inverse document frequency (TF*IDF) algorithm |
WO2012134180A2 (en) * | 2011-03-28 | 2012-10-04 | 가톨릭대학교 산학협력단 | Emotion classification method for analyzing inherent emotions in a sentence, and emotion classification method for multiple sentences using context information |
CN105205090A (en) * | 2015-05-29 | 2015-12-30 | 湖南大学 | Web page text classification algorithm research based on web page link analysis and support vector machine |
CN105046273A (en) * | 2015-07-07 | 2015-11-11 | 南京邮电大学 | Epilepsia electrocorticogram signal classification method based on multiscale sample entropy |
CN106548134A (en) * | 2016-10-17 | 2017-03-29 | 沈阳化工大学 | GA optimizes palmmprint and the vena metacarpea fusion identification method that SVM and normalization combine |
CN108733733A (en) * | 2017-04-21 | 2018-11-02 | 为朔生物医学有限公司 | Categorization algorithms for biomedical literatures, system based on machine learning and storage medium |
CN109145097A (en) * | 2018-06-11 | 2019-01-04 | 人民法院信息技术服务中心 | A kind of judgement document's classification method based on information extraction |
WO2020007028A1 (en) * | 2018-07-04 | 2020-01-09 | 平安科技(深圳)有限公司 | Medical consultation data recommendation method, device, computer apparatus, and storage medium |
CN109215754A (en) * | 2018-09-10 | 2019-01-15 | 平安科技(深圳)有限公司 | Medical record data processing method, device, computer equipment and storage medium |
CN109902223A (en) * | 2019-01-14 | 2019-06-18 | 中国科学院信息工程研究所 | A kind of harmful content filter method based on multi-modal information feature |
CN110298032A (en) * | 2019-05-29 | 2019-10-01 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Text classification corpus labeling training system |
CN111415740A (en) * | 2020-02-12 | 2020-07-14 | 东北大学 | Method and device for processing inquiry information, storage medium and computer equipment |
CN111816321A (en) * | 2020-07-09 | 2020-10-23 | 武汉东湖大数据交易中心股份有限公司 | System, apparatus and storage medium for intelligent infectious disease identification based on legal diagnostic criteria |
Non-Patent Citations (1)
Title |
---|
王丁: "基于中文文本分类的自动诊病系统", 《中国优秀博硕士学位论文全文数据库 (硕士)信息科技辑》, no. 03, pages 140 - 584 * |
Also Published As
Publication number | Publication date |
---|---|
CN112269880B (en) | 2024-02-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lohner et al. | Non‐nutritive sweeteners for diabetes mellitus | |
Iatridi et al. | Reconsidering the classification of sweet taste liker phenotypes: A methodological review | |
CN107919161A (en) | A kind of method, electronic equipment and the storage medium of prescription authorization | |
Bahia et al. | A systematic review of the physiological effects of the effortful swallow maneuver in adults with normal and disordered swallowing | |
Molfenter et al. | The swallowing profile of healthy aging adults: comparing noninvasive swallow tests to videofluoroscopic measures of safety and efficiency | |
CN109102899A (en) | Chinese medicine intelligent assistance system and method based on machine learning and big data | |
Islam et al. | Deep learning of facial depth maps for obstructive sleep apnea prediction | |
CN114220514B (en) | Internet hospital patient diagnosis and treatment data analysis processing method, equipment and storage medium | |
Abouelenien et al. | Gender-based multimodal deception detection | |
Simpson et al. | Analysing differences in clinical outcomes between hospitals | |
Steele et al. | Exploration of the utility of a brief swallow screening protocol with comparison to concurrent videofluoroscopy. | |
Qiu et al. | Egocentric image captioning for privacy-preserved passive dietary intake monitoring | |
TW201040756A (en) | Chinese medicine intelligent formulary system | |
Goshvarpour et al. | Asymmetry of lagged Poincare plot in heart rate signals during meditation | |
CN112002419B (en) | Disease auxiliary diagnosis system, equipment and storage medium based on clustering | |
Dupuy-McCauley et al. | A comparison of 2 visual methods for classifying obstructive vs central hypopneas | |
CN112269880A (en) | Sweet text classification matching system based on linear function | |
Bergström et al. | Dysphagia management: Does structured training improve the validity and reliability of cervical auscultation? | |
Donohue et al. | Establishing reference values for temporal kinematic swallow events across the lifespan in healthy community dwelling adults using high-resolution cervical auscultation | |
Werden Abrams et al. | The adverse effects and events of thickened liquid use in adults: A systematic review | |
CN112185571B (en) | Disease auxiliary diagnosis system, equipment and storage medium based on orotic acid | |
CN111973155B (en) | Disease cognition self-learning system based on abnormal change of human taste | |
CN105930646B (en) | A kind of data processing system and method for assessing heart aging degree | |
Li et al. | Understanding the impact of fluid restriction on growth outcomes in infants following cardiac surgery | |
CN112270186B (en) | Mouth based on entropy model peppery text information matching system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |