CN116186262A - Menstrual disorder typing system, menstrual disorder typing method, electronic device, and recording medium - Google Patents
Menstrual disorder typing system, menstrual disorder typing method, electronic device, and recording medium Download PDFInfo
- Publication number
- CN116186262A CN116186262A CN202310168401.7A CN202310168401A CN116186262A CN 116186262 A CN116186262 A CN 116186262A CN 202310168401 A CN202310168401 A CN 202310168401A CN 116186262 A CN116186262 A CN 116186262A
- Authority
- CN
- China
- Prior art keywords
- menstrual disorder
- text
- description text
- case
- case description
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 208000019255 Menstrual disease Diseases 0.000 title claims abstract description 136
- 238000000034 method Methods 0.000 title claims abstract description 30
- 239000013598 vector Substances 0.000 claims abstract description 101
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 14
- 201000010099 disease Diseases 0.000 claims abstract description 13
- 238000005259 measurement Methods 0.000 claims abstract description 10
- 238000000605 extraction Methods 0.000 claims description 17
- 238000004422 calculation algorithm Methods 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000010276 construction Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 7
- 238000013016 damping Methods 0.000 claims description 7
- 239000003814 drug Substances 0.000 abstract description 7
- 239000008280 blood Substances 0.000 description 30
- 210000004369 blood Anatomy 0.000 description 29
- 208000024891 symptom Diseases 0.000 description 12
- 239000006187 pill Substances 0.000 description 9
- 230000007812 deficiency Effects 0.000 description 8
- 230000002175 menstrual effect Effects 0.000 description 8
- 208000037093 Menstruation Disturbances Diseases 0.000 description 7
- 230000005906 menstruation Effects 0.000 description 6
- 230000027758 ovulation cycle Effects 0.000 description 6
- 206010027339 Menstruation irregular Diseases 0.000 description 5
- 210000004185 liver Anatomy 0.000 description 5
- 230000002159 abnormal effect Effects 0.000 description 4
- 239000011248 coating agent Substances 0.000 description 4
- 238000000576 coating method Methods 0.000 description 4
- 230000003821 menstrual periods Effects 0.000 description 4
- 206010033557 Palpitations Diseases 0.000 description 3
- 208000007536 Thrombosis Diseases 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 210000003734 kidney Anatomy 0.000 description 3
- 208000007106 menorrhagia Diseases 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000002035 prolonged effect Effects 0.000 description 3
- MCSXGCZMEPXKIW-UHFFFAOYSA-N 3-hydroxy-4-[(4-methyl-2-nitrophenyl)diazenyl]-N-(3-nitrophenyl)naphthalene-2-carboxamide Chemical compound Cc1ccc(N=Nc2c(O)c(cc3ccccc23)C(=O)Nc2cccc(c2)[N+]([O-])=O)c(c1)[N+]([O-])=O MCSXGCZMEPXKIW-UHFFFAOYSA-N 0.000 description 2
- 206010013789 Dry throat Diseases 0.000 description 2
- 208000032843 Hemorrhage Diseases 0.000 description 2
- 206010021033 Hypomenorrhoea Diseases 0.000 description 2
- 208000002193 Pain Diseases 0.000 description 2
- 208000013738 Sleep Initiation and Maintenance disease Diseases 0.000 description 2
- 210000001015 abdomen Anatomy 0.000 description 2
- 230000000740 bleeding effect Effects 0.000 description 2
- 210000000038 chest Anatomy 0.000 description 2
- 230000006735 deficit Effects 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 208000002173 dizziness Diseases 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 206010022437 insomnia Diseases 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 210000000952 spleen Anatomy 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- 210000002700 urine Anatomy 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 208000004998 Abdominal Pain Diseases 0.000 description 1
- 206010008479 Chest Pain Diseases 0.000 description 1
- 206010010356 Congenital anomaly Diseases 0.000 description 1
- 206010010774 Constipation Diseases 0.000 description 1
- 206010012735 Diarrhoea Diseases 0.000 description 1
- 206010016825 Flushing Diseases 0.000 description 1
- 206010062717 Increased upper airway secretion Diseases 0.000 description 1
- 206010022998 Irritability Diseases 0.000 description 1
- 206010024642 Listless Diseases 0.000 description 1
- 208000019790 abdominal distention Diseases 0.000 description 1
- 208000022531 anorexia Diseases 0.000 description 1
- 235000019658 bitter taste Nutrition 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 238000009835 boiling Methods 0.000 description 1
- 210000000481 breast Anatomy 0.000 description 1
- 239000002775 capsule Substances 0.000 description 1
- 210000003756 cervix mucus Anatomy 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 206010061428 decreased appetite Diseases 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 230000004064 dysfunction Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 210000004696 endometrium Anatomy 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 239000003292 glue Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000003907 kidney function Effects 0.000 description 1
- 208000017971 listlessness Diseases 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 208000026435 phlegm Diseases 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 230000005982 spleen dysfunction Effects 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 230000035922 thirst Effects 0.000 description 1
- SWGJCIMEBVHMTA-UHFFFAOYSA-K trisodium;6-oxido-4-sulfo-5-[(4-sulfonatonaphthalen-1-yl)diazenyl]naphthalene-2-sulfonate Chemical compound [Na+].[Na+].[Na+].C1=CC=C2C(N=NC3=C4C(=CC(=CC4=CC=C3O)S([O-])(=O)=O)S([O-])(=O)=O)=CC=C(S([O-])(=O)=O)C2=C1 SWGJCIMEBVHMTA-UHFFFAOYSA-K 0.000 description 1
- 210000004291 uterus Anatomy 0.000 description 1
- 210000001835 viscera Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/374—Thesaurus
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Public Health (AREA)
- Medical Informatics (AREA)
- Theoretical Computer Science (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Pathology (AREA)
- Physics & Mathematics (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Computational Linguistics (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The embodiment of the invention discloses a menstrual disorder parting system, a menstrual disorder parting method, electronic equipment and a storage medium, wherein the menstrual disorder parting method comprises the following steps: constructing a text description corpus based on menstrual disorder case description text; extracting main characteristic information of each menstrual disorder case description text in a text description corpus, and constructing a standard database; extracting feature vectors corresponding to main feature information in a standard database to obtain feature vector sets corresponding to the description text of each menstrual disorder case; extracting feature vectors of the case description text to be typed, and respectively calculating the similarity between the feature vectors of the case description text to be typed and elements in a feature vector set by using cosine measurement; and carrying out case matching and typing on the text to be typed according to the similarity, and obtaining the disease type and prescription information corresponding to the description text of the case to be typed. The menstrual disorder typing method solves the problem that the traditional Chinese medicine menstrual disorder type cannot be intelligently identified in the prior art.
Description
Technical Field
The present invention relates to the field of computer technology, and in particular, to a menstrual disorder typing system, a menstrual disorder typing method, an electronic device, and a storage medium.
Background
Menstrual disorder is one of common gynecological diseases, and is manifested by abnormal menstrual cycle or abnormal bleeding volume, and common types include menorrhagia, hypomenorrhea, menorrhagia, and diluted menstruation; traditional Chinese medicine is divided into different types of menstrual disorder according to the cause, symptoms and pulse conditions of menstrual disorder, and the corresponding important formulas are selected for treatment according to the types of menstrual disorder.
At present, the type of menstrual disorder is judged by on-site diagnosis in traditional Chinese medicine, and the symptoms of menstrual disorder patients cannot be intelligently analyzed.
Disclosure of Invention
The embodiment of the invention aims to provide a menstrual disorder typing system, a menstrual disorder typing method, electronic equipment and a storage medium, which are used for solving the problem that the traditional Chinese medicine menstrual disorder type cannot be intelligently identified in the prior art.
To achieve the above object, an embodiment of the present invention provides a menstrual disorder typing method, the method specifically including:
collecting a certain number of menstrual disorder case descriptive texts;
constructing a text description corpus based on the menstrual disorder case description text;
extracting main characteristic information of each menstrual disorder case description text in the text description corpus by using a TextRank algorithm, and constructing a standard database;
extracting feature vectors corresponding to main feature information in the standard database to obtain feature vector sets corresponding to the description text of each menstrual disorder case;
extracting feature vectors of the case description text to be typed, and respectively calculating the similarity between the feature vectors of the case description text to be typed and elements in the feature vector set by using cosine measurement;
and performing case matching and typing on the text to be typed according to the similarity to obtain the disease type and prescription information corresponding to the case description text to be typed.
Based on the technical scheme, the invention can also be improved as follows:
further, extracting main characteristic information of each menstrual disorder case description text by a TextRank algorithm, and constructing a standard database, wherein the standard database comprises;
calculating main characteristic information of the descriptive text of each menstrual disorder case by a formula 1;
wherein d represents a damping coefficient, and is generally set to 0.85; v (V) i Representing any node in the way; in (V) i ) The representation points to vertex V i Is defined by a vertex set; out (V) j ) Represented by vertex V j All vertex sets connected out; w (w) ij Representing the vertex V i and Vj Is a connection weight of (2); WS (V) i ) Representing the vertex V i Is added to the final ranking weights of (a).
Further, the extracting the feature vector corresponding to the main feature information in the standard database to obtain a feature vector set corresponding to each menstrual disorder case description text includes:
calculating word frequency of the menstrual disorder case description text description through a formula 2;
wherein ,ni,j Is the number of times the vocabulary appears in the menstrual disorder case description text dj,is the sum of the times of occurrence of all words in menstrual disorder case description text dj;
calculating an inverse document frequency by formula 3;
where |d| is the total number of menstrual disorder case description text in the text description corpus; | { j: t is t i ∈d j The } | represents the number of menstrual disorder case descriptive text containing the word ti; if the term is not in the text description corpus, it will result in zero denominator, so 1+| { j is typically used: t is t i ∈d j }|;
Calculating a TF-IDF value through a formula 4;
TF-IDF=tf ij -idf i equation 4;
after the TF-IDF value of each word in the menstrual disorder case description text is calculated, descending order is carried out, a plurality of words with TF-IDF values higher than a set threshold value are selected as keywords, and feature vectors are constructed according to the keywords and the corresponding TF-IDF values, so that a feature vector set corresponding to each menstrual disorder case description text is obtained.
Further, extracting feature vectors of the case description text to be typed, and respectively calculating the similarity between the feature vectors of the case description text to be typed and elements in the feature vector set by using cosine measurement, wherein the method comprises the following steps:
and respectively calculating the similarity between the feature vector of the case description text to be typed and the elements in the feature vector set through a formula 5:
wherein, I x I I is vector x= (x) 1 ,x 2 ,x 3 ,...,x p ) Is defined as the Euclidean norm ofConceptually, it is the length of vector x.
A menstrual disorder typing system comprising:
the acquisition module is used for acquiring a certain number of menstrual disorder case description texts;
a first construction module for constructing a text description corpus based on the menstrual disorder case description text;
the first extraction module is used for extracting main characteristic information of each menstrual disorder case description text in the text description corpus through a TextRank algorithm;
the second construction module is used for constructing a standard database;
the second extraction module is used for extracting the feature vector of the case description text to be typed;
the similarity calculation module is used for calculating the similarity between the feature vector of the case description text to be typed and the elements in the feature vector set by using cosine measurement;
and the parting module is used for carrying out case matching and parting on the text to be parting according to the similarity to obtain the disease type and prescription information corresponding to the text to be parting case description.
Further, the first extraction module is further configured to:
calculating main characteristic information of the descriptive text of each menstrual disorder case by a formula 1;
wherein d represents a damping coefficient, and is generally set to 0.85; v (V) i Representing any node in the way; in (V) i ) The representation points to vertex V i Is defined by a vertex set; out (V) j ) Represented by vertex V j All vertex sets connected out; w (w) ij Representing the vertex V i and Vj Is a connection weight of (2); WS (V) i ) Representing the vertex V i Is added to the final ranking weights of (a).
Further, the second extraction module is further configured to:
calculating word frequency of the menstrual disorder case description text description through a formula 2;
wherein ,ni,j Is the number of times the vocabulary appears in the menstrual disorder case description text dj,is the sum of the times of occurrence of all words in menstrual disorder case description text dj;
calculating an inverse document frequency by formula 3;
where |d| is the total number of menstrual disorder case description text in the text description corpus; | { j: t is t i ∈d j The } | represents the number of menstrual disorder case descriptive text containing the word ti; if the term is not in the text description corpus, it will result in zero denominator, so 1+| { j is typically used: t is t i ∈d j }|;
Calculating a TF-IDF value through a formula 4;
TF-IDF=tf ij -idf i equation 4;
after the TF-IDF value of each word in the menstrual disorder case description text is calculated, descending order is carried out, a plurality of words with TF-IDF values higher than a set threshold value are selected as keywords, and feature vectors are constructed according to the keywords and the corresponding TF-IDF values, so that a feature vector set corresponding to each menstrual disorder case description text is obtained.
Further, the similarity calculation module further includes:
and respectively calculating the similarity between the feature vector of the case description text to be typed and the elements in the feature vector set through a formula 5:
wherein, I x I I is vector x= (x) 1 ,x 2 ,x 3 ,...,x p ) Is defined as the Euclidean norm ofConceptually, it is the length of vector x.
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when the computer program is executed.
A non-transitory computer readable medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method.
The embodiment of the invention has the following advantages:
the menstrual disorder typing method of the invention collects a certain amount of menstrual disorder case descriptive text; constructing a text description corpus based on the menstrual disorder case description text; extracting main characteristic information of each menstrual disorder case description text in the text description corpus by using a TextRank algorithm, and constructing a standard database; extracting feature vectors corresponding to main feature information in the standard database to obtain feature vector sets corresponding to the description text of each menstrual disorder case; extracting feature vectors of the case description text to be typed, and respectively calculating the similarity between the feature vectors of the case description text to be typed and elements in the feature vector set by using cosine measurement; performing case matching and typing on the text to be typed according to the similarity to obtain the disease type and prescription information corresponding to the case description text to be typed; solves the problem that the traditional Chinese medicine menstrual disorder type cannot be intelligently identified in the prior art.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those of ordinary skill in the art that the drawings in the following description are exemplary only and that other implementations can be obtained from the extensions of the drawings provided without inventive effort.
The structures, proportions, sizes, etc. shown in the present specification are shown only for the purposes of illustration and description, and are not intended to limit the scope of the invention, which is defined by the claims, so that any structural modifications, changes in proportions, or adjustments of sizes, which do not affect the efficacy or the achievement of the present invention, should fall within the ambit of the technical disclosure.
FIG. 1 is a flow chart of a method of typing menstrual disorder according to the present invention;
FIG. 2 is a block diagram of a menstrual disorder typing system of the present invention;
fig. 3 is a schematic diagram of an entity structure of an electronic device according to the present invention.
Wherein the reference numerals are as follows:
the system comprises an acquisition module 10, a first construction module 20, a first extraction module 30, a second construction module 40, a second extraction module 50, a similarity calculation module 60, a parting module 70, an electronic device 80, a processor 801, a memory 802 and a bus 803.
Detailed Description
Other advantages and advantages of the present invention will become apparent to those skilled in the art from the following detailed description, which, by way of illustration, is to be read in connection with certain specific embodiments, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Examples
Fig. 1 is a flowchart of an embodiment of a menstrual disorder typing method according to the present invention, as shown in fig. 1, wherein the menstrual disorder typing method according to the embodiment of the present invention includes the steps of:
s101, acquiring a certain number of menstrual disorder case description texts;
specifically, traditional Chinese medicine considers that menstrual water flows out of kidneys, female irregular menstruation is related to kidney functions, spleen, liver, qi and blood, pulse, conception vessel and uterus. The disease is mainly caused by seven emotions or exogenous six exogenous pathogenic factors, or congenital kidney qi deficiency, excessive sexual overstrain and overstrain, which causes the impairment of viscera qi, kidney liver spleen dysfunction, qi-blood imbalance, leading to the impairment of thoroughfare and conception vessels, which is irregular menstruation. Menorrhagia, which is mostly caused by internal heat, blood deficiency or blood stasis, resulting in boiling, overflow or inability to manage, and random flooding, is usually marked by the need of observing the condition of the patient and distinguishing the treatment from the menstruation color; and the hypomenorrhea is mostly caused by blood deficiency, blood stasis, phlegm dampness and qi stagnation blocking the qi and blood passage and unsmooth blood.
Menstrual disorder case descriptive text includes etiology, menstrual cycle, menstrual period, menstrual blood color, menstrual blood volume, duration, complications, and the like. For example, normal blood is dark red, and small fragments of the endometrium, cervical mucus and vaginal epithelial cells which fall off are mixed in the blood, so that the blood is free of blood clots; if the menstrual blood is thin like water, it is only a little pink or black and purple, and it is abnormal. If the menstrual blood is completely coagulated blood, the menstrual blood is abnormal, and the part possibly with bleeding should seek medical care early, so that the health of the body is ensured. For another example, a typical female menstrual cycle is 28 to 30 days, but there is also 40 days for a menstrual cycle. But all are normal conditions as long as they are regular. In addition, menstruation is easily affected by various factors, so that it is a normal phenomenon 3 to 5 days after the advance or misplacement. If the menstrual cycle is 20 days, 40 days next, and the situation often occurs that some menstrual cycles are even from 1 to 2 days, and the period is lost after more than 10 days for 1 to 2 days, which belongs to irregular menstruation. In the primary tide of girls, the functions of the ovaries are imperfect, so that dysfunction and irregularity can occur, which is not a pathological phenomenon.
S102, constructing a text description corpus based on menstrual disorder case description text;
s103, extracting main characteristic information of each menstrual disorder case description text in a text description corpus by using a TextRank algorithm, and constructing a standard database;
specifically, main characteristic information of the descriptive text of each menstrual disorder case is calculated by the formula 1;
wherein d represents a damping coefficient, and is generally set to 0.85; v (V) i Representing any node in the way; in (V) i ) The representation points to vertex V i Is defined by a vertex set; out (V) j ) Represented by vertex V j All vertex sets connected out; w (w) ij Representing the vertex V i and Vj Is a connection weight of (2); WS (V) i ) Representing the vertex V i Is added to the final ranking weights of (a).
Blood deficiency type irregular menstruation: the symptoms are that after the menstrual period is prolonged, the amount is small, the color is light red, no lump exists, or the pain of the lower abdomen is caused; or dizziness, dim eyesight, palpitation, insomnia, pale complexion or sallow complexion, pale red tongue and weak pulse; 2) Menoxenia due to blood cold: the symptoms are that after the menstrual period is prolonged, the amount is small, the color is dark red or blood clots are present, the pain is caused by cold glue, the heat is relieved, the cold limbs are averse, the tongue coating is white, and the pulse is deep and tight; 3) Blood heat type irregular menstruation: the syndrome has a large dosage, bright red or deep red, thick and viscous; or small blood clots, with symptoms of vexation, thirst, yellow urine, constipation, red tongue, yellow coating, slippery and rapid pulse. Etc.;
TextRank is an algorithm based on graph ordering, the idea is derived from the Pagerank algorithm of Google, a text is divided into a plurality of constituent units (words and sentences) and a graph model is established, important components in the text are ordered by using a voting mechanism, and keyword extraction can be realized by using information of a single document.
TextRank uses the principle of voting, with each word prizing its neighbors, the weight of the vote being dependent on its own number of votes. Assuming that each word is a Vertex (Vertex), then all words form a network in which each Vertex has edges pointing to other vertices and also other vertices point to their own edges. And calculating the weight sum of the vertexes pointing to each vertex connected with each vertex, and finally obtaining the weight value of the vertex.
The main problem with TextRank is the determination of the initial value, which is assigned a non-0 value for simplicity of subsequent calculations. At the same time, a concept of a damping coefficient is introduced, which represents the probability from a given vertex to any other vertex.
And S104, extracting feature vectors corresponding to the main feature information in the standard database to obtain feature vector sets corresponding to the description text of each menstrual disorder case.
Specifically, word frequency of the menstrual disorder case description text description is calculated through a formula 2;
wherein ,ni,j Is the number of times the vocabulary appears in the menstrual disorder case description text dj,is the sum of the times of occurrence of all words in menstrual disorder case description text dj;
calculating an inverse document frequency by formula 3;
where |d| is the total number of menstrual disorder case description text in the text description corpus; | { j: t is t i ∈d j The } | represents the number of menstrual disorder case descriptive text containing the word ti; if the term is not in the text description corpus, it will result in zero denominator, so 1+| { j is typically used: t is t i ∈d j }|;
Calculating a TF-IDF value through a formula 4;
TF-IDF=tf ij -idf i equation 4;
after the TF-IDF value of each word in the menstrual disorder case description text is calculated, descending order is carried out, a plurality of words with TF-IDF values higher than a set threshold value are selected as keywords, and feature vectors are constructed according to the keywords and the corresponding TF-IDF values, so that a feature vector set corresponding to each menstrual disorder case description text is obtained.
S105, extracting feature vectors of the case description text to be typed, and respectively calculating the similarity between the feature vectors of the case description text to be typed and elements in the feature vector set by using cosine measurement.
Specifically, similarity between the feature vector of the case description text to be typed and the elements in the feature vector set is calculated through a formula 5 respectively:
let x, y be the two vectors to be compared, using the cosine metric as the similarity function:
wherein, I x I I is vector x= (x) 1 ,x 2 ,x 3 ,...,x p ) Is defined as the Euclidean norm ofConceptually, it is the length of vector x;
the cosine value of the angle of 0 degree is 1, and the cosine value of any other angle is not more than 1; and its minimum value is-1. The cosine value of the angle between the two vectors thus determines whether the two vectors point approximately in the same direction. When the two vectors have the same direction, the cosine similarity value is 1; when the included angle of the two vectors is 90 degrees, the cosine similarity value is 0; when the two vectors point in diametrically opposite directions, the cosine similarity has a value of-1. This results in dependence on the length of the vector, only on the pointing direction of the vector. Cosine similarity is usually used for positive space and therefore gives values between-1 and 1.
And S106, performing case matching and typing on the text to be typed according to the similarity to obtain the disease type and prescription information corresponding to the description text of the case to be typed.
Specifically, the types of disorders specifically include blood-heat type; 1) Blood heat, liver depression transforming into heat, qi deficiency and blood deficiency. Specifically, the blood is febrile. Symptoms: menoxenia, red menstrual blood or purple or deep red, sticky and thick texture, vexation of heart and chest, dry face and mouth, dry throat and mouth, flushed face, yellow urine and stool, red tongue and yellow tongue fur. Treatment: it is suitable for clearing heat and cooling blood, and can be taken as a pill, capsule, etc.; 2) Liver depression transforming into heat. Symptoms: menoxenia, obstruction of menstruation, chest, hypochondrium, breast and lower abdominal distention and pain, chest distress, irritability or frequent sighing, belch, anorexia, red or purple menstrual blood, red tongue edge, bitter taste, dry throat, thin and yellow tongue fur. Treatment: liver soothing and qi relieving Yu Qingre, and can be used for treating menoxenia, leukorrhagia, and other diseases; 3) Qi deficiency type. Symptoms: early menstruation or prolonged menstruation, with symptoms of multiple colors, thin and clear quality, listlessness, debilitation, palpitation, short breath, loose stool, empty lower abdomen, and pale tongue with thin coating. Treatment: for invigorating qi and blood, it can be taken as pill for invigorating middle-jiao and qi, and pill for invigorating spleen; 4) Blood deficiency type. Symptoms: after the menstrual period is wrong, the symptoms of hypofunction and thin quality, dizziness, palpitation, insomnia, dreaminess, sallow complexion, pale tongue and little coating are caused. Treatment: replenishing blood and replenishing qi to replenish the body fluid, can be administered in the form of tablet, FUNING pill, BAZHENYIMU pill, radix Angelicae sinensis blood replenishing paste, BABAKUNSHU pill, SHIZHENXIANGFU pill, ning Kun ZHIBAODAN, JIAWEIYIMU paste, FUYANGSHIWEI tablet, ANKUNZANYU pill, SHENRONGBAIFENG pill, etc.
FIG. 2 is a flow chart of an embodiment of a menstrual disorder typing system according to the present invention; as shown in fig. 2, the menstrual disorder typing system according to an embodiment of the present invention includes the steps of:
an acquisition module 10 for acquiring a number of menstrual disorder case descriptive texts;
a first construction module 20 for constructing a text description corpus based on the menstrual disorder case description text;
a first extracting module 30, configured to extract main feature information of each of the menstrual disorder case description texts in the text description corpus through a TextRank algorithm;
the first extraction module 30 is further configured to:
calculating main characteristic information of the descriptive text of each menstrual disorder case by a formula 1;
wherein d represents a damping coefficient, and is generally set to 0.85; v (V) i Representing any node in the way; in (V) i ) The representation points to vertex V i Is defined by a vertex set; out (V) j ) Represented by vertex V j All vertex sets connected out; omega ij Representing the vertex V i and Vj Is a connection weight of (2); WS (V) i ) Representing the vertex V i Is added to the final ranking weights of (a).
A second construction module 40 for constructing a standard database;
a second extracting module 50, configured to extract feature vectors of the case description text to be typed;
the second extraction module 50 is further configured to:
calculating word frequency of the menstrual disorder case description text description through a formula 2;
wherein ,ni,j Is the number of times the vocabulary appears in the menstrual disorder case description text dj,is the sum of the times of occurrence of all words in menstrual disorder case description text dj;
calculating an inverse document frequency by formula 3;
where |d| is the total number of menstrual disorder case description text in the text description corpus; | { j: t is t i ∈d j The } | represents the number of menstrual disorder case descriptive text containing the word ti; if the term is not in the text description corpus, it will result in zero denominator, so 1+| { j is typically used: t is t i ∈d j }|;
Calculating a TF-IDF value through a formula 4;
TF-IDF=tf ij -idf i equation 4;
after the TF-IDF value of each word in the menstrual disorder case description text is calculated, descending order is carried out, a plurality of words with TF-IDF values higher than a set threshold value are selected as keywords, and feature vectors are constructed according to the keywords and the corresponding TF-IDF values, so that a feature vector set corresponding to each menstrual disorder case description text is obtained.
A similarity calculating module 60, configured to calculate similarities between feature vectors of the case description text to be typed and elements in the feature vector set using cosine metrics, respectively;
the similarity calculation module 60 further includes:
and respectively calculating the similarity between the feature vector of the case description text to be typed and the elements in the feature vector set through a formula 5:
wherein, I x I I is vector x= (x) 1 ,x 2 ,x 3 ,...,x p ) Is defined as the Euclidean norm ofConceptually, it is the length of vector x.
And the typing module 70 is used for performing case matching and typing on the text to be typed according to the similarity, and obtaining the disease type and prescription information corresponding to the text to be typed case description.
According to the menstrual disorder typing system, a certain number of menstrual disorder case description texts are collected through the collection module 10, a text description corpus is built through the first building module 20 based on the menstrual disorder case description texts, main characteristic information of each menstrual disorder case description text in the text description corpus is extracted through the first extraction module 30, a standard database is built through the second building module 40, characteristic vectors of the case description texts to be typed are extracted through the second extraction module 50, cosine measures are used for respectively calculating the similarity between the characteristic vectors of the case description texts to be typed and elements in the characteristic vector set through the similarity calculation module 60, case matching and typing are carried out on the case description texts to be typed according to the similarity, so that symptom types and prescription information corresponding to the case description texts to be typed are obtained, a standard database of association relations between different case typing standards and corresponding main characteristic information is built based on traditional Chinese medicine dialectical typing standards of current female menstrual disorder, meanwhile, a space model is applied to standard characteristic extraction, the problem of effective typing and characteristic extraction can be effectively solved, the important characteristic information is greatly reduced, the intelligent diagnosis accuracy is greatly improved, and the characteristic information is greatly identified.
Fig. 3 is a schematic diagram of an entity structure of an electronic device according to an embodiment of the present invention, as shown in fig. 3, an electronic device 80 includes: a processor 801 (processor), a memory 802 (memory), and a bus 803;
the processor 801 and the memory 802 complete communication with each other through the bus 803;
the processor 801 is configured to invoke program instructions in the memory 802 to perform the methods provided by the above-described method embodiments, including, for example: collecting a certain number of menstrual disorder case descriptive texts; constructing a text description corpus based on the menstrual disorder case description text; extracting main characteristic information of each menstrual disorder case description text in the text description corpus by using a TextRank algorithm, and constructing a standard database; extracting feature vectors corresponding to main feature information in the standard database to obtain feature vector sets corresponding to the description text of each menstrual disorder case; extracting feature vectors of the case description text to be typed, and respectively calculating the similarity between the feature vectors of the case description text to be typed and elements in the feature vector set by using cosine measurement; and performing case matching and typing on the text to be typed according to the similarity to obtain the disease type and prescription information corresponding to the case description text to be typed.
The present embodiment provides a non-transitory computer readable medium storing computer instructions that cause a computer to perform the methods provided by the above-described method embodiments, for example, including: collecting a certain number of menstrual disorder case descriptive texts; constructing a text description corpus based on the menstrual disorder case description text; extracting main characteristic information of each menstrual disorder case description text in the text description corpus by using a TextRank algorithm, and constructing a standard database; extracting feature vectors corresponding to main feature information in the standard database to obtain feature vector sets corresponding to the description text of each menstrual disorder case; extracting feature vectors of the case description text to be typed, and respectively calculating the similarity between the feature vectors of the case description text to be typed and elements in the feature vector set by using cosine measurement; and performing case matching and typing on the text to be typed according to the similarity to obtain the disease type and prescription information corresponding to the case description text to be typed.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware associated with program instructions, where the foregoing program may be stored in a computer readable medium, and when executed, the program performs steps including the above method embodiments; and the aforementioned medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
The apparatus embodiments described above are merely illustrative, wherein elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable medium such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method of the respective embodiments or parts of the embodiments.
While the invention has been described in detail in the foregoing general description and specific examples, it will be apparent to those skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.
Claims (10)
1. A method for typing menstrual disorder, the method comprising:
collecting a certain number of menstrual disorder case descriptive texts;
constructing a text description corpus based on the menstrual disorder case description text;
extracting main characteristic information of each menstrual disorder case description text in the text description corpus by using a TextRank algorithm, and constructing a standard database;
extracting feature vectors corresponding to main feature information in the standard database to obtain feature vector sets corresponding to the description text of each menstrual disorder case;
extracting feature vectors of the case description text to be typed, and respectively calculating the similarity between the feature vectors of the case description text to be typed and elements in the feature vector set by using cosine measurement;
and performing case matching and typing on the text to be typed according to the similarity to obtain the disease type and prescription information corresponding to the case description text to be typed.
2. The menstrual disorder typing method according to claim 1, wherein the extracting of main feature information of the descriptive text of each of the menstrual disorder cases by TextRank algorithm constructs a standard database, comprising;
calculating main characteristic information of the descriptive text of each menstrual disorder case by a formula 1;
wherein d represents a damping coefficient, and is generally set to 0.85; v (V) i Representing any node in the way; in (V) i ) The representation points to vertex V i Is defined by a vertex set; out (V) j ) Represented by vertex V j All vertex sets connected out; omega ij Representing the vertex V i and Vj Is a connection weight of (2); WS (V) i ) Representing the vertex V i Is added to the final ranking weights of (a).
3. The menstrual disorder typing method according to claim 1, wherein the extracting feature vectors corresponding to the main feature information in the standard database to obtain a feature vector set corresponding to each menstrual disorder case description text comprises:
calculating word frequency of the menstrual disorder case description text description through a formula 2;
wherein ,ni,j Is the number of times the vocabulary appears in the menstrual disorder case description text dj,is the sum of the times of occurrence of all words in menstrual disorder case description text dj;
calculating an inverse document frequency by formula 3;
where |d| is the total number of menstrual disorder case description text in the text description corpus; |j: t is t i ∈d j The } | represents the number of menstrual disorder case descriptive text containing the word ti; if the term is not in the text description corpus, it will result in zero denominator, so 1+| { j is typically used: t is t i ∈d j }|;
Calculating a TF-IDF value through a formula 4;
TF-IDF=tf ij -idf i equation 4;
after the TF-IDF value of each word in the menstrual disorder case description text is calculated, descending order is carried out, a plurality of words with TF-IDF values higher than a set threshold value are selected as keywords, and feature vectors are constructed according to the keywords and the corresponding TF-IDF values, so that a feature vector set corresponding to each menstrual disorder case description text is obtained.
4. The method for typing menstrual disorder according to claim 1, wherein the extracting feature vectors of the case description text to be typed, calculating the similarity between the feature vectors of the case description text to be typed and elements in the feature vector set, respectively, using cosine measures, comprises:
and respectively calculating the similarity between the feature vector of the case description text to be typed and the elements in the feature vector set through a formula 5:
5. A menstrual disorder typing system, comprising:
the acquisition module is used for acquiring a certain number of menstrual disorder case description texts;
a first construction module for constructing a text description corpus based on the menstrual disorder case description text;
the first extraction module is used for extracting main characteristic information of each menstrual disorder case description text in the text description corpus through a TextRank algorithm;
the second construction module is used for constructing a standard database;
the second extraction module is used for extracting the feature vector of the case description text to be typed;
the similarity calculation module is used for calculating the similarity between the feature vector of the case description text to be typed and the elements in the feature vector set by using cosine measurement;
and the parting module is used for carrying out case matching and parting on the text to be parting according to the similarity to obtain the disease type and prescription information corresponding to the text to be parting case description.
6. The menstrual disorder typing system according to claim 5, wherein the first extraction module is further configured to:
calculating main characteristic information of the descriptive text of each menstrual disorder case by a formula 1;
wherein d represents a damping coefficient, and is generally set to 0.85; v (V) i Representing any node in the way; in (V) i ) The representation points to vertex V i Is defined by a vertex set; out (V) j ) Representing the vertex V j All vertex sets connected out; omega ij Representing the vertex V i and Vj Is a connection weight of (2); WS (V) i ) Representing the vertex V i Is added to the final ranking weights of (a).
7. The menstrual disorder typing system according to claim 5, wherein the second extraction module is further configured to:
calculating word frequency of the menstrual disorder case description text description through a formula 2;
wherein ,ni,j Is the number of times the vocabulary appears in the menstrual disorder case description text dj,is the sum of the times of occurrence of all words in menstrual disorder case description text dj;
calculating an inverse document frequency by formula 3;
where |d| is the total number of menstrual disorder case description text in the text description corpus; | { j: t is t i ∈d j The } | represents the number of menstrual disorder case descriptive text containing the word ti; if the term is not in the text description corpus, it will result in zero denominator, so 1+| { j is typically used: t is t i ∈d j }|;
Calculating a TF-IDF value through a formula 4;
TF-IDF=tf ij -idf i equation 4;
after the TF-IDF value of each word in the menstrual disorder case description text is calculated, descending order is carried out, a plurality of words with TF-IDF values higher than a set threshold value are selected as keywords, and feature vectors are constructed according to the keywords and the corresponding TF-IDF values, so that a feature vector set corresponding to each menstrual disorder case description text is obtained.
8. The menstrual disorder typing system according to claim 5, wherein the similarity calculation module further comprises:
and respectively calculating the similarity between the feature vector of the case description text to be typed and the elements in the feature vector set through a formula 5:
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 4 when the computer program is executed.
10. A non-transitory computer readable medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310168401.7A CN116186262A (en) | 2023-02-27 | 2023-02-27 | Menstrual disorder typing system, menstrual disorder typing method, electronic device, and recording medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310168401.7A CN116186262A (en) | 2023-02-27 | 2023-02-27 | Menstrual disorder typing system, menstrual disorder typing method, electronic device, and recording medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116186262A true CN116186262A (en) | 2023-05-30 |
Family
ID=86452006
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310168401.7A Pending CN116186262A (en) | 2023-02-27 | 2023-02-27 | Menstrual disorder typing system, menstrual disorder typing method, electronic device, and recording medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116186262A (en) |
-
2023
- 2023-02-27 CN CN202310168401.7A patent/CN116186262A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hassantabar et al. | CovidDeep: SARS-CoV-2/COVID-19 test based on wearable medical sensors and efficient neural networks | |
CN110929511B (en) | Intelligent matching method for personalized traditional Chinese medicine diagnosis and treatment information and traditional Chinese medicine information based on semantic similarity | |
Lin et al. | Nonparametric estimation of the gap time distribution for serial events with censored data | |
CN109102899A (en) | Chinese medicine intelligent assistance system and method based on machine learning and big data | |
CN110246577B (en) | Method for assisting gestational diabetes genetic risk prediction based on artificial intelligence | |
CN109325942A (en) | Eye fundus image Structural Techniques based on full convolutional neural networks | |
CN110335684A (en) | The intelligent dialectical aid decision-making method of Chinese medicine based on topic model technology | |
CN104915561A (en) | Intelligent disease attribute matching method | |
CN111985246B (en) | Disease cognitive system based on main symptoms and accompanying symptom words | |
CN111563891B (en) | Disease prediction system based on color cognition | |
CN112289441B (en) | Medical biological feature information matching system based on multiple modes | |
CN110348019A (en) | A kind of medical bodies vector method for transformation based on attention mechanism | |
Wu et al. | Diagnosis of sleep disorders in traditional Chinese medicine based on adaptive neuro-fuzzy inference system | |
CN109065174A (en) | Consider the case history theme acquisition methods and device of similar constraint | |
CN118335292A (en) | Interactive auxiliary system of special prescription for special diseases of traditional Chinese medicine | |
Tang et al. | Deep adaptation network for subject-specific sleep stage classification based on a single-lead ECG | |
CN112182168A (en) | Medical record text analysis method and device, electronic equipment and storage medium | |
CN113345574B (en) | Traditional Chinese medicine stomachache health preserving scheme obtaining device based on BERT language model and CNN model | |
CN112002419B (en) | Disease auxiliary diagnosis system, equipment and storage medium based on clustering | |
CN112259220B (en) | System, equipment and storage medium for predicting diseases based on nasal bleeding accompanying symptoms | |
CN116913475A (en) | Traditional Chinese medicine curative effect evaluation system and method for gout | |
CN116186262A (en) | Menstrual disorder typing system, menstrual disorder typing method, electronic device, and recording medium | |
Hui et al. | Extraction and classification of tcm medical records based on bert and bi-lstm with attention mechanism | |
CN116501837A (en) | Retrieval method, system, equipment and storage medium based on double-tower recall | |
Karthik et al. | Virtual doctor: an artificial medical diagnostic system based on hard and soft inputs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |