CN113707330A - Mongolian medicine syndrome differentiation model construction method, system and method - Google Patents

Mongolian medicine syndrome differentiation model construction method, system and method Download PDF

Info

Publication number
CN113707330A
CN113707330A CN202110872486.8A CN202110872486A CN113707330A CN 113707330 A CN113707330 A CN 113707330A CN 202110872486 A CN202110872486 A CN 202110872486A CN 113707330 A CN113707330 A CN 113707330A
Authority
CN
China
Prior art keywords
sample
correlation
label
neighborhood
medical record
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110872486.8A
Other languages
Chinese (zh)
Other versions
CN113707330B (en
Inventor
陈永波
刘勇国
张云
朱嘉静
杨尚明
李巧勤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110872486.8A priority Critical patent/CN113707330B/en
Publication of CN113707330A publication Critical patent/CN113707330A/en
Application granted granted Critical
Publication of CN113707330B publication Critical patent/CN113707330B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a construction method, a system and a method of a Mongolian medicine syndrome differentiation model, wherein the construction method comprises the following steps: s1, medical record data preprocessing: acquiring different symptoms in the medical record set and expressing the symptoms as a symptom set F, acquiring different syndromes in the medical record set and expressing the syndromes as a syndrome set Y, wherein each medical record in the medical record set is a sample; s2, constructing a neighborhood feature-label correlation calculation model of the sample; s3, constructing a label correlation calculation model of the sample; s4, constructing a calculation model of the interaction coefficient; and S5, constructing a gravity calculation model between the samples in the medical record set based on the calculation model of the interaction coefficient obtained in the step S4 and the heterogeneous overlapping Euclidean measurement distance between the samples, calculating positive and negative discrimination scores, and performing label prediction. The invention not only considers the correlation between symptoms and syndromes, but also combines the correlation between syndromes and syndromes, improves the accuracy of syndrome differentiation results and provides auxiliary decision for the diagnosis and treatment process of doctors.

Description

Mongolian medicine syndrome differentiation model construction method, system and method
Technical Field
The invention relates to the technical field of medical information management, in particular to a construction method, a system and a method of a Mongolian medicine syndrome differentiation model.
Background
Mongolian medicine is an important component of national medicine, has a history of more than 2000 years, and has a unique theoretical system and a good medical effect. Syndrome differentiation is the basis of Mongolian diagnosis and treatment, and the pathology, the disease location and the disease nature of a disease are analyzed through three diagnoses of inspection, inquiry and palpation, so as to determine the label of the disease.
In the traditional Mongolian medicine dialectical process, a doctor obtains the characteristics and physical sign information of a patient through observation and statement of the patient, and then carries out comprehensive analysis by combining factors such as diet, daily life, character and the like of the patient according to personal knowledge and experience to obtain a dialectical result. The syndrome differentiation results are subjective and depend to some extent on the personal experience and knowledge level of the doctor.
At present, there are two types of methods for objectively studying Mongolian medical syndrome differentiation: the Mongolian medicine differentiation diagnosis is quantitatively researched by using a scale method from the whole thinking, so that the Mongolian medicine clinical differentiation diagnosis is objectively and standardizedly facilitated, most of the methods depend on expert experience consensus and have subjectivity; the other type of analysis starts locally with the relationship between the index (also called the feature) and the syndrome type (also called the label) for a specific disease.
Most of the existing Mongolian medicine syndrome differentiation studies only explore the relationship between symptoms and syndromes, and relate to the relationship between syndromes and syndromes less, aiming at the diseases with two or three syndromes existing simultaneously, such as 'Mengkri' disease, which contains three syndromes of Heryi preponderance type, Hira preponderance type and Ba Da gan preponderance type, if only considering the relationship between the three syndromes of Heryi preponderance type, Hira preponderance type and Ba Da gan preponderance type and the conventional index of blood, the relationship between the three syndromes of Heryi preponderance type, His preponderance type and Ba Da gan preponderance type is not considered, thus the accuracy of syndrome differentiation results is poor.
Disclosure of Invention
The invention aims to provide a construction method and a system of a Mongolian medicine syndrome differentiation model.
The invention is realized by the following technical scheme:
a construction method of a Mongolian medicine syndrome differentiation model comprises the following steps:
s1, medical record data preprocessing:
acquiring different symptoms in a medical record set and expressing the symptoms as a symptom set F, acquiring different syndromes in the medical record set and expressing the syndromes as a syndrome set Y, wherein each medical record in the medical record set is a sample, the symptoms and syndromes in the sample are coded by adopting 0-1, the symptoms in the sample are characteristics, and the syndromes are labels;
s2, constructing a neighborhood feature-label correlation calculation model of the sample:
constructing a calculation model of the correlation between each feature and the label set based on the mutual information correlation between the features in the neighborhood of the sample and the labels and the average precision correlation between the features in the neighborhood of the sample and the labels; summing the correlation of each feature and the label set to obtain a neighborhood feature-label correlation calculation model of the sample;
s3, constructing a label correlation calculation model of the sample:
respectively constructing correlation calculation models of every two labels in the samples based on the number of the samples with two different labels in the neighborhood set in the samples, and taking the maximum value of the correlation in the samples as the label correlation of the samples;
s4, constructing a calculation model of an interaction coefficient based on the neighborhood characteristic-label correlation calculation model of the sample obtained in the step S2, the correlation calculation model of every two labels obtained in the step S3 and balance parameters of the two correlations;
s5, based on the interaction coefficient calculation model obtained in the step S4 and the heterogeneous overlapping Euclidean distance between the samples, a gravity calculation model between the samples in the medical record set is constructed, the gravity is calculated based on the calculation model, the gravity is summed, a positive judgment score calculation model and a negative judgment score calculation model are respectively constructed, and whether the samples belong to a certain syndrome in the syndrome set Y or not is judged by comparing the calculated values of the positive judgment score calculation model and the negative judgment score calculation model.
The invention carries on the objectification processing to the symptom and syndrome of the original medical record through the 0-1 coding mode, express it as the format convenient for the computer to process, the invention uses the mutual information, average precision and probability, etc. mode, measure the correlation of the characteristic and label, label correlation of the sample, calculate the sample interaction coefficient; and calculating and comparing the positive and negative discrimination scores by constructing a gravity formula among samples to obtain a final syndrome differentiation result.
The construction method of the invention fully considers the correlation between symptoms and syndromes and the correlation between syndromes, can highlight the effect of symptoms which have great influence on the syndrome differentiation result, is beneficial to improving the accuracy of the syndrome differentiation result, has interpretability of the result, and can provide auxiliary decision for traditional Chinese medicine syndrome differentiation.
Further, in step S2, the calculation model of the correlation between each feature and the label set is shown as follows:
Figure BDA0003189268850000021
in the formula, wherein, MIRh(Ni) Representing the feature f in the neighborhood of sample ihMutual information correlation with tags, APRh(Ni) Representing the feature f in the neighborhood of sample ihAverage accuracy correlation with tags; n is a radical ofiRepresenting a neighborhood of sample i.
Further, in step S4, the calculation model of the interaction coefficient is shown as follows:
ICi=αMi+(1-α)Ri
in the formula, alpha represents a balance parameter of two correlations, MiRepresenting neighborhood feature-label correlation of the sample i; riTag correlation representing sample i
Further, in step S5,
the gravity calculation model is shown as follows:
Figure BDA0003189268850000031
in the formula, ICjGravitational coefficient between sample j and other samples, dF(i, j) is the heterogeneous overlapping euclidean metric distance between sample i and sample j.
Further, in step S1, the encoding with 0-1 specifically includes:
the symptom in the existing symptom set F in each case is 1, and the symptom in the absent symptom set F is 0; the syndrome in the existing syndrome set Y in each medical record is 1, and the syndrome in the absent syndrome set Y is 0.
A Mongolian medicine dialectical method comprises the following steps:
s1, medical record data preprocessing:
acquiring different symptoms in a medical record set and expressing the symptoms as a symptom set F, acquiring different syndromes in the medical record set and expressing the syndromes as a syndrome set Y, wherein each medical record in the medical record set is a sample, the symptoms and syndromes in the sample are coded by adopting 0-1, the symptoms in the sample are characteristics, and the syndromes are labels;
s2, feature-tag correlation analysis:
calculating the correlation between each feature and the label set based on the mutual information correlation between the features in the neighborhood of the sample and the labels and the average precision correlation between the features in the neighborhood of the sample and the labels, summing the correlation between each feature and the label set to obtain the neighborhood feature-label correlation of the sample,
s3, analyzing label correlation of the sample:
respectively calculating the correlation of every two labels in the sample based on the number of samples with two different labels in the neighborhood set in the sample, and taking the maximum value of the correlation in the sample as the label correlation of the sample;
s4, calculation of interaction coefficient:
calculating an interaction coefficient of the sample based on the neighborhood feature-label correlation of the sample obtained in the step S2, the label correlation of the sample obtained in the step S3, and a balance parameter of the two correlations;
s5, label prediction:
based on the interaction coefficient obtained in the step S4 and the heterogeneous overlapping euclidean metric distance between the samples, calculating the attraction force between the samples in the medical record set, calculating the magnitude of the attraction force for each sample belonging to the label in the nearest neighbor set of the samples, and summing the magnitudes to obtain the positive discrimination score of the sample for the label; and calculating the gravity of each sample which does not belong to the label in the nearest neighbor set of the sample, summing the gravity to obtain the negative discrimination score of the sample to the label, comparing the positive discrimination score with the negative discrimination score, and judging whether the sample belongs to a certain syndrome in the syndrome set Y.
Further, the positive discrimination score and the negative discrimination score are compared, if the positive discrimination score is larger than the negative discrimination score, the syndrome exists in the sample, and if the positive discrimination score is smaller than the negative discrimination score, the syndrome does not exist in the sample.
A Mongolian medical dialectical system, comprising:
a data preprocessing module: the system is used for acquiring different symptoms in a medical record set and expressing the symptoms as a symptom set F, acquiring different syndromes in the medical record set and expressing the syndromes as a syndrome set Y, wherein each medical record in the medical record set is a sample, and the symptoms and syndromes in the sample are coded by adopting 0-1;
a feature-tag relevance analysis module: the method comprises the steps of calculating the correlation between each feature and a label set based on the mutual information correlation between the features and the labels in the neighborhood of a sample and the average precision correlation between the features and the labels in the neighborhood of the sample; summing the correlation of each feature and the label set to calculate neighborhood feature-label correlation of the sample, and constructing a correlation matrix of the neighborhood feature and the label of the sample according to the neighborhood feature-label correlation of the sample;
a label correlation analysis module of the sample: the method comprises the steps of calculating the correlation of every two labels in a sample respectively based on the number of samples with two different labels in a neighborhood set in the sample, and taking the maximum value of the correlation in the sample as the label correlation of the sample; constructing a label correlation matrix of the samples according to the label correlation of each sample;
an interaction coefficient calculation module: the system comprises a characteristic-label correlation analysis module, a sample correlation analysis module and a sample correlation analysis module, wherein the characteristic-label correlation analysis module is used for calculating the neighborhood characteristic-label correlation of the sample, the sample correlation calculated by the sample correlation analysis module, and the balance parameters of the two correlations to calculate the interaction coefficient of the sample;
a label prediction module: the system comprises a module for obtaining an interaction coefficient calculated by an interaction coefficient calculation module, calculating the gravitation between samples in a medical record set by combining heterogeneous overlapping Euclidean measurement distance between the samples, calculating the gravitation size of each sample belonging to a label in the nearest neighbor set of the samples, and summing to obtain the positive discrimination score of the sample to the label; and calculating the gravity of each sample which does not belong to the label in the nearest neighbor set of the sample, summing the gravity to obtain the negative discrimination score of the sample to the label, comparing the positive discrimination score with the negative discrimination score, and judging whether the sample belongs to a certain syndrome in the syndrome set Y.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the syndrome differentiation method of the invention fully considers the correlation between symptoms and syndromes and the correlation between syndromes, can highlight the effect of symptoms which have great influence on the syndrome differentiation result, is beneficial to improving the accuracy of the syndrome differentiation result, has interpretability of the result, and can provide auxiliary decision for traditional Chinese medicine syndrome differentiation.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:
FIG. 1 is a block flow diagram of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples and accompanying drawings, and the exemplary embodiments and descriptions thereof are only used for explaining the present invention and are not meant to limit the present invention.
Example 1:
this example illustrates the "gastric Hiragana" disease as a subject.
As shown in fig. 1, a Mongolian medicine syndrome differentiation method includes the following steps:
s1, medical record data preprocessing:
mongolian medical diagnosis mainly includes three syndromes of Ba Da gan, Xila and He Yi, and aims at the diseases with two or three syndromes existing at the same time.
All the different symptoms in the medical data set are expressed as a set of symptoms F ═ F1,…,fk,…,fd]Wherein f iskRepresents the kth symptom in the symptom set, and d represents the number of different symptoms; all the different syndromes are expressed as syndrome set Y ═ Y1,y2,y3}。
All cases are expressed as the set of cases X ═ X1,…,Xi,…,Xn]Wherein X isiIndicating the ith medical case in the medical case set, Xi={xi1,…,xik,…,xid},xikA kth feature representing an ith case; xiCorresponding syndrome vector Yi=[yi1,yi2,yi3]And (4) showing. The symptoms and syndromes in each case are coded by 0-1, xik0 represents the absence of symptoms f in this casek,xik1 represents the symptom f in this casek(ii) a If X isiHaving the syndrome yjThen y isij1, otherwise yij0, where j ∈ 1,2, 3.
S2, feature-tag correlation analysis:
in the Mongolian medicine dialectical process, different characteristics have different influence degrees on dialectical results, i.e. the correlation between different characteristics and labels is different. The scheme constructs a feature-label correlation matrix M and measures the correlation between the features and the labels.
In the neighborhood of sample i, feature fhThe correlation with the set of tags is expressed as
Figure BDA0003189268850000051
The calculation is as follows:
Figure BDA0003189268850000052
wherein, MIRh(Ni) Representing the feature f in the neighborhood of sample ihMutual information correlation with tags, APRh(Ni) Representing the feature f in the neighborhood of sample ihCorrelation with the average accuracy of the tags. N is a radical ofiExpressing the neighborhood of the sample i, calculating the distance between the sample i and other samples in the training set according to a heterogeneous overlapping Euclidean measurement method, sorting the distances according to an ascending order, selecting the first k samples, and obtaining the neighborhood N of the sample ii=(i1,i2,…,ik) Where k denotes the number of neighbors of a sample i, the invention takes an empirical value, k being 10. The heterogeneous overlapping euclidean metric distance of sample i and sample j is defined as:
Figure BDA0003189268850000053
where F is a feature set, xifIs the f-th feature of sample i.
For discrete eigenvalues:
Figure BDA0003189268850000054
for the continuous characteristic value:
Figure BDA0003189268850000061
where, | | denotes an absolute value, max (f) and min (f) are the maximum and minimum values of the feature f, respectively.
MIR of formula (1)h(Ni) The calculation method is as follows:
Figure BDA0003189268850000062
wherein g represents g different tags in the tag set (3 in this embodiment), and p (y)j) Label y for representing neighborhood data set existencejProbability of (d), NMI (f)h,yj) Representing a feature fhAnd label yjThe calculation method of the normalized mutual information is as follows:
Figure BDA0003189268850000063
wherein, H (f)h) And H (y)j) Respectively represent the feature fhAnd a label yjInformation entropy of (1), MI (f)h;yj) Representing a feature fhAnd label yjThe mutual information of (2). H (f)h)、H(yj) And MI (f)h;yj) The following are calculated respectively:
Figure BDA0003189268850000064
Figure BDA0003189268850000065
Figure BDA0003189268850000066
wherein t represents a feature fhThere are t different values (2 in this case), p (f)hq,yj) Representing a feature fhValue q and label y ofjProbability of occurrence together. p (f)hq|yj) Indicating label yjIn the presence of a feature fhThe probability of occurrence of the value q.
APR of formula (1)h(Ni) The calculation is as follows: taking the average precision (Averageprecision) as an evaluation index, and taking the feature h of the sample in the neighborhood and the corresponding label setForming a new classification data set, completing five-fold cross validation by using a multi-label K-nearest neighbor algorithm (ML-KNN), and recording the obtained result as APRh(Ni)。
The neighborhood feature-label correlation for sample i is calculated as follows:
Figure BDA0003189268850000067
the correlation matrix M ═ M of the neighborhood features of all samples and the labels can be obtained according to equations (1) - (10)1,M2,…Mn],MiRepresenting the neighborhood feature-label correlation of sample i.
S3, analyzing label correlation of samples
The labels of the samples have correlation, which is beneficial to improving the accuracy of the syndrome differentiation result. The present scheme uses the sample's label correlation matrix R to measure the label correlation of each sample.
Label y of sample i1And a label y2Correlation of (a) ri1The calculation method is as follows:
Figure BDA0003189268850000071
wherein, count (y)1=1,y2=1,Ni) Neighborhood set N representing sample iiIn while having a label y1And a label y2The number of samples.
Label y of sample i2And a label y3Correlation of (a) ri2The calculation method is as follows:
Figure BDA0003189268850000072
wherein, count (y)2=1,y3=1,Ni) The neighbor set Ni representing the sample i has the label y at the same time2And a label y3The number of samples.
Label y of sample i1And a label y3Correlation of (a) ri3The calculation method is as follows:
Figure BDA0003189268850000073
wherein, count (y)1=1,y3=1,Ni) The neighbor set Ni representing the sample i has the label y at the same time1And a label y3The number of samples.
Label correlation R for sample iiThe calculation method is as follows:
Ri=max(ri1,ri2,ri3) (14)
where max (r)i1,ri2,ri3) Represents taking ri1,ri2,ri3Maximum value of (2).
The tag correlation matrix R ═ R for all samples can be obtained according to equations (11) to (14)1,R2,…,Rn]Wherein R isiIndicating the tag correlation of sample i.
S4, calculation of interaction coefficient
Coefficient of interaction IC of design sampleiTo measure the magnitude of the interaction between sample i and other samples, the following is calculated:
ICi=αMi+(1-α)Ri (15)
where α represents the balance parameter of the two correlations, which the present invention sets to 0.5. MiRepresenting the neighborhood feature-label correlation of the sample i, and calculating R as shown in formula (10)iThe label correlation of sample i is expressed, and the calculation is shown in equation (14).
S5, label prediction
(1) Calculation of gravitational forces between samples
The invention calculates the gravitation between samples by using the idea of classical universal gravitation, and the formula of the gravitation between a sample j and a sample i is as follows:
Figure BDA0003189268850000074
(2) generating a sample label
Nearest neighbor set N for sample iiCalculating the gravity value and summing to obtain the positive discrimination score DS of the sample i to the label y, wherein the calculation is as follows:
Figure BDA0003189268850000075
nearest neighbor set N for sample iiCalculating the gravity value and summing to obtain the negative discrimination score DS' of the sample i to the label y, wherein the calculation is as follows:
Figure BDA0003189268850000081
and comparing the sizes of the DS and the DS' to determine the dialectical result. If DS (i) > DS' (i), then sample i belongs to the y-th label, i.e., sample i has the y-th syndrome. If DS (i) ≦ DS' (i), then sample i does not belong to the yth label, i.e., sample i does not have the yth syndrome.
Example 2:
a Mongolian medical dialectical system, comprising:
a data preprocessing module: the system is used for acquiring different symptoms in a medical record set and expressing the symptoms as a symptom set F, acquiring different syndromes in the medical record set and expressing the syndromes as a syndrome set Y, wherein each medical record in the medical record set is a sample, and the symptoms and syndromes in the sample are coded by adopting 0-1;
a feature-tag relevance analysis module: the method comprises the steps of calculating the correlation between each feature and a label set based on the mutual information correlation between the features and the labels in the neighborhood of a sample and the average precision correlation between the features and the labels in the neighborhood of the sample; summing the correlation of each feature and the label set to calculate neighborhood feature-label correlation of the sample, and constructing a correlation matrix of the neighborhood feature and the label of the sample according to the neighborhood feature-label correlation of the sample;
a label correlation analysis module of the sample: the method comprises the steps of calculating the correlation of every two labels in a sample respectively based on the number of samples with two different labels in a neighborhood set in the sample, and taking the maximum value of the correlation in the sample as the label correlation of the sample; constructing a label correlation matrix of the samples according to the label correlation of each sample;
an interaction coefficient calculation module: the system comprises a characteristic-label correlation analysis module, a sample correlation analysis module and a sample correlation analysis module, wherein the characteristic-label correlation analysis module is used for calculating the neighborhood characteristic-label correlation of the sample, the sample correlation calculated by the sample correlation analysis module, and the balance parameters of the two correlations to calculate the interaction coefficient of the sample;
a label prediction module: the system comprises a module for obtaining an interaction coefficient calculated by an interaction coefficient calculation module, calculating the gravitation between samples in a medical record set by combining heterogeneous overlapping Euclidean measurement distance between the samples, calculating the gravitation size of each sample belonging to a label in the nearest neighbor set of the samples, and summing to obtain the positive discrimination score of the sample to the label; and calculating the gravity of each sample which does not belong to the label in the nearest neighbor set of the sample, summing the gravity to obtain the negative discrimination score of the sample to the label, comparing the positive discrimination score with the negative discrimination score, and judging whether the sample belongs to a certain syndrome in the syndrome set Y.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (8)

1. A construction method of a Mongolian medicine syndrome differentiation model is characterized by comprising the following steps:
s1, medical record data preprocessing:
acquiring different symptoms in a medical record set and expressing the symptoms as a symptom set F, acquiring different syndromes in the medical record set and expressing the syndromes as a syndrome set Y, wherein each medical record in the medical record set is a sample, the symptoms and syndromes in the sample are coded by adopting 0-1, the symptoms in the sample are characteristics, and the syndromes are labels;
s2, constructing a neighborhood feature-label correlation calculation model of the sample:
constructing a calculation model of the correlation between each feature and the label set based on the mutual information correlation between the features in the neighborhood of the sample and the labels and the average precision correlation between the features in the neighborhood of the sample and the labels; summing the correlation of each feature and the label set to obtain a neighborhood feature-label correlation calculation model of the sample;
s3, constructing a label correlation calculation model of the sample:
respectively constructing correlation calculation models of every two labels in the samples based on the number of the samples with two different labels in the neighborhood set in the samples, and taking the maximum value of the correlation in the samples as the label correlation of the samples;
s4, constructing a calculation model of an interaction coefficient based on the neighborhood characteristic-label correlation calculation model of the sample obtained in the step S2, the correlation calculation model of every two labels obtained in the step S3 and balance parameters of the two correlations;
s5, based on the interaction coefficient calculation model obtained in the step S4 and the heterogeneous overlapping Euclidean distance between the samples, a gravity calculation model between the samples in the medical record set is constructed, the gravity is calculated based on the calculation model, the gravity is summed, a positive judgment score calculation model and a negative judgment score calculation model are respectively constructed, and whether the samples belong to a certain syndrome in the syndrome set Y or not is judged by comparing the calculated values of the positive judgment score calculation model and the negative judgment score calculation model.
2. The method of claim 1, wherein in step S2, the calculation model of the correlation between each feature and the label set is represented by the following formula:
Figure FDA0003189268840000011
in the formula, wherein, MIRh(Ni) Representing the feature f in the neighborhood of sample ihMutual information correlation with tags, APRh(Ni) Representing the feature f in the neighborhood of sample ihAverage accuracy correlation with tags; n is a radical ofiRepresenting a neighborhood of sample i.
3. The method of claim 1, wherein in step S4, the interaction coefficient is calculated as follows:
ICi=αMi+(1-α)Ri
in the formula, alpha represents a balance parameter of two correlations, MiRepresenting neighborhood feature-label correlation of the sample i; riIndicating the tag correlation of sample i.
4. The method as claimed in claim 1, wherein in step S5,
the gravity calculation model is shown as follows:
Figure FDA0003189268840000021
in the formula, ICjIs the coefficient of interaction between sample j and other samples, dF(i, j) is the heterogeneous overlapping euclidean metric distance between sample i and sample j.
5. The method for constructing a Mongolian medicine dialectical model according to any one of claims 1 to 4, wherein in step S1, the encoding with 0-1 is specifically:
the symptom in the existing symptom set F in each case is 1, and the symptom in the absent symptom set F is 0; the syndrome in the existing syndrome set Y in each medical record is 1, and the syndrome in the absent syndrome set Y is 0.
6. A Mongolian medicine syndrome differentiation method is characterized by comprising the following steps:
s1, medical record data preprocessing:
acquiring different symptoms in a medical record set and expressing the symptoms as a symptom set F, acquiring different syndromes in the medical record set and expressing the syndromes as a syndrome set Y, wherein each medical record in the medical record set is a sample, the symptoms and syndromes in the sample are coded by adopting 0-1, the symptoms in the sample are characteristics, and the syndromes are labels;
s2, feature-tag correlation analysis:
calculating the correlation between each feature and the label set based on the mutual information correlation between the features in the neighborhood of the sample and the labels and the average precision correlation between the features in the neighborhood of the sample and the labels, summing the correlation between each feature and the label set to obtain the neighborhood feature-label correlation of the sample,
s3, analyzing label correlation of the sample:
respectively calculating the correlation of every two labels in the sample based on the number of samples with two different labels in the neighborhood set in the sample, and taking the maximum value of the correlation in the sample as the label correlation of the sample;
s4, calculation of interaction coefficient:
calculating an interaction coefficient of the sample based on the neighborhood feature-label correlation of the sample obtained in the step S2, the label correlation of the sample obtained in the step S3, and a balance parameter of the two correlations;
s5, label prediction:
based on the interaction coefficient obtained in the step S4 and the heterogeneous overlapping euclidean metric distance between the samples, calculating the attraction force between the samples in the medical record set, calculating the magnitude of the attraction force for each sample belonging to the label in the nearest neighbor set of the samples, and summing the magnitudes to obtain the positive discrimination score of the sample for the label; and calculating the gravity of each sample which does not belong to the label in the nearest neighbor set of the sample, summing the gravity to obtain the negative discrimination score of the sample to the label, comparing the positive discrimination score with the negative discrimination score, and judging whether the sample belongs to a certain syndrome in the syndrome set Y.
7. The Mongolian medicine dialectical method according to claim 6,
and comparing the positive discrimination score with the negative discrimination score, wherein if the positive discrimination score is greater than the negative discrimination score, the syndrome exists in the sample, and if the positive discrimination score is less than the negative discrimination score, the syndrome does not exist in the sample.
8. A Mongolian medicine dialectical system, comprising:
a data preprocessing module: the system is used for acquiring different symptoms in a medical record set and expressing the symptoms as a symptom set F, acquiring different syndromes in the medical record set and expressing the syndromes as a syndrome set Y, wherein each medical record in the medical record set is a sample, and the symptoms and syndromes in the sample are coded by adopting 0-1;
a feature-tag relevance analysis module: the method comprises the steps of calculating the correlation between each feature and a label set based on the mutual information correlation between the features and the labels in the neighborhood of a sample and the average precision correlation between the features and the labels in the neighborhood of the sample; summing the correlation of each feature and the label set to calculate neighborhood feature-label correlation of the sample, and constructing a correlation matrix of the neighborhood feature and the label of the sample according to the neighborhood feature-label correlation of the sample;
a label correlation analysis module of the sample: the method comprises the steps of calculating the correlation of every two labels in a sample respectively based on the number of samples with two different labels in a neighborhood set in the sample, and taking the maximum value of the correlation in the sample as the label correlation of the sample; constructing a label correlation matrix of the samples according to the label correlation of each sample;
an interaction coefficient calculation module: the system comprises a characteristic-label correlation analysis module, a sample correlation analysis module and a sample correlation analysis module, wherein the characteristic-label correlation analysis module is used for calculating the neighborhood characteristic-label correlation of the sample, the sample correlation calculated by the sample correlation analysis module, and the balance parameters of the two correlations to calculate the interaction coefficient of the sample;
a label prediction module: the system comprises a module for obtaining an interaction coefficient calculated by an interaction coefficient calculation module, calculating the gravitation between samples in a medical record set by combining heterogeneous overlapping Euclidean measurement distance between the samples, calculating the gravitation size of each sample belonging to a label in the nearest neighbor set of the samples, and summing to obtain the positive discrimination score of the sample to the label; and calculating the gravity of each sample which does not belong to the label in the nearest neighbor set of the sample, summing the gravity to obtain the negative discrimination score of the sample to the label, comparing the positive discrimination score with the negative discrimination score, and judging whether the sample belongs to a certain syndrome in the syndrome set Y.
CN202110872486.8A 2021-07-30 2021-07-30 Construction method of syndrome differentiation model of Mongolian medicine, syndrome differentiation system and method of Mongolian medicine Active CN113707330B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110872486.8A CN113707330B (en) 2021-07-30 2021-07-30 Construction method of syndrome differentiation model of Mongolian medicine, syndrome differentiation system and method of Mongolian medicine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110872486.8A CN113707330B (en) 2021-07-30 2021-07-30 Construction method of syndrome differentiation model of Mongolian medicine, syndrome differentiation system and method of Mongolian medicine

Publications (2)

Publication Number Publication Date
CN113707330A true CN113707330A (en) 2021-11-26
CN113707330B CN113707330B (en) 2023-04-28

Family

ID=78650981

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110872486.8A Active CN113707330B (en) 2021-07-30 2021-07-30 Construction method of syndrome differentiation model of Mongolian medicine, syndrome differentiation system and method of Mongolian medicine

Country Status (1)

Country Link
CN (1) CN113707330B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116525100A (en) * 2023-04-26 2023-08-01 脉景(杭州)健康管理有限公司 Traditional Chinese medicine prescription reverse verification method and system based on label system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298663A (en) * 2010-06-24 2011-12-28 上海中医药大学 Method for automatically identifying syndrome type in traditional Chinese medical science
CN105701013A (en) * 2016-01-04 2016-06-22 中国石油大学(华东) Software defect data feature selection method based on mutual information
US20180039911A1 (en) * 2016-08-05 2018-02-08 Yandex Europe Ag Method and system of selecting training features for a machine learning algorithm
CN108875795A (en) * 2018-05-28 2018-11-23 哈尔滨工程大学 A kind of feature selecting algorithm based on Relief and mutual information
CN109190678A (en) * 2018-08-08 2019-01-11 北京工商大学 A kind of attack of terrorism type evaluation method based on MLKNN and Gravity Models
CN110335684A (en) * 2019-06-14 2019-10-15 电子科技大学 The intelligent dialectical aid decision-making method of Chinese medicine based on topic model technology

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298663A (en) * 2010-06-24 2011-12-28 上海中医药大学 Method for automatically identifying syndrome type in traditional Chinese medical science
CN105701013A (en) * 2016-01-04 2016-06-22 中国石油大学(华东) Software defect data feature selection method based on mutual information
US20180039911A1 (en) * 2016-08-05 2018-02-08 Yandex Europe Ag Method and system of selecting training features for a machine learning algorithm
CN108875795A (en) * 2018-05-28 2018-11-23 哈尔滨工程大学 A kind of feature selecting algorithm based on Relief and mutual information
CN109190678A (en) * 2018-08-08 2019-01-11 北京工商大学 A kind of attack of terrorism type evaluation method based on MLKNN and Gravity Models
CN110335684A (en) * 2019-06-14 2019-10-15 电子科技大学 The intelligent dialectical aid decision-making method of Chinese medicine based on topic model technology

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AMIN HASHEMI 等: "MFS-MCDM: Multi-label feature selection using multi-criteria decision making" *
LIWEN PENG 等: "Gravitation theory based model for multi-label classification" *
彭黎文: "面向慢性肾脏病中医辨证的计算机辅助决策研究" *
王纪超: "基于多标签学习的分类算法研究" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116525100A (en) * 2023-04-26 2023-08-01 脉景(杭州)健康管理有限公司 Traditional Chinese medicine prescription reverse verification method and system based on label system

Also Published As

Publication number Publication date
CN113707330B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
Yang Machine learning approaches to bioinformatics
Luo et al. Retinal image classification by self-supervised fuzzy clustering network
CN107845424B (en) Method and system for diagnostic information processing analysis
Wen et al. Grouping attributes zero-shot learning for tongue constitution recognition
CN110503155A (en) A kind of method and relevant apparatus, server of information classification
CN116959725A (en) Disease risk prediction method based on multi-mode data fusion
JP7467504B2 (en) Methods and devices for determining chromosomal aneuploidy and for building classification models - Patents.com
CN117174330A (en) IgA nephropathy patient treatment scheme recommendation method based on machine learning
CN113707330A (en) Mongolian medicine syndrome differentiation model construction method, system and method
CN113707317B (en) Disease risk factor importance analysis method based on mixed model
Karnes et al. Adaptive Few-Shot Learning PoC Ultrasound COVID-19 Diagnostic System
CN118116578A (en) Medicine recommendation method based on GPT-4 and LANGCHAIN
US20240054360A1 (en) Similar patients identification method and system based on patient representation image
US20180289292A1 (en) Detection Systems Using Fingerprint Images for Type 1 Diabetes Mellitus and Type 2 Diabettes Mellitus
Bhardwaj et al. Machine Learning-Based Approaches for the Prognosis and Prediction of Multiple Diseases
US20220044762A1 (en) Methods of assessing breast cancer using machine learning systems
Qin A Prediction Model of Diabetes Based on Ensemble Learning
CN112287665A (en) Chronic disease data analysis method and system based on natural language processing and integrated training
Hashim et al. Enhancing Parkinson’s Disease Diagnosis through Stacking Ensemble-Based Machine Learning Approach
US20230334662A1 (en) Methods and apparatus for analyzing pathology patterns of whole-slide images based on graph deep learning
US20240266062A1 (en) Disease risk evaluation method, disease risk evaluation system, and health information processing device
Giri et al. Multimodal Detection and Analysis of Parkinson’s Disease
CN117936019A (en) Multi-mode data fusion-based fracture postoperative weight bearing rehabilitation scheme planning method
Belinda et al. Five layered Ensembled Deep Fully Connected Neural Network based Brain Stroke Prediction
Sharma et al. Machine Learning-Based Algorithms for Prediction of Chronic Kidney Disease: A Review

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant