CN109087702B - A four-diagnosis representation information fusion method for TCM health status analysis - Google Patents
A four-diagnosis representation information fusion method for TCM health status analysis Download PDFInfo
- Publication number
- CN109087702B CN109087702B CN201810878380.7A CN201810878380A CN109087702B CN 109087702 B CN109087702 B CN 109087702B CN 201810878380 A CN201810878380 A CN 201810878380A CN 109087702 B CN109087702 B CN 109087702B
- Authority
- CN
- China
- Prior art keywords
- information
- tester
- representation
- health status
- diagnosis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 29
- 230000003862 health status Effects 0.000 title claims abstract description 25
- 238000003745 diagnosis Methods 0.000 title claims abstract description 18
- 238000007500 overflow downdraw method Methods 0.000 title claims abstract description 8
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 28
- 230000036541 health Effects 0.000 claims abstract description 25
- 208000011580 syndromic disease Diseases 0.000 claims abstract description 19
- 230000004927 fusion Effects 0.000 claims abstract description 8
- 238000000034 method Methods 0.000 claims description 27
- 238000012549 training Methods 0.000 claims description 15
- 238000011156 evaluation Methods 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000005457 optimization Methods 0.000 claims description 12
- 238000012360 testing method Methods 0.000 claims description 6
- 239000003814 drug Substances 0.000 claims description 5
- 238000003709 image segmentation Methods 0.000 claims description 4
- 238000002790 cross-validation Methods 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 238000007689 inspection Methods 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 239000003550 marker Substances 0.000 claims description 3
- 201000010099 disease Diseases 0.000 abstract description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 abstract description 3
- 238000012512 characterization method Methods 0.000 abstract description 2
- 238000009472 formulation Methods 0.000 abstract 1
- 239000000203 mixture Substances 0.000 abstract 1
- 238000005516 engineering process Methods 0.000 description 4
- 238000012795 verification Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000019771 cognition Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Biology (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
用于中医健康状态分析的四诊表征信息融合方法,采集临床就诊患者的望、闻、问、切等信息,用于生成病人的多源信息表示,并标注其隶属的证型类别;利用每个信息源的特征表征及其类别信息分别对测试者的健康状态进行分析,得多个信息源对测试者的辅助决策信息;构建信息融合模型使得决策一致性最大化,用于返回优化的健康状态分析结果;对比测试者的实际健康状态与相应的预测结果来评价所提算法的性能。能检测出测试者当前的健康状态和病变本质,使得测试者能够明自身的体质状况,为制定干预方案提供参考。能提供高精度的健康状态分析结果,为健康保健提供依据。能融合临床就诊患者的四诊表征信息,获得更加准确可靠的状态分析结果。
The four-diagnosis representation information fusion method for TCM health status analysis collects information such as sight, smell, questioning, and diagnosis of clinical patients, and is used to generate multi-source information representation of patients, and mark the syndrome category to which they belong; The characteristic representation of each information source and its category information respectively analyze the health status of the tester, and obtain the auxiliary decision-making information for the tester from multiple information sources; build an information fusion model to maximize the consistency of decision-making, which is used to return the optimal health status. Status analysis results; evaluate the performance of the proposed algorithm by comparing the tester's actual health status with the corresponding predicted results. It can detect the tester's current health status and the nature of the disease, so that the tester can understand his physical condition and provide a reference for the formulation of intervention plans. It can provide high-precision health status analysis results and provide a basis for health care. It can integrate the four-diagnosis characterization information of clinical patients to obtain more accurate and reliable state analysis results.
Description
技术领域technical field
本发明涉及多标记学习,尤其是涉及用于中医健康状态分析的四诊表征信息融合方法。The invention relates to multi-label learning, in particular to a four-diagnosis representation information fusion method for TCM health state analysis.
背景技术Background technique
状态是中医健康认知理论的逻辑起点,健康状态是指人体单位时间内形态结构、生理功能、心理状态、适应外界环境能力的综合状态,体现的是健康的状况和态势。健康状态分析是以中医学理论为依据,将采集的望、闻、问、切等信息用数据形式表达,强调客观地评价人体健康状态和病变本质,并对所患病、证给出概括性判断(李灿东.中医状态学[M].北京:中国中医药出版社,2016)。State is the logical starting point of TCM health cognition theory. Health state refers to the comprehensive state of the human body's morphological structure, physiological function, psychological state, and ability to adapt to the external environment within a unit of time, which reflects the state and situation of health. Health status analysis is based on the theory of traditional Chinese medicine. It expresses the collected information such as sight, smell, questioning, and incision in the form of data, emphasizing the objective evaluation of human health status and the nature of the disease, and giving a generalization of the disease and syndrome. Judgment (Li Candong. State of Traditional Chinese Medicine [M]. Beijing: China Traditional Chinese Medicine Press, 2016).
多标记学习技术用于处理真实世界中具有多义性的对象,在图像自动标注、生物信息学、信息检索以及推荐系统等领域得到了广泛关注和应用。具体地,临床就诊患者的证型分布往往多状态兼挟。故而,立足于人工智能技术解决中医健康状态分析问题,多标记学习技术引入到中医健康状态分析中来。Multi-label learning techniques are used to deal with objects with ambiguity in the real world, and have received extensive attention and applications in the fields of automatic image annotation, bioinformatics, information retrieval, and recommender systems. Specifically, the distribution of syndrome types in clinical patients is often multi-state. Therefore, based on artificial intelligence technology to solve the problem of TCM health status analysis, multi-label learning technology is introduced into TCM health status analysis.
按照中医“四诊合参”的原则,状态分析是建立在四诊信息的基础上。考虑到不同信息源对于预测的贡献程度具有差异性,且不同信息源之间相互关联,那么通过四诊方法收集临床就诊患者的整体信息,进而构建信息融合模型用以分析该患者所处健康状态。According to the principle of "four diagnostics combined with reference" in traditional Chinese medicine, state analysis is based on the information of four diagnostics. Considering that the contribution of different information sources to the prediction is different, and the different information sources are related to each other, the overall information of clinical patients is collected through the four-diagnosis method, and then an information fusion model is constructed to analyze the health status of the patient. .
中医健康大数据呈现多模态性与多标记性等特征,使得传统的数据分析理论、方法与技术面临有效性、准确性与可计算性等严峻挑战。因此,研究用于中医健康状态分析的四诊表征信息融合方法,有利于构建更为准确可靠的辨识模型,有利于发挥人工智能技术的优势促进交叉学科共同发展和繁荣。TCM health big data presents the characteristics of multi-modality and multi-marker, which makes traditional data analysis theories, methods and technologies face severe challenges such as validity, accuracy and computability. Therefore, studying the information fusion method of the four diagnostics for TCM health status analysis is conducive to the construction of a more accurate and reliable identification model, and is conducive to giving full play to the advantages of artificial intelligence technology to promote the common development and prosperity of interdisciplinary subjects.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于针对临床就诊患者多状态兼挟,且诊断信息的来源具有多样性,提供用于中医健康状态分析的四诊表征信息融合方法。The purpose of the present invention is to provide a four-diagnosis representation information fusion method for TCM health state analysis, aiming at the multi-state coexistence of clinical patients and the diverse sources of diagnostic information.
本发明包括以下步骤:The present invention includes the following steps:
1)采集临床就诊患者的望、闻、问、切等信息,用于生成病人的多源信息表示,并标注其隶属的证型类别;1) Collect information such as seeing, smelling, asking, cutting and other information of clinical patients, which is used to generate multi-source information representation of patients, and mark the syndrome type to which they belong;
2)利用每个信息源的特征表征及其类别信息分别对测试者的健康状态进行分析,得到多个信息源对测试者的辅助决策信息;2) Use the characteristic representation of each information source and its category information to analyze the tester's health status respectively, and obtain the auxiliary decision-making information for the tester from multiple information sources;
3)构建信息融合模型使得决策一致性最大化,用于返回优化的健康状态分析结果;3) Build an information fusion model to maximize the consistency of decision-making, which is used to return the optimized health state analysis results;
4)对比测试者的实际健康状态与相应的预测结果来评价所提算法的性能。4) The performance of the proposed algorithm is evaluated by comparing the tester's actual health status with the corresponding predicted results.
在步骤1)中,所述采集临床就诊患者的望、闻、问、切等信息,用于生成病人的多源信息表示,并标注其隶属的证型类别的具体方法可为:In step 1), the described collection of information such as sight, smell, questioning, and incision of clinical patients is used to generate a multi-source information representation of the patient, and the specific method of marking the syndrome category to which it belongs can be:
(1)从电子病历中提取临床就诊患者的四诊表征信息,组成信息源A;利用望诊仪得到患者舌象,基于U-Net网络模型实现舌象分割,然后采用HSV、LAB和RGB描述算子获取舌象多个特征表示,分别组成信息源B、信息源C和信息源D;(1) Extract the four-diagnosis representation information of clinical patients from the electronic medical records to form information source A; obtain the patient's tongue image by using the inspection instrument, realize the tongue image segmentation based on the U-Net network model, and then use HSV, LAB and RGB to describe The operator obtains multiple feature representations of the tongue image, and forms information source B, information source C and information source D respectively;
(2)医生对临床就诊患者的健康状态进行标记,记为{l1,l2,...,lq},1≤j≤q,其中lj为临床就诊患者的第j个证型,q为类别标记的总数;(2) The doctor marks the health status of the clinically treated patients as {l 1 ,l 2 ,...,l q }, 1≤j≤q, where l j is the jth syndrome of the clinically treated patient , q is the total number of category labels;
(3)采用十折交叉验证方法对算法进行验证:将处理好的标准化数据按照9︰1的比例进行划分,分为训练数据和测试数据。(3) The algorithm is verified by the ten-fold cross-validation method: the processed standardized data is divided into training data and test data according to the ratio of 9:1.
在步骤2)中,所述利用每个信息源的特征表征及其类别信息分别对测试者的健康状态进行分析,得到多个信息源对测试者的辅助决策信息的具体方法可为:In step 2), the characteristic representation of each information source and its category information are used to analyze the health status of the tester respectively, and the specific method for obtaining the auxiliary decision-making information for the tester from multiple information sources may be:
(1)采用SVM预测测试者的健康状态,计算公式为:(1) Using SVM to predict the health status of the tester, the calculation formula is:
其中,表示在数据源A上第i个测试者关于第j个证型的预测结果,表示第i个测试者在数据源A上的特征表征信息;in, represents the prediction result of the i-th tester on the j-th syndrome on the data source A, Represents the feature representation information of the i-th tester on data source A;
(2)联合特征表征和相应的预测信息在训练集中搜寻测试者的Top-k个近邻,近邻选择基于测试者与训练样本的相似性关系,计算公式为:(2) Combine the feature representation and the corresponding prediction information to search the top-k neighbors of the tester in the training set. The selection of the neighbors is based on the similarity between the tester and the training samples. The calculation formula is:
其中,包含测试者与训练样本在特征空间上的相似度,用余弦相似性方法计算得到;由杰卡德相似性方法求得,包含测试者与训练样本在标记空间上的相似度;β为阈值,其取值范围为[0,1];in, Contains the similarity between the tester and the training sample in the feature space, calculated by the cosine similarity method; Obtained by the Jaccard similarity method, including the similarity between the tester and the training sample in the label space; β is the threshold, and its value range is [0,1];
(3)利用相似性关系simA对证型之间的相关性建模来重构测试者的标记空间:(3) Use the similarity relationship sim A to model the correlation between the syndromes to reconstruct the tester's labeling space:
其中,表示在数据源A上第i个测试者关于第j个证型的状态分析结果,Yzj表示第i个测试者的第z个近邻在第j个证型上的实际值;in, Represents the state analysis result of the i-th tester on the j-th pattern on the data source A, and Y zj represents the actual value of the i-th tester's z-th neighbor on the j-th pattern;
(4)重复步骤(1)~(3),分别得到基于信息源B~D的状态分析结果。(4) Repeat steps (1) to (3) to obtain state analysis results based on information sources B to D, respectively.
在步骤3)中,所述构建信息融合模型使得决策一致性最大化,用于返回优化的健康状态分析结果的具体方法可为:In step 3), the information fusion model is constructed to maximize decision consistency, and the specific method for returning the optimized health state analysis result may be:
(1)利用临床就诊患者四诊表征信息预测的多个状态结果来获取测试者最终的结果,构建以下优化目标函数进行求解:(1) Use multiple state results predicted by the four-diagnosis representation information of clinical patients to obtain the final result of the tester, and construct the following optimization objective function to solve:
其中,表示第i个测试者在第j个证型上的优化结果,该优化结果通过融合多源的决策信息得到,W={w1,w2,...,wM}为M个信息源的权重分布,其中M=4,另外,cm表示(i,j)的集合,且(i,j)满足α为阈值,其取值范围为[0,1];in, represents the optimization result of the i-th tester on the j-th card type, which is obtained by fusing multi-source decision-making information, W={w 1 ,w 2 ,...,w M } is M information sources The weight distribution of , where M=4, In addition, cm represents the set of (i, j ), and (i, j) satisfies α is the threshold, and its value range is [0,1];
(2)初始化权重,令设置: (2) Initialize the weights, let set up:
(3)固定W,利用梯度下降法求解Y*,计算公式为:(3) Fix W, use the gradient descent method to solve Y * , the calculation formula is:
(4)固定Y*,利用拉格朗日乘子法求解W,计算公式为:(4) Fix Y * , use the Lagrange multiplier method to solve W, and the calculation formula is:
(5)重复步骤(3)和(4),直到优化目标收敛,返回测试者健康状态的优化结果Y*。(5) Steps (3) and (4) are repeated until the optimization objective converges, and the optimization result Y * of the tester's health state is returned.
在步骤4)中,所述对比测试者的实际健康状态与相应的预测结果来评价所提算法的性能的具体方法可为:In step 4), the specific method for evaluating the performance of the proposed algorithm by comparing the actual health state of the tester with the corresponding prediction result may be:
利用所提方法对测试数据中测试者的类别标记进行预测,并采用以下五个指标对所提算法的性能进行评价:The proposed method is used to predict the class labels of testers in the test data, and the following five indicators are used to evaluate the performance of the proposed algorithm:
(1)汉明损失:用于考察样本在单个标记上的误分类情况,该评价指标越小越好;(1) Hamming loss: used to examine the misclassification of the sample on a single marker, the smaller the evaluation index, the better;
(2)1-错误率:用于考察在样本的类别标记排序序列中,序列最前端的标记不属于相关标记集合的情况,该评价指标越小越好;(2) 1-Error rate: It is used to investigate the situation in which the label at the front end of the sequence does not belong to the relevant label set in the class label sorting sequence of the sample. The smaller the evaluation index, the better;
(3)覆盖率:用于考察在样本的类别标记排序序列中,覆盖所有相关标记所需的搜索深度情况,该评价指标越小越好;(3) Coverage rate: It is used to investigate the search depth required to cover all relevant tags in the category tag sorting sequence of the sample. The smaller the evaluation index, the better;
(4)排序损失:用于考察在样本的类别标记排序序列中出现排序错误的情况,该评价指标越小越好;(4) Sorting loss: It is used to investigate the case of sorting errors in the sorting sequence of the class labels of the samples. The smaller the evaluation index, the better;
(5)平均精度:用于考察在样本的类别标记排序序列中,排在相关标记之前的标记仍为相关标记的情况,该评价指标越大越好。(5) Average precision: It is used to investigate the situation where the tags before the relevant tags are still relevant tags in the category tag sorting sequence of the samples. The larger the evaluation index, the better.
与现有技术相比,本发明能够检测出测试者当前的健康状态和病变本质,使得测试者能够明了自身的体质状况,为制定干预方案提供参考。Compared with the prior art, the present invention can detect the current health state and lesion nature of the tester, so that the tester can understand his own physical condition and provide a reference for formulating an intervention plan.
本发明能够提供高精度的健康状态分析结果,为健康保健提供依据。The present invention can provide high-precision health state analysis results and provide basis for health care.
本发明能够融合临床就诊患者的四诊表征信息,从而获得更加准确可靠的状态分析结果。The invention can integrate the four-diagnosis representation information of the clinical patients, so as to obtain more accurate and reliable state analysis results.
附图说明Description of drawings
图1为舌象分割的示意图。Figure 1 is a schematic diagram of tongue image segmentation.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
本实施例包括如下步骤:This embodiment includes the following steps:
1)采集729例临床就诊患者的望、闻、问、切等信息,用于生成病人的多源信息表示,并标注其隶属的证型类别,共计339个证型类别数;1) Collect information such as sight, smell, question, and cut of 729 clinical patients, which are used to generate multi-source information representation of patients, and mark the syndrome category to which they belong, with a total of 339 syndrome categories;
(1)从电子病历中提取临床就诊患者的四诊表征信息,组成信息源A;利用望诊仪得到患者舌象,基于U-Net网络模型实现舌象分割,如图1所示。然后采用HSV、LAB和RGB描述算子获取舌象多个特征表示,分别组成信息源B、信息源C和信息源D;(1) Extract the four-diagnosis characteristic information of clinical patients from the electronic medical records to form an information source A; obtain the patient's tongue image by using the inspection instrument, and realize the tongue image segmentation based on the U-Net network model, as shown in Figure 1. Then, HSV, LAB and RGB descriptors are used to obtain multiple feature representations of the tongue image, which form information source B, information source C and information source D respectively;
(2)医生对临床就诊患者的健康状态进行标记,记为{l1,l2,...,lq}(1≤j≤q)。其中lj为临床就诊患者的第j个证型,q为类别标记的总数;(2) The doctor marks the health status of the clinical patient, which is marked as {l 1 ,l 2 ,...,l q }(1≤j≤q). Among them, l j is the jth syndrome type of the clinical patient, and q is the total number of category markers;
(3)采用十折交叉验证方法对算法进行验证:将处理好的标准化数据按照9︰1的比例进行划分,分为训练数据和测试数据。(3) The algorithm is verified by the ten-fold cross-validation method: the processed standardized data is divided into training data and test data according to the ratio of 9:1.
2)利用每个信息源的特征表征及其类别信息分别分析测试者的健康状态,得到多个信息源对测试者的辅助决策信息;2) Use the characteristic representation of each information source and its category information to analyze the health status of the tester respectively, and obtain the auxiliary decision-making information for the tester from multiple information sources;
(1)采用SVM预测测试者健康状态,计算公式为:(1) Using SVM to predict the health status of the tester, the calculation formula is:
其中,表示在数据源A上第i个测试者关于第j个证型的预测结果,表示第i个测试者在数据源A上的特征表征信息;in, represents the prediction result of the i-th tester on the j-th syndrome on the data source A, Represents the feature representation information of the i-th tester on data source A;
(2)联合特征表征和相应的预测信息在训练集中搜寻测试者的Top-k个近邻。近邻选择基于测试者与训练样本的相似性关系,计算公式为:(2) Combine the feature representation and the corresponding prediction information to search the top-k neighbors of the tester in the training set. The selection of nearest neighbors is based on the similarity between testers and training samples, and the calculation formula is:
其中,包含测试者与训练样本在特征空间上的相似度,用余弦相似性方法计算得到;由杰卡德相似性方法求得,包含测试者与训练样本在标记空间上的相似度;β为阈值,其取值范围为[0,1];in, Contains the similarity between the tester and the training sample in the feature space, calculated by the cosine similarity method; Obtained by the Jaccard similarity method, including the similarity between the tester and the training sample in the label space; β is the threshold, and its value range is [0,1];
(3)利用相似性关系simA对证型之间的相关性建模来重构测试者的标记空间:(3) Use the similarity relationship sim A to model the correlation between the syndromes to reconstruct the tester's labeling space:
其中,表示在数据源A上第i个测试者关于第j个证型的状态分析结果,Yzj表示第i个测试者的第z个近邻在第j个证型上的实际值;in, Represents the state analysis result of the i-th tester on the j-th pattern on the data source A, and Y zj represents the actual value of the i-th tester's z-th neighbor on the j-th pattern;
(4)将信息源A上的预测结果分别与BSVM(M.R.Boutell,J.Luo,X.Shen,C.M.Brown,Learning multi-label scene classification,Pattern Recognition,2004,37(9):1757–1771)和LIFT(M.Zhang,L.Wu,LIFT:Multi-label learning with label-specific features,IEEE Transactions on Pattern Analysis and MachineIntelligence,2015,37(1):107–120)方法进行比较,实验结果如表1所示。算法1对应的是本发明所提算法的验证结果;算法2对应的是LIFT的验证结果;算法3对应的是BSVM的验证结果。从表1中可以看出,本发明通过考虑标记相关性能在大部分的评价指标上好于其他算法。(4) Compare the prediction results on information source A with BSVM (M.R.Boutell,J.Luo,X.Shen,C.M.Brown,Learning multi-label scene classification,Pattern Recognition,2004,37(9):1757–1771) Compared with LIFT (M. Zhang, L. Wu, LIFT: Multi-label learning with label-specific features, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 107–120) method, the experimental results are shown in the table 1 shown. Algorithm 1 corresponds to the verification result of the algorithm proposed in the present invention; Algorithm 2 corresponds to the verification result of LIFT; Algorithm 3 corresponds to the verification result of BSVM. It can be seen from Table 1 that the present invention is better than other algorithms in most of the evaluation indexes by considering the tag-related performance.
表1Table 1
(5)重复步骤(1)~(3),分别得到基于信息源B~D的健康状态分析结果。(5) Steps (1) to (3) are repeated to obtain the health state analysis results based on the information sources B to D, respectively.
3)构建信息融合模型使得决策一致性最大化,用于返回优化的健康状态分析结果;3) Build an information fusion model to maximize the consistency of decision-making, which is used to return the optimized health state analysis results;
(1)利用临床就诊患者四诊表征信息预测的多个状态辨识结果来获取测试者最终的结果,构建以下优化目标函数进行求解:(1) Use multiple state identification results predicted by the four-diagnosis characterization information of clinical patients to obtain the final result of the tester, and construct the following optimization objective function to solve:
其中,表示第i个测试者在第j个证型上的优化结果,该结果通过融合多源的决策信息得到。W={w1,w2,...,wM}为M个信息源的权重分布(这里M=4),另外,cm表示(i,j)的集合,且(i,j)满足α为阈值,其取值范围为[0,1];in, Represents the optimization result of the i-th tester on the j-th card type, which is obtained by fusing multi-source decision-making information. W={w 1 ,w 2 ,...,w M } is the weight distribution of M information sources (here M=4), In addition, cm represents the set of (i, j ), and (i, j) satisfies α is the threshold, and its value range is [0,1];
(2)初始化权重。令设置: (2) Initialize the weights. make set up:
(3)固定W,利用梯度下降法求解Y*,计算公式为:(3) Fix W, use the gradient descent method to solve Y * , the calculation formula is:
(4)固定Y*,利用拉格朗日乘子法求解W,计算公式为:(4) Fix Y * , use the Lagrange multiplier method to solve W, and the calculation formula is:
(5)重复步骤(3)~(4),直到优化目标收敛,返回测试者健康状态的优化结果Y*。(5) Steps (3) to (4) are repeated until the optimization objective converges, and the optimization result Y * of the tester's health state is returned.
4)利用所提方法对测试数据中测试者的健康状态进行分析;4) Use the proposed method to analyze the tester's health status in the test data;
将所提算法与每个信息源的预测结果进行比较,如表2所示。从表2可以看出,所提算法通过融合信息源A~D能在大部分评价指标上得到最优的结果。The proposed algorithm is compared with the prediction results of each information source, as shown in Table 2. It can be seen from Table 2 that the proposed algorithm can obtain the best results on most of the evaluation indicators by fusing the information sources A to D.
表2Table 2
将所提算法与其他融合算法进行比较,如表3所示。算法1对应的是本发明所提算法的验证结果;算法2对应的基于所有信息源预测的平均结果;算法3对应的基于所有信息源预测的投票结果,算法4将所有信息源进行串联,然后利用SVM进行分类。从表3中可以看出,本发明所提算法具有最优的结果。The proposed algorithm is compared with other fusion algorithms, as shown in Table 3. Algorithm 1 corresponds to the verification result of the algorithm proposed in the present invention; Algorithm 2 corresponds to the average result predicted based on all information sources; Algorithm 3 corresponds to the voting result predicted based on all information sources, Algorithm 4 concatenates all information sources, and then Classification using SVM. It can be seen from Table 3 that the algorithm proposed in the present invention has the best results.
表3table 3
本发明首先对四诊采集仪捕获的信息进行预处理,然后分别分析每个信息源的预测结果来判断测试者的健康状态,最后融合多个特征表征信息的预测结果使得状态辨识的一致性最大化,从而为测试者制定干预方案提供准确可靠的参考。The present invention first preprocesses the information captured by the four-diagnosis acquisition instrument, then analyzes the prediction results of each information source separately to judge the health state of the tester, and finally fuses the prediction results of multiple feature representation information to maximize the consistency of state identification Therefore, it can provide an accurate and reliable reference for testers to formulate intervention plans.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810878380.7A CN109087702B (en) | 2018-08-03 | 2018-08-03 | A four-diagnosis representation information fusion method for TCM health status analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810878380.7A CN109087702B (en) | 2018-08-03 | 2018-08-03 | A four-diagnosis representation information fusion method for TCM health status analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109087702A CN109087702A (en) | 2018-12-25 |
CN109087702B true CN109087702B (en) | 2021-07-16 |
Family
ID=64833581
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810878380.7A Active CN109087702B (en) | 2018-08-03 | 2018-08-03 | A four-diagnosis representation information fusion method for TCM health status analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109087702B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111260619A (en) * | 2020-01-14 | 2020-06-09 | 浙江中医药大学 | Tongue body automatic segmentation method based on U-net model |
CN112200091A (en) * | 2020-10-13 | 2021-01-08 | 深圳市悦动天下科技有限公司 | Tongue region identification method and device and computer storage medium |
CN112530584A (en) * | 2020-12-15 | 2021-03-19 | 贵州小宝健康科技有限公司 | Medical diagnosis assisting method and system |
CN113409938A (en) * | 2021-06-30 | 2021-09-17 | 海南医学院 | Modeling method and system of traditional Chinese medicine syndrome type prediction model of systemic lupus erythematosus |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101129261A (en) * | 2007-02-09 | 2008-02-27 | 北京中医药大学 | A device and method for acquiring pulse recognition information and tongue diagnosis information |
CN101647696A (en) * | 2008-08-13 | 2010-02-17 | 上海经路通中医药科技发展有限公司 | Intelligent system for health diagnosis and treatment |
CN104766068A (en) * | 2015-04-20 | 2015-07-08 | 江西中医药大学 | Random walk tongue image extraction method based on multi-rule fusion |
CN105528529A (en) * | 2016-02-20 | 2016-04-27 | 成都中医药大学 | Data processing method of traditional Chinese medicine clinical skill evaluation system based on big data analysis |
CN106874655A (en) * | 2017-01-16 | 2017-06-20 | 西北工业大学 | Traditional Chinese medical science disease type classification Forecasting Methodology based on Multi-label learning and Bayesian network |
CN108198621A (en) * | 2018-01-18 | 2018-06-22 | 中山大学 | A kind of database data synthesis dicision of diagnosis and treatment method based on neural network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11235062B2 (en) * | 2009-03-06 | 2022-02-01 | Metaqor Llc | Dynamic bio-nanoparticle elements |
-
2018
- 2018-08-03 CN CN201810878380.7A patent/CN109087702B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101129261A (en) * | 2007-02-09 | 2008-02-27 | 北京中医药大学 | A device and method for acquiring pulse recognition information and tongue diagnosis information |
CN101647696A (en) * | 2008-08-13 | 2010-02-17 | 上海经路通中医药科技发展有限公司 | Intelligent system for health diagnosis and treatment |
CN104766068A (en) * | 2015-04-20 | 2015-07-08 | 江西中医药大学 | Random walk tongue image extraction method based on multi-rule fusion |
CN105528529A (en) * | 2016-02-20 | 2016-04-27 | 成都中医药大学 | Data processing method of traditional Chinese medicine clinical skill evaluation system based on big data analysis |
CN106874655A (en) * | 2017-01-16 | 2017-06-20 | 西北工业大学 | Traditional Chinese medical science disease type classification Forecasting Methodology based on Multi-label learning and Bayesian network |
CN108198621A (en) * | 2018-01-18 | 2018-06-22 | 中山大学 | A kind of database data synthesis dicision of diagnosis and treatment method based on neural network |
Non-Patent Citations (2)
Title |
---|
Computational drug repositioning using collaborative filtering via multi-source fusion;Jia Zhang.etc;《Expert Systems With Applications》;20171030;第84卷;全文 * |
面向认知的多源数据学习理论和算法研究进展;杨柳,等;《软件学报》;20171130;第28卷(第11期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN109087702A (en) | 2018-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109087702B (en) | A four-diagnosis representation information fusion method for TCM health status analysis | |
Wang et al. | The devil of face recognition is in the noise | |
Zhuang et al. | An effective WSSENet-based similarity retrieval method of large lung CT image databases | |
Sainju et al. | Automated bleeding detection in capsule endoscopy videos using statistical features and region growing | |
Rajan et al. | Fog computing employed computer aided cancer classification system using deep neural network in internet of things based healthcare system | |
Huan et al. | Deep convolutional neural networks for classifying body constitution based on face image | |
CN109102899A (en) | Chinese medicine intelligent assistance system and method based on machine learning and big data | |
Anwar et al. | Automatic breast cancer classification from histopathological images | |
CN116705300A (en) | Medical decision assistance method, system and storage medium based on sign data analysis | |
Weng et al. | Multi-label symptom analysis and modeling of TCM diagnosis of hypertension | |
Chiwariro et al. | Comparative analysis of deep learning convolutional neural networks based on transfer learning for pneumonia detection | |
Nandakumar et al. | A novel graph neural network to localize eloquent cortex in brain tumor patients from resting-state fmri connectivity | |
CN112256754A (en) | Ultrasonic detection and analysis system and method based on standard model | |
CN114399634B (en) | Three-dimensional image classification method, system, equipment and medium based on weak supervision learning | |
Sun et al. | Liver tumor segmentation and subsequent risk prediction based on Deeplabv3+ | |
Zhang et al. | Weighted hashing with multiple cues for cell-level analysis of histopathological images | |
CN118193770B (en) | Medical image retrieval method and system based on deep learning | |
CN113990454A (en) | Malicious behavior identification method based on federal learning and feature extraction | |
Nyon et al. | Durian species recognition system based on global shape representations and k-nearest neighbors | |
CN118429680A (en) | Method and system for identifying and predicting tongue picture full-class label | |
CN113255718B (en) | Cervical cell auxiliary diagnosis method based on deep learning cascade network method | |
CN116309465A (en) | Tongue image detection and positioning method based on improved YOLOv5 in natural environment | |
Mohapatra et al. | Automated invasive cervical cancer disease detection at early stage through deep learning | |
Ornek et al. | Classification of Medical Thermograms Belonging Neonates by Using Segmentation, Feature Engineering and Machine Learning Algorithms. | |
Cheslerean-Boghiu et al. | Transformer-based interpretable multi-modal data fusion for skin lesion classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |