WO2018161435A1 - 一种中医证素的辨证方法及装置 - Google Patents

一种中医证素的辨证方法及装置 Download PDF

Info

Publication number
WO2018161435A1
WO2018161435A1 PCT/CN2017/085353 CN2017085353W WO2018161435A1 WO 2018161435 A1 WO2018161435 A1 WO 2018161435A1 CN 2017085353 W CN2017085353 W CN 2017085353W WO 2018161435 A1 WO2018161435 A1 WO 2018161435A1
Authority
WO
WIPO (PCT)
Prior art keywords
syndrome
information
syndrome information
syndromes
classifier
Prior art date
Application number
PCT/CN2017/085353
Other languages
English (en)
French (fr)
Inventor
李坚强
朱灿杰
颜果开
邓根强
Original Assignee
深圳大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳大学 filed Critical 深圳大学
Publication of WO2018161435A1 publication Critical patent/WO2018161435A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Definitions

  • the invention belongs to the technical field of computers, and in particular relates to a method and a device for dialectic of Chinese medicine syndrome.
  • the card is a generalization of the staged pathological nature of the overall response of the body in the disease.
  • the symptoms and signs of the patient such as headache, dizziness, urinary yellow and chills, are called syndromes and syndromes.
  • the disease is called syndrome.
  • the syndromes mainly include heart, liver, spleen and other diseases, blood deficiency, loss of kidney, blood and other diseases.
  • the syndrome differentiation of Chinese medicine is to analyze the syndrome to determine the syndrome corresponding to the syndrome.
  • the syndromes are combined into a certificate name, for example, the phlegm syndrome and the blood syndrome can be combined into a sputum mutual bond, and the sputum is one of the names.
  • the object of the present invention is to provide a method and a device for syndrome differentiation of TCM syndromes, which aim to solve the problem that it is difficult to process high-dimensional data, requires feature selection, weak generalization ability, and lack of syndromes due to the difficulty in processing high-dimensional data in the prior art TCM syndrome differentiation method. Relevance, which leads to the problem of poor efficiency and effectiveness of syndrome differentiation.
  • the present invention provides a method for dialectic of a TCM syndrome, the method comprising the steps of:
  • the syndrome information including a plurality of syndromes
  • the syndrome classifier is a classifier chain model based on a random forest-based classifier
  • the certificate name corresponding to the syndrome information is synthesized and output according to all the syndromes to which the syndrome information belongs.
  • the present invention provides a syndrome differentiation device for a TCM syndrome, the device comprising:
  • a syndrome information receiving module configured to receive the syndrome information to be classified, and the syndrome included in the syndrome information
  • a first prediction module configured to classify the syndrome information by using a first number of trained syndrome classifiers, and obtain the first number of classification results to determine whether the syndrome information belongs to a predetermined syndrome
  • the syndrome in the information is initially predicted, and the syndrome classifier is a classifier chain model based on a random forest-based classifier;
  • a secondary prediction module configured to perform second prediction on whether the syndrome information belongs to the syndrome in the syndrome information according to the first number of classification results and a preset confidence threshold, and according to the As a result of the second prediction, determining all the syndromes to which the syndrome information belongs;
  • the certificate synthesizing module is configured to synthesize and output the certificate name corresponding to the syndrome information according to all the syndromes to which the syndrome information belongs.
  • the invention receives the syndrome information to be classified, classifies the syndrome information through the first number of trained syndrome classifiers, and obtains the first number of classification results, so that the syndrome information belongs to the preset syndrome information.
  • the initial prediction of the syndrome is based on the first number of classification results and the preset confidence threshold, and whether the syndrome information belongs to the syndrome in the syndrome information is predicted twice, and the certificate is determined according to the result of the second prediction. All the syndromes to which the information belongs belong to the syndromes corresponding to the syndrome information, and the syndrome classifier is a classifier chain model based on a random forest-based classifier, thereby classifying by random forests.
  • the classifier chain model predicts the syndrome information and the subsequent secondary prediction, which improves the high-dimensional data processing capability, generalization ability, and correlation between the syndromes in the syndrome differentiation (multi-label classification).
  • the forest itself has a feature selection function to avoid the feature selection of syndromes in the process of syndrome differentiation, and thus effectively improve the syndrome differentiation efficiency and dialectical effect of the syndrome differentiation process.
  • FIG. 1 is a flowchart showing an implementation of a method for dialectical differentiation of a Chinese medicine syndrome according to a first embodiment of the present invention
  • FIG. 2 is a table for predicting whether a syndrome information belongs to a syndrome in a syndrome information in a method for authenticating a Chinese medicine syndrome provided by the first embodiment of the present invention
  • 3 is a correspondence table between a predetermined certificate name and a syndrome in the method for dialectical differentiation of TCM syndromes provided by the first embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a syndrome differentiation device for a TCM syndrome provided by Embodiment 2 of the present invention.
  • FIG. 5 is a schematic diagram of a preferred structure of a syndrome differentiation device for a TCM syndrome according to a second embodiment of the present invention.
  • Embodiment 1 is a diagrammatic representation of Embodiment 1:
  • FIG. 1 is a flowchart showing an implementation process of a method for authenticating a Chinese medicine syndrome provided by Embodiment 1 of the present invention. For convenience of description, only parts related to the embodiment of the present invention are shown, which are described in detail as follows:
  • step S101 the syndrome information to be classified is received.
  • the embodiment of the invention is applicable to a system or platform for authenticating TCM syndromes, the system or platform classifying the syndrome information, determining the (or belonging) syndrome corresponding to the syndrome information, and then synthesizing the syndromes into a certificate Name, that is, the certificate name corresponding to the certificate.
  • the syndrome information to be classified is received, and the syndrome information is classified to determine all the syndromes corresponding to the syndrome information.
  • the syndrome is the reaction state of the essential organic connection in a certain stage of the disease process, such as the disease position, etiology, disease, disease and the body's disease resistance.
  • the case includes the corresponding syndrome information, in which the syndrome information can be There are multiple reaction states, that is, multiple syndromes.
  • the syndrome information of the cold-cold table is “fever, aversion, headache, body pain, no sweat, tight pulse, thin white fur”, and the evidence of the cold-cold table is the syndrome corresponding to the syndrome information.
  • step S102 the syndrome information is classified by the first number of trained syndrome classifiers, and the first number of classification results are obtained to perform the first time whether the syndrome information belongs to the syndrome in the preset syndrome information. prediction.
  • a first number of syndrome classifiers are pre-trained, and the syndrome classifier is a classifier chain model based on a random forest-based classifier, and each syndrome classifier can obtain a classification.
  • the classification result is the correspondence between the syndrome information and each of the syndromes in the syndrome information, that is, whether the syndrome information belongs to the initial prediction of the syndrome in the syndrome information, for example, the classification result of the i-th syndrome classifier is among them,
  • the syndrome information indicating the classification result of the i-th syndrome classifier is not the first syndrome in the syndrome information.
  • the syndrome information indicating the classification result of the i-th syndrome classifier belongs to the third syndrome element in the syndrome information.
  • each syndrome classifier includes a second number of random forests, and the second number is consistent with the number of syndromes in the syndrome information, and each random forest is pre-trained to determine whether the syndrome information belongs to
  • the corresponding syndrome such as the i-th random forest, can determine whether the syndrome information belongs to the i-th syndrome in the syndrome information. Therefore, when the syndrome information is classified by the syndrome classifier, the second number of random forests are used to classify the syndrome information, and when the current random forest determines that the syndrome information belongs to the corresponding syndrome, the corresponding certificate is The prime is set as the syndrome to join the syndrome information, and then the next random forest classifies the syndrome information of the new syndrome, thereby achieving the correlation between the syndromes.
  • each of the first number of syndrome classifiers can be trained by the following steps:
  • a sample set of the same size as the training data set is randomly selected from the training data set.
  • the training data set includes a plurality of pre-acquired case information, one case information corresponds to one sample, and the case information includes syndrome information and a syndrome corresponding to the syndrome information.
  • the classifier chain model based on the random forest as the base classifier is trained.
  • a third number of subsample sets are randomly selected in the sample set, and the subsample set is the same size as the sample set.
  • Each subsample set can be used to build a decision tree, so a third number of decision trees are obtained, and all decision trees form a random forest, and the sample set is classified by the random forest.
  • the syndrome classifier when training the syndrome classifier, by adjusting the number of syndrome classifiers (the first number) and the number of decision trees in the random forest (the second number), better syndrome resolution efficiency and effect can be obtained.
  • step S103 according to the first number of classification results and the preset confidence threshold, whether the syndrome information belongs to the syndrome in the syndrome information is predicted twice, and the syndrome information is determined according to the result of the second prediction. All the vouchers belonging to it.
  • the confidence vector of each syndrome in the syndrome information is calculated.
  • the confidence vector exceeds the preset confidence threshold, it is determined that the syndrome information belongs to the certificate corresponding to the confidence vector.
  • the confidence vector is calculated as:
  • the secondary prediction process is shown in the table in FIG. 2, and y i is the i-th syndrome in the syndrome information.
  • a prediction vector (or classification result) for the initial prediction (or classification) of the syndrome information by the i-th syndrome classifier For the ith syndrome classifier, whether the syndrome information belongs to the prediction result of each syndrome in the syndrome information, When it is 0, it means not, and when it is 1, it means it belongs.
  • step S104 the certificate name corresponding to the syndrome information is synthesized and output based on all the syndromes to which the syndrome information belongs.
  • a correspondence table of a certificate name and a syndrome is set in advance, and a certificate name including all the syndromes is found in the correspondence table, and the certificate name corresponding to the syndrome information can be found.
  • FIG. 3 shows a correspondence table of a certificate name and a syndrome.
  • the syndrome information is classified by training a syndrome classifier (a classifier chain model based on a random forest as a base classifier) to determine whether the syndrome information belongs to the syndrome in the syndrome information. Preliminary prediction is made, and according to the classification result and the preset confidence threshold, whether the syndrome information belongs to the syndrome in the syndrome information is predicted twice, and all the syndromes to which the syndrome information belongs are obtained, and the combination of these syndromes is obtained.
  • the syndrome name corresponding to the syndrome information so as to realize the syndrome differentiation of syndrome information, through the combination of random forest and classifier chain, effectively improve the high-level data processing ability, generalization ability and association between syndromes in syndrome differentiation. Sexuality, at the same time, in the process of syndrome differentiation, there is no need to select the characteristics of syndromes, which effectively improves the syndrome differentiation efficiency and syndrome differentiation effect.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • Embodiment 4 shows the structure of a syndrome differentiation device for a TCM syndrome provided by Embodiment 4 of the present invention. For the convenience of description, only parts related to the embodiment of the present invention are shown, including:
  • the syndrome information receiving module 41 is configured to receive the syndrome information to be classified
  • the initial prediction module 42 is configured to classify the syndrome information by using the first number of trained syndrome classifiers, and obtain the first number of classification results to determine whether the syndrome information belongs to the syndrome in the preset syndrome information. Make initial predictions;
  • the secondary prediction module 43 is configured to perform secondary prediction on whether the syndrome information belongs to the syndrome in the syndrome information according to the first number of classification results and the preset confidence threshold, and determine according to the result of the secondary prediction. All the syndromes to which the syndrome information belongs;
  • the certificate synthesizing module 44 is configured to synthesize and output the certificate name corresponding to the syndrome information according to all the syndromes to which the syndrome information belongs.
  • the primary prediction module 42 includes a random forest classification module 521, wherein:
  • the random forest classification module 521 is configured to sequentially acquire the current syndrome classifier for classifying the syndrome information in the first number of syndrome classifiers, and pass the second number of random forests in the current syndrome classifier Lin determines whether the syndrome information belongs to the corresponding syndrome in the syndrome information.
  • the secondary prediction module 43 includes a confidence vector calculation module 531 and a threshold comparison module 532, wherein:
  • the confidence vector calculation module 531 is configured to calculate a confidence vector of each syndrome in the syndrome information corresponding to the syndrome information according to the first number of classification results, and the calculation formula of the confidence vector is:
  • a confidence vector a prediction vector for predicting whether the syndrome information belongs to the jth syndrome in the syndrome information for the kth syndrome classifier
  • the threshold comparison module 532 is configured to compare all the confidence vectors with the confidence threshold, and when the confidence vector exceeds the confidence threshold, determine that the syndrome information belongs to the syndrome corresponding to the confidence vector.
  • the certificate synthesizing module 44 includes a certificate synthesizing sub-module 541, wherein:
  • the certificate synthesizing sub-module 541 is configured to determine, according to the preset syndrome certificate correspondence table, a certificate name corresponding to all the syndromes to which the syndrome information belongs, and output the certificate name.
  • the random forest classification module 521 includes a random forest classification chain module, wherein:
  • the random forest classification chain module is configured to sequentially acquire the current random forests for classifying the syndrome information from the second number of random forests, and determine whether the syndrome information belongs to the corresponding syndrome in the syndrome information through the current random forest, and proves When the information belongs to the corresponding syndrome, the corresponding syndrome is set as the syndrome in the syndrome information.
  • the syndrome information is classified by training a syndrome classifier (a classifier chain model based on a random forest as a base classifier) to determine whether the syndrome information belongs to the syndrome in the syndrome information. Preliminary prediction is made, and according to the classification result and the preset confidence threshold, whether the syndrome information belongs to the syndrome in the syndrome information is predicted twice, and all the syndromes to which the syndrome information belongs are obtained, and the combination of these syndromes is obtained.
  • the syndrome name corresponding to the syndrome information so as to realize the syndrome differentiation of syndrome information, through the combination of random forest and classifier chain, effectively improve the high-level data processing ability, generalization ability and association between syndromes in syndrome differentiation. Sexuality, at the same time, in the process of syndrome differentiation, there is no need to select the characteristics of syndromes, which effectively improves the syndrome differentiation efficiency and syndrome differentiation effect.
  • each module of the syndrome authentication device of the TCM syndrome can be corresponding hardware or software.
  • each module can be a separate software and hardware module, or can be integrated into a software and hardware module, which is not limited to the present invention.
  • For the specific implementation of each module refer to the description of each step in the foregoing first embodiment. I will not repeat them here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种中医证素的辨证方法及装置,所述方法包括:接收待分类的证候信息(S101),通过第一数目个训练好的证候分类器对证候信息进行分类,获取第一数目个分类结果,以对证候信息是否属于预设证素信息中的证素进行初步预测(S102),根据第一数目个分类结果和预设的置信度阈值,对证候信息是否属于证素信息中的证素进行二次预测,并根据二次预测的结果,确定证候信息所属的所有证素(S103),根据证候信息所述的所有证素,合成并输出证候信息对应的证名(S104),从而有效地提高了证素辨证的高维数据处理能力、泛化能力以及证素间的关联性,同时在证素辨证过程中不需对证候进行特征选择,有效地提高了证素的辨证效率和辨证效果。

Description

一种中医证素的辨证方法及装置 技术领域
本发明属于计算机技术领域,尤其涉及一种中医证素的辨证方法及装置。
背景技术
在中医学中,证是对疾病中机体整体反应的阶段性病理本质的概括,患者的症状、体征,如头痛、目眩、尿黄以及畏寒等,被称为证候,证候的病位和病性被称为证素,证素主要有心、肝、脾等病位和血虚、津亏、血於等病性,中医的辨证即对证候进行分析,以确定证候对应的证素,再将证素组合成证名,例如痰湿证素和血於证素可以组合成痰瘀互结证,痰瘀互结即证名中的一种。
自证素辨证提出以来,不少研究人员以临床数据为基础,通过模糊数学、统计分析和神经网络等方法在证素分类上进行研究,然而由于所用学习方法自身的缺点和局限性,例如难以处理高维数据、需要特征选择、泛化能力弱、缺乏标记关联性,无法得到较好的辨证效率和辨证效果。
发明内容
本发明的目的在于提供一种中医证素的辨证方法及装置,旨在解决由于现有技术中中医证素辨证方法难以处理高维数据、需要特征选择、泛化能力弱、证素之间缺乏关联性,导致证素辨证的效率和效果不佳的问题。
一方面,本发明提供了一种中医证素的辨证方法,所述方法包括下述步骤:
接收待分类的证候信息,所述证候信息包括多个证候;
通过第一数目个训练好的证候分类器对所述证候信息进行分类,获取所述第一数目个分类结果,以对所述证候信息是否属于预设证素信息中的证素进行初次预测,所述证候分类器为以随机森林为基分类器的分类器链模型;
根据所述第一数目个分类结果和预设的置信度阈值,对所述证候信息是否属于所述证素信息中的证素进行二次预测,并根据所述二次预测的结果,确定所述证候信息所属的所有证素;
根据所述证候信息所属的所有证素,合成并输出所述证候信息对应的证名。
另一方面,本发明提供了一种中医证素的辨证装置,所述装置包括:
证候信息接收模块,用于接收待分类的证候信息,所述证候信息包括的证候;
初次预测模块,用于通过第一数目个训练好的证候分类器对所述证候信息进行分类,获取所述第一数目个分类结果,以对所述证候信息是否属于预设证素信息中的证素进行初次预测,所述证候分类器为以随机森林为基分类器的分类器链模型;
二次预测模块,用于根据所述第一数目个分类结果和预设的置信度阈值,对所述证候信息是否属于所述证素信息中的证素进行二次预测,并根据所述二次预测的结果,确定所述证候信息所属的所有证素;以及
证名合成模块,用于根据所述证候信息所属的所有证素,合成并输出所述证候信息对应的证名。
本发明接收待分类的证候信息,通过第一数目个训练好的证候分类器对证候信息进行分类,获取第一数目个分类结果,以对证候信息是否属于预设证素信息中的证素进行初次预测,根据第一数目个分类结果和预设的置信度阈值,对证候信息是否属于证素信息中的证素进行二次预测,并根据二次预测的结果,确定证候信息所属的所有证素,将所述所有证素合成为证候信息对应的证名,证候分类器为以随机森林为基分类器的分类器链模型,从而通过以随机森林为基分类器的分类器链模型对证候信息进行一次预测、以及后续的二次预测,提高了证素辨证(多标签分类)中高维数据处理能力、泛化能力、证素间的关联性,由于随机森林自身带有特征选择功能,避免在证素辨证过程中对证候进行特征选择,进而有效地提高了证素分辨过程的辨证效率和辨证效果。
附图说明
图1是本发明实施例一提供的中医证素的辨证方法的实现流程图;
图2是本发明实施例一提供的中医证素的辨证方法中对证候信息是否属于证素信息中的证素进行二次预测的表格;
图3是本发明实施例一提供的中医证素的辨证方法中预设的证名与证素的对应表;
图4是本发明实施例二提供的中医证素的辨证装置的结构示意图;以及
图5是本发明实施例二提供的中医证素的辨证装置的优选结构示意图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
以下结合具体实施例对本发明的具体实现进行详细描述:
实施例一:
图1示出了本发明实施例一提供的中医证素的辨证方法的实现流程,为了便于说明,仅示出了与本发明实施例相关的部分,详述如下:
在步骤S101中,接收待分类的证候信息。
本发明实施例适用于对中医证素进行辨证的系统或平台,该系统或平台对证候信息进行分类,确定证候信息对应的(或所属的)证素,再将这些证素合成为证名,即得到证候对应的证名。
在本发明实施例中,接收待分类的证候信息,对该证候信息进行分类,以确定该证候信息对应的所有证素。证候为疾病过程中一定阶段的病位、病因、病性、病势及机体抗病能力强弱等本质有机联系的反应状态,一份病例中包括相应的证候信息,其中证候信息中可存在多个反应状态,即多个证候。例如, 风寒表实证的证候信息为“发热恶寒,头痛,身痛,无汗,脉浮紧,舌苔薄白”,风寒表实证为该证候信息对应的证素。
在步骤S102中,通过第一数目个训练好的证候分类器对证候信息进行分类,获取第一数目个分类结果,以对证候信息是否属于预设证素信息中的证素进行初次预测。
在本发明实施例中,预先训练好的第一数目个证候分类器,该证候分类器为以随机森林为基分类器的分类器链模型,每个证候分类器都可得到一个分类结果,分类结果为证候信息与证素信息中每个证素的对应关系,即对证候信息是否属于证素信息中证素的初次预测,例如,第i个证候分类器分类结果为
Figure PCTCN2017085353-appb-000001
其中,
Figure PCTCN2017085353-appb-000002
为第i个证候分类器对证候信息初次预测的预测向量,
Figure PCTCN2017085353-appb-000003
表示第i个证候分类器的分类结果中证候信息不属于证素信息中第一个证素,
Figure PCTCN2017085353-appb-000004
表示第i个证候分类器的分类结果中证候信息属于证素信息中第三个证素。
在本发明实施例中,每个证候分类器中包括第二数目个随机森林,第二数目与证素信息中证素的数目一致,每个随机森林被预先训练来确定证候信息是否属于对应的证素,例如第i个随机森林可确定证候信息是否属于证素信息中第i个证素。因此,通过证候分类器对证候信息进行分类时,由第二数目个随机森林对证候信息进行分类,当当前的随机森林确定证候信息属于对应的证素时,将该对应的证素设置为证候加入证候信息中,再由下一个随机森林对加入新的证候的证候信息进行分类,从而实现了证素间的关联性。
在本发明实施例中,可通过下述步骤训练出第一数目个证候分类器中的每个证候分类器:
(1)从训练数据集中随机与训练数据集大小相同的样本集合。
具体地,训练数据集中包括多个预先采集的病例信息,一个病例信息对应一个样本,且病例信息中包括证候信息以及证候信息对应的证素。
(2)根据样本集合,训练得到以随机森林为基分类器的分类器链模型。
具体地,根据预设的第三数目,在样本集合中随机选取第三数目个子样本集合,该子样本集合与样本集合的大小相同。每个子样本集合可用来建立一棵决策树,因此得到第三数目棵决策树,由所有的决策树组成一个随机森林,并通过该随机森林对样本集合进行分类。
优选地,在训练证候分类器时,通过调整证候分类器的数目(第一数目)与随机森林中决策树的数目(第二数目),能够得到较好的证素分辨效率和效果。
在步骤S103中,根据第一数目个分类结果和预设的置信度阈值,对证候信息是否属于证素信息中的证素进行二次预测,并根据二次预测的结果,确定证候信息所属的所有证素。
在本发明实施例中,计算证候信息对证素信息中每个证素的置信度向量,当置信度向量超过预设的置信度阈值时,确定证候信息属于该置信度向量对应的证素。具体地,置信度向量的计算公式为:
Figure PCTCN2017085353-appb-000005
其中,
Figure PCTCN2017085353-appb-000006
为置信度向量,
Figure PCTCN2017085353-appb-000007
为第k个证候分类器对证候信息是否属于证素信息中第j个证素进行初次预测的预测向量。
作为示例地,二次预测过程如图2中的表格所示,yi为证素信息中第i个证素,
Figure PCTCN2017085353-appb-000008
为第i个证候分类器对证候信息进行初测预测(或分类)的预测向量(或分类结果),
Figure PCTCN2017085353-appb-000009
为第i个证候分类器对证候信息是否属于证素信息中j各证素的预测结果,
Figure PCTCN2017085353-appb-000010
为0时表示不属于,为1时表示属于,
Figure PCTCN2017085353-appb-000011
为阈值函数,t为置信度阈值,
Figure PCTCN2017085353-appb-000012
为二次预测的结果,为1时表示属于对应的证素,为0时表示不属于对应的证素。
在步骤S104中,根据证候信息所属的所有证素,合成并输出证候信息对应的证名。
在本发明实施例中,预先设置了证名与证素的对应表,在该对应表中查找包含所有证素的证名,即可找到证候信息对应的证名。例如,图3示出了证名与证素的对应表。
在本发明实施例中,通过训练好的证候分类器(以随机森林为基分类器的分类器链模型)对证候信息进行分类,以对证候信息是否属于证素信息中的证素进行初步预测,并根据分类结果和预设的置信度阈值,对证候信息是否属于证素信息中的证素进行二次预测,得到证候信息所属的所有证素,由这些证素组合得到证候信息对应的证名,从而实现证候信息的证素辨证,通过随机森林和分类器链的结合,有效地提高了证素辨证中高位数据处理能力、泛化能力以及证素间的关联性,同时在证素辨证过程中不需对证候进行特征选择,有效地提高了证素辨证效率和辨证效果。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,所述的程序可以存储于一计算机可读取存储介质中,所述的存储介质,如ROM/RAM、磁盘、光盘等。
实施例二:
图4示出了本发明实施例四提供的中医证素的辨证装置的结构,为了便于说明,仅示出了与本发明实施例相关的部分,其中包括:
证候信息接收模块41,用于接收待分类的证候信息;
初次预测模块42,用于通过第一数目个训练好的证候分类器对证候信息进行分类,获取第一数目个分类结果,以对证候信息是否属于预设证素信息中的证素进行初次预测;
二次预测模块43,用于根据第一数目个分类结果和预设的置信度阈值,对证候信息是否属于证素信息中的证素进行二次预测,并根据二次预测的结果,确定证候信息所属的所有证素;以及
证名合成模块44,用于根据证候信息所属的所有证素,合成并输出证候信息对应的证名。
优选地,如图5所示,初次预测模块42包括随机森林分类模块521,其中:
随机森林分类模块521,用于在第一数目个证候分类器中依次获取对证候信息进行分类的当前证候分类器,并通过当前证候分类器中第二数目个随机森 林确定证候信息是否属于证素信息中相应的证素。
优选地,二次预测模块43包括置信度向量计算模块531和阈值比较模块532,其中:
置信度向量计算模块531,用于根据第一数目个分类结果,计算证候信息对应证素信息中每个证素的置信度向量,置信度向量的计算公式为:
Figure PCTCN2017085353-appb-000013
其中,
Figure PCTCN2017085353-appb-000014
为置信度向量,
Figure PCTCN2017085353-appb-000015
为第k个证候分类器对证候信息是否属于证素信息中第j个证素进行预测的预测向量;以及
阈值比较模块532,用于将所有置信度向量与置信度阈值进行比较,当置信度向量超过置信度阈值时,确定证候信息属于置信度向量对应的证素。
优选地,证名合成模块44包括证名合成子模块541,其中:
证名合成子模块541,用于根据预设的证素证名对应表,确定对证候信息所属的所有证素进行合成后对应的证名,并输出证名。
优选地,随机森林分类模块521包括随机森林分类链模块,其中:
随机森林分类链模块,用于从第二数目个随机森林中依次获取对证候信息进行分类的当前随机森林,通过当前随机森林确定证候信息是否属于证素信息中相应的证素,当证候信息属于相应的证素时,将相应的证素设置为证候信息中的证候。
在本发明实施例中,通过训练好的证候分类器(以随机森林为基分类器的分类器链模型)对证候信息进行分类,以对证候信息是否属于证素信息中的证素进行初步预测,并根据分类结果和预设的置信度阈值,对证候信息是否属于证素信息中的证素进行二次预测,得到证候信息所属的所有证素,由这些证素组合得到证候信息对应的证名,从而实现证候信息的证素辨证,通过随机森林和分类器链的结合,有效地提高了证素辨证中高位数据处理能力、泛化能力以及证素间的关联性,同时在证素辨证过程中不需对证候进行特征选择,有效地提高了证素辨证效率和辨证效果。
在本发明实施例中,中医证素的辨证装置的各模块可由相应的硬件或软件 模块实现,各模块可以为独立的软、硬件模块,也可以集成为一个软、硬件模块,在此不用以限制本发明,各模块的具体实施方式可参考前述实施例一中各步骤的描述,在此不再赘述。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种中医证素的辨证方法,其特征在于,所述方法包括下述步骤:
    接收待分类的证候信息,所述证候信息包括多个证候;
    通过第一数目个训练好的证候分类器对所述证候信息进行分类,获取所述第一数目个分类结果,以对所述证候信息是否属于预设证素信息中的证素进行初次预测,所述证候分类器为以随机森林为基分类器的分类器链模型;
    根据所述第一数目个分类结果和预设的置信度阈值,对所述证候信息是否属于所述证素信息中的证素进行二次预测,并根据所述二次预测的结果,确定所述证候信息所属的所有证素;
    根据所述证候信息所属的所有证素,合成并输出所述证候信息对应的证名。
  2. 如权利要求1所述的方法,其特征在于,通过第一数目个训练好的证候分类器对所述证候信息进行分类,获取所述第一数目个分类结果的步骤,包括:
    在所述第一数目个证候分类器中依次获取对所述证候信息进行分类的当前证候分类器,并通过所述当前证候分类器中第二数目个随机森林确定所述证候信息是否属于所述证素信息中相应的证素;
    所述第二数目与所述证素信息中证素的数目相同。
  3. 如权利要求2所述的方法,其特征在于,通过所述当前证候分类器中第二数目个随机森林确定所述证候信息是否属于所述证素信息中相应的证素的步骤,包括:
    从所述第二数目个随机森林中依次获取对所述证候信息进行分类的当前随机森林,通过所述当前随机森林确定所述证候信息是否属于所述证素信息中相应的证素,当所述证候信息属于所述相应的证素时,将所述相应的证素设置为所述证候信息中的证候。
  4. 如权利要求1所述的方法,其特征在于,根据所述第一数目个分类结果和预设的置信度阈值,对所述证候信息是否属于所述证素信息中的证素进行二次预测的步骤,包括:
    根据所述第一数目个分类结果,计算所述证候信息对应所述证素信息中每个证素的置信度向量,所述置信度向量的计算公式为:
    Figure PCTCN2017085353-appb-100001
    其中,
    Figure PCTCN2017085353-appb-100002
    为所述置信度向量,
    Figure PCTCN2017085353-appb-100003
    为第k个证候分类器对所述证候信息是否属于所述证素信息中第j个证素进行初次预测的预测向量;
    将所述所有置信度向量与所述置信度阈值进行比较,当所述置信度向量超过所述置信度阈值时,确定所述证候信息属于所述置信度向量对应的证素。
  5. 如权利要求1所述的方法,其特征在于,根据所述证候信息所属的所有证素,合成并输出所述证候信息对应的证名的步骤,包括:
    根据预设的证素证名对应表,确定对所述证候信息所属的所有证素进行合成后对应的证名,并输出所述证名。
  6. 一种中医证素的辨证装置,其特征在于,所述装置包括:
    证候信息接收模块,用于接收待分类的证候信息,所述证候信息包括多个证候;
    初次预测模块,用于通过第一数目个训练好的证候分类器对所述证候信息进行分类,获取所述第一数目个分类结果,以对所述证候信息是否属于预设证素信息中的证素进行初次预测,所述证候分类器为以随机森林为基分类器的分类器链模型;
    二次预测模块,用于根据所述第一数目个分类结果和预设的置信度阈值,对所述证候信息是否属于所述证素信息中的证素进行二次预测,并根据所述二次预测的结果,确定所述证候信息所属的所有证素;以及
    证名合成模块,用于根据所述证候信息所属的所有证素,合成并输出所述证候信息对应的证名。
  7. 如权利要求6所述的装置,其特征在于,所述初次预测模块包括:
    随机森林分类模块,用于在所述第一数目个证候分类器中依次获取对所述证候信息进行分类的当前证候分类器,并通过所述当前证候分类器中第二数目个随机森林确定所述证候信息是否属于所述证素信息中相应的证素;
    所述第二数目与所述证素信息中证素的数目相同。
  8. 如权利要求7所述的装置,其特征在于,所述随机森林分类模块包括:
    随机森林分类链模块,用于从所述第二数目个随机森林中依次获取对所述证候信息进行分类的当前随机森林,通过所述当前随机森林确定所述证候信息是否属于所述证素信息中相应的证素,当所述证候信息属于所述相应的证素时,将所述相应的证素设置为所述证候信息中的证候。
  9. 如权利要求7所述的装置,其特征在于,所述二次预测模块包括:
    置信度向量计算模块,用于根据所述第一数目个分类结果,计算所述证候信息对应所述证素信息中每个证素的置信度向量,所述置信度向量的计算公式为:
    Figure PCTCN2017085353-appb-100004
    其中,
    Figure PCTCN2017085353-appb-100005
    为所述置信度向量,
    Figure PCTCN2017085353-appb-100006
    为第k个证候分类器对所述证候信息是否属于所述证素信息中第j个证素进行初次预测的预测向量;以及
    阈值比较模块,用于将所述所有置信度向量与所述置信度阈值进行比较,当所述置信度向量超过所述置信度阈值时,确定所述证候信息属于所述置信度向量对应的证素。
  10. 如权利要求6所述的装置,其特征在于,所述证名合成模块包括:
    证名合成子模块,用于根据预设的证素证名对应表,确定对所述证候信息所属的所有证素进行合成后对应的证名,并输出所述证名。
PCT/CN2017/085353 2017-03-10 2017-05-22 一种中医证素的辨证方法及装置 WO2018161435A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710141468.6A CN107122583A (zh) 2017-03-10 2017-03-10 一种中医证素的辨证方法及装置
CN201710141468.6 2017-03-10

Publications (1)

Publication Number Publication Date
WO2018161435A1 true WO2018161435A1 (zh) 2018-09-13

Family

ID=59717958

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/085353 WO2018161435A1 (zh) 2017-03-10 2017-05-22 一种中医证素的辨证方法及装置

Country Status (2)

Country Link
CN (1) CN107122583A (zh)
WO (1) WO2018161435A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110853764B (zh) * 2019-11-28 2023-11-14 成都中医药大学 一种糖尿病证候预测系统
CN111128375B (zh) * 2020-01-10 2021-11-02 电子科技大学 一种基于多标签学习的藏医诊断辅助装置
CN113555086B (zh) * 2021-07-26 2024-05-10 平安科技(深圳)有限公司 基于机器学习的辩证分析方法、装置、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102222153A (zh) * 2010-01-27 2011-10-19 洪文学 中医机器问诊数量化辩证诊断方法
CN102831404A (zh) * 2012-08-15 2012-12-19 深圳先进技术研究院 手势检测方法及系统
CN103426008A (zh) * 2013-08-29 2013-12-04 北京大学深圳研究生院 基于在线机器学习的视觉人手跟踪方法及系统
CN105608441A (zh) * 2016-01-13 2016-05-25 浙江宇视科技有限公司 一种车型识别方法及系统
US20160170492A1 (en) * 2014-12-15 2016-06-16 Aaron DeBattista Technologies for robust two-dimensional gesture recognition

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701336B (zh) * 2015-12-31 2018-09-04 深圳先进技术研究院 基于脑电数据的中医辨证分型系统、模型建立的方法和系统
CN105608476B (zh) * 2016-02-16 2019-03-15 北京小米移动软件有限公司 基于随机森林分类器的分类方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102222153A (zh) * 2010-01-27 2011-10-19 洪文学 中医机器问诊数量化辩证诊断方法
CN102831404A (zh) * 2012-08-15 2012-12-19 深圳先进技术研究院 手势检测方法及系统
CN103426008A (zh) * 2013-08-29 2013-12-04 北京大学深圳研究生院 基于在线机器学习的视觉人手跟踪方法及系统
US20160170492A1 (en) * 2014-12-15 2016-06-16 Aaron DeBattista Technologies for robust two-dimensional gesture recognition
CN105608441A (zh) * 2016-01-13 2016-05-25 浙江宇视科技有限公司 一种车型识别方法及系统

Also Published As

Publication number Publication date
CN107122583A (zh) 2017-09-01

Similar Documents

Publication Publication Date Title
Gurovich et al. Identifying facial phenotypes of genetic disorders using deep learning
Wu et al. Self-critical reasoning for robust visual question answering
Firpi et al. Swarmed feature selection
CN109117879B (zh) 图像分类方法、装置及系统
CN110717554B (zh) 图像识别方法、电子设备及存储介质
Gurovich et al. DeepGestalt-identifying rare genetic syndromes using deep learning
Gao et al. The labeled multiple canonical correlation analysis for information fusion
JP2009086901A (ja) 年齢推定システム及び年齢推定方法
WO2018161435A1 (zh) 一种中医证素的辨证方法及装置
CN106295694A (zh) 一种迭代重约束组稀疏表示分类的人脸识别方法
WO2015197029A1 (zh) 一种人脸相似度识别方法和系统
US20240185604A1 (en) System and method for predicting formation in sports
Wang et al. Machine learning-based methods for prediction of linear B-cell epitopes
Yue et al. Fast palmprint identification with multiple templates per subject
CN114511912A (zh) 基于双流卷积神经网络的跨库微表情识别方法及装置
Tavakoli Seq2image: Sequence analysis using visualization and deep convolutional neural network
Lan et al. Learning and integrating multi-level matching features for image-text retrieval
Ali et al. Lyme rashes disease classification using deep feature fusion technique
Jadhav et al. HDL-PI: hybrid DeepLearning technique for person identification using multimodal finger print, iris and face biometric features
WO2021114637A1 (zh) 一种基于深度神经网络的数据处理方法及装置
Hung et al. Improving young stroke prediction by learning with active data augmenter in a large-scale electronic medical claims database
US20230196810A1 (en) Neural ode-based conditional tabular generative adversarial network apparatus and method
WO2019187107A1 (ja) 情報処理装置、制御方法、及びプログラム
CN115565001A (zh) 基于最大平均差异对抗的主动学习方法
Hu et al. Region interaction and attribute embedding for zero-shot learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17900138

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09/12/2019)

122 Ep: pct application non-entry in european phase

Ref document number: 17900138

Country of ref document: EP

Kind code of ref document: A1