CN114822875B

CN114822875B - Disease and medicine matching method based on naive Bayesian network

Info

Publication number: CN114822875B
Application number: CN202210639583.7A
Authority: CN
Inventors: 李海滨; 杨金凤; 包散丹
Original assignee: Inner Mongolia University of Technology
Current assignee: Inner Mongolia University of Technology
Priority date: 2022-06-08
Filing date: 2022-06-08
Publication date: 2024-09-17
Anticipated expiration: 2042-06-08
Also published as: CN114822875A

Abstract

The invention discloses a matching method for disease and medicine based on naive Bayesian network, wherein disease symptoms ( ^x1 , ^x2 , ..., ^x1 , ..., ^x1 ) are used as input nodes, each of which represents a disease symptom; medicines ( ^y1 , ^y2 , ..., ^yj , ..., ^yJ ) are used as output nodes, each of which represents a medicine; the connection between the input node and the output node represents the mapping relationship between the disease symptoms and the medicine. The disease symptom node and the medicine node are represented by conditional probability, and the conditional probability is used to reflect the probability strength between each input node and the output node. The greater the conditional probability, the stronger the correlation between the disease symptoms and the medicine at both ends of the connection, that is, the more suitable the medicine is for treating the disease symptoms. Through the invention, the doctor only needs to determine the disease symptoms of the patient, which can provide a reference basis for determining the treatment plan, recommend reasonable, effective and efficient medicines to the doctor, and reduce the situation of irrational use of medicines.

Description

Matching method between disease and medicine based on naive Bayesian network

技术领域：Technical field:

本发明涉及生物医药技术领域，尤其涉及一种基于朴素贝叶斯网络的病症与药物的匹配方法。The present invention relates to the field of biomedicine technology, and in particular to a method for matching symptoms and drugs based on a naive Bayesian network.

背景技术：Background technology:

在传统的中蒙医治疗过程中，医生需要根据自身所学为患者开具药方，但是，由于病症以及药材药效的复杂化和多样化，在对药物进行配伍开具药方时，对于医生的经验依赖程度较高，存在药物使用不合理的情况，这将直接影响患者的治疗效果，甚至会影响患者的健康。In the traditional Chinese and Mongolian medicine treatment process, doctors need to prescribe medicine for patients based on what they have learned. However, due to the complexity and diversity of symptoms and the efficacy of medicinal materials, the doctor is highly dependent on his or her experience when prescribing drugs. This leads to irrational use of drugs, which will directly affect the treatment effect and even the health of the patient.

目前，研究配方中药物与其治疗的病症关系的方法有很多，多数使用的是拆方分析法，如药对研究法、单味药研究法等。此类方法显然只能在控制其他药物不变的情况下，改变一味药物或两味药物，探究配方中的药物与病症之间的关系，对配方中治疗的病症进行一味或二味药物的研究，显然并不能精确地从整体角度上对配方中各治疗的病症和各味药物之间的非线性映射关系进行综合性研究，无法为医生开具药方提供参考依据，无法高效、有效的降低不合理用药的情况以及对于医生经验的依赖程度。At present, there are many methods to study the relationship between the drugs in the prescription and the diseases they treat, and most of them use the method of decomposing the prescription, such as the drug pair research method, the single drug research method, etc. This kind of method can obviously only change one or two drugs while keeping other drugs unchanged, explore the relationship between the drugs in the prescription and the disease, and study one or two drugs for the diseases treated in the prescription. Obviously, it is not possible to accurately conduct a comprehensive study on the nonlinear mapping relationship between the diseases treated in the prescription and the drugs from an overall perspective, and it is impossible to provide a reference for doctors to prescribe medicines, and it is impossible to efficiently and effectively reduce the situation of irrational drug use and the degree of dependence on doctors' experience.

发明内容：Summary of the invention:

本发明的目的在于提供一种基于朴素贝叶斯网络的病症与药物的匹配方法，可为确定治疗方案提供参考依据，为医生推荐合理、有效、高效的药物，降低不合理用药的情况，降低对于医生经验的依赖程度，提高医生的工作效率。The purpose of the present invention is to provide a method for matching symptoms and drugs based on a naive Bayesian network, which can provide a reference for determining a treatment plan, recommend reasonable, effective and efficient drugs to doctors, reduce the situation of irrational use of drugs, reduce the degree of dependence on doctors' experience, and improve doctors' work efficiency.

本发明由如下技术方案实施：The present invention is implemented by the following technical solutions:

基于朴素贝叶斯网络的病症与药物的匹配方法，包括以下步骤：The matching method of disease and medicine based on naive Bayesian network includes the following steps:

S1、收集若干传统的配方，将配方对应的病症症状X与所使用的药物Y相对应，形成配方样本集Z；S1. Collect several traditional prescriptions, match the symptoms X corresponding to the prescriptions with the drugs Y used, and form a prescription sample set Z;

S2、将所述S1中建立的配方样本集Z中的部分配方作为训练配方集，并将所述训练配方集进行二值化处理后得到训练集，利用训练集对朴素贝叶斯网络模型进行训练，计算出训练集中病症症状的概率P(xⁱ)、药物的概率P(y^j)以及病症症状与药物之间的条件概率P(xⁱ＝t|y^j＝l)，t＝0，1表示病症症状xⁱ不存在和存在两种情况，l＝0，1表示药物y^j未使用和使用两种情况；S2, using part of the recipes in the recipe sample set Z established in S1 as a training recipe set, binarizing the training recipe set to obtain a training set, using the training set to train the naive Bayesian network model, and calculating the probability P( ^xi ) of the disease symptoms, the probability P( ^yj ) of the drug, and the conditional probability P( ^xi =t| ^yj =l) between the disease symptoms and the drug in the training set, where t=0, 1 represents the two situations of the disease symptom ^xi not existing and existing, and l=0, 1 represents the two situations of the drug ^yj not being used and being used;

S3、将待开药病例的病症症状X’作为输入参数，根据所述S2中计算得到的病症症状的概率P(xⁱ)、药物的概率P(y^j)以及病症症状与药物之间的条件概率P(xⁱ＝t|y^j＝l)，计算出各味药物使用的后验概率P(y^j＝1|X)和各味药物未使用的后验概率P(y^j＝0|X)，其中，X＝(x¹，x²，…，xⁱ，…，x^I)表示患者的各个病症症状，xⁱ(i＝1，2，…，I)的取值为1或0；S3, taking the symptom X' of the case to be prescribed as an input parameter, and calculating the posterior probability P(y j =1| ^X ) of each drug being used and the posterior probability P(y ^j =0|X) of each drug not being used according ^to the probability P(xi) of the symptom, the probability P(y ^j ) of the drug, and the conditional probability P( ^xi =t|y ^j =l) between the symptom and the drug calculated in S2, wherein X=( ^x1 , ^x2 , ..., ^xi , ..., xI) represents each symptom of the patient, and the value of ^xi (i=1, 2, ..., ^I ) is 1 or 0;

S4、比较P(y^j＝1|X)和P(y^j＝0|X)的大小，若P(y^j＝1|X)≥P(y^j＝0|X)，则判断该味药物是与所述病例的病症症状X’相匹配的药物，将所有与所述病例的病症症状X’相匹配的药物进行配伍即得出匹配的药物结果Y＝(y¹，y²，...，y^J)。S4. Compare P(y ^j =1|X) and P(y ^j =0|X). If P(y ^j =1|X)≥P(y ^j =0|X), then determine that the medicine is a medicine that matches the symptoms X' of the case. All medicines that match the symptoms X' of the case are combined to obtain a matching medicine result Y=(y ¹ , y ² , ..., y ^J ).

优选的，所述S1中，Preferably, in said S1,

配方样本集Z＝{(X₁，Y₁)，(X₂，Y₂)，…，(X_n，Y_n)，…，(X_N，Y_N)}，Recipe sample set Z = {(X ₁ , Y ₁ ), (X ₂ , Y ₂ ), … , (X _n , Y _n ), … , (X _N , Y _N )},

其中，N为配方样本集中的配方总数量；Where N is the total number of recipes in the recipe sample set;

第n首配方中对应的病症特征记作向量中的是第n首配方中对应的第i种病症症状；The corresponding symptom characteristics in the nth recipe are recorded as In vector It is the symptom of the i-th disease corresponding to the n-th recipe;

第n首配方中的药物配伍记作向量中的是第n首配方中第j味药物。The drug combination in the nth prescription is recorded as In vector It is the jth medicine in the nth recipe.

优选的，所述S2包括以下步骤：对训练配方集进行二值化处理的方法为，在一首配方中，若病症症状xⁱ存在，则对该病症症状xⁱ赋值1；若病症症状xⁱ不存在，则对该病症症状xⁱ赋值0；同理，若开具的药方中有药物y^j，则对该药物y^j赋值1；若开具的药方中没有药物y^j，则对该药物y^j赋值0。Preferably, S2 includes the following steps: the method of binarizing the training prescription set is that in a prescription, if the disease symptom x ⁱ exists, the disease symptom x ⁱ is assigned a value of 1; if the disease symptom x ⁱ does not exist, the disease symptom x ⁱ is assigned a value of 0; similarly, if the prescribed prescription contains a drug y ^j , the drug y ^j is assigned a value of 1; if the prescribed prescription does not contain a drug y ^j , the drug y ^j is assigned a value of 0.

优选的，所述S2包括以下步骤：Preferably, S2 comprises the following steps:

(1)计算训练集中病症症状的概率P(xⁱ)和药物的概率P(y^j)，具体的，(1) Calculate the probability of symptoms P( ^xi ) and the probability of drugs P( ^yj ) in the training set. Specifically,

在式(1)中，表示在训练集中病症症状xⁱ＝t的累加频数，t＝0，1表示病症症状xⁱ不存在和存在的两种情况，即t＝0表示第n首配方中病症症状xⁱ不存在，t＝1表示第n首配方中病症症状xⁱ存在，N为配方样本集中的配方总数量；In formula (1), represents the cumulative frequency of symptom x ⁱ = t in the training set, t = 0, 1 represents the two situations where symptom x ⁱ does not exist or exists, that is, t = 0 represents the absence of symptom x ⁱ in the nth recipe, t = 1 represents the presence of symptom x ⁱ in the nth recipe, N is the total number of recipes in the recipe sample set;

在式(2)中，表示在训练集中药物y^j＝l的累加频数；l＝0，1表示药物y^j不使用和使用的两种情况，即l＝0表示第n首配方中药物y^j未使用，l＝1表示第n首配方中药物y^j被使用，N为配方样本集中的配方总数量；In formula (2), represents the cumulative frequency of drug y ^j = l in the training set; l = 0, 1 represents the two situations of drug y ^j not being used and being used, that is, l = 0 represents that drug y ^j is not used in the nth formula, l = 1 represents that drug y ^j is used in the nth formula, and N is the total number of formulas in the formula sample set;

(2)计算训练集中病症症状xⁱ＝t且药物y^j＝l时的概率P(xⁱ＝t，y^j＝l)，(2) Calculate the probability P( ^xi = t, ^yj = l) when the disease symptom x ⁱ = t and the drug y ^j = l in the training set,

在式(3)中，表示在训练集中病症症状xⁱ＝t且药物y^j＝l时的配方累加频数；In formula (3), represents the cumulative frequency of prescriptions when the disease symptom x ⁱ = t and the drug y ^j = l in the training set;

(3)计算病症症状与药物之间的条件概率P(xⁱ＝t|y^j＝l)，(3) Calculate the conditional probability P( ^xi = t| ^yj = l) between the disease symptoms and the drug.

将式(2)和式(3)代入式(4)中即可求出条件概率。Substituting equations (2) and (3) into equation (4) we can find the conditional probability.

优选的，所述S3中：各味药物使用和未使用的后验概率P(y^j＝l|X)的计算方法如下：Preferably, in S3: the calculation method of the posterior probability P(y ^j =l|X) of each drug being used and not used is as follows:

在式(5)中，X＝(x¹，x²，…，xⁱ，…，x^I)表示患者的各个病症症状，xⁱ(i＝1，2，…，I)的取值为1或0；l＝0，1表示药物y^j的未使用和使用两种情况，即P(y^j＝0|X)表示药物y^j未使用的后验概率，即P(y^j＝1|X)表示药物y^j使用的后验概率。In formula (5), X = ( ^x1 , ^x2 , ..., ^xi , ..., xI) represents the symptoms of various diseases of the patient, the value of ^xi (i = 1, 2, ..., ^I ) is 1 or 0; l = 0, 1 represents the two situations of drug ^yj not being used and using, that is, P( ^yj = 0|X) represents the posterior probability that drug ^yj is not used, that is, P( ^yj = 1|X) represents the posterior probability that drug ^yj is used.

本发明的优点：Advantages of the present invention:

本发明以病症症状(x¹，x²，…，xⁱ，…，x^I)作为输入节点，每一个输入节点表示一种病症症状；以药物(y¹，y²，…，y^j，…，y^J)作为输出节点，每一个输出节点表示一味药物；输入节点和输出节点之间的连接表示病症症状和药物之间的映射关系。病症症状节点与药物节点之间用条件概率进行表示，以条件概率体现各输入节点到输出节点之间的概率强度，条件概率越大则表明连接两端的病症症状与药物之间的相关性越强，即治疗该病症症状越适合使用该味药物。The present invention uses disease symptoms ( ^x1 , ^x2 , ..., ^x1 , ..., ^xI ) as input nodes, each of which represents a disease symptom; uses drugs ( ^y1 , ^y2 , ..., ^yj , ..., ^yJ ) as output nodes, each of which represents a drug; the connection between the input node and the output node represents the mapping relationship between the disease symptoms and the drug. The disease symptom nodes and the drug nodes are represented by conditional probabilities, and the conditional probabilities are used to reflect the probability strength between each input node and the output node. The greater the conditional probability, the stronger the correlation between the disease symptoms and the drug at both ends of the connection, that is, the more suitable the drug is for treating the disease symptom.

通过本发明，医生只需要确定患者的病症症状，将病症症状输入，即可输出匹配的药物结果，可为确定治疗方案提供参考依据，为医生推荐合理、有效、高效的药物，降低不合理用药的情况，降低对于医生经验的依赖程度，提高医生的工作效率。Through the present invention, doctors only need to determine the patient's symptoms and input the symptoms to output matching drug results, which can provide a reference for determining treatment plans, recommend reasonable, effective and efficient drugs to doctors, reduce irrational use of drugs, reduce dependence on doctors' experience, and improve doctors' work efficiency.

附图说明：Description of the drawings:

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.

图1为本发明基于朴素贝叶斯网络的病症与药物的匹配方法的流程图；FIG1 is a flow chart of a method for matching symptoms and drugs based on a naive Bayesian network according to the present invention;

图2为本发明基于朴素贝叶斯网络的病症与药物的匹配方法的结构模型图。FIG2 is a structural model diagram of the method for matching symptoms and drugs based on a naive Bayesian network according to the present invention.

具体实施方式：Specific implementation method:

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

实施例1：Embodiment 1:

如图1、图2所示的基于朴素贝叶斯网络的病症与药物的匹配方法，包括以下步骤：The matching method of disease and medicine based on naive Bayesian network as shown in FIG1 and FIG2 includes the following steps:

S1、收集24首传统的蒙医配方，将配方对应的病症症状X与所使用的药物Y相对应，形成配方样本集S1. Collect 24 traditional Mongolian medicine prescriptions, match the symptoms X corresponding to the prescriptions with the medicine Y used, and form a prescription sample set

Z＝{(X₁，Y₁)，(X₂，Y₂)，…，(X_n，Yn₎，…，(X₂₄，Y₂₄)}，如表1所示。Z={(X ₁ , Y ₁ ), (X ₂ , Y ₂ ), ..., (X _n , Yn ₎ , ..., (X ₂₄ , Y ₂₄ )}, as shown in Table 1.

表1配方样本集Table 1 Recipe sample set

表1中共涉及46种病症症状，78味药物。将所涉及的病症症状xⁱ(i＝1，2，…，46)和药物y^j(j＝1，2，…，78)列于表2。Table 1 involves 46 disease symptoms and 78 medicines. Table 2 lists the disease symptoms ^xi (i=1, 2, ..., 46) and medicines ^yj (j=1, 2, ..., 78).

表2病症症状与药物Table 2 Symptoms and medications

S2、将S1中建立的配方样本集Z中的前22首作为训练配方集，如表3所示；后2首作为测试配方集。S2. The first 22 recipes in the recipe sample set Z established in S1 are used as the training recipe set, as shown in Table 3; the last 2 are used as the test recipe set.

表3训练配方集Table 3 Training recipe set

将表3训练配方集中的病症症状xⁱ和药物y^j进行数据二值化处理，即在一首配方中，若病症症状xⁱ存在，则对该病症症状xⁱ赋值1；若病症症状xⁱ不存在，则对该病症症状xⁱ赋值0；当时表示第i种病症症状在第n首配方中存在，即第n首配方中的药物配伍可治疗该病症症状；当时表示第i种病症症状在第n首配方中不存在，即第n首配方中的药物配伍未治疗该病症症状。同理，若开具的药方中有药物y^j，则对该药物y^j赋值1；若开具的药方中没有药物y^j，则对该药物y^j赋值0；当时表示第j味药物在第n首配方中使用，当时表示第j味药物在第n首配方中未使用。最终形成如表4所示的训练集。The symptoms ^xi and drugs ^yj in the training formula set in Table 3 are binarized. That is, in a formula, if the symptom ^xi exists, the symptom ^xi is assigned a value of 1; if the symptom ^xi does not exist, the symptom xi ^is assigned a value of 0; When , it means that the symptom of the i-th disease exists in the n-th prescription, that is, the drug combination in the n-th prescription can treat the symptom of the disease; when When , it means that the symptom of the ith disease does not exist in the nth prescription, that is, the drug combination in the nth prescription does not treat the symptom of the disease. Similarly, if the prescription contains drug y ^j , the drug y ^j is assigned a value of 1; if the prescription does not contain drug y ^j , the drug y ^j is assigned a value of 0; when When the jth drug is used in the nth formula, When , it means that the jth drug is not used in the nth formula. Finally, the training set shown in Table 4 is formed.

表4训练集Table 4 Training set

利用训练集对朴素贝叶斯网络模型进行训练，计算出训练集中病症症状的概率P(xⁱ)、药物的概率P(y^j)以及病症症状与药物之间的条件概率P(xⁱ＝t|y^j＝l)，t＝0，1表示病症症状xⁱ不存在和存在两种情况，l＝0，1表示药物y^j未使用和使用两种情况；具体的，包括以下步骤：The naive Bayesian network model is trained using the training set to calculate the probability P( ^xi ) of the symptom, the probability P( ^yj ) of the drug, and the conditional probability P( ^xi =t| ^yj =l) between the symptom and the drug in the training set, where t=0, 1 represents the two situations of the symptom ^xi not existing and existing, and l=0, 1 represents the two situations of the drug ^yj not being used and being used. Specifically, the following steps are included:

在式(1)中，表示在训练集中病症症状xⁱ＝t的累加频数，t＝0，1表示病症症状xⁱ不存在和存在的两种情况，即t＝0表示第n首配方中病症症状xⁱ不存在，t＝1表示第n首配方中病症症状xⁱ存在，N为配方样本集中的配方总数量；利用式(1)可计算各病症症状的不存在概率P(xⁱ＝0)和各病症症状的存在概率P(xⁱ＝1)。In formula (1), represents the cumulative frequency of symptom x ⁱ = t in the training set, t = 0, 1 represents the two situations that symptom x ⁱ does not exist and does not exist, that is, t = 0 represents the absence of symptom x ⁱ in the nth recipe, t = 1 represents the existence of symptom x ⁱ in the nth recipe, N is the total number of recipes in the recipe sample set; using formula (1), the probability of non-existence P (x ⁱ = 0) and the probability of existence P (x ⁱ = 1) of each symptom can be calculated.

在式(2)中，表示在训练集中药物y^j＝l的累加频数；l＝0，1表示药物y^j不使用和使用的两种情况，即l＝0表示第n首配方中药物y^j未使用，l＝1表示第n首配方中药物y^j被使用，N为配方样本集中的配方总数量；利用式(2)可计算各味药物的未使用概率P(y^j＝0)和药物使用概率P(y^j＝1)。In formula (2), represents the cumulative frequency of drug ^yj = l in the training set; l = 0, 1 represents the two situations of drug ^yj not being used and being used, that is, l = 0 represents that drug ^yj is not used in the nth formula, l = 1 represents that drug ^yj is used in the nth formula, and N is the total number of formulas in the formula sample set; Formula (2) can be used to calculate the probability of non-use P( ^yj = 0) and the probability of drug use P( ^yj = 1) of each drug.

利用式(3)可计算配方训练集中药物和病症症状分别处于不同情况下的概率，分别是P(xⁱ＝1，y^j＝1)，P(xⁱ＝1，y^j＝0)，P(xⁱ＝0，y^j＝1)和P(xⁱ＝0，y^j＝0)四部分概率。Formula (3) can be used to calculate the probabilities of drugs and symptoms in the prescription training set being in different situations, which are the four probabilities of P( ^xi = 1, ^yj = 1), P( ^xi = 1, ^yj = 0), P( ^xi = 0, ^yj = 1) and P( ^xi = 0, ^yj = 0).

将计算得到的数据P(y^j＝1)和P(xⁱ＝1，y^j＝1)代入式(4)中，计算出训练集中药物使用与病症症状存在的条件概率P(xⁱ＝1|y^j＝1)；将计算得到的数据P(y^j＝1)和P(xⁱ＝0，y^j＝1)代入式(4)中，计算出训练集中药物使用与病症症状不存在的条件概率P(xⁱ＝0|y^j＝1)；将计算得到的数据P(y^j＝0)和P(xⁱ＝1，y^j＝0)代入式(4)中，计算出训练集中药物未使用与病症症状存在的条件概率P(xⁱ＝1|y^j＝0)；将计算得到的数据P(y^j＝0)和P(xⁱ＝0，y^j＝0)代入式(4)中，计算出训练集中药物未使用与病症症状不存在的条件概率P(xⁱ＝0|y^j＝0)。Substitute the calculated data P( ^yj =1) and P( ^xi =1, yj=1) into formula (4) to calculate the conditional probability P( ^xi =1| ^yj =1) of drug use and the presence of disease symptoms in the training set; substitute the calculated data P( ^yj =1) and P( ^xi =0, ^yj =1) into formula (4) to calculate the conditional probability P( ^xi =0| ^yj ⁼ 1) of drug use and the absence of disease symptoms in the training set; substitute the calculated data P( ^yj =0) and P( ^xi =1, ^yj =0) into formula (4) to calculate the conditional probability P( ^xi =1| ^yj =0) of drug non-use and the presence of disease symptoms in the training set; substitute the calculated data P( ^yj =0) and P( ^xi =0, ^yj =0) into formula (4) to calculate the conditional probability P( ^xi =0|yj=1) of drug non-use and the absence of disease symptoms in the training set. ^j = 0).

各味药物使用和未使用的后验概率P(y^j＝l|X)的计算方法如下：The calculation method of the posterior probability P(y ^j = l|X) of each drug being used or not used is as follows:

在式(5)中，In formula (5),

X＝(x¹，x²，…，xⁱ，…，x^I)表示患者的各个症状症状，xⁱ(i＝1，2，…，I)的取值为1或0。l＝0，1表示药物y^j的未使用和使用两种情况，即P(y^j＝0|X)表示药物y^j未使用的后验概率，P(y^j＝1|X)表示药物y^j使用的后验概率。X＝(x ¹ , x ² , …, x ⁱ , …, x ^I ) represents the symptoms of the patient, and the value of x ⁱ (i＝1, 2, …, I) is 1 or 0. l＝0, 1 represents the two situations of drug y ^j not being used and using, that is, P(y ^j ＝0|X) represents the posterior probability of drug y ^j not being used, and P(y ^j ＝1|X) represents the posterior probability of drug y ^j being used.

实验例1：Experimental Example 1:

将实施例1中测试配方集中的2首配方对应的病症症状，利用本发明方法来得出与其匹配的药物，进而验证传统配方中的药物配伍与由本发明方法给出的药物配伍一致性。其病症症状和药物配伍列于表5。The symptoms corresponding to the two prescriptions in the test prescription set in Example 1 were used to obtain drugs matching them using the method of the present invention, and then the consistency of the drug compatibility in the traditional prescription and the drug compatibility given by the method of the present invention was verified. The symptoms and drug compatibility are listed in Table 5.

表5测试配方集Table 5 Test recipe set

将表5测试配方集中的病症症状xⁱ和药物y^j进行数据二值化处理，最终形成测试集。The disease symptoms ^xi and drugs ^yj in the test prescription set in Table 5 are binarized to form a test set.

以测试集中的病症症状作为输入参数，根据实施例1的方法，首先，计算出病症症状的概率P(xⁱ)、药物的概率P(y^j)以及病症症状与药物之间的条件概率P(xⁱ＝t|y^j＝l)，再计算出使用某一药物的后验概率P(y^j＝1|X)和不使用某一药物的后验概率P(y^j＝0|X)。将同一药物对应的P(y^j＝0|X)和P(y^j＝1|X)进行比较，若P(y^j＝1|X)≥P(y^j＝0|X)则判断该病症症状下使用该药物，否则判断该病症症状下未使用该药物；最后，将所有判断该病症症状下使用的药物进行配伍即为利用本发明方法来得出与其匹配的药物结果Y＝(y¹，y²，...，y^J)，结果列于表6。Taking the symptoms in the test set as input parameters, according to the method of Example 1, first, the probability of the symptom P( ^xi ), the probability of the drug P( ^yj ), and the conditional probability between the symptom and the drug P( ^xi =t| ^yj =l) are calculated, and then the posterior probability P( ^yj =1|X) of using a certain drug and the posterior probability P( ^yj =0|X) of not using a certain drug are calculated. P( ^yj =0|X) and P( ^yj =1|X) corresponding to the same drug are compared. If P( ^yj =1|X)≥P( ^yj =0|X), it is judged that the drug is used under the symptom, otherwise it is judged that the drug is not used under the symptom; finally, all the drugs judged to be used under the symptom are matched, that is, the drug result Y=( ^y1 , ^y2 , ..., ^yJ ) matching the symptom is obtained by using the method of the present invention, and the results are listed in Table 6.

表6利用本发明方法来得出与其匹配的药物结果Table 6 Drug results obtained by using the method of the present invention to match

将表5中的传统的药物配伍和表6中的利用本发明方法来得出与其匹配的药物结果相比较可知，序号2的两组药物配伍完全相同，序号1中，表5传统配方中共9味药物，而表6表利用本发明方法来得出与其匹配的药物结果共给出了10味药物，其中表6用栀子、川楝子和红花替换表5中的兔心和丁香。Comparing the traditional drug combinations in Table 5 with the drug results obtained by using the method of the present invention in Table 6, it can be seen that the two groups of drug combinations in sequence number 2 are exactly the same. In sequence number 1, there are 9 drugs in the traditional formula of Table 5, while there are 10 drugs in Table 6 using the method of the present invention to obtain the drug results matching the traditional drug combinations, wherein Table 6 replaces rabbit heart and clove in Table 5 with Gardenia jasminoides, Toosendan fructus and Carthamus tinctorius.

针对序号2中的病症症状，传统配方与本发明得到的匹配药物结果有所差别，因此进行下面的分析：For the symptoms in sequence number 2, the matching drug results obtained by the traditional formula and the present invention are different, so the following analysis is performed:

根据实施例1中的式(5)可求出单一病症症状下使用单味药物的后验概率P(xⁱ＝1|y^j＝1)，即可反映出药物对病症症状的治疗效果。将P(xⁱ＝1)、P(y^j＝1)和P(xⁱ＝1|y^j＝1)的数据代入式(5)，求得单一病症症状和单味药物之间的后验概率P(y^j＝1|xⁱ＝1)，即得出单味药物的使用和未使用的情况，结果如表7所示。According to formula (5) in Example 1, the posterior probability P( ^xi = 1| ^yj = 1) of using a single drug under a single symptom can be obtained, which can reflect the therapeutic effect of the drug on the symptom. Substituting the data of P( ^xi = 1), P( ^yj = 1) and P( ^xi = 1| ^yj = 1) into formula (5), the posterior probability P( ^yj = 1| ^xi = 1) between a single symptom and a single drug is obtained, that is, the use and non-use of a single drug are obtained, and the results are shown in Table 7.

表7单味药物的使用和未使用的情况Table 7 Use and non-use of single drugs

在表7中所列的病症症状为序号2中出现的病症症状，可以看出各味药物对各种病症症状的治疗效果。由表7可知在治疗心悸方面，在传统配方的药物配伍中丁香和兔心比栀子、川楝子和红花的效果好；而在胸刺痛方面，栀子和川楝子治疗效果比丁香和兔心的治疗效果好。红花在咳嗽、痰白沫多方面治疗效果比丁香和兔心的治疗效果好。因此，对于胸刺痛、咳嗽和痰白沫多严重的患者可以使用表5中的药物配伍；对于心悸严重的患者可以使用表4中的药物配伍。The symptoms listed in Table 7 are the symptoms that appear in sequence number 2, and it can be seen that the therapeutic effects of various medicines on various symptoms. From Table 7, it can be seen that in the treatment of palpitations, in the traditional formula of drug compatibility, cloves and rabbit hearts are better than gardenia, Chuanlianzi and safflower; and in terms of chest pain, gardenia and Chuanlianzi have better therapeutic effects than cloves and rabbit hearts. Safflower has better therapeutic effects than cloves and rabbit hearts in terms of cough and white sputum. Therefore, for patients with severe chest pain, cough and white sputum, the drug compatibility in Table 5 can be used; for patients with severe palpitations, the drug compatibility in Table 4 can be used.

综上可知，本发明的对病症症状进行药物匹配的方法是可靠的。In summary, it can be seen that the method for matching medicines to disease symptoms of the present invention is reliable.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for matching a condition to a drug based on a naive bayes network, comprising the steps of:

S1, collecting a plurality of traditional formulas, and corresponding symptoms X of the formulas to the used medicines Y to form a formula sample set Z;

S2, taking part of formulas in the formula sample set Z established in the S1 as a training formula set, carrying out binarization treatment on the training formula set to obtain a training set, training a naive Bayesian network model by utilizing the training set, and calculating the probability P (x ⁱ) of symptoms in the training set, the probability P (y ^j) of medicines and the conditional probability P (x ⁱ＝t|y^j =l) between symptoms and medicines, wherein t=0, 1 represents that symptoms x ⁱ are not present and are present, and l=0, and 1 represents that the medicines y ^j are not used and are used;

S3, taking a symptom X' of a to-be-prescribed case as an input parameter, and calculating a posterior probability P (y ^j = 1|X) of each medicine and an posterior probability P (y ^j = 0|X) of each medicine which are not used according to the probability P (X ⁱ) of the symptom, the probability P (y ^j) of the medicine and the conditional probability P (X ⁱ＝t|y^j =l) between the symptom and the medicine calculated in the S2, wherein X= (X ¹,x²,…,xⁱ,…,x^I) represents each symptom of a patient, and the value of X ⁱ (i=1, 2, …, I) is 1 or 0;

and S4, comparing the sizes of P (Y ^j = 1|X) and P (Y ^j = 0|X), if P (Y ^j＝1|X)≥P(y^j = 0|X), judging that the medicine is a medicine matched with the symptom X 'of the case, and matching all medicines matched with the symptom X' of the case to obtain a matched medicine result Y= (Y ¹,y²,…,y^J).

2. The method for naive bayes network based condition to drug matching according to claim 1, wherein in S1,

Recipe sample set Z＝{(X₁,Y₁),(X₂,Y₂),…,(X_n,Y_n),…,(X_N,Y_N)},

Wherein N is the total number of formulas in the formula sample set;

the corresponding symptoms in the nth recipe are noted as In vectorsIs the corresponding ith symptom of the disease in the nth formula;

The compatibility of medicines in the nth formula is recorded as In vectorsIs the jth medicine in the nth formula.

3. A method of matching a naive bayes network based condition to a drug according to claim 2, wherein said S2 comprises the steps of: the method for binarizing the training formula set is that in a first formula, if the symptom x ⁱ exists, 1 is assigned to the symptom x ⁱ; if symptom x ⁱ is absent, assigning 0 to symptom x ⁱ; similarly, if the prescribed prescription contains a drug y ^j, then the drug y ^j is assigned 1; if the prescribed prescription does not contain the drug y ^j, then the drug y ^j is assigned a value of 0.

4. A method of matching a naive bayes network based condition to a drug according to claim 2, wherein said S2 comprises the steps of:

(1) The probability of symptoms of the disorder in the training set P (x ⁱ) and the probability of the drug P (y ^j) are calculated, in particular,

In the formula (1), the amino acid sequence of the formula (1),The cumulative frequency of symptoms x ⁱ =t in the training set, t=0, 1 indicates both the absence and presence of symptoms x ⁱ, i.e., t=0 indicates the absence of symptoms x ⁱ in the nth formulation, t=1 indicates the presence of symptoms x ⁱ in the nth formulation, and N is the total number of formulations in the formulation sample set;

in the formula (2), the amino acid sequence of the formula (2), The cumulative frequency of drugs y ^j = l in the training set; l=0, 1 indicates both the non-use and use of drug y ^j, i.e., l=0 indicates that drug y ^j is not used in the nth formulation, l=1 indicates that drug y ^j is used in the nth formulation, and N is the total number of formulations in the formulation sample set;

(2) The probability P (x ⁱ＝t,y^j =l) is calculated for the symptom x ⁱ =t and the drug y ^j =l in the training set,

In the formula (3), the amino acid sequence of the compound,Formulation accumulation frequency, representing symptoms x ⁱ =t and drug y ^j =l in the training set;

(3) The conditional probability P between symptoms of the disorder and the drug is calculated (x ⁱ＝t|y^j = l),

The conditional probability can be obtained by substituting the expression (2) and the expression (3) into the expression (4).

5. The method of naive bayes network based condition to drug matching of claim 4 wherein in S3: the calculation method of the posterior probability P (y ^j =l|x) of each drug used and unused is as follows:

in formula (5), x= (X ¹,x²,…,xⁱ,…,x^I) represents each symptom of the disease in the patient, and X ⁱ (i=1, 2, …, I) has a value of 1 or 0; l=0, 1 indicates both the unused and used cases of drug y ^j, i.e., P (y ^j = 0|X) indicates the posterior probability of drug y ^j being unused, and P (y ^j = 1|X) indicates the posterior probability of drug y ^j being used.