CN114121212A - Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning - Google Patents
Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning Download PDFInfo
- Publication number
- CN114121212A CN114121212A CN202111402132.3A CN202111402132A CN114121212A CN 114121212 A CN114121212 A CN 114121212A CN 202111402132 A CN202111402132 A CN 202111402132A CN 114121212 A CN114121212 A CN 114121212A
- Authority
- CN
- China
- Prior art keywords
- entity
- symptom
- group
- representation
- chinese medicine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 239000003814 drug Substances 0.000 title claims abstract description 52
- 208000024891 symptom Diseases 0.000 claims abstract description 102
- 208000011580 syndromic disease Diseases 0.000 claims abstract description 48
- 235000008216 herbs Nutrition 0.000 claims abstract description 29
- 230000002776 aggregation Effects 0.000 claims abstract description 25
- 238000004220 aggregation Methods 0.000 claims abstract description 25
- 230000008569 process Effects 0.000 claims abstract description 22
- 241000411851 herbal medicine Species 0.000 claims abstract description 16
- 230000010415 tropism Effects 0.000 claims abstract description 5
- 230000002452 interceptive effect Effects 0.000 claims abstract description 3
- 239000010410 layer Substances 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 20
- 230000006698 induction Effects 0.000 claims description 15
- 230000007246 mechanism Effects 0.000 claims description 15
- 239000011159 matrix material Substances 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 230000001902 propagating effect Effects 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 239000002356 single layer Substances 0.000 claims description 3
- 238000013519 translation Methods 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims description 2
- 230000001419 dependent effect Effects 0.000 claims description 2
- 238000003745 diagnosis Methods 0.000 abstract description 6
- 230000004069 differentiation Effects 0.000 abstract description 4
- 238000007418 data mining Methods 0.000 abstract 1
- 238000004806 packaging method and process Methods 0.000 abstract 1
- 241000218671 Ephedra Species 0.000 description 12
- 230000035900 sweating Effects 0.000 description 11
- 230000000694 effects Effects 0.000 description 6
- 230000003993 interaction Effects 0.000 description 5
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 239000000796 flavoring agent Substances 0.000 description 2
- 235000019634 flavors Nutrition 0.000 description 2
- 238000006116 polymerization reaction Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 241001465251 Ephedra sinica Species 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/10—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/90—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to alternative medicines, e.g. homeopathy or oriental medicines
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Theoretical Computer Science (AREA)
- Public Health (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Alternative & Traditional Medicine (AREA)
- Pharmacology & Pharmacy (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Medicinal Chemistry (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Chemical & Material Sciences (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning, which sequentially comprises the following steps: step 1, constructing a traditional Chinese medicine knowledge graph, taking the herbs as the core, packaging the properties of the herbs, such as nature, taste, channel tropism, efficacy and the like into a triple group, adding the symptoms in the prescription data set and the treatment relationship of the herbs into the knowledge graph, and finally forming the traditional Chinese medicine knowledge graphStep 2, updating the embedded representation of each entity through propagation and aggregation of neighborhood information in the knowledge graph, and step 3, regarding the symptom combination corresponding to each prescription sample as a group according to the embedded representation of the entities obtained in the step 2, and regarding the group representation information and the traditional Chinese medicine knowledge graph as a groupThe Chinese herbal medicine entities are subjected to interactive learning, and a plurality of Chinese herbal medicines which are most suitable for symptom combination are finally output to form a Chinese herbal medicine prescription. The invention mainly utilizes a data mining method to simulate the process of 'treatment based on syndrome differentiation' in traditional Chinese medicine diagnosis and treatment, and realizes the prescription of traditional Chinese medicine for assisting clinical treatment according to symptoms.
Description
Technical Field
The invention relates to a traditional Chinese medicine prescription generation method, in particular to a traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning.
Background
Prescription generation of chinese medicine is the generation of a set of herbs for treating the symptoms of a patient by analyzing the interaction between symptoms and herbs. The actual diagnosis and treatment process of traditional Chinese medicine is that doctors deduce the syndrome according to the symptoms of patients and then make prescriptions according to the syndrome. Treatment based on syndrome differentiation is the basic principle of understanding and treating diseases in TCM, and is a special research and treatment method for diseases in TCM. The syndrome is the nature of the disease revealed by a number of symptoms with a main and secondary score. A group of symptoms can play different roles in decisions affecting syndrome induction, and syndrome induction does not simply accumulate each symptom, but comprehensively distinguishes the symptoms according to the primary symptoms and the secondary symptoms of each symptom.
The knowledge graph is a large semantic network consisting of rich entities and relationship information, and can be used for supplementing the relationship between users and items in a recommendation task. The embedding-based method and the path-based method in the knowledge-graph are two types common in knowledge-graph recommendation. Embedding-based methods mainly use information in the graph to better characterize entities and relationships. The Trans class model is a representation based on an embedding method. The path-based method utilizes a connectivity mode such as a meta path or a meta graph in the knowledge graph to recommend the path-based method. However, such methods are, for the most part, dependent on domain knowledge and are difficult to apply in practice. On the basis, the two methods are combined through the integration of entity embedding and connectivity information, and the entity embedding is optimized mainly through the path connectivity information of the knowledge graph. The group recommendation method is based on a common recommendation method, uses a specific aggregation method to aggregate users into user groups, learns the representation information of the groups, and models the interaction information of the user groups and items to predict the preference degree of different groups to each item.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to provide a Chinese Medicine Prescription generating method (KGAPG) based on Knowledge-graph and Group representation learning aiming at the problem that the Traditional Prescription generating method can not reasonably simulate the actual diagnosis and treatment process of Chinese Medicine. The invention considers the prescription generation as a group recommendation task, considers the symptom set of one patient as a group, and comprehensively considers the different influence of each symptom in the group, thereby expressing a plurality of symptoms as one group information by means of attention mechanism aggregation. From the theory of TCM, the group is the so-called "syndrome" of TCM, so the interaction between syndrome and herbs can be modeled. Firstly, a knowledge graph of traditional Chinese medicine is constructed, and semantic relations among different herbal medicines are learned in the knowledge graph. Then different influence weights of all symptoms in the symptom group are learned through an attention mechanism, so that the representation information of the symptom group, namely the representation information of the syndrome, is obtained through aggregation. Finally, the syndrome information and the herbal medicine information are interacted to output the predicted scores of different herbal medicines suitable for the group of symptom groups.
The technical scheme is as follows: in order to achieve the above purpose, the method for generating a traditional Chinese medicine prescription based on knowledge graph and group representation learning of the invention sequentially comprises the following steps:
Further, step 1 specifically comprises: by Chinese medicine knowledge mapThe existing triple (ephedra, hasEffect, sweating) in (1) is taken as an example, wherein "ephedra" is a Chinese herbal medicine entity in the knowledge graph, "sweating" is an efficacy entity in the knowledge graph, and "hasEffect" indicates that the semantic relationship between "ephedra" and "sweating" can be expressed as "ephedra has efficacy of sweating". Let the notation of the triplet be (e)h,r,et) Wherein e ish,r,etThe head entity (ephedra), relationship (hasEffect) and tail entity (sweating) of the knowledge-graph are represented, respectively. First, the entities in the d-dimensional entity space are passed through Wr∈Rk×dProjecting the matrix into a k-dimensional relation space where the relation r is located to obtain an entity ehEmbedded representation within a relationship spaceAnd entity etEmbedded representation within a relationship spaceThen by optimizing the translation principleWhere r is an embedded representation of the relationship r in k-dimensional relationship space. Thereby obtaining the Chinese medicine knowledge map of the two entities of the Chinese ephedra and the sweatingIs initially embedded in the representation. According to the method, the Chinese medicine knowledge map can be finally obtainedEach entity in (a) is represented by an initial embedding after being trained by a TransR model.
Further, step 2 specifically comprises:
step 21, making entity e in the knowledge graphhThe initial embedding obtained after the transR embedding of step 1 is denoted as ehWith entity ehOther directly connected entities are called direct neighbours of the entity, usingDenotes ehThe aggregate representation of the neighboring entities is shown in formula (1):
whereinIs ehNeighbor entity e oftThe weight occupied in the process of aggregate representation can also be understood as the relation r to the entity ehThe importance of (c). The weights here depend on e in the space of the relation rhAnd etIs defined as shown in formula (2):
wherein(d denotes the embedding dimension) is a trainable weight matrix, erIs an embedded representation of the relationship r. Finally, the weights are normalized to be soft max function
Step 22, obtaining an entity ehNeighbor aggregate representation ofThen useTo update the original embedding e of the entityh. Entity e after direct neighbor information aggregation and updatehIs expressed asWherein f isagg(. cndot.) is an aggregation function defined as shown in equation (3):
wherein(d denotes the embedding dimension) is a trainable weight matrix, which indicates the product of elements, and LeakyReLU is an activation function.
And step 23, further stacking more propagation layers to obtain a high-order neighbor aggregation representation of each entity on the basis of the step 21 and the step 22. The entity embedding representation updating is carried out recursively in the l-layer network, and information is propagated node by node in the knowledge graph. Simply, through the propagation of the l layers, the last layer of entity ehThe presentation information of (a) includes ehHigh-order neighbor entity information that can be reached in step l.
Further, step 3 specifically comprises:
step 31, defining a plurality of symptoms of each prescription sample as a group Sp={si|siE.g., S }, where SiRepresenting the ith symptom, and S represents the set of all symptoms in the dataset. The syndrome induction process utilizes attentionThe force mechanism learns the influence of different symptoms in each symptom cluster on the cluster, i.e. the weight each symptom takes on a symptom cluster. The weight is learned by the attention network, symptom set SpEach symptom of (1)iWeight of (a) (S)p,si) The definition is shown in formula (4):
whereinAndis a trainable parameter, siIs the symptom s obtained through step 2iIs shown embedded. After the weight of each symptom in a group of symptom sets is obtained, the weight is normalized by the softmax function, and finally the influence score of each symptom in the group is obtainedSpecifically, as shown in formula (5):
then, by means of the nonlinear processing advantages of the single-layer MLP, the more expressive syndrome representation is learned, as defined by equation (7):
sd=ReLU(Wmlp·sd+bmlp)#(7)
wherein WmlpAnd bmlpAre learnable parameters and ReLU is the activation function. So far, by means of a symptom polymerization method based on an attention mechanism, a potential syndrome representation in each prescription sample is obtained, and the method accords with the basic process of traditional Chinese medicine diagnosis and treatment. The above process is adaptive.
Step 33. the potential syndromes for each prescription sample obtained by the above steps are interacted with herbs to predict the likelihood that each herb will be suitable for treating the set of symptoms. Here, the predicted score is calculated using the inner product, as shown in equation (8):
whereinIndicating that herb h is suitable for treating symptom group SpI.e. the underlying syndrome sdH is the embedded representation of the herbal entities in the knowledge-graph. The first N herbs with the highest probability score are finally output as the prescription for the combination of the input symptoms.
Has the advantages that:
the invention provides a traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning, which converts traditional Chinese medicine prescription generation problems into a group recommendation task by means of the knowledge graph and group representation learning method. The model firstly utilizes knowledge in the field of traditional Chinese medicine to construct a traditional Chinese medicine knowledge map, and learns the embedded representation of each entity through the high-order connectivity of the knowledge map; in addition, the symptom combination of each input sample is regarded as a group, the influence weight of each symptom in the group is learned by using an attention mechanism, and the embedded representation of each symptom is aggregated into the embedded representation of the group, namely the embedded representation of the potential symptoms reflected by the symptom combination, so that the treatment process of 'treatment based on syndrome differentiation' in the traditional Chinese medicine diagnosis and treatment is simulated. It includes the following advantages:
(1) the group recommendation method is introduced into the prescription generation task, and the primary and secondary influence relations of different symptoms in the syndrome induction process are emphasized. Using the attention mechanism in the group aggregation process in the syndrome induction stage to learn the influence scores of different symptoms in the symptom group of a prescription sample;
(2) the sparseness problem in the recommendation task is improved by utilizing the abundant project structures in the knowledge graph, and the potential abundant semantic relation between the herbal medicine and various attributes and symptoms is obtained through the high-order connectivity of the knowledge graph.
Drawings
FIG. 1 is a block diagram of the overall framework of the KGAPG model of the present invention;
FIG. 2 is a flow chart of a method of the present invention;
FIG. 3 is a schematic view of a Chinese medicine knowledge map;
FIG. 4 is a verification graph comparing neighbor information propagation and aggregation layer depth;
FIG. 5 is a comparison verification diagram of knowledge-graph entity embedding dimension.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described below with reference to specific embodiments and illustrative drawings, it being understood that the preferred embodiments described herein are for the purpose of illustration and explanation only and are not intended to limit the present invention.
The invention relates to a traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning, which sequentially comprises the following steps of:
step 1: the knowledge-graph construction and initial embedding layer,
the entities in the knowledge graph are a complex of multiple attributes, different relationships concern different attributes of the entities, and different relationships have different semantic spaces. By Chinese medicine knowledge mapThe existing triple (ephedra, hasEffect, sweating) in (1) is taken as an example, wherein the ephedra is a Chinese herbal medicine entity in the knowledge map, the sweating is an efficacy entity in the knowledge map, and the hasEffect represents that the semantic relationship between the ephedra and the sweating can be expressed as that the ephedra has the sweatingThe efficacy of (1). Let the notation of the triplet be (e)h,r,et) Wherein e ish,r,etThe head entity (ephedra), relationship (hasEffect) and tail entity (sweating) of the knowledge-graph are represented, respectively. First, the entities in the d-dimensional entity space are passed through Wr∈Rk×dProjecting the matrix into a k-dimensional relation space where the relation r is located to obtain an entity ehEmbedded representation within a relationship spaceAnd entity etEmbedded representation within a relationship spaceThen by optimizing the translation principleWhere r is an embedded representation of the relationship r in k-dimensional relationship space. For a given triplet (e)h,r,et) The rationality score (or energy score) is formulated as shown in formula (1):
lower g (e)h,r,et) The score indicates that the triplet is more likely to be true and false otherwise. Training of TransR takes into account the relative order between real and false triples and distinguishes them by the following pairwise ordering loss function:
whereinAnd (e)h,r,et') is a dummy triple constructed by randomly replacing one of the entities in a real triple; σ (-) is a sigmoid function.
Step 2: the neighbor information propagation and aggregation layer,
order entity e in knowledge maphThe initial embedding obtained after the transR embedding of step 1 is denoted as eh. With entity ehOther entities that are directly connected are referred to as direct neighbors of the entity. The embedded representation of the direct neighbors needs to go through the information propagation process and be aggregated to the entity ehThus entity ehIs integrated with the aggregated representation of the direct neighbours to obtain ehHigher order representation of (a). By usingDenotes ehSo the aggregate representation of the neighboring entities is defined as shown in equation (3):
whereinIs ehNeighbor entity e oftThe weight occupied in the process of aggregate representation can also be understood as the relation r to the entity ehThe importance of (c). The weights here depend on e in the space of the relation rhAnd etIs defined by formula (4):
wherein(d is the embedding dimension) is a trainable weight matrix, erIs an embedded representation of the relationship r. At the time of obtaining entity ehAfter the aggregated weights occupied by all direct neighbors, these weights are normalized by the softmax function, as shown in equation (5):
at the time of obtaining entity ehNeighbor aggregate representation ofThen useTo update the original embedding e of the entityh. Entity e after direct neighbor information aggregation and updatehIs expressed asWherein f isagg(. cndot.) is an aggregation function defined as shown in equation (6):
wherein(d' and d are embedding dimensions) are trainable weight matrices, which indicate element products, LeakyReLU is an activation function. Through the propagation and aggregation of direct neighbors, each entity in the knowledge graph contains not only its own information, but also information flowing from its direct neighbors to the entity along first-order connectivity.
Based on the method for propagating and aggregating the direct neighbor information, more propagation layers are further superimposed to obtain the high-order neighbor information of each entity. The entity embedding representation updating is carried out recursively in the l-layer network, and information is propagated node by node in the knowledge graph. Simply, through the propagation of the l layers, the last layer of entity ehThe presentation information of (a) includes ehHigh-order neighbor entity information that can be reached in step l.
And step 3: the layer of syndrome induction and prediction,
the symptoms of each prescription sampleThe shape is defined as a group Sp={si|siE.g., S }, where SiRepresenting the ith symptom, and S represents the set of all symptoms in the dataset. According to the theory of traditional Chinese medicine, the process of treatment based on syndrome differentiation needs to comprehensively consider the primary and secondary symptoms of all symptoms of patients, which influences the decision of traditional Chinese medicine in the process of syndrome induction. In the syndrome induction process, the influence of different symptoms in each symptom group on the group, namely the weight of each symptom in one symptom group, is learned by using an attention mechanism. The weight is learned by the attention network, symptom set SpEach symptom of (1)iThe weight definition of (c) is shown in equation (7):
whereinAnd(d is the embedding dimension) is a trainable parameter, siIs the symptom s obtained through step 2iIs shown embedded. After the weight of each symptom in a set of symptom sets is obtained, the weight is normalized by the softmax function, and finally the final influence score of each symptom in the set is obtainedSpecifically, as shown in formula (8):
on the basis of the obtained symptom group SpI.e. a representation s of the underlying syndrome for each symptom combinationdThe definition is shown as formula (9):
then, by means of the nonlinear processing advantages of the single-layer MLP, the more expressive syndrome representation is learned, as defined by equation (10):
sd=ReLU(Wmlp·sd+bmlp)#(10)
wherein WmlpAnd bmlpAre learnable parameters and ReLU is the activation function. So far, by adopting a symptom polymerization method based on an attention mechanism, a potential syndrome representation in each prescription sample is obtained, which accords with the basic process of traditional Chinese medicine diagnosis and treatment. The above process is adaptive.
Finally, the potential syndromes for each prescription sample obtained in the above steps are interacted with herbs to predict the likelihood that each herb will be suitable for treating the set of symptoms. Here, the predicted score is calculated using the inner product, as shown in equation (11):
whereinIndicating that herb h is suitable for treating symptom group SpI.e. the underlying syndrome sdH is the embedded representation of the herbal entities in the knowledge-graph. The first N herbs with the highest probability score are finally output as the prescription for the combination of the input symptoms.
4. Model optimization
At a given prescription sample P ═<Sp,Hp>In the case of (1), wherein Sp,HpThe symptom set and the herb set in the prescription sample, the herb set H used actuallypMultiple heat vectors hp', v (S) represented as dimension | H |pH) is in a given symptom group SpThe output probability vectors of all herbs in case (2). hp' and v (S)pH), Weighted Mean Square Error (WMSE) between H) is defined as shown in equation (12):
wherein hpi' and v (S)p,H)iRespectively representing the i-th element, w, in the vectoriIs the weight of the ith herb, and is defined as shown in equation (13):
where freq (i) is the frequency with which the ith herb appears in all prescribed samples. The herbal weight is set to balance the contributions of herbs of different frequencies, the higher the frequency of an herb, the lower its weight.
The interaction of the underlying syndrome with herbs uses a loss function as in equation (14):
finally, the model is optimized by jointly learning equations (2) and (14) to obtain a joint objective function, as shown in equation (15):
wherein λΘControl L2The term is normalized to prevent overfitting, Θ being the set of all parameters of the model.
Experiment:
in order to verify the effectiveness of the KGAPG prescription generation model, the invention discloses a traditional Chinese medicine prescription data set for experiment, and in addition, parameter learning and ablation analysis are carried out to further verify the effectiveness of the model, wherein the prescription data set and the traditional Chinese medicine knowledge map data used in the invention are shown in Table 1.
TABLE 1 prescription data set and Chinese medicine knowledge map data
The chinese prescription data set contains 26360 complete prescription samples, which involve 360 symptoms and 753 herbs. Triplets are a general representation of a knowledge graph: (head, relation, tail), wherein head and tail represent the head entity and the tail entity respectively, and relation is the relationship between the two entities. In most cases, natural language can be represented in this form. For example, the triple (ephedra, waseffect, sweating) indicates that the herb "ephedra" has the efficacy of "sweating". In the theory of traditional Chinese medicine, herbs have the properties of four flavors and five flavors and meridian tropism. Therefore, the Chinese medicine knowledge map with the Chinese herbal medicine as the core is constructed on the basis of the background knowledge. FIG. 2 is an illustration of a Chinese medicine knowledge map.
Table 2 shows a comparison of the performance between the syndrome induction and models predicting the presence or absence of attention-driven mechanisms in the hierarchy aggregating multiple symptoms into a cluster. It can be observed from the table that the attention-based syndrome induction process is superior to the average aggregation method without attention. The symptom entity captures similar information between herbs through the high-order connectivity of the knowledge graph, so that the embedded representation of symptoms also contains the herb background knowledge of nature, taste, channel tropism, etc. Obviously, in the syndrome induction stage, the potential syndrome representation also integrates the background knowledge of herbs, which makes the relationship between syndrome and herbs more compact. Therefore, on the basis, the attention mechanism not only can dynamically acquire the respective influence of each symptom in the symptom group, but also accords with the basic idea of traditional Chinese medicine dialectical treatment.
TABLE 2 influence of the attention mechanism on the process of syndrome induction
Table 3 demonstrates the effect of using different aggregation functions in the neighbor information propagation and aggregation layer to update the embedded representation of the entity on the model performance. Three different polymerizers were investigated for their effect on the model performance: a GCN polymerizer, a GraphSage polymerizer and a Bi-Interaction polymerizer. From the table, it can be observed that the additional feature Interaction of the Bi-Interaction Aggregator in the information aggregation process can improve the representation learning effect of the node, which proves the rationality and effectiveness of the Bi-Interaction Aggregator.
TABLE 3 Effect of different polymerizers
FIG. 4 illustrates the effect of the depth of the neighbor information propagation and aggregation layer on the model performance. Depth embodies the high-order connectivity of the knowledge-graph, which controls the extent to which entities can aggregate information from. It can be observed from the figure that the model achieves the best results when the depth is 2, which shows that the second order relationship between entities can effectively represent the complexity of the herb. As depth continues to increase, noise may be introduced resulting in reduced model performance.
FIG. 5 illustrates the effect of embedding dimension d into an entity in a knowledge graph on model performance. Experiments controlled the range of sizes of the embedding dimensions between 32,64,128,256, 512. It can be observed from the figure that the model achieves the best performance when the embedding dimension is 256, which indicates that increasing the dimension value appropriately can more fully represent the complex herbal information in the knowledge graph.
Case verification:
in order to verify the rationality and validity of the prescription generation method proposed by the present invention, the proposed KGAPG model was tested in two real prescription cases. The model generates a set of herbs to treat these symptoms collectively. Two cases in the recipe generation scenario are shown in table 4, for example. The bold herbs in the table indicate successful hits of the KGAPG model generated herbs in the real herb collection.
Table 4 prescription Generation cases
It should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention, and all equivalent substitutions or substitutions made on the above-mentioned technical solutions belong to the scope of the present invention.
Claims (4)
1. A traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning is characterized by sequentially comprising the following steps:
step 1, knowledge graph construction and initial embedding layer: taking herbs as core, encapsulating the properties of herbs such as nature, taste, channel tropism, efficacy and the like into a triple group, adding the treatment relationship between symptoms and herbs in prescription data set into the knowledge graph, and finally forming the knowledge graph of traditional Chinese medicineInitializing the embedded representation of each entity in the knowledge graph through a TransR model;
step 2, a neighbor information transmission and aggregation layer: updating the embedded representation of each entity through the propagation and aggregation of high-order neighborhood information in the knowledge graph, and enriching the semantic relation of each entity in the knowledge graph of the traditional Chinese medicine;
step 3, syndrome induction and prediction layer: according to the entity embedded expression obtained in the step 2, the symptom combination corresponding to each prescription sample is regarded as a group, the group is used for representing syndrome information in the theory of traditional Chinese medicine, the embedded expression of the group is learned by using the attention mechanism, and the group expression information and the traditional Chinese medicine knowledge map are combinedThe Chinese herbal medicine entities are subjected to interactive learning, and a plurality of Chinese herbal medicines which are most suitable for symptom combination are finally output to form a Chinese herbal medicine prescription.
2. The method for generating a prescription of traditional Chinese medicine based on knowledge-graph and group representation learning according to claim 1, wherein the step 1 is specifically as follows:
chinese medicine knowledge mapOf (c), let the notation of the triplet be (e)h,r,et) Wherein e ish,r,etRepresenting the head entity, the relation (hasEffect) and the tail entity of the knowledge graph respectively, firstly, the entities in the d-dimensional entity space are passed through Wr∈Rk×dProjecting the matrix into a k-dimensional relation space where the relation r is located to obtain an entity ehEmbedded representation within a relationship spaceAnd entity etEmbedded representation within a relationship spaceThen by optimizing the translation principleWherein r is the embedded expression of the relation r in the k-dimensional relation space, and the Chinese medicine knowledge graph is finally obtained according to the methodEach entity in (a) is represented by an initial embedding after being trained by a TransR model.
3. The method for generating a prescription of traditional Chinese medicine based on knowledge-graph and group representation learning according to claim 2, wherein the step 2 is specifically as follows:
step 21, making entity e in the knowledge graphhThe initial embedding obtained after the transR embedding of step 1 is denoted as ehWith entity ehOther directly connected entities are called direct neighbours of the entity, usingDenotes ehThe aggregate representation of the neighboring entities is shown in formula (1):
whereinIs ehNeighbor entity e oftThe weight occupied in the process of aggregate representation is understood at the same time as the relation r to the entity ehOf importance, the weight being dependent on e in the space of the relation rhAnd etIs defined as shown in formula (2):
π(eh,r,et)=(Wret)Ttanh((Wreh+er))#(2)
wherein(d denotes the embedding dimension) is a trainable weight matrix, erIs an embedded representation of the relationship r, and finally the weights are normalized by the softmax function to
Step 22, obtaining an entity ehNeighbor aggregate representation ofThen useTo update the original embedding e of the entityhEntity e after direct neighbor information aggregation and updatehIs expressed asWherein f isagg(. cndot.) is an aggregation function defined as shown in equation (3):
wherein(d represents an embedding dimension) is a trainable weight matrix, indicating a product of elements, LeakyReLU is an activation function;
step 23, on the basis of step 21 and step 22, further stacking more propagation layers to obtain a high-order neighbor aggregate representation of each entity, recursively updating the entity embedded representation in the layer I network, propagating information node by node in the knowledge graph, propagating information through the layer I, and finally propagating the entity e in the layer IhThe presentation information of (a) includes ehHigh-order neighbor entity information that can be reached in step l.
4. The method for generating a prescription of traditional Chinese medicine based on knowledge-graph and group representation learning according to claim 3, wherein the step 3 is specifically as follows:
step 31, defining a plurality of symptoms of each prescription sample as a group Sp={si|siE S) where SiRepresenting the ith symptom, S represents all symptom sets in the data set, the syndrome induction process utilizes an attention mechanism to learn the influence of different symptoms in each symptom group on the group, namely the weight of each symptom in one symptom group, the weight is learned by an attention network, and the symptom sets SpEach symptom of (1)iWeight of (a) (S)p,si) The definition is shown in formula (4):
α(Sp,si)=hTWattsi#(4)
whereinAndis a trainable parameter, siIs the symptom s obtained through step 2iAfter obtaining the weight of each symptom in a set of symptom sets, normalizing by a softmax function to finally obtain the influence score of each symptom in the setSpecifically, as shown in formula (5):
step 32. based on step 31, the symptom group S can be obtainedpI.e. a representation s of the underlying syndrome for each symptom combinationdThe definition is shown as formula (6):
then, by means of the nonlinear processing advantages of the single-layer MLP, the more expressive syndrome representation is learned, as defined by equation (7):
sd=ReLU(Wmlp·sd+bmlp)#(7)
wherein WmlpAnd bmlpAre learnable parameters, ReLU is an activation function, so far, by means of a symptom aggregation method based on attention mechanism, a potential syndrome representation in each prescription sample is obtained,
step 33. the potential syndromes for each prescription sample obtained by the above steps are interacted with herbs to predict the likelihood that each herb will be suitable for treating the set of symptoms, where the predicted score is calculated using the inner product, as shown in equation (8):
whereinIndicating that herb h is suitable for treating symptom group SpI.e. the underlying syndrome sdH is an embedded representation of the herb entities in the knowledge-graph, and finally outputting the top N herbs with the highest probability scores as the prescription applicable to the input symptom combination.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111402132.3A CN114121212B (en) | 2021-11-19 | 2021-11-19 | Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111402132.3A CN114121212B (en) | 2021-11-19 | 2021-11-19 | Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114121212A true CN114121212A (en) | 2022-03-01 |
CN114121212B CN114121212B (en) | 2024-04-02 |
Family
ID=80371712
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111402132.3A Active CN114121212B (en) | 2021-11-19 | 2021-11-19 | Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114121212B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116631612A (en) * | 2023-06-09 | 2023-08-22 | 广东工业大学 | Graph convolution herbal medicine recommendation method and computer based on multi-graph fusion |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110334211A (en) * | 2019-06-14 | 2019-10-15 | 电子科技大学 | A kind of Chinese medicine diagnosis and treatment knowledge mapping method for auto constructing based on deep learning |
CN112131399A (en) * | 2020-09-04 | 2020-12-25 | 牛张明 | Old medicine new use analysis method and system based on knowledge graph |
WO2021139247A1 (en) * | 2020-08-06 | 2021-07-15 | 平安科技(深圳)有限公司 | Construction method, apparatus and device for medical domain knowledge map, and storage medium |
WO2021189971A1 (en) * | 2020-10-26 | 2021-09-30 | 平安科技(深圳)有限公司 | Medical plan recommendation system and method based on knowledge graph representation learning |
CN113539412A (en) * | 2021-07-19 | 2021-10-22 | 闽江学院 | Chinese herbal medicine recommendation system based on deep learning |
-
2021
- 2021-11-19 CN CN202111402132.3A patent/CN114121212B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110334211A (en) * | 2019-06-14 | 2019-10-15 | 电子科技大学 | A kind of Chinese medicine diagnosis and treatment knowledge mapping method for auto constructing based on deep learning |
WO2021139247A1 (en) * | 2020-08-06 | 2021-07-15 | 平安科技(深圳)有限公司 | Construction method, apparatus and device for medical domain knowledge map, and storage medium |
CN112131399A (en) * | 2020-09-04 | 2020-12-25 | 牛张明 | Old medicine new use analysis method and system based on knowledge graph |
WO2021189971A1 (en) * | 2020-10-26 | 2021-09-30 | 平安科技(深圳)有限公司 | Medical plan recommendation system and method based on knowledge graph representation learning |
CN113539412A (en) * | 2021-07-19 | 2021-10-22 | 闽江学院 | Chinese herbal medicine recommendation system based on deep learning |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116631612A (en) * | 2023-06-09 | 2023-08-22 | 广东工业大学 | Graph convolution herbal medicine recommendation method and computer based on multi-graph fusion |
CN116631612B (en) * | 2023-06-09 | 2024-03-19 | 广东工业大学 | Graph convolution herbal medicine recommendation method and computer based on multi-graph fusion |
Also Published As
Publication number | Publication date |
---|---|
CN114121212B (en) | 2024-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Khan et al. | Chronic disease prediction using administrative data and graph theory: The case of type 2 diabetes | |
US20200203017A1 (en) | Systems and methods of prediction of injury risk with a training regime | |
Yang et al. | Predicting coronary heart disease using an improved LightGBM model: Performance analysis and comparison | |
Davazdahemami et al. | An explanatory machine learning framework for studying pandemics: The case of COVID-19 emergency department readmissions | |
Sarkar et al. | Selecting informative rules with parallel genetic algorithm in classification problem | |
US20210089965A1 (en) | Data Conversion/Symptom Scoring | |
CN111477337A (en) | Infectious disease early warning method, system and medium based on individual self-adaptive transmission network | |
CN116403730A (en) | Medicine interaction prediction method and system based on graph neural network | |
Ravuri et al. | Learning from the experts: From expert systems to machine-learned diagnosis models | |
CN116992980A (en) | Prognosis prediction early warning model training method, system and equipment based on super network and federal learning | |
CN114121212B (en) | Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning | |
Hoyos et al. | PRV-FCM: An extension of fuzzy cognitive maps for prescriptive modeling | |
Ma et al. | Construction and evaluation of intelligent medical diagnosis model based on integrated deep neural network | |
Tran et al. | Building interpretable predictive models with context-aware evolutionary learning | |
Zeng et al. | Influential simplices mining via simplicial convolutional network | |
Chen et al. | Personalized expert recommendation systems for optimized nutrition | |
CN116798653A (en) | Drug interaction prediction method, device, electronic equipment and storage medium | |
CN115240811A (en) | Construction method and application of implicit relation drug recommendation model based on graph neural network | |
Xu et al. | Multiple MACE risk prediction using multi-task recurrent neural network with attention | |
Khater et al. | Interpretable models for ml-based classification of obesity | |
Dong et al. | PresRecST: a novel herbal prescription recommendation algorithm for real-world patients with integration of syndrome differentiation and treatment planning | |
Jin et al. | A knowledge-guided and traditional Chinese medicine informed approach for herb recommendation | |
Yale | Privacy preserving synthetic health data generation and evaluation | |
Melek et al. | A theoretic framework for intelligent expert systems in medical encounter evaluation | |
CN117435747B (en) | Few-sample link prediction drug recycling method based on multilevel refinement network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |