CN114121212A

CN114121212A - Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning

Info

Publication number: CN114121212A
Application number: CN202111402132.3A
Authority: CN
Inventors: 王伟; 李书晨; 何洁月
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2021-11-19
Filing date: 2021-11-19
Publication date: 2022-03-01
Anticipated expiration: 2041-11-19
Also published as: CN114121212B

Abstract

The invention discloses a traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning, which sequentially includes the following steps: Step 1. Constructing a traditional Chinese medicine knowledge graph, taking herbal medicine as the core, and encapsulating attributes such as properties, taste, meridian function, etc. of herbal medicine into ternary Group, and add the treatment relationship between symptoms and herbal medicines in the prescription data set to the knowledge map, and finally form a knowledge map of traditional Chinese medicine

Step 2. Update the embedding representation of each entity through the dissemination and aggregation of neighborhood information in the knowledge graph. Step 3. According to the entity embedding representation obtained in step 2, the symptom combination corresponding to each prescription sample is regarded as a group, and Combining group representation information with knowledge graph of traditional Chinese medicine

The herbal entities in the interactive learning process, and finally output several herbs that are most suitable for the combination of symptoms to form a Chinese medicine prescription. The invention mainly uses the data mining method to simulate the process of "diagnosis and treatment" in the diagnosis and treatment of traditional Chinese medicine, and realizes the prescription of traditional Chinese medicine for auxiliary clinical treatment according to the symptoms.

Description

Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning

Technical Field

The invention relates to a traditional Chinese medicine prescription generation method, in particular to a traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning.

Background

Prescription generation of chinese medicine is the generation of a set of herbs for treating the symptoms of a patient by analyzing the interaction between symptoms and herbs. The actual diagnosis and treatment process of traditional Chinese medicine is that doctors deduce the syndrome according to the symptoms of patients and then make prescriptions according to the syndrome. Treatment based on syndrome differentiation is the basic principle of understanding and treating diseases in TCM, and is a special research and treatment method for diseases in TCM. The syndrome is the nature of the disease revealed by a number of symptoms with a main and secondary score. A group of symptoms can play different roles in decisions affecting syndrome induction, and syndrome induction does not simply accumulate each symptom, but comprehensively distinguishes the symptoms according to the primary symptoms and the secondary symptoms of each symptom.

The knowledge graph is a large semantic network consisting of rich entities and relationship information, and can be used for supplementing the relationship between users and items in a recommendation task. The embedding-based method and the path-based method in the knowledge-graph are two types common in knowledge-graph recommendation. Embedding-based methods mainly use information in the graph to better characterize entities and relationships. The Trans class model is a representation based on an embedding method. The path-based method utilizes a connectivity mode such as a meta path or a meta graph in the knowledge graph to recommend the path-based method. However, such methods are, for the most part, dependent on domain knowledge and are difficult to apply in practice. On the basis, the two methods are combined through the integration of entity embedding and connectivity information, and the entity embedding is optimized mainly through the path connectivity information of the knowledge graph. The group recommendation method is based on a common recommendation method, uses a specific aggregation method to aggregate users into user groups, learns the representation information of the groups, and models the interaction information of the user groups and items to predict the preference degree of different groups to each item.

Disclosure of Invention

The purpose of the invention is as follows: the invention aims to provide a Chinese Medicine Prescription generating method (KGAPG) based on Knowledge-graph and Group representation learning aiming at the problem that the Traditional Prescription generating method can not reasonably simulate the actual diagnosis and treatment process of Chinese Medicine. The invention considers the prescription generation as a group recommendation task, considers the symptom set of one patient as a group, and comprehensively considers the different influence of each symptom in the group, thereby expressing a plurality of symptoms as one group information by means of attention mechanism aggregation. From the theory of TCM, the group is the so-called "syndrome" of TCM, so the interaction between syndrome and herbs can be modeled. Firstly, a knowledge graph of traditional Chinese medicine is constructed, and semantic relations among different herbal medicines are learned in the knowledge graph. Then different influence weights of all symptoms in the symptom group are learned through an attention mechanism, so that the representation information of the symptom group, namely the representation information of the syndrome, is obtained through aggregation. Finally, the syndrome information and the herbal medicine information are interacted to output the predicted scores of different herbal medicines suitable for the group of symptom groups.

The technical scheme is as follows: in order to achieve the above purpose, the method for generating a traditional Chinese medicine prescription based on knowledge graph and group representation learning of the invention sequentially comprises the following steps:

step 1, knowledge graph construction and initial embedding layer: taking herbs as core, encapsulating the properties of herbs such as nature, taste, channel tropism, efficacy and the like into a triple group, adding the treatment relationship between symptoms and herbs in prescription data set into the knowledge graph, and finally forming the knowledge graph of traditional Chinese medicine

Initializing the embedded representation of each entity in the knowledge graph through a TransR model;

step 2, a neighbor information transmission and aggregation layer: updating the embedded representation of each entity through the propagation and aggregation of high-order neighborhood information in the knowledge graph, and enriching the semantic relation of each entity in the knowledge graph of the traditional Chinese medicine;

step 3, syndrome induction and prediction layer: according to the stepsThe entity embedded representation obtained in the step 2 is characterized in that symptom combinations corresponding to each prescription sample are regarded as a group, the group is used for representing syndrome information in the theory of traditional Chinese medicine, the embedded representation of the group is learned by using the attention mechanism, and the group representation information and the traditional Chinese medicine knowledge map are combined

The Chinese herbal medicine entities are subjected to interactive learning, and a plurality of Chinese herbal medicines which are most suitable for symptom combination are finally output to form a Chinese herbal medicine prescription.

Further, step 1 specifically comprises: by Chinese medicine knowledge map

The existing triple (ephedra, hasEffect, sweating) in (1) is taken as an example, wherein "ephedra" is a Chinese herbal medicine entity in the knowledge graph, "sweating" is an efficacy entity in the knowledge graph, and "hasEffect" indicates that the semantic relationship between "ephedra" and "sweating" can be expressed as "ephedra has efficacy of sweating". Let the notation of the triplet be (e)_h,r,e_t) Wherein e is_h,r,e_tThe head entity (ephedra), relationship (hasEffect) and tail entity (sweating) of the knowledge-graph are represented, respectively. First, the entities in the d-dimensional entity space are passed through W_r∈R^k×dProjecting the matrix into a k-dimensional relation space where the relation r is located to obtain an entity e_hEmbedded representation within a relationship space

And entity e_tEmbedded representation within a relationship space

Then by optimizing the translation principle

Where r is an embedded representation of the relationship r in k-dimensional relationship space. Thereby obtaining the Chinese medicine knowledge map of the two entities of the Chinese ephedra and the sweating

Is initially embedded in the representation. According to the method, the Chinese medicine knowledge map can be finally obtained

Each entity in (a) is represented by an initial embedding after being trained by a TransR model.

Further, step 2 specifically comprises:

step 21, making entity e in the knowledge graph_hThe initial embedding obtained after the transR embedding of step 1 is denoted as e_hWith entity e_hOther directly connected entities are called direct neighbours of the entity, using

Denotes e_hThe aggregate representation of the neighboring entities is shown in formula (1):

wherein

Is e_hNeighbor entity e of_tThe weight occupied in the process of aggregate representation can also be understood as the relation r to the entity e_hThe importance of (c). The weights here depend on e in the space of the relation r_hAnd e_tIs defined as shown in formula (2):

wherein

(d denotes the embedding dimension) is a trainable weight matrix, e_rIs an embedded representation of the relationship r. Finally, the weights are normalized to be soft max function

Step 22, obtaining an entity e_hNeighbor aggregate representation of

Then use

To update the original embedding e of the entity_h. Entity e after direct neighbor information aggregation and update_hIs expressed as

Wherein f is_agg(. cndot.) is an aggregation function defined as shown in equation (3):

wherein

(d denotes the embedding dimension) is a trainable weight matrix, which indicates the product of elements, and LeakyReLU is an activation function.

And step 23, further stacking more propagation layers to obtain a high-order neighbor aggregation representation of each entity on the basis of the step 21 and the step 22. The entity embedding representation updating is carried out recursively in the l-layer network, and information is propagated node by node in the knowledge graph. Simply, through the propagation of the l layers, the last layer of entity e_hThe presentation information of (a) includes e_hHigh-order neighbor entity information that can be reached in step l.

Further, step 3 specifically comprises:

step 31, defining a plurality of symptoms of each prescription sample as a group S_p＝{s_i|s_iE.g., S }, where S_iRepresenting the ith symptom, and S represents the set of all symptoms in the dataset. The syndrome induction process utilizes attentionThe force mechanism learns the influence of different symptoms in each symptom cluster on the cluster, i.e. the weight each symptom takes on a symptom cluster. The weight is learned by the attention network, symptom set S_pEach symptom of (1)_iWeight of (a) (S)_p,s_i) The definition is shown in formula (4):

wherein

And

is a trainable parameter, s_iIs the symptom s obtained through step 2_iIs shown embedded. After the weight of each symptom in a group of symptom sets is obtained, the weight is normalized by the softmax function, and finally the influence score of each symptom in the group is obtained

Specifically, as shown in formula (5):

step 32. based on step 31, the symptom group S can be obtained_pI.e. a representation s of the underlying syndrome for each symptom combination_dThe definition is shown as formula (6):

then, by means of the nonlinear processing advantages of the single-layer MLP, the more expressive syndrome representation is learned, as defined by equation (7):

s_d＝ReLU(W^mlp·s_d+b^mlp)#(7)

wherein W^mlpAnd b^mlpAre learnable parameters and ReLU is the activation function. So far, by means of a symptom polymerization method based on an attention mechanism, a potential syndrome representation in each prescription sample is obtained, and the method accords with the basic process of traditional Chinese medicine diagnosis and treatment. The above process is adaptive.

Step 33. the potential syndromes for each prescription sample obtained by the above steps are interacted with herbs to predict the likelihood that each herb will be suitable for treating the set of symptoms. Here, the predicted score is calculated using the inner product, as shown in equation (8):

wherein

Indicating that herb h is suitable for treating symptom group S_pI.e. the underlying syndrome s_dH is the embedded representation of the herbal entities in the knowledge-graph. The first N herbs with the highest probability score are finally output as the prescription for the combination of the input symptoms.

Has the advantages that:

the invention provides a traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning, which converts traditional Chinese medicine prescription generation problems into a group recommendation task by means of the knowledge graph and group representation learning method. The model firstly utilizes knowledge in the field of traditional Chinese medicine to construct a traditional Chinese medicine knowledge map, and learns the embedded representation of each entity through the high-order connectivity of the knowledge map; in addition, the symptom combination of each input sample is regarded as a group, the influence weight of each symptom in the group is learned by using an attention mechanism, and the embedded representation of each symptom is aggregated into the embedded representation of the group, namely the embedded representation of the potential symptoms reflected by the symptom combination, so that the treatment process of 'treatment based on syndrome differentiation' in the traditional Chinese medicine diagnosis and treatment is simulated. It includes the following advantages:

(1) the group recommendation method is introduced into the prescription generation task, and the primary and secondary influence relations of different symptoms in the syndrome induction process are emphasized. Using the attention mechanism in the group aggregation process in the syndrome induction stage to learn the influence scores of different symptoms in the symptom group of a prescription sample;

(2) the sparseness problem in the recommendation task is improved by utilizing the abundant project structures in the knowledge graph, and the potential abundant semantic relation between the herbal medicine and various attributes and symptoms is obtained through the high-order connectivity of the knowledge graph.

Drawings

FIG. 1 is a block diagram of the overall framework of the KGAPG model of the present invention;

FIG. 2 is a flow chart of a method of the present invention;

FIG. 3 is a schematic view of a Chinese medicine knowledge map;

FIG. 4 is a verification graph comparing neighbor information propagation and aggregation layer depth;

FIG. 5 is a comparison verification diagram of knowledge-graph entity embedding dimension.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described below with reference to specific embodiments and illustrative drawings, it being understood that the preferred embodiments described herein are for the purpose of illustration and explanation only and are not intended to limit the present invention.

The invention relates to a traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning, which sequentially comprises the following steps of:

step 1: the knowledge-graph construction and initial embedding layer,

the entities in the knowledge graph are a complex of multiple attributes, different relationships concern different attributes of the entities, and different relationships have different semantic spaces. By Chinese medicine knowledge map

The existing triple (ephedra, hasEffect, sweating) in (1) is taken as an example, wherein the ephedra is a Chinese herbal medicine entity in the knowledge map, the sweating is an efficacy entity in the knowledge map, and the hasEffect represents that the semantic relationship between the ephedra and the sweating can be expressed as that the ephedra has the sweatingThe efficacy of (1). Let the notation of the triplet be (e)_h,r,e_t) Wherein e is_h,r,e_tThe head entity (ephedra), relationship (hasEffect) and tail entity (sweating) of the knowledge-graph are represented, respectively. First, the entities in the d-dimensional entity space are passed through W_r∈R^k×dProjecting the matrix into a k-dimensional relation space where the relation r is located to obtain an entity e_hEmbedded representation within a relationship space

And entity e_tEmbedded representation within a relationship space

Then by optimizing the translation principle

Where r is an embedded representation of the relationship r in k-dimensional relationship space. For a given triplet (e)_h,r,e_t) The rationality score (or energy score) is formulated as shown in formula (1):

lower g (e)_h,r,e_t) The score indicates that the triplet is more likely to be true and false otherwise. Training of TransR takes into account the relative order between real and false triples and distinguishes them by the following pairwise ordering loss function:

wherein

And (e)_h,r,e_t') is a dummy triple constructed by randomly replacing one of the entities in a real triple; σ (-) is a sigmoid function.

Step 2: the neighbor information propagation and aggregation layer,

order entity e in knowledge map_hThe initial embedding obtained after the transR embedding of step 1 is denoted as e_h. With entity e_hOther entities that are directly connected are referred to as direct neighbors of the entity. The embedded representation of the direct neighbors needs to go through the information propagation process and be aggregated to the entity e_hThus entity e_hIs integrated with the aggregated representation of the direct neighbours to obtain e_hHigher order representation of (a). By using

Denotes e_hSo the aggregate representation of the neighboring entities is defined as shown in equation (3):

wherein

Is e_hNeighbor entity e of_tThe weight occupied in the process of aggregate representation can also be understood as the relation r to the entity e_hThe importance of (c). The weights here depend on e in the space of the relation r_hAnd e_tIs defined by formula (4):

wherein

(d is the embedding dimension) is a trainable weight matrix, e_rIs an embedded representation of the relationship r. At the time of obtaining entity e_hAfter the aggregated weights occupied by all direct neighbors, these weights are normalized by the softmax function, as shown in equation (5):

at the time of obtaining entity e_hNeighbor aggregate representation of

Then use

Wherein f is_agg(. cndot.) is an aggregation function defined as shown in equation (6):

wherein

(d' and d are embedding dimensions) are trainable weight matrices, which indicate element products, LeakyReLU is an activation function. Through the propagation and aggregation of direct neighbors, each entity in the knowledge graph contains not only its own information, but also information flowing from its direct neighbors to the entity along first-order connectivity.

Based on the method for propagating and aggregating the direct neighbor information, more propagation layers are further superimposed to obtain the high-order neighbor information of each entity. The entity embedding representation updating is carried out recursively in the l-layer network, and information is propagated node by node in the knowledge graph. Simply, through the propagation of the l layers, the last layer of entity e_hThe presentation information of (a) includes e_hHigh-order neighbor entity information that can be reached in step l.

And step 3: the layer of syndrome induction and prediction,

the symptoms of each prescription sampleThe shape is defined as a group S_p＝{s_i|s_iE.g., S }, where S_iRepresenting the ith symptom, and S represents the set of all symptoms in the dataset. According to the theory of traditional Chinese medicine, the process of treatment based on syndrome differentiation needs to comprehensively consider the primary and secondary symptoms of all symptoms of patients, which influences the decision of traditional Chinese medicine in the process of syndrome induction. In the syndrome induction process, the influence of different symptoms in each symptom group on the group, namely the weight of each symptom in one symptom group, is learned by using an attention mechanism. The weight is learned by the attention network, symptom set S_pEach symptom of (1)_iThe weight definition of (c) is shown in equation (7):

wherein

And

(d is the embedding dimension) is a trainable parameter, s_iIs the symptom s obtained through step 2_iIs shown embedded. After the weight of each symptom in a set of symptom sets is obtained, the weight is normalized by the softmax function, and finally the final influence score of each symptom in the set is obtained

Specifically, as shown in formula (8):

on the basis of the obtained symptom group S_pI.e. a representation s of the underlying syndrome for each symptom combination_dThe definition is shown as formula (9):

then, by means of the nonlinear processing advantages of the single-layer MLP, the more expressive syndrome representation is learned, as defined by equation (10):

s_d＝ReLU(W^mlp·s_d+b^mlp)#(10)

wherein W^mlpAnd b^mlpAre learnable parameters and ReLU is the activation function. So far, by adopting a symptom polymerization method based on an attention mechanism, a potential syndrome representation in each prescription sample is obtained, which accords with the basic process of traditional Chinese medicine diagnosis and treatment. The above process is adaptive.

Finally, the potential syndromes for each prescription sample obtained in the above steps are interacted with herbs to predict the likelihood that each herb will be suitable for treating the set of symptoms. Here, the predicted score is calculated using the inner product, as shown in equation (11):

wherein

4. Model optimization

At a given prescription sample P ═<S_p,H_p>In the case of (1), wherein S_p,H_pThe symptom set and the herb set in the prescription sample, the herb set H used actually_pMultiple heat vectors hp', v (S) represented as dimension | H |_pH) is in a given symptom group S_pThe output probability vectors of all herbs in case (2). hp' and v (S)_pH), Weighted Mean Square Error (WMSE) between H) is defined as shown in equation (12):

wherein hp_i' and v (S)_p,H)_iRespectively representing the i-th element, w, in the vector_iIs the weight of the ith herb, and is defined as shown in equation (13):

where freq (i) is the frequency with which the ith herb appears in all prescribed samples. The herbal weight is set to balance the contributions of herbs of different frequencies, the higher the frequency of an herb, the lower its weight.

The interaction of the underlying syndrome with herbs uses a loss function as in equation (14):

finally, the model is optimized by jointly learning equations (2) and (14) to obtain a joint objective function, as shown in equation (15):

wherein λ_ΘControl L₂The term is normalized to prevent overfitting, Θ being the set of all parameters of the model.

Experiment:

in order to verify the effectiveness of the KGAPG prescription generation model, the invention discloses a traditional Chinese medicine prescription data set for experiment, and in addition, parameter learning and ablation analysis are carried out to further verify the effectiveness of the model, wherein the prescription data set and the traditional Chinese medicine knowledge map data used in the invention are shown in Table 1.

TABLE 1 prescription data set and Chinese medicine knowledge map data

The chinese prescription data set contains 26360 complete prescription samples, which involve 360 symptoms and 753 herbs. Triplets are a general representation of a knowledge graph: (head, relation, tail), wherein head and tail represent the head entity and the tail entity respectively, and relation is the relationship between the two entities. In most cases, natural language can be represented in this form. For example, the triple (ephedra, waseffect, sweating) indicates that the herb "ephedra" has the efficacy of "sweating". In the theory of traditional Chinese medicine, herbs have the properties of four flavors and five flavors and meridian tropism. Therefore, the Chinese medicine knowledge map with the Chinese herbal medicine as the core is constructed on the basis of the background knowledge. FIG. 2 is an illustration of a Chinese medicine knowledge map.

Table 2 shows a comparison of the performance between the syndrome induction and models predicting the presence or absence of attention-driven mechanisms in the hierarchy aggregating multiple symptoms into a cluster. It can be observed from the table that the attention-based syndrome induction process is superior to the average aggregation method without attention. The symptom entity captures similar information between herbs through the high-order connectivity of the knowledge graph, so that the embedded representation of symptoms also contains the herb background knowledge of nature, taste, channel tropism, etc. Obviously, in the syndrome induction stage, the potential syndrome representation also integrates the background knowledge of herbs, which makes the relationship between syndrome and herbs more compact. Therefore, on the basis, the attention mechanism not only can dynamically acquire the respective influence of each symptom in the symptom group, but also accords with the basic idea of traditional Chinese medicine dialectical treatment.

TABLE 2 influence of the attention mechanism on the process of syndrome induction

Table 3 demonstrates the effect of using different aggregation functions in the neighbor information propagation and aggregation layer to update the embedded representation of the entity on the model performance. Three different polymerizers were investigated for their effect on the model performance: a GCN polymerizer, a GraphSage polymerizer and a Bi-Interaction polymerizer. From the table, it can be observed that the additional feature Interaction of the Bi-Interaction Aggregator in the information aggregation process can improve the representation learning effect of the node, which proves the rationality and effectiveness of the Bi-Interaction Aggregator.

TABLE 3 Effect of different polymerizers

FIG. 4 illustrates the effect of the depth of the neighbor information propagation and aggregation layer on the model performance. Depth embodies the high-order connectivity of the knowledge-graph, which controls the extent to which entities can aggregate information from. It can be observed from the figure that the model achieves the best results when the depth is 2, which shows that the second order relationship between entities can effectively represent the complexity of the herb. As depth continues to increase, noise may be introduced resulting in reduced model performance.

FIG. 5 illustrates the effect of embedding dimension d into an entity in a knowledge graph on model performance. Experiments controlled the range of sizes of the embedding dimensions between 32,64,128,256, 512. It can be observed from the figure that the model achieves the best performance when the embedding dimension is 256, which indicates that increasing the dimension value appropriately can more fully represent the complex herbal information in the knowledge graph.

Case verification:

in order to verify the rationality and validity of the prescription generation method proposed by the present invention, the proposed KGAPG model was tested in two real prescription cases. The model generates a set of herbs to treat these symptoms collectively. Two cases in the recipe generation scenario are shown in table 4, for example. The bold herbs in the table indicate successful hits of the KGAPG model generated herbs in the real herb collection.

Table 4 prescription Generation cases

It should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention, and all equivalent substitutions or substitutions made on the above-mentioned technical solutions belong to the scope of the present invention.

Claims

1. A traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning is characterized by sequentially comprising the following steps:

step 3, syndrome induction and prediction layer: according to the entity embedded expression obtained in the step 2, the symptom combination corresponding to each prescription sample is regarded as a group, the group is used for representing syndrome information in the theory of traditional Chinese medicine, the embedded expression of the group is learned by using the attention mechanism, and the group expression information and the traditional Chinese medicine knowledge map are combined

2. The method for generating a prescription of traditional Chinese medicine based on knowledge-graph and group representation learning according to claim 1, wherein the step 1 is specifically as follows:

chinese medicine knowledge map

Of (c), let the notation of the triplet be (e)_h，r，e_t) Wherein e is_h，r，e_tRepresenting the head entity, the relation (hasEffect) and the tail entity of the knowledge graph respectively, firstly, the entities in the d-dimensional entity space are passed through W_r∈R^k×dProjecting the matrix into a k-dimensional relation space where the relation r is located to obtain an entity e_hEmbedded representation within a relationship space

And entity e_tEmbedded representation within a relationship space

Then by optimizing the translation principle

Wherein r is the embedded expression of the relation r in the k-dimensional relation space, and the Chinese medicine knowledge graph is finally obtained according to the method

3. The method for generating a prescription of traditional Chinese medicine based on knowledge-graph and group representation learning according to claim 2, wherein the step 2 is specifically as follows:

wherein

Is e_hNeighbor entity e of_tThe weight occupied in the process of aggregate representation is understood at the same time as the relation r to the entity e_hOf importance, the weight being dependent on e in the space of the relation r_hAnd e_tIs defined as shown in formula (2):

π(e_h，r，e_t)＝(W_re_t)^Ttanh((W_re_h+e_r))#(2)

wherein

(d denotes the embedding dimension) is a trainable weight matrix, e_rIs an embedded representation of the relationship r, and finally the weights are normalized by the softmax function to

Step 22, obtaining an entity e_hNeighbor aggregate representation of

Then use

To update the original embedding e of the entity_hEntity e after direct neighbor information aggregation and update_hIs expressed as

wherein

(d represents an embedding dimension) is a trainable weight matrix, indicating a product of elements, LeakyReLU is an activation function;

step 23, on the basis of step 21 and step 22, further stacking more propagation layers to obtain a high-order neighbor aggregate representation of each entity, recursively updating the entity embedded representation in the layer I network, propagating information node by node in the knowledge graph, propagating information through the layer I, and finally propagating the entity e in the layer I_hThe presentation information of (a) includes e_hHigh-order neighbor entity information that can be reached in step l.

4. The method for generating a prescription of traditional Chinese medicine based on knowledge-graph and group representation learning according to claim 3, wherein the step 3 is specifically as follows:

step 31, defining a plurality of symptoms of each prescription sample as a group S_p＝{s_i|s_iE S) where S_iRepresenting the ith symptom, S represents all symptom sets in the data set, the syndrome induction process utilizes an attention mechanism to learn the influence of different symptoms in each symptom group on the group, namely the weight of each symptom in one symptom group, the weight is learned by an attention network, and the symptom sets S_pEach symptom of (1)_iWeight of (a) (S)_p，s_i) The definition is shown in formula (4):

α(S_p，s_i)＝h^TW_atts_i#(4)

wherein

And

is a trainable parameter, s_iIs the symptom s obtained through step 2_iAfter obtaining the weight of each symptom in a set of symptom sets, normalizing by a softmax function to finally obtain the influence score of each symptom in the set

Specifically, as shown in formula (5):

s_d＝ReLU(W^mlp·s_d+b^mlp)#(7)

wherein W^mlpAnd b^mlpAre learnable parameters, ReLU is an activation function, so far, by means of a symptom aggregation method based on attention mechanism, a potential syndrome representation in each prescription sample is obtained,

step 33. the potential syndromes for each prescription sample obtained by the above steps are interacted with herbs to predict the likelihood that each herb will be suitable for treating the set of symptoms, where the predicted score is calculated using the inner product, as shown in equation (8):

wherein

Indicating that herb h is suitable for treating symptom group S_pI.e. the underlying syndrome s_dH is an embedded representation of the herb entities in the knowledge-graph, and finally outputting the top N herbs with the highest probability scores as the prescription applicable to the input symptom combination.