CN114610897A - Medical knowledge map relation prediction method based on graph attention machine mechanism - Google Patents
Medical knowledge map relation prediction method based on graph attention machine mechanism Download PDFInfo
- Publication number
- CN114610897A CN114610897A CN202210181938.2A CN202210181938A CN114610897A CN 114610897 A CN114610897 A CN 114610897A CN 202210181938 A CN202210181938 A CN 202210181938A CN 114610897 A CN114610897 A CN 114610897A
- Authority
- CN
- China
- Prior art keywords
- entity
- embedding
- matrix
- attention
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 230000007246 mechanism Effects 0.000 title claims abstract description 12
- 239000013598 vector Substances 0.000 claims description 74
- 239000011159 matrix material Substances 0.000 claims description 62
- 230000009466 transformation Effects 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 22
- 238000012549 training Methods 0.000 claims description 19
- 230000004913 activation Effects 0.000 claims description 10
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 8
- 238000013527 convolutional neural network Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 6
- 238000013519 translation Methods 0.000 claims description 5
- 230000008859 change Effects 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 230000008034 disappearance Effects 0.000 claims description 3
- 238000002474 experimental method Methods 0.000 claims description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 2
- 230000004931 aggregating effect Effects 0.000 claims description 2
- 210000002569 neuron Anatomy 0.000 claims description 2
- 230000008685 targeting Effects 0.000 claims description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims 2
- 235000008694 Humulus lupulus Nutrition 0.000 claims 1
- 230000000644 propagated effect Effects 0.000 abstract description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000012886 linear function Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/288—Entity relationship models
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A medical knowledge map relation prediction method based on a graph attention machine mechanism belongs to the field of electronic information. The invention relates to the following points 3: (1) different weights (attention) are assigned to nearby nodes and attention is propagated through iterative and hierarchical computations. (2) And an auxiliary edge is introduced between multi-hop neighbors, so that the effective propagation of knowledge flow between entities is realized, and an embedded model based on graph attention is constructed. (3) The application of ConvKB as a decoder effectively captures the associations existing between entities and their neighbors. Aiming at the relation prediction task in the medical knowledge graph, the invention constructs an embedded model based on graph attention by expanding a graph attention mechanism, captures the entity and relation characteristics among multi-hop neighborhoods of a given entity and further perfects the association relation among the entities in the medical knowledge graph.
Description
Technical Field
The invention belongs to the field of electronic information, and relates to a technology which is based on a graph neural network and can be applied to medical knowledge map relation prediction.
Background
The knowledge graph is a structured representation of real world information, and even the most advanced knowledge graph has the problems of incompleteness and continuous perfection. The relation prediction is a technology for predicting missing facts according to existing entities in a knowledge graph, and can complement and enhance the knowledge graph. Recent studies have shown that Convolutional Neural Network (CNN) based models can generate rich and more expressive feature embeddings and thus perform well in relation prediction. However, such knowledge graph models independently process triples and fail to capture the complex and hidden information inherent in the local neighborhood around the triples.
Aiming at the problems of the convolutional neural network model in knowledge graph relation prediction, a feature embedding method based on a graph attention machine mechanism is provided, and association relations between entities and fields of the entities are captured.
Disclosure of Invention
Aiming at the problems that a model based on translation distance and a Convolutional Neural Network (CNN) can only independently process a single triple and is difficult to capture the relationship between adjacent domains of a given entity, and the like, the invention provides a relationship prediction graph embedding method based on attention, and realizes a more expressive knowledge graph relationship prediction technology.
The invention relates to the following 3 points:
(1) different weights (attention) are assigned to nearby nodes and attention is propagated through iterative and hierarchical computations.
(2) And an auxiliary edge is introduced between multi-hop neighbors, so that the effective propagation of knowledge flow between entities is realized, and an embedded model based on graph attention is constructed.
(3) The application of ConvKB as a decoder efficiently captures the associations existing between an entity and its neighbors.
The core algorithm of the invention is as follows:
(1) graph attention-based graph embedding method for relation prediction
Entities in the knowledge Graph play different roles under different relationships, and Graph Attention networks (GATs) ignore the role of the relationships in the knowledge Graph, and a novel Graph Attention-based Graph embedding method is provided by taking the characteristics of the relationships and adjacent nodes into an Attention mechanism.
Unlike GATs, the inputs to each layer of the model contain entity embedding matrices and relationship embedding matrices.
Wherein, the entity embedding matrix H is as the formula (1):
wherein,feature matrix representing entities, NeIs the total number of entities and T is the embedded feature dimension of each entity.
The relationship embedding matrix G is expressed by the following formula (2):
wherein,feature matrix representing relations, NrIs the number of relationships, and P is the characteristic dimension of the relationship matrix embedding.
Referring to the network architecture of fig. 1, the updated embedded matrices H 'and G' are calculated from the input matrices H and G as follows:
the method comprises the following steps: defining a set of entities E ═ { E ] in a medical knowledge graph1,…,ei,…,en},eiIs the embedding of the ith entity. First, study with eiRepresentation of each triplet associated to obtain entity ei
New embedding of (2). By targeting specific tripletsThe entity and relationship feature vectors of (3) perform a linear transformation to learn these embeddings as in equation (3).
Wherein,is a tripletIs represented by a vector of (a).Are respectively entity ei、ejAnd relation rkEmbedded representation of W1Is a linear transformation matrix.
Step two: the importance degree of each triplet, namely the attention coefficient b, is obtained by adopting the same idea as the GATsijk. As shown in formula (4), a linear transformation is first performed, and then a non-linear activation function is applied to obtain bijkIn the formula (4), W2Is a linear transformation matrix.
bijk=LeakyReLU(W2cijk) (4)
Furthermore, normalization of the attention coefficient was performed using equation (5) to obtain a relative attention value αijk。
Wherein N isiDenotes all ofiSet of adjacent entities, RinRepresenting a connecting entity ei、enSet of relationships, binrIs represented byiAttention coefficients of neighboring entities.
After the normalized attention coefficient is obtained, the updated embedding vector is calculated according to formula (8). The model adopts a multi-head attention mechanism, so that the model learns related information in different expression subspaces, and the stability of the learning process is ensured. In addition, in order to reduce the output dimension, a method of calculating an average value is adopted to obtain a final embedded vector.
Where M is the number of attention heads and σ represents a non-linear function. j represents and entity eiAdjacent entities, k representing entity eiWith entity ejThe relationship between them.
Step three: as shown in equation (9), a weight matrix W is usedRAnd carrying out linear transformation on the relation matrix G to obtain a new embedded matrix.
G′=GWR (7)
Wherein, WR∈RT×T′And T' is the dimension of the relational embedding vector of the layer output.
Step four: entity weight matrix WE( Feature matrix, T, representing a relationshipi、TjDimension representing initial and final entity embedding vectors, respectively), HiFor entity embedding vectors of the input model, the initial entity embedding vector H is given according to equation (8)iLinear transformation is carried out to obtain a transformed entity embedding vector Ht。
Ht=WEHi (8)
Step five: embedding the initial entity into vector H according to equation (9)tEntity embedding vector H obtained from last attention layerfThe addition yields the update entity embedding vector H ".
H”=Ht+Hf (9)
Wherein,(feature matrix being entity) embedding matrix for entity output of last attention layer, NeIs the total number of entities, TfIs the characteristic dimension of the final entity embedding vector.
In addition, k (k >1) hop adjacency (dotted directional line segment in fig. 2) is defined as an auxiliary relationship in the knowledge graph, and embedding of the auxiliary relationship is the sum of embedding of all the relationships in the directional path. Thus, for a multi-layer model, at the s-th layer, updated embedding vectors can be calculated by aggregating neighbors of adjacent s-hops.
The graph attention network will take triples present in the knowledge graph as valid triplesAs a positive example of training. And randomly replacing a head entity or a tail entity in the triple by the entity to form an invalid triple t'ijAs a negative example of training.
At the same time, the invention learns the embedding by using the idea of a translation scoring function, namely for a given effective tripleIs provided withWherein eiIs an entity, ejIs eiNearest neighbor entity, rkIs eiAnd ejThe relationship between them.Is entity eiThe embedded vector of (a) is embedded,is a relation rkThe embedded vector of (a) is embedded,is entity ejThe embedded vector of (2).
In model training, entity is closedThe minimum L1 non-similarity norm is adopted for the system embedding learning And uses the change loss function shown in equation (10).
Wherein, γ>0, is a margin over-parameter; s is the correct triplet set and S' is the invalid triplet set. S' is calculated according to formula (11) and includes a triplet obtained by substituting the head entityAnd triplets derived in place of tail entities
(2) ConvKB decoder based on convolutional neural network
The model uses ConvKB as decoder, analyzes triplets by convolutional layerAnd global embedding characteristics of different dimensions are adopted, so that the conversion characteristics of the model are summarized. The model calculates a plurality of feature mapping scores during decoding according to equation (12).
Wherein, ω isqDenotes the q-th filter, Ω is the number of filters, is the convolution operator,W∈RΩk×1Is a linear transformation matrix used to calculate the final scores of the triples.
In order to improve the generalization capability of the model, the soft boundary loss function shown in formula (13) is adopted in the model training to calculate the loss:
wherein,coefficient representing positive and negative cases whenWhen the temperature of the water is higher than the set temperature,when in useWhen the utility model is used, the water is discharged,λ is the hyperparameter of the L2 norm.
Effects of the invention
Aiming at the relation prediction task in the medical knowledge graph, an embedded model based on graph attention is constructed by expanding a graph attention mechanism, and the entity and relation characteristics among multi-hop neighborhoods of a given entity are captured, so that the incidence relation among the entities in the medical knowledge graph is perfected.
Drawings
FIG. 1 is a diagram of a network architecture;
FIG. 2 is a schematic view of an auxiliary relational edge;
FIG. 3 is a schematic illustration of an attention mechanism;
FIG. 4 shows the structure of ConvKB.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings.
In view of the fact that knowledge maps are constructed in the early stage, because knowledge sources are limited, the knowledge coverage is not enough, and a large amount of related knowledge is lost. Therefore, in the medical field, there is a need for relationship prediction to complement knowledge of medical knowledge maps.
The embodiment provides a knowledge graph relation completion method flow, which is implemented by the following steps:
1) GATs are trained to encode entity and relationship information.
2) ConvKB was trained as a decoder to perform the relational prediction task.
The following gives the relevant definitions of the invention:
definition 1 (the knowledge-graph,)representing all entities and relationships contained in the knowledge-graph.
Definition 2 (entity set, E) E ═ E1,…,ei,…,enAnd representing all entity nodes in the knowledge graph, which correspond to the entity set in the knowledge base.
Definition 3 (set of relationships, R) R ═ { R1,…,ri,…,rnAnd expressing all the relation edges in the knowledge graph, which correspond to the relation sets in the knowledge base.
The definition 4 (the triple,)eirepresents a head entity, rkRepresents a relationship, ejRepresenting the tail entity. Wherein e isi,ej∈E,rkE.r, a triplet is also called a knowledge.
1) First stage
The GAT is trained to encode information about entities and relationships in the graph for better, more expressive embedding.
S1: and acquiring a medical knowledge map to be processed, and converting all knowledge in the knowledge map into a knowledge base file in a triple storage form. The Neo4j graph database is used to convert the knowledge graph into an RDF repository file in triple storage.
S2: because the computer cannot recognize the triples in text form, it needs to be converted into a word vector space representation before being input to the neural network for relationship determination.
The invention learns initial embedding by using TransE by referencing the idea of a translation scoring function, namely for a given effective tripleIs provided withWherein eiIs an entity, ejIs eiNearest neighbor entity, rkIs eiAnd ejThe relationship between them.Is entity eiThe embedded vector of (a) is embedded,is a relation rkThe embedded vector of (a) is embedded,is entity ejThe embedded vector of (2).
Wherein, the entity embedding matrix H is as the formula (1):
wherein R represents the feature matrix of the entity, NeIs the total number of entities, T is the feature dimension embedded by each entityAnd (4) degree.
The relationship embedding matrix G is expressed by the following formula (2):
wherein,feature matrix representing relations, NrIs the number of relationships, and P is the characteristic dimension of the relationship matrix embedding.
S3: for each entity, it is necessary to learn the vector representation of all triples associated with it, to obtain a new embedding of the entity, by means of the specific triplesThe entity and relationship feature vectors of (3) perform a linear transformation to learn these embeddings as in equation (3).
Wherein,is a tripletIs represented by a vector.Are respectively entity ei、ejAnd relation rkEmbedded representation of W1Is a linear transformation matrix.
S4: in order to map the input features to the output feature space with higher dimension, a linear transformation is first performed, and then a nonlinear activation function is applied to obtain the attention coefficient bijkRepresenting the importance of each triplet.
W is shown in the formula (4)2Is a linear transformation matrix. In order to avoid the problems of gradient disappearance and the like, LeakyReLU is used as an activation function, and the value of the hyper-parameter of the LeakyReLU is 0.2 through multiple comparison experiments and references.
bijk=LeakyReLU(W2cijk) (4)
S5: applying softmax operation to the original attention scores obtained by all incoming edges of a node, carrying out normalization on the attention coefficient according to formula (5) to obtain a relative attention value alphaijk。
Wherein N isiDenotes all ofiSet of adjacent entities, RinRepresenting a connecting entity ei、enSet of relationships, binrIs represented byiAttention coefficients of neighboring entities.
S6: after the normalized attention coefficient is obtained, the updated embedding vector is calculated according to formula (8).
The model adopts a multi-head attention mechanism, so that the model learns related information in different expression subspaces, and the stability of the learning process is ensured. In addition, in order to reduce the output dimension, a method of calculating an average value is adopted to obtain a final embedded vector.
Where M is the number of attention heads and σ represents a non-linear function. j represents and entity eiAdjacent entities, k representing entity eiWith entity ejThe relationship between them.
In addition, the method adopts dropout operation, abandons the activation of some neurons with the probability of 0.3, and avoids the problems of overfitting and the like.
S7: as shown in equation (9), a weight matrix W is usedRLinear transformation of the relation matrix GA new embedded matrix is obtained.
G′=GWR (7)
Wherein, WR∈RT×T′And T' is the dimension of the relational embedding vector of the layer output.
S8: embedding the vector H into the initial entity according to equation (8)iLinear transformation is carried out to obtain a transformed entity embedding vector Ht。
Ht=WEHi (8)
Wherein the entity weight matrix WE( Feature matrix, T, representing a relationshipi、TjDimension representing initial and final entity embedding vectors, respectively), HiVectors are embedded for the entities of the input model.
S9: when learning new embedded vectors, their original embedded information may be lost, so the original entity embedded vector H is based on equation (9)tEntity embedding vector H obtained from last attention layerfThe addition yields the update entity embedding vector H ".
H”=Ht+Hf (9)
Wherein,(feature matrix being entity) embedding matrix for entity output of last attention layer, NeIs the total number of entities, TfIs the characteristic dimension of the final entity embedding vector.
S10: the graph attention network will take triples present in the knowledge graph as valid triplesAs a positive example of training. And randomly replacing a head entity or a tail entity in the triple by the entity to form an invalid triple t'ijAs a negative example of training. In this training phase, the ratio of the number of valid triples to invalid triples is 2: 1.
In model training, entity relation embedding learning adopts minimum L1 non-similarity norm And uses the change loss function shown in equation (10).
Wherein, γ>0 is a margin hyper-parameter with a value of 5; s is the correct triplet set and S' is the invalid triplet set. S' is calculated according to formula (11) and includes a triplet obtained by substituting the head entityAnd triplets derived in place of tail entitiesRepresenting the removal of E from the entity set EiThe entity of (1).
In addition, the model was continuously optimized using an Adam optimizer with a learning rate of 0.001.
2) Second stage
The model uses ConvKB as decoder, analyzes triplets by convolutional layerAnd (4) global embedding characteristics of different dimensions, thereby summarizing the conversion characteristics of the model.
S11: the model calculates a plurality of feature mapping scores during decoding according to equation (12).
Wherein, ω isqDenotes the q-th filter, Ω is the number of filters, is the convolution operator, W ∈ RΩk×1Is a linear transformation matrix used to calculate the final scores of the triples.As shown in formula (3), is a tripletIs represented by a vector.
Through multiple comparison experiments, in order to avoid the problems of gradient disappearance and the like, LeakyReLU with a hyper-parameter value of 0.2 is adopted as an activation function.
S12: in order to improve the generalization capability of the model, the soft boundary loss function shown in formula (13) is adopted in the model training to calculate the loss:
wherein,coefficient representing positive and negative cases whenWhen the temperature of the water is higher than the set temperature,when in useWhen the temperature of the water is higher than the set temperature,is a hyperparameter of L2 norm, and takes the value of 0.00001.
In this training phase, the ratio of the number of valid triples to invalid triples is 4: 1.
Claims (2)
1. The medical knowledge map relation prediction method based on the graph attention machine system is characterized by comprising the following steps of:
(1) graph attention-based graph embedding method for relation prediction
The input of each layer of the model comprises an entity embedding matrix and a relation embedding matrix; wherein, the entity embedding matrix H is as the formula (1):
wherein R represents the feature matrix of the entity, NeIs the total number of entities, T is the characteristic dimension embedded by each entity;
the relationship embedding matrix G is expressed by the following formula (2):
wherein,feature matrix representing relations, NrIs the relation quantity, P is the characteristic dimension of relation matrix embedding;
according to the input matrixes H and G, calculating the updated embedded matrixes H 'and G' according to the following steps:
the method comprises the following steps: defining a set of entities E ═ { E ] in a medical knowledge graph1,…,ei,…,en},eiIs the embedding of the ith entity; first, study with eiRepresentation of each triplet associated to obtain entity eiNew embedding of (2); by targeting specific tripletsThe entity and relationship feature vectors of (3) perform a linear transformation as in equation (3) to learn these embeddings;
wherein,is a tripletA vector representation of (a);are respectively entity ei、ejAnd relation rkEmbedded representation of W1Is a linear transformation matrix;
step two: obtaining the importance degree of each three group, namely the attention coefficient bijk(ii) a As shown in formula (4), a linear transformation is first performed, and then a non-linear activation function is applied to obtain bijkIn the formula (4), W2Is a linear transformation matrix;
bijk=LeakyReLU(W2cijk) (4)
furthermore, normalization of the attention coefficient was performed using equation (5) to obtain a relative attention value αijk;
Wherein N isiDenotes all ofiSet of adjacent entities, RinRepresenting a connection entity ei、enSet of relationships, binrIs represented byiAttention coefficients of neighboring entities;
after obtaining the normalized attention coefficient, calculating an updated embedding vector according to a formula (8); the model adopts a multi-head attention mechanism, so that the model learns related information in different expression subspaces, and the stability of the learning process is ensured; in addition, in order to reduce output dimensionality, a method of calculating an average value is adopted to obtain a final embedded vector;
wherein M is the number of the attention heads, and sigma represents a nonlinear function; j represents and entity eiAdjacent entities, k represents entity eiWith entity ejThe relationship between;
step three: as shown in equation (9), a weight matrix W is usedRCarrying out linear transformation on the relation matrix G to obtain a new embedded matrix;
G′=GWR (7)
wherein, WR∈RT×T′T' is the dimension of the relation embedding vector of the layer output;
step four: entity weight matrix WE, Feature matrix, T, representing a relationshipi、TjRepresenting the dimensions of the initial and final entity embedding vectors, respectively, for the entity embedding vector of the input model, the initial entity embedding vector H is given according to equation (8)iLinear transformation is carried out to obtain a transformed entity embedding vector Ht;
Ht=WEHi (8)
Step five: embedding the initial entity into vector H according to equation (9)tEntity embedding vector H obtained from last attention layerfAdding to obtain an updated entity embedding vector H';
H”=Ht+Hf (9)
wherein, is a feature matrix of the entity, an entity embedding matrix for the last attention layer output, NeIs the total number of entities, TfIs the characteristic dimension of the final entity embedding vector;
in addition, k hop adjacency is defined as an auxiliary relation in the knowledge graph, the embedding of the auxiliary relation is the sum of the embedding of all the relations in the directed path, and k is greater than 1; thus, for a multilayer model, an updated embedding vector can be calculated by aggregating neighbors of adjacent s hops at the s-th layer;
the graph attention network will take triples present in the knowledge graph as valid triplesAs a positive example of training; and randomly replacing a head entity or a tail entity in the triple by the entity to form an invalid triple t'ijAs a negative example of training;
learning embeddings by using the concept of a translation scoring function, i.e. for a given valid triplet Is provided withWhereineiIs an entity, ejIs eiNearest neighbor entity, rkIs eiAnd ejThe relationship between;is entity eiThe embedded vector of (a) is embedded,is a relation rkThe embedded vector of (a) is embedded,is entity ejThe embedded vector of (2);
in model training, entity relation embedding learning adopts minimum L1 non-similarity norm And using a change loss function shown in formula (10);
wherein, γ>0, is a margin over-parameter; s is a correct triple set, and S' is an invalid triple set; s' is calculated according to formula (11) and includes a triplet obtained by substituting the head entityAnd triplets derived in place of tail entities
(2) ConvKB decoder based on convolutional neural network
The model uses ConvKB as decoder, analyzes triplets by convolutional layerGlobal embedding characteristics of different dimensions are adopted, and then the conversion characteristics of the model are summarized; the model calculates a plurality of feature mapping scores according to formula (12) during decoding;
wherein, ω isqDenotes the q-th filter, Ω is the number of filters, is the convolution operator, W ∈ RΩk×1Is a linear transformation matrix for calculating the final scores of the triples;
the loss is calculated using the soft boundary loss function shown in equation (13):
2. The method of claim 1, wherein:
the following relevant definitions:
definition 1 (the knowledge-graph,)representing all entities and relationships contained in the knowledge-graph;
definition 2 (entity set, E) E ═ E1,…,ei,…,enRepresenting all entity nodes in the knowledge graph, corresponding to the entity set in the knowledge base;
definition 3 (set of relationships, R) R ═ { R1,…,ri,…,rnRepresenting all relation edges in the knowledge graph, corresponding to the relation sets in the knowledge base;
the definition 4 (the triple,)eirepresents a head entity, rkRepresents a relationship, ejRepresenting a tail entity; wherein e isi,ej∈E,rkE.g. R, a triple is also called a knowledge;
first stage
Training the GAT to encode information of entities and relationships in the graph;
s1: acquiring a medical knowledge graph to be processed, and converting all knowledge in the knowledge graph into a knowledge base file in a triple storage form; converting the knowledge graph into an RDF knowledge base file in a triple storage form by using a Neo4j graph database;
s2: because the computer cannot identify the triples in the text form, before the triples are input into the neural network for relation judgment, the triples need to be converted into a form expressed by a word vector space;
using the concept of a translation scoring function, TransE is used to learn the initial embedding, i.e., for a given valid tripletIs provided withWherein eiIs an entity, ejIs eiNearest neighbor entity, rkIs eiAnd ejThe relationship between;is entity eiThe embedded vector of (a) is embedded,is a relation rjThe embedded vector of (a) is embedded,is entity ejThe embedded vector of (2);
wherein, the entity embedding matrix H is as the formula (1):
wherein R represents the feature matrix of the entity, NeIs the total number of entities, T is the characteristic dimension each entity embeds;
the relationship embedding matrix G is expressed by the following formula (2):
wherein,feature matrix representing relations, NrIs the relation quantity, P is the characteristic dimension of relation matrix embedding;
s3: for each entity, it is necessary to learn the vector representation of all triples associated with it, to obtain a new embedding of the entity, by means of the specific triplesThe entity and relationship feature vectors of (3) perform a linear transformation as in equation (3) to learn these embeddings;
wherein,is a tripletA vector representation of (a);are respectively entity ei、ejAnd relation rkEmbedded representation of W1Is a linear transformation matrix;
s4: in order to map the input features to the higher-dimensional output feature space, a linear transformation is required, and then a nonlinear activation function is applied to obtain the attention coefficient bijkRepresenting the importance of each triplet;
w is shown in the formula (4)2Is a linear transformation matrix; in order to avoid the problems of gradient disappearance and the like, LeakyReLU is used as an activation function, and the hyper-parameter value of the LeakyReLU is 0.2 through multiple comparison experiments and references;
bijk=LeakyReLU(W2cijk) (4)
s5: applying softmax operation to the original attention scores obtained by all incoming edges of a node, carrying out normalization on the attention coefficient according to formula (5) to obtain a relative attention value alphaijk;
Wherein N isiDenotes all ofiSet of adjacent entities, RinRepresenting a connecting entity ei、enSet of relationships, binrIs represented byiAttention coefficients of neighboring entities;
s6: after obtaining the normalized attention coefficient, calculating an updated embedding vector according to a formula (8);
the model adopts a multi-head attention mechanism, so that the model learns related information in different expression subspaces, and the stability of the learning process is ensured; in addition, in order to reduce output dimensionality, a method of calculating an average value is adopted to obtain a final embedded vector;
wherein M is the number of the attention heads, and sigma represents a nonlinear function; j represents and entity eiAdjacent entities, k representing entity eiWith entity ejThe relationship between;
furthermore, with the dropout operation, the activation of some neurons is discarded with a probability of 0.3;
s7: as shown in equation (9), a weight matrix W is usedRCarrying out linear transformation on the relation matrix G to obtain a new embedded matrix;
G′=GWR (7)
wherein, WR∈RT×T′T' is the dimension of the relation embedding vector of the layer output;
s8: embedding vectors into the initial entity according to equation (8)HiLinear transformation is carried out to obtain a transformed entity embedding vector Ht;
Ht=WEHi (8)
Wherein the entity weight matrix WE, Feature matrix, T, representing a relationshipi、TjDimension, H, representing the initial and final entity embedding vectors, respectivelyiEmbedding vectors for entities of the input model;
s9: when learning new embedded vectors, their original embedded information may be lost, so the original entity embedded vector H is based on equation (9)tEntity embedding vector H obtained from last attention layerfAdding to obtain an updated entity embedding vector H';
H”=Ht+Hf (9)
wherein, for entities the feature matrix is the entity-embedded matrix of the last attention layer output, NeIs the total number of entities, TfIs the characteristic dimension of the final entity embedding vector;
s10: the graph attention network will take triples present in the knowledge graph as valid triplesAs a positive example of training; and randomly replacing a head entity or a tail entity in the triple by the entity to form an invalid triple t'ijAs a negative example of training; during this training phase, valid tripletsAnd the number ratio of invalid triples is 2: 1;
in model training, entity relation embedding learning adopts minimum L1 non-similarity norm And using a change loss function shown in formula (10);
wherein, γ>0 is a margin hyper-parameter with a value of 5; s is a correct triple set, and S' is an invalid triple set; s' is calculated according to formula (11) and includes a triplet obtained by substituting the head entityAnd triplets derived in place of tail entitiesRepresenting the removal of E from the entity set EiThe entity of (1);
in addition, an Adam optimizer with a learning rate of 0.001 is adopted to continuously optimize the model;
2) second stage
The model uses ConvKB as decoder, analyzes triplets by convolutional layerGlobal embedding characteristics of different dimensions are adopted, and then the conversion characteristics of the model are summarized;
s11: the model calculates a plurality of feature mapping scores according to formula (12) during decoding;
wherein, ω isqDenotes the q-th filter, Ω is the number of filters, is the convolution operator, W ∈ RΩk×1Is a linear transformation matrix for calculating the final scores of the triples;as shown in formula (3), is a tripletA vector representation of (a);
adopting LeakyReLU with a hyper-parameter value of 0.2 as an activation function;
s12: in order to improve the generalization capability of the model, the soft boundary loss function shown in formula (13) is adopted in the model training to calculate the loss:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210181938.2A CN114610897A (en) | 2022-02-25 | 2022-02-25 | Medical knowledge map relation prediction method based on graph attention machine mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210181938.2A CN114610897A (en) | 2022-02-25 | 2022-02-25 | Medical knowledge map relation prediction method based on graph attention machine mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114610897A true CN114610897A (en) | 2022-06-10 |
Family
ID=81858174
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210181938.2A Pending CN114610897A (en) | 2022-02-25 | 2022-02-25 | Medical knowledge map relation prediction method based on graph attention machine mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114610897A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115861715A (en) * | 2023-02-15 | 2023-03-28 | 创意信息技术股份有限公司 | Knowledge representation enhancement-based image target relation recognition algorithm |
CN117010494A (en) * | 2023-09-27 | 2023-11-07 | 之江实验室 | Medical data generation method and system based on causal expression learning |
CN117435747A (en) * | 2023-12-18 | 2024-01-23 | 中南大学 | Few-sample link prediction drug recycling method based on multilevel refinement network |
CN117610662A (en) * | 2024-01-19 | 2024-02-27 | 江苏天人工业互联网研究院有限公司 | Knowledge graph embedding method for extracting representative sub-graph information through GAT |
CN117747124A (en) * | 2024-02-20 | 2024-03-22 | 浙江大学 | Medical large model logic inversion method and system based on network excitation graph decomposition |
CN117952198A (en) * | 2023-11-29 | 2024-04-30 | 海南大学 | Time sequence knowledge graph representation learning method based on time characteristics and complex evolution |
WO2024138803A1 (en) * | 2022-12-29 | 2024-07-04 | 之江实验室 | Drug repositioning method and system fusing multi-source knowledge graph |
-
2022
- 2022-02-25 CN CN202210181938.2A patent/CN114610897A/en active Pending
Non-Patent Citations (1)
Title |
---|
DEEPAK NATHANI 等: "Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs", 《ARXIV》, 30 June 2019 (2019-06-30), pages 1 - 10 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024138803A1 (en) * | 2022-12-29 | 2024-07-04 | 之江实验室 | Drug repositioning method and system fusing multi-source knowledge graph |
CN115861715A (en) * | 2023-02-15 | 2023-03-28 | 创意信息技术股份有限公司 | Knowledge representation enhancement-based image target relation recognition algorithm |
CN117010494A (en) * | 2023-09-27 | 2023-11-07 | 之江实验室 | Medical data generation method and system based on causal expression learning |
CN117010494B (en) * | 2023-09-27 | 2024-01-05 | 之江实验室 | Medical data generation method and system based on causal expression learning |
CN117952198A (en) * | 2023-11-29 | 2024-04-30 | 海南大学 | Time sequence knowledge graph representation learning method based on time characteristics and complex evolution |
CN117435747A (en) * | 2023-12-18 | 2024-01-23 | 中南大学 | Few-sample link prediction drug recycling method based on multilevel refinement network |
CN117435747B (en) * | 2023-12-18 | 2024-03-29 | 中南大学 | Few-sample link prediction drug recycling method based on multilevel refinement network |
CN117610662A (en) * | 2024-01-19 | 2024-02-27 | 江苏天人工业互联网研究院有限公司 | Knowledge graph embedding method for extracting representative sub-graph information through GAT |
CN117747124A (en) * | 2024-02-20 | 2024-03-22 | 浙江大学 | Medical large model logic inversion method and system based on network excitation graph decomposition |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114610897A (en) | Medical knowledge map relation prediction method based on graph attention machine mechanism | |
CN109389151B (en) | Knowledge graph processing method and device based on semi-supervised embedded representation model | |
CN111353534B (en) | Graph data category prediction method based on adaptive fractional order gradient | |
CN114969367B (en) | Cross-language entity alignment method based on multi-aspect subtask interaction | |
CN113157957A (en) | Attribute graph document clustering method based on graph convolution neural network | |
CN113065649A (en) | Complex network topology graph representation learning method, prediction method and server | |
CN112131403B (en) | Knowledge graph representation learning method in dynamic environment | |
CN111949764A (en) | Knowledge graph completion method based on bidirectional attention mechanism | |
CN113240683A (en) | Attention mechanism-based lightweight semantic segmentation model construction method | |
CN113361627A (en) | Label perception collaborative training method for graph neural network | |
CN108399268A (en) | A kind of increment type isomery figure clustering method based on game theory | |
CN113326884A (en) | Efficient learning method and device for large-scale abnormal graph node representation | |
Rassil et al. | Augmented graph neural network with hierarchical global-based residual connections | |
CN113806559B (en) | Knowledge graph embedding method based on relationship path and double-layer attention | |
CN111967528B (en) | Image recognition method for deep learning network structure search based on sparse coding | |
Du et al. | CGaP: Continuous growth and pruning for efficient deep learning | |
CN115422321B (en) | Knowledge graph complex logic reasoning method, component and knowledge graph query and retrieval method | |
Zhou et al. | Online recommendation based on incremental-input self-organizing map | |
WO2021046681A1 (en) | Complex scenario-oriented multi-source target tracking method | |
Liu et al. | Entity representation learning with multimodal neighbors for link prediction in knowledge graph | |
CN116226547A (en) | Incremental graph recommendation method based on stream data | |
CN115525836A (en) | Graph neural network recommendation method and system based on self-supervision | |
CN110569807B (en) | Multi-source target tracking method for complex scene | |
CN112905599A (en) | Distributed deep hash retrieval method based on end-to-end | |
CN114936296B (en) | Indexing method, system and computer equipment for super-large-scale knowledge map storage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |