CN114610897A - Medical knowledge map relation prediction method based on graph attention machine mechanism - Google Patents

Medical knowledge map relation prediction method based on graph attention machine mechanism Download PDF

Info

Publication number
CN114610897A
CN114610897A CN202210181938.2A CN202210181938A CN114610897A CN 114610897 A CN114610897 A CN 114610897A CN 202210181938 A CN202210181938 A CN 202210181938A CN 114610897 A CN114610897 A CN 114610897A
Authority
CN
China
Prior art keywords
entity
embedding
matrix
attention
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210181938.2A
Other languages
Chinese (zh)
Inventor
何坚
苗宁
张仰
陈建辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202210181938.2A priority Critical patent/CN114610897A/en
Publication of CN114610897A publication Critical patent/CN114610897A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A medical knowledge map relation prediction method based on a graph attention machine mechanism belongs to the field of electronic information. The invention relates to the following points 3: (1) different weights (attention) are assigned to nearby nodes and attention is propagated through iterative and hierarchical computations. (2) And an auxiliary edge is introduced between multi-hop neighbors, so that the effective propagation of knowledge flow between entities is realized, and an embedded model based on graph attention is constructed. (3) The application of ConvKB as a decoder effectively captures the associations existing between entities and their neighbors. Aiming at the relation prediction task in the medical knowledge graph, the invention constructs an embedded model based on graph attention by expanding a graph attention mechanism, captures the entity and relation characteristics among multi-hop neighborhoods of a given entity and further perfects the association relation among the entities in the medical knowledge graph.

Description

Medical knowledge map relation prediction method based on graph attention machine mechanism
Technical Field
The invention belongs to the field of electronic information, and relates to a technology which is based on a graph neural network and can be applied to medical knowledge map relation prediction.
Background
The knowledge graph is a structured representation of real world information, and even the most advanced knowledge graph has the problems of incompleteness and continuous perfection. The relation prediction is a technology for predicting missing facts according to existing entities in a knowledge graph, and can complement and enhance the knowledge graph. Recent studies have shown that Convolutional Neural Network (CNN) based models can generate rich and more expressive feature embeddings and thus perform well in relation prediction. However, such knowledge graph models independently process triples and fail to capture the complex and hidden information inherent in the local neighborhood around the triples.
Aiming at the problems of the convolutional neural network model in knowledge graph relation prediction, a feature embedding method based on a graph attention machine mechanism is provided, and association relations between entities and fields of the entities are captured.
Disclosure of Invention
Aiming at the problems that a model based on translation distance and a Convolutional Neural Network (CNN) can only independently process a single triple and is difficult to capture the relationship between adjacent domains of a given entity, and the like, the invention provides a relationship prediction graph embedding method based on attention, and realizes a more expressive knowledge graph relationship prediction technology.
The invention relates to the following 3 points:
(1) different weights (attention) are assigned to nearby nodes and attention is propagated through iterative and hierarchical computations.
(2) And an auxiliary edge is introduced between multi-hop neighbors, so that the effective propagation of knowledge flow between entities is realized, and an embedded model based on graph attention is constructed.
(3) The application of ConvKB as a decoder efficiently captures the associations existing between an entity and its neighbors.
The core algorithm of the invention is as follows:
(1) graph attention-based graph embedding method for relation prediction
Entities in the knowledge Graph play different roles under different relationships, and Graph Attention networks (GATs) ignore the role of the relationships in the knowledge Graph, and a novel Graph Attention-based Graph embedding method is provided by taking the characteristics of the relationships and adjacent nodes into an Attention mechanism.
Unlike GATs, the inputs to each layer of the model contain entity embedding matrices and relationship embedding matrices.
Wherein, the entity embedding matrix H is as the formula (1):
Figure BDA0003521544920000021
wherein the content of the first and second substances,
Figure BDA0003521544920000022
feature matrix representing entities, NeIs the total number of entities and T is the embedded feature dimension of each entity.
The relationship embedding matrix G is expressed by the following formula (2):
Figure BDA0003521544920000023
wherein the content of the first and second substances,
Figure BDA0003521544920000024
feature matrix representing relations, NrIs the number of relationships, and P is the characteristic dimension of the relationship matrix embedding.
Referring to the network architecture of fig. 1, the updated embedded matrices H 'and G' are calculated from the input matrices H and G as follows:
the method comprises the following steps: defining a set of entities E ═ { E ] in a medical knowledge graph1,…,ei,…,en},eiIs the embedding of the ith entity. First, study with eiRepresentation of each triplet associated to obtain entity ei
New embedding of (2). By targeting specific triplets
Figure BDA0003521544920000025
The entity and relationship feature vectors of (3) perform a linear transformation to learn these embeddings as in equation (3).
Figure BDA0003521544920000026
Wherein the content of the first and second substances,
Figure BDA0003521544920000027
is a triplet
Figure BDA0003521544920000028
Is represented by a vector of (a).
Figure BDA0003521544920000029
Are respectively entity ei、ejAnd relation rkEmbedded representation of W1Is a linear transformation matrix.
Step two: the importance degree of each triplet, namely the attention coefficient b, is obtained by adopting the same idea as the GATsijk. As shown in formula (4), a linear transformation is first performed, and then a non-linear activation function is applied to obtain bijkIn the formula (4), W2Is a linear transformation matrix.
bijk=LeakyReLU(W2cijk) (4)
Furthermore, normalization of the attention coefficient was performed using equation (5) to obtain a relative attention value αijk
Figure BDA0003521544920000031
Wherein N isiDenotes all ofiSet of adjacent entities, RinRepresenting a connecting entity ei、enSet of relationships, binrIs represented byiAttention coefficients of neighboring entities.
After the normalized attention coefficient is obtained, the updated embedding vector is calculated according to formula (8). The model adopts a multi-head attention mechanism, so that the model learns related information in different expression subspaces, and the stability of the learning process is ensured. In addition, in order to reduce the output dimension, a method of calculating an average value is adopted to obtain a final embedded vector.
Figure BDA0003521544920000032
Where M is the number of attention heads and σ represents a non-linear function. j represents and entity eiAdjacent entities, k representing entity eiWith entity ejThe relationship between them.
Step three: as shown in equation (9), a weight matrix W is usedRAnd carrying out linear transformation on the relation matrix G to obtain a new embedded matrix.
G′=GWR (7)
Wherein, WR∈RT×T′And T' is the dimension of the relational embedding vector of the layer output.
Step four: entity weight matrix WE(
Figure BDA0003521544920000033
Figure BDA0003521544920000034
Feature matrix, T, representing a relationshipi、TjDimension representing initial and final entity embedding vectors, respectively), HiFor entity embedding vectors of the input model, the initial entity embedding vector H is given according to equation (8)iLinear transformation is carried out to obtain a transformed entity embedding vector Ht
Ht=WEHi (8)
Step five: embedding the initial entity into vector H according to equation (9)tEntity embedding vector H obtained from last attention layerfThe addition yields the update entity embedding vector H ".
H”=Ht+Hf (9)
Wherein the content of the first and second substances,
Figure BDA0003521544920000035
(
Figure BDA0003521544920000036
feature matrix being entity) embedding matrix for entity output of last attention layer, NeIs the total number of entities, TfIs the characteristic dimension of the final entity embedding vector.
In addition, k (k >1) hop adjacency (dotted directional line segment in fig. 2) is defined as an auxiliary relationship in the knowledge graph, and embedding of the auxiliary relationship is the sum of embedding of all the relationships in the directional path. Thus, for a multi-layer model, at the s-th layer, updated embedding vectors can be calculated by aggregating neighbors of adjacent s-hops.
The graph attention network will take triples present in the knowledge graph as valid triples
Figure BDA0003521544920000041
As a positive example of training. And randomly replacing a head entity or a tail entity in the triple by the entity to form an invalid triple t'ijAs a negative example of training.
At the same time, the invention learns the embedding by using the idea of a translation scoring function, namely for a given effective triple
Figure BDA0003521544920000042
Is provided with
Figure BDA0003521544920000043
Wherein eiIs an entity, ejIs eiNearest neighbor entity, rkIs eiAnd ejThe relationship between them.
Figure BDA0003521544920000044
Is entity eiThe embedded vector of (a) is embedded,
Figure BDA0003521544920000045
is a relation rkThe embedded vector of (a) is embedded,
Figure BDA0003521544920000046
is entity ejThe embedded vector of (2).
In model training, entity is closedThe minimum L1 non-similarity norm is adopted for the system embedding learning
Figure BDA0003521544920000047
Figure BDA0003521544920000048
And uses the change loss function shown in equation (10).
Figure BDA0003521544920000049
Wherein, γ>0, is a margin over-parameter; s is the correct triplet set and S' is the invalid triplet set. S' is calculated according to formula (11) and includes a triplet obtained by substituting the head entity
Figure BDA00035215449200000410
And triplets derived in place of tail entities
Figure BDA00035215449200000411
Figure BDA00035215449200000412
(2) ConvKB decoder based on convolutional neural network
The model uses ConvKB as decoder, analyzes triplets by convolutional layer
Figure BDA00035215449200000413
And global embedding characteristics of different dimensions are adopted, so that the conversion characteristics of the model are summarized. The model calculates a plurality of feature mapping scores during decoding according to equation (12).
Figure BDA00035215449200000414
Wherein, ω isqDenotes the q-th filter, Ω is the number of filters, is the convolution operator,W∈RΩk×1Is a linear transformation matrix used to calculate the final scores of the triples.
In order to improve the generalization capability of the model, the soft boundary loss function shown in formula (13) is adopted in the model training to calculate the loss:
Figure BDA0003521544920000051
wherein the content of the first and second substances,
Figure BDA0003521544920000052
coefficient representing positive and negative cases when
Figure BDA0003521544920000053
When the temperature of the water is higher than the set temperature,
Figure BDA0003521544920000054
when in use
Figure BDA0003521544920000055
When the utility model is used, the water is discharged,
Figure BDA0003521544920000056
λ is the hyperparameter of the L2 norm.
Effects of the invention
Aiming at the relation prediction task in the medical knowledge graph, an embedded model based on graph attention is constructed by expanding a graph attention mechanism, and the entity and relation characteristics among multi-hop neighborhoods of a given entity are captured, so that the incidence relation among the entities in the medical knowledge graph is perfected.
Drawings
FIG. 1 is a diagram of a network architecture;
FIG. 2 is a schematic view of an auxiliary relational edge;
FIG. 3 is a schematic illustration of an attention mechanism;
FIG. 4 shows the structure of ConvKB.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings.
In view of the fact that knowledge maps are constructed in the early stage, because knowledge sources are limited, the knowledge coverage is not enough, and a large amount of related knowledge is lost. Therefore, in the medical field, there is a need for relationship prediction to complement knowledge of medical knowledge maps.
The embodiment provides a knowledge graph relation completion method flow, which is implemented by the following steps:
1) GATs are trained to encode entity and relationship information.
2) ConvKB was trained as a decoder to perform the relational prediction task.
The following gives the relevant definitions of the invention:
definition 1 (the knowledge-graph,
Figure BDA0003521544920000057
)
Figure BDA0003521544920000058
representing all entities and relationships contained in the knowledge-graph.
Definition 2 (entity set, E) E ═ E1,…,ei,…,enAnd representing all entity nodes in the knowledge graph, which correspond to the entity set in the knowledge base.
Definition 3 (set of relationships, R) R ═ { R1,…,ri,…,rnAnd expressing all the relation edges in the knowledge graph, which correspond to the relation sets in the knowledge base.
The definition 4 (the triple,
Figure BDA0003521544920000061
)
Figure BDA0003521544920000062
eirepresents a head entity, rkRepresents a relationship, ejRepresenting the tail entity. Wherein e isi,ej∈E,rkE.r, a triplet is also called a knowledge.
1) First stage
The GAT is trained to encode information about entities and relationships in the graph for better, more expressive embedding.
S1: and acquiring a medical knowledge map to be processed, and converting all knowledge in the knowledge map into a knowledge base file in a triple storage form. The Neo4j graph database is used to convert the knowledge graph into an RDF repository file in triple storage.
S2: because the computer cannot recognize the triples in text form, it needs to be converted into a word vector space representation before being input to the neural network for relationship determination.
The invention learns initial embedding by using TransE by referencing the idea of a translation scoring function, namely for a given effective triple
Figure BDA0003521544920000063
Is provided with
Figure BDA0003521544920000064
Wherein eiIs an entity, ejIs eiNearest neighbor entity, rkIs eiAnd ejThe relationship between them.
Figure BDA0003521544920000065
Is entity eiThe embedded vector of (a) is embedded,
Figure BDA0003521544920000066
is a relation rkThe embedded vector of (a) is embedded,
Figure BDA0003521544920000067
is entity ejThe embedded vector of (2).
Wherein, the entity embedding matrix H is as the formula (1):
Figure BDA0003521544920000068
wherein R represents the feature matrix of the entity, NeIs the total number of entities, T is the feature dimension embedded by each entityAnd (4) degree.
The relationship embedding matrix G is expressed by the following formula (2):
Figure BDA0003521544920000069
wherein the content of the first and second substances,
Figure BDA00035215449200000610
feature matrix representing relations, NrIs the number of relationships, and P is the characteristic dimension of the relationship matrix embedding.
S3: for each entity, it is necessary to learn the vector representation of all triples associated with it, to obtain a new embedding of the entity, by means of the specific triples
Figure BDA0003521544920000071
The entity and relationship feature vectors of (3) perform a linear transformation to learn these embeddings as in equation (3).
Figure BDA0003521544920000072
Wherein the content of the first and second substances,
Figure BDA0003521544920000073
is a triplet
Figure BDA0003521544920000074
Is represented by a vector.
Figure BDA0003521544920000075
Are respectively entity ei、ejAnd relation rkEmbedded representation of W1Is a linear transformation matrix.
S4: in order to map the input features to the output feature space with higher dimension, a linear transformation is first performed, and then a nonlinear activation function is applied to obtain the attention coefficient bijkRepresenting the importance of each triplet.
W is shown in the formula (4)2Is a linear transformation matrix. In order to avoid the problems of gradient disappearance and the like, LeakyReLU is used as an activation function, and the value of the hyper-parameter of the LeakyReLU is 0.2 through multiple comparison experiments and references.
bijk=LeakyReLU(W2cijk) (4)
S5: applying softmax operation to the original attention scores obtained by all incoming edges of a node, carrying out normalization on the attention coefficient according to formula (5) to obtain a relative attention value alphaijk
Figure BDA0003521544920000076
Wherein N isiDenotes all ofiSet of adjacent entities, RinRepresenting a connecting entity ei、enSet of relationships, binrIs represented byiAttention coefficients of neighboring entities.
S6: after the normalized attention coefficient is obtained, the updated embedding vector is calculated according to formula (8).
The model adopts a multi-head attention mechanism, so that the model learns related information in different expression subspaces, and the stability of the learning process is ensured. In addition, in order to reduce the output dimension, a method of calculating an average value is adopted to obtain a final embedded vector.
Figure BDA0003521544920000077
Where M is the number of attention heads and σ represents a non-linear function. j represents and entity eiAdjacent entities, k representing entity eiWith entity ejThe relationship between them.
In addition, the method adopts dropout operation, abandons the activation of some neurons with the probability of 0.3, and avoids the problems of overfitting and the like.
S7: as shown in equation (9), a weight matrix W is usedRLinear transformation of the relation matrix GA new embedded matrix is obtained.
G′=GWR (7)
Wherein, WR∈RT×T′And T' is the dimension of the relational embedding vector of the layer output.
S8: embedding the vector H into the initial entity according to equation (8)iLinear transformation is carried out to obtain a transformed entity embedding vector Ht
Ht=WEHi (8)
Wherein the entity weight matrix WE(
Figure BDA0003521544920000081
Figure BDA0003521544920000082
Feature matrix, T, representing a relationshipi、TjDimension representing initial and final entity embedding vectors, respectively), HiVectors are embedded for the entities of the input model.
S9: when learning new embedded vectors, their original embedded information may be lost, so the original entity embedded vector H is based on equation (9)tEntity embedding vector H obtained from last attention layerfThe addition yields the update entity embedding vector H ".
H”=Ht+Hf (9)
Wherein the content of the first and second substances,
Figure BDA0003521544920000083
(
Figure BDA0003521544920000084
feature matrix being entity) embedding matrix for entity output of last attention layer, NeIs the total number of entities, TfIs the characteristic dimension of the final entity embedding vector.
S10: the graph attention network will take triples present in the knowledge graph as valid triples
Figure BDA0003521544920000085
As a positive example of training. And randomly replacing a head entity or a tail entity in the triple by the entity to form an invalid triple t'ijAs a negative example of training. In this training phase, the ratio of the number of valid triples to invalid triples is 2: 1.
In model training, entity relation embedding learning adopts minimum L1 non-similarity norm
Figure BDA0003521544920000086
Figure BDA0003521544920000087
And uses the change loss function shown in equation (10).
Figure BDA0003521544920000088
Wherein, γ>0 is a margin hyper-parameter with a value of 5; s is the correct triplet set and S' is the invalid triplet set. S' is calculated according to formula (11) and includes a triplet obtained by substituting the head entity
Figure BDA0003521544920000089
And triplets derived in place of tail entities
Figure BDA00035215449200000810
Representing the removal of E from the entity set EiThe entity of (1).
Figure BDA0003521544920000091
In addition, the model was continuously optimized using an Adam optimizer with a learning rate of 0.001.
2) Second stage
The model uses ConvKB as decoder, analyzes triplets by convolutional layer
Figure BDA0003521544920000092
And (4) global embedding characteristics of different dimensions, thereby summarizing the conversion characteristics of the model.
S11: the model calculates a plurality of feature mapping scores during decoding according to equation (12).
Figure BDA0003521544920000093
Wherein, ω isqDenotes the q-th filter, Ω is the number of filters, is the convolution operator, W ∈ RΩk×1Is a linear transformation matrix used to calculate the final scores of the triples.
Figure BDA0003521544920000094
As shown in formula (3), is a triplet
Figure BDA0003521544920000095
Is represented by a vector.
Through multiple comparison experiments, in order to avoid the problems of gradient disappearance and the like, LeakyReLU with a hyper-parameter value of 0.2 is adopted as an activation function.
S12: in order to improve the generalization capability of the model, the soft boundary loss function shown in formula (13) is adopted in the model training to calculate the loss:
Figure BDA0003521544920000096
wherein the content of the first and second substances,
Figure BDA0003521544920000097
coefficient representing positive and negative cases when
Figure BDA0003521544920000098
When the temperature of the water is higher than the set temperature,
Figure BDA0003521544920000099
when in use
Figure BDA00035215449200000910
When the temperature of the water is higher than the set temperature,
Figure BDA00035215449200000911
is a hyperparameter of L2 norm, and takes the value of 0.00001.
In this training phase, the ratio of the number of valid triples to invalid triples is 4: 1.

Claims (2)

1. The medical knowledge map relation prediction method based on the graph attention machine system is characterized by comprising the following steps of:
(1) graph attention-based graph embedding method for relation prediction
The input of each layer of the model comprises an entity embedding matrix and a relation embedding matrix; wherein, the entity embedding matrix H is as the formula (1):
Figure FDA0003521544910000011
wherein R represents the feature matrix of the entity, NeIs the total number of entities, T is the characteristic dimension embedded by each entity;
the relationship embedding matrix G is expressed by the following formula (2):
Figure FDA0003521544910000012
wherein the content of the first and second substances,
Figure FDA0003521544910000013
feature matrix representing relations, NrIs the relation quantity, P is the characteristic dimension of relation matrix embedding;
according to the input matrixes H and G, calculating the updated embedded matrixes H 'and G' according to the following steps:
the method comprises the following steps: defining a set of entities E ═ { E ] in a medical knowledge graph1,…,ei,…,en},eiIs the embedding of the ith entity; first, study with eiRepresentation of each triplet associated to obtain entity eiNew embedding of (2); by targeting specific triplets
Figure FDA0003521544910000014
The entity and relationship feature vectors of (3) perform a linear transformation as in equation (3) to learn these embeddings;
Figure FDA0003521544910000015
wherein the content of the first and second substances,
Figure FDA0003521544910000016
is a triplet
Figure FDA0003521544910000017
A vector representation of (a);
Figure FDA0003521544910000018
are respectively entity ei、ejAnd relation rkEmbedded representation of W1Is a linear transformation matrix;
step two: obtaining the importance degree of each three group, namely the attention coefficient bijk(ii) a As shown in formula (4), a linear transformation is first performed, and then a non-linear activation function is applied to obtain bijkIn the formula (4), W2Is a linear transformation matrix;
bijk=LeakyReLU(W2cijk) (4)
furthermore, normalization of the attention coefficient was performed using equation (5) to obtain a relative attention value αijk
Figure FDA0003521544910000019
Wherein N isiDenotes all ofiSet of adjacent entities, RinRepresenting a connection entity ei、enSet of relationships, binrIs represented byiAttention coefficients of neighboring entities;
after obtaining the normalized attention coefficient, calculating an updated embedding vector according to a formula (8); the model adopts a multi-head attention mechanism, so that the model learns related information in different expression subspaces, and the stability of the learning process is ensured; in addition, in order to reduce output dimensionality, a method of calculating an average value is adopted to obtain a final embedded vector;
Figure FDA0003521544910000021
wherein M is the number of the attention heads, and sigma represents a nonlinear function; j represents and entity eiAdjacent entities, k represents entity eiWith entity ejThe relationship between;
step three: as shown in equation (9), a weight matrix W is usedRCarrying out linear transformation on the relation matrix G to obtain a new embedded matrix;
G′=GWR (7)
wherein, WR∈RT×T′T' is the dimension of the relation embedding vector of the layer output;
step four: entity weight matrix WE
Figure FDA0003521544910000022
Figure FDA0003521544910000023
Feature matrix, T, representing a relationshipi、TjRepresenting the dimensions of the initial and final entity embedding vectors, respectively, for the entity embedding vector of the input model, the initial entity embedding vector H is given according to equation (8)iLinear transformation is carried out to obtain a transformed entity embedding vector Ht
Ht=WEHi (8)
Step five: embedding the initial entity into vector H according to equation (9)tEntity embedding vector H obtained from last attention layerfAdding to obtain an updated entity embedding vector H';
H”=Ht+Hf (9)
wherein the content of the first and second substances,
Figure FDA0003521544910000024
Figure FDA0003521544910000025
is a feature matrix of the entity, an entity embedding matrix for the last attention layer output, NeIs the total number of entities, TfIs the characteristic dimension of the final entity embedding vector;
in addition, k hop adjacency is defined as an auxiliary relation in the knowledge graph, the embedding of the auxiliary relation is the sum of the embedding of all the relations in the directed path, and k is greater than 1; thus, for a multilayer model, an updated embedding vector can be calculated by aggregating neighbors of adjacent s hops at the s-th layer;
the graph attention network will take triples present in the knowledge graph as valid triples
Figure FDA0003521544910000026
As a positive example of training; and randomly replacing a head entity or a tail entity in the triple by the entity to form an invalid triple t'ijAs a negative example of training;
learning embeddings by using the concept of a translation scoring function, i.e. for a given valid triplet
Figure FDA0003521544910000031
Figure FDA0003521544910000032
Is provided with
Figure FDA0003521544910000033
WhereineiIs an entity, ejIs eiNearest neighbor entity, rkIs eiAnd ejThe relationship between;
Figure FDA0003521544910000034
is entity eiThe embedded vector of (a) is embedded,
Figure FDA0003521544910000035
is a relation rkThe embedded vector of (a) is embedded,
Figure FDA0003521544910000036
is entity ejThe embedded vector of (2);
in model training, entity relation embedding learning adopts minimum L1 non-similarity norm
Figure FDA0003521544910000037
Figure FDA0003521544910000038
And using a change loss function shown in formula (10);
Figure FDA0003521544910000039
wherein, γ>0, is a margin over-parameter; s is a correct triple set, and S' is an invalid triple set; s' is calculated according to formula (11) and includes a triplet obtained by substituting the head entity
Figure FDA00035215449100000310
And triplets derived in place of tail entities
Figure FDA00035215449100000311
Figure FDA00035215449100000312
(2) ConvKB decoder based on convolutional neural network
The model uses ConvKB as decoder, analyzes triplets by convolutional layer
Figure FDA00035215449100000313
Global embedding characteristics of different dimensions are adopted, and then the conversion characteristics of the model are summarized; the model calculates a plurality of feature mapping scores according to formula (12) during decoding;
Figure FDA00035215449100000314
wherein, ω isqDenotes the q-th filter, Ω is the number of filters, is the convolution operator, W ∈ RΩk×1Is a linear transformation matrix for calculating the final scores of the triples;
the loss is calculated using the soft boundary loss function shown in equation (13):
Figure FDA00035215449100000315
wherein the content of the first and second substances,
Figure FDA00035215449100000316
coefficient representing positive and negative cases when
Figure FDA00035215449100000317
When the temperature of the water is higher than the set temperature,
Figure FDA00035215449100000318
when in use
Figure FDA00035215449100000319
When the temperature of the water is higher than the set temperature,
Figure FDA00035215449100000320
λ is the hyperparameter of the L2 norm.
2. The method of claim 1, wherein:
the following relevant definitions:
definition 1 (the knowledge-graph,
Figure FDA00035215449100000321
)
Figure FDA00035215449100000322
representing all entities and relationships contained in the knowledge-graph;
definition 2 (entity set, E) E ═ E1,…,ei,…,enRepresenting all entity nodes in the knowledge graph, corresponding to the entity set in the knowledge base;
definition 3 (set of relationships, R) R ═ { R1,…,ri,…,rnRepresenting all relation edges in the knowledge graph, corresponding to the relation sets in the knowledge base;
the definition 4 (the triple,
Figure FDA0003521544910000041
)
Figure FDA0003521544910000042
eirepresents a head entity, rkRepresents a relationship, ejRepresenting a tail entity; wherein e isi,ej∈E,rkE.g. R, a triple is also called a knowledge;
first stage
Training the GAT to encode information of entities and relationships in the graph;
s1: acquiring a medical knowledge graph to be processed, and converting all knowledge in the knowledge graph into a knowledge base file in a triple storage form; converting the knowledge graph into an RDF knowledge base file in a triple storage form by using a Neo4j graph database;
s2: because the computer cannot identify the triples in the text form, before the triples are input into the neural network for relation judgment, the triples need to be converted into a form expressed by a word vector space;
using the concept of a translation scoring function, TransE is used to learn the initial embedding, i.e., for a given valid triplet
Figure FDA0003521544910000043
Is provided with
Figure FDA0003521544910000044
Wherein eiIs an entity, ejIs eiNearest neighbor entity, rkIs eiAnd ejThe relationship between;
Figure FDA0003521544910000045
is entity eiThe embedded vector of (a) is embedded,
Figure FDA0003521544910000046
is a relation rjThe embedded vector of (a) is embedded,
Figure FDA0003521544910000047
is entity ejThe embedded vector of (2);
wherein, the entity embedding matrix H is as the formula (1):
Figure FDA0003521544910000048
wherein R represents the feature matrix of the entity, NeIs the total number of entities, T is the characteristic dimension each entity embeds;
the relationship embedding matrix G is expressed by the following formula (2):
Figure FDA0003521544910000049
wherein the content of the first and second substances,
Figure FDA00035215449100000410
feature matrix representing relations, NrIs the relation quantity, P is the characteristic dimension of relation matrix embedding;
s3: for each entity, it is necessary to learn the vector representation of all triples associated with it, to obtain a new embedding of the entity, by means of the specific triples
Figure FDA00035215449100000411
The entity and relationship feature vectors of (3) perform a linear transformation as in equation (3) to learn these embeddings;
Figure FDA0003521544910000051
wherein the content of the first and second substances,
Figure FDA0003521544910000052
is a triplet
Figure FDA0003521544910000053
A vector representation of (a);
Figure FDA0003521544910000054
are respectively entity ei、ejAnd relation rkEmbedded representation of W1Is a linear transformation matrix;
s4: in order to map the input features to the higher-dimensional output feature space, a linear transformation is required, and then a nonlinear activation function is applied to obtain the attention coefficient bijkRepresenting the importance of each triplet;
w is shown in the formula (4)2Is a linear transformation matrix; in order to avoid the problems of gradient disappearance and the like, LeakyReLU is used as an activation function, and the hyper-parameter value of the LeakyReLU is 0.2 through multiple comparison experiments and references;
bijk=LeakyReLU(W2cijk) (4)
s5: applying softmax operation to the original attention scores obtained by all incoming edges of a node, carrying out normalization on the attention coefficient according to formula (5) to obtain a relative attention value alphaijk
Figure FDA0003521544910000055
Wherein N isiDenotes all ofiSet of adjacent entities, RinRepresenting a connecting entity ei、enSet of relationships, binrIs represented byiAttention coefficients of neighboring entities;
s6: after obtaining the normalized attention coefficient, calculating an updated embedding vector according to a formula (8);
the model adopts a multi-head attention mechanism, so that the model learns related information in different expression subspaces, and the stability of the learning process is ensured; in addition, in order to reduce output dimensionality, a method of calculating an average value is adopted to obtain a final embedded vector;
Figure FDA0003521544910000056
wherein M is the number of the attention heads, and sigma represents a nonlinear function; j represents and entity eiAdjacent entities, k representing entity eiWith entity ejThe relationship between;
furthermore, with the dropout operation, the activation of some neurons is discarded with a probability of 0.3;
s7: as shown in equation (9), a weight matrix W is usedRCarrying out linear transformation on the relation matrix G to obtain a new embedded matrix;
G′=GWR (7)
wherein, WR∈RT×T′T' is the dimension of the relation embedding vector of the layer output;
s8: embedding vectors into the initial entity according to equation (8)HiLinear transformation is carried out to obtain a transformed entity embedding vector Ht
Ht=WEHi (8)
Wherein the entity weight matrix WE
Figure FDA0003521544910000061
Figure FDA0003521544910000062
Feature matrix, T, representing a relationshipi、TjDimension, H, representing the initial and final entity embedding vectors, respectivelyiEmbedding vectors for entities of the input model;
s9: when learning new embedded vectors, their original embedded information may be lost, so the original entity embedded vector H is based on equation (9)tEntity embedding vector H obtained from last attention layerfAdding to obtain an updated entity embedding vector H';
H”=Ht+Hf (9)
wherein the content of the first and second substances,
Figure FDA0003521544910000063
Figure FDA0003521544910000064
for entities the feature matrix is the entity-embedded matrix of the last attention layer output, NeIs the total number of entities, TfIs the characteristic dimension of the final entity embedding vector;
s10: the graph attention network will take triples present in the knowledge graph as valid triples
Figure FDA0003521544910000065
As a positive example of training; and randomly replacing a head entity or a tail entity in the triple by the entity to form an invalid triple t'ijAs a negative example of training; during this training phase, valid tripletsAnd the number ratio of invalid triples is 2: 1;
in model training, entity relation embedding learning adopts minimum L1 non-similarity norm
Figure FDA0003521544910000066
Figure FDA0003521544910000067
And using a change loss function shown in formula (10);
Figure FDA0003521544910000068
wherein, γ>0 is a margin hyper-parameter with a value of 5; s is a correct triple set, and S' is an invalid triple set; s' is calculated according to formula (11) and includes a triplet obtained by substituting the head entity
Figure FDA0003521544910000069
And triplets derived in place of tail entities
Figure FDA00035215449100000610
Representing the removal of E from the entity set EiThe entity of (1);
Figure FDA00035215449100000611
in addition, an Adam optimizer with a learning rate of 0.001 is adopted to continuously optimize the model;
2) second stage
The model uses ConvKB as decoder, analyzes triplets by convolutional layer
Figure FDA0003521544910000071
Global embedding characteristics of different dimensions are adopted, and then the conversion characteristics of the model are summarized;
s11: the model calculates a plurality of feature mapping scores according to formula (12) during decoding;
Figure FDA0003521544910000072
wherein, ω isqDenotes the q-th filter, Ω is the number of filters, is the convolution operator, W ∈ RΩk×1Is a linear transformation matrix for calculating the final scores of the triples;
Figure FDA0003521544910000073
as shown in formula (3), is a triplet
Figure FDA0003521544910000074
A vector representation of (a);
adopting LeakyReLU with a hyper-parameter value of 0.2 as an activation function;
s12: in order to improve the generalization capability of the model, the soft boundary loss function shown in formula (13) is adopted in the model training to calculate the loss:
Figure FDA0003521544910000075
wherein the content of the first and second substances,
Figure FDA0003521544910000076
coefficient representing positive and negative cases when
Figure FDA0003521544910000077
When the temperature of the water is higher than the set temperature,
Figure FDA0003521544910000078
when in use
Figure FDA0003521544910000079
When the temperature of the water is higher than the set temperature,
Figure FDA00035215449100000710
is a hyperparameter of L2 norm, and takes the value of 0.00001.
CN202210181938.2A 2022-02-25 2022-02-25 Medical knowledge map relation prediction method based on graph attention machine mechanism Pending CN114610897A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210181938.2A CN114610897A (en) 2022-02-25 2022-02-25 Medical knowledge map relation prediction method based on graph attention machine mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210181938.2A CN114610897A (en) 2022-02-25 2022-02-25 Medical knowledge map relation prediction method based on graph attention machine mechanism

Publications (1)

Publication Number Publication Date
CN114610897A true CN114610897A (en) 2022-06-10

Family

ID=81858174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210181938.2A Pending CN114610897A (en) 2022-02-25 2022-02-25 Medical knowledge map relation prediction method based on graph attention machine mechanism

Country Status (1)

Country Link
CN (1) CN114610897A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115861715A (en) * 2023-02-15 2023-03-28 创意信息技术股份有限公司 Knowledge representation enhancement-based image target relation recognition algorithm
CN117010494A (en) * 2023-09-27 2023-11-07 之江实验室 Medical data generation method and system based on causal expression learning
CN117435747A (en) * 2023-12-18 2024-01-23 中南大学 Few-sample link prediction drug recycling method based on multilevel refinement network
CN117610662A (en) * 2024-01-19 2024-02-27 江苏天人工业互联网研究院有限公司 Knowledge graph embedding method for extracting representative sub-graph information through GAT
CN117747124A (en) * 2024-02-20 2024-03-22 浙江大学 Medical large model logic inversion method and system based on network excitation graph decomposition

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115861715A (en) * 2023-02-15 2023-03-28 创意信息技术股份有限公司 Knowledge representation enhancement-based image target relation recognition algorithm
CN117010494A (en) * 2023-09-27 2023-11-07 之江实验室 Medical data generation method and system based on causal expression learning
CN117010494B (en) * 2023-09-27 2024-01-05 之江实验室 Medical data generation method and system based on causal expression learning
CN117435747A (en) * 2023-12-18 2024-01-23 中南大学 Few-sample link prediction drug recycling method based on multilevel refinement network
CN117435747B (en) * 2023-12-18 2024-03-29 中南大学 Few-sample link prediction drug recycling method based on multilevel refinement network
CN117610662A (en) * 2024-01-19 2024-02-27 江苏天人工业互联网研究院有限公司 Knowledge graph embedding method for extracting representative sub-graph information through GAT
CN117747124A (en) * 2024-02-20 2024-03-22 浙江大学 Medical large model logic inversion method and system based on network excitation graph decomposition

Similar Documents

Publication Publication Date Title
CN114610897A (en) Medical knowledge map relation prediction method based on graph attention machine mechanism
CN109389151B (en) Knowledge graph processing method and device based on semi-supervised embedded representation model
CN113240683B (en) Attention mechanism-based lightweight semantic segmentation model construction method
CN113157957A (en) Attribute graph document clustering method based on graph convolution neural network
CN113065649A (en) Complex network topology graph representation learning method, prediction method and server
CN113361627A (en) Label perception collaborative training method for graph neural network
CN114969367B (en) Cross-language entity alignment method based on multi-aspect subtask interaction
CN111949764A (en) Knowledge graph completion method based on bidirectional attention mechanism
CN112884045A (en) Classification method of random edge deletion embedded model based on multiple visual angles
CN111353534A (en) Graph data category prediction method based on adaptive fractional order gradient
CN113806559B (en) Knowledge graph embedding method based on relationship path and double-layer attention
CN116383401A (en) Knowledge graph completion method integrating text description and graph convolution mechanism
CN111967528B (en) Image recognition method for deep learning network structure search based on sparse coding
CN112131403A (en) Knowledge graph representation learning method in dynamic environment
CN115422321B (en) Knowledge graph complex logic reasoning method, component and knowledge graph query and retrieval method
Zhou et al. Online recommendation based on incremental-input self-organizing map
CN116226547A (en) Incremental graph recommendation method based on stream data
CN115564013B (en) Method for improving learning representation capability of network representation, model training method and system
CN110858311B (en) Deep nonnegative matrix factorization-based link prediction method and system
CN110188219B (en) Depth-enhanced redundancy-removing hash method for image retrieval
CN112836065A (en) Prediction method of graph convolution knowledge representation learning model ComSAGCN based on combination self-attention
Liu et al. Entity representation learning with multimodal neighbors for link prediction in knowledge graph
CN114936296B (en) Indexing method, system and computer equipment for super-large-scale knowledge map storage
CN112801192B (en) Extended LargeVis image feature dimension reduction method based on deep neural network
Xu et al. Efficient block pruning based on kernel and feature stablization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination