CN116402133B

CN116402133B - Knowledge graph completion method and system based on structure aggregation graph convolutional network

Info

Publication number: CN116402133B
Application number: CN202310385458.2A
Authority: CN
Inventors: 谢永芳; 王飞达; 谢世文; 张彬
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2023-04-12
Filing date: 2023-04-12
Publication date: 2024-04-30
Anticipated expiration: 2043-04-12
Also published as: CN116402133A

Abstract

The invention discloses a knowledge graph completion method and a system based on a structure aggregation graph convolutional network. And extracting the head entity-relation pair in the test set and all the entities in the test set as tail entities to obtain a new triplet set, and extracting a semantic rule set corresponding to the new triplet set by using the LSTM network again. Inputting the test set into a trained network, and evaluating the effect of map completion by using a scoring function and a scoring function. The map completion method of the invention has better map completion performance than some existing methods.

Description

Knowledge graph completion method and system based on structure aggregation graph convolutional network

Technical Field

The invention relates to the field of software, in particular to a knowledge graph completion method and system based on a structural aggregation graph convolution network.

Background

The Knowledge Graph (KG) is a form of semantic network, is a graphical knowledge base for storing semantic knowledge, and has the basic composition unit of a triplet, the expression form of (h, r, t), and the h, r and t are respectively a head entity, a relation and a tail entity. And establishing connection among the entities through the relationship to form a knowledge base of the network structure. Although the knowledge graph contains rich knowledge, the knowledge contained in the knowledge graph may be incomplete, and the hidden relations of which the part is not found are missing. The purpose of map completion is to predict missing entities or relationships and perfect the knowledge triples of the knowledge map as much as possible. The current main knowledge graph completion method can be divided into a distance model, a bilinear model, a nerve tensor model, a matrix decomposition model and a translation model, wherein the distance model, the translation model and the like convert entities and relations from words into vector representations in continuous low-dimensional space, which is called as embedding of the knowledge graph. After embedding the atlas, the embedded vector is used to predict potential knowledge triples, such as TransH, transR, complEx models. The Graph Convolution Network (GCN) is a convolution neural network which can directly act on a graph and utilize the structural information of the graph, the convolution process is to sum the node information and the domain node information to obtain new node representation, the information of multiple layers of domain nodes can be fused through multi-layer convolution operation, and the node information participating in operation is more abundant. The combination of the relation graph rolling network (R-GCN) and DistMult models on the knowledge graph public data set with the graph structure is used for the graph completion task, so that the effectiveness of the GCN in the graph completion task is proved to a certain extent.

However, when the existing GCN is applied to map completion, the aggregation operation only uses the information of the central node and the nodes in the field of the GCN, and the important relation vector information existing between the nodes of the knowledge map is ignored, so that the completion effect of the GCN is poor.

Disclosure of Invention

The invention aims to provide a knowledge graph completion method and system based on a structural aggregation graph convolution network, which are used for solving the problems in the background technology.

In order to achieve the above purpose, the present invention provides the following technical solutions:

A knowledge graph completion method based on a structural aggregation graph convolution network comprises the following steps:

Step S201, obtaining a to-be-complemented knowledge graph, wherein the to-be-complemented knowledge graph consists of a plurality of triples, and each triplet consists of an entity-relation-entity;

Step S202, inputting a knowledge graph to be complemented into an LSTM network, and learning semantic connection rules of triples in the knowledge graph by the LSTM network to obtain a first semantic rule feature set S of the knowledge graph to be complemented;

step S203, inputting the knowledge graph to be complemented into a Word2Vec model, and converting the knowledge graph from Word representation to vector representation of a low-dimensional continuous space to obtain a preliminary embedded representation result;

step S204, inputting the primary embedding representation result into a structure aggregation convolution network to perform deep embedding, wherein the deep embedding process comprises the following steps: constructing a training set which is a preliminarily embedded knowledge graph containing correct labels; inputting the training set into a structure aggregation graph convolution network, training the structure aggregation convolution network, wherein the structure aggregation convolution network comprises L layers, each layer contains a trainable parameter omega, and the loss function is a two-class cross entropy loss function:

Step S2041, extracting rows of the corresponding tag matrix of the head entity-relation pairs in the training set triples according to batches, wherein all entities in the training set correspond to columns of the tag matrix to obtain a tag matrix with a size of batch_num_ent, the value of batch is the number of the head entity-relation pairs in the training batch, and the value of num_ent is the number of the training set entities; the element values of the tag matrix are valued according to the triplets in the training set, the triplets are split into head entity-relation pairs and tail entities, the values in the rows and columns corresponding to the tag matrix are 1, and the tag matrix does not have the head entity-relation pairs, and the values of the elements corresponding to the tail entities are 0; simultaneously obtaining a candidate new triplet set composed of head entity-relation pairs and all entities in the training set;

Step S2042, inputting semantic representations corresponding to the new triplet sets into an LSTM network to extract semantic features, and obtaining a second semantic rule feature set S _new;

Step S2043, inputting the new triplet set into a scoring function f (h, r, t), a first semantic rule feature set S and a second semantic rule feature set S _new into a scoring function g (S _new, S), summing the result of the scoring function f (h, r, t) and the result of the scoring function g (S _new, S), and inputting a Sigmoid function to evaluate the score of each new triplet in the new triplet candidate set, wherein the value range of the score is [0,1]; inputting BCELoss the obtained scores and a real triplet tag matrix of the training set into a loss function to calculate errors, carrying out back propagation according to the errors to update training parameters, and training until BCELoss the loss function is smaller than a preset threshold value to obtain a trained structural aggregation convolutional network;

Step S205, inputting the knowledge graph to be complemented into a trained structural aggregation convolutional network, randomly combining head entity-relation pairs, taking entities in the knowledge graph to be complemented as tail entities in sequence to form a candidate triplet set, and calculating the final scores of triples in the set; and calculating each candidate triplet set based on the head entity-relation pair, taking the triplet with the highest final score in each candidate triplet set to form a complete triplet set, removing the original triplet in the to-be-completed knowledge graph in the complete triplet set to obtain the final complete triplet set, and adding the correct triplet in the final complete triplet set into the to-be-completed knowledge graph after manual verification to realize completion of the graph.

Further improvements, the LSTM network comprises an input x _t, an output gate o _t, an hidden layer state h _t, a forget gate f _t, an input gate i _t, a cell state C _t, a temporary cell stateThe final result is obtained through the cyclic calculation of the memory of useful information and the forgetting of useless information;

the forgetting gate is calculated as follows:

f_t＝σ(W_f·[h_t-1,x_t]+b_f) (1)

Input gate calculation:

i_t＝σ(W_i·[h_t-1,x_t]+b_i) (2)

New cell state calculation:

Output gate calculation:

o_t＝σ(W_o·[h_t-1,x_t]+b_o) (5)

h_t＝o_t*tanh(C_t) (6)

Obtaining a semantic feature set S through LSTM network learning;

Wherein f _t、i_t、o_t is the calculation result of forgetting gate, input gate and output gate respectively, sigma is the activation function, h _t-1、h_t is the hidden layer state of the previous time and the current time respectively, W _f、W_i、W_o、W_c is the weight parameters of forgetting gate, input gate, output gate and new unit state for the current input x _t and the hidden layer state output h _t-1 of the previous unit respectively, b _f、b_i、b₀、b_c is the bias vector corresponding to the corresponding weight parameters respectively, tanh is the hyperbolic tangent function, C _t-1、C_t, The state of the unit at the previous moment, the state of the unit at the current moment and the state of the temporary unit at the current moment are respectively defined.

Further improving, if the graph nodes of the knowledge graph to be complemented are n and the embedding dimension is D, inputting the GCN comprises an adjacent matrix A of a node characteristic matrix H, n of n and a degree matrix D of n; the aggregation graph rolling network performs structure aggregation operation on entity nodes:

Wherein i is the sequence number of the map entity node, j is the sequence number of one adjacent node in all adjacent nodes of the ith entity node; n _i ^l is the result of the vector representation of the center node after the first layer convolution aggregation operation, v _j is all the neighboring nodes of the center node N _i, N _j ^l-1 is the vector representation of the neighboring nodes, R _j is the relationship vector between the center node N _i and the neighboring node N _j, For the weight parameter of the corresponding relation vector R _j, ω _ii is the weight parameter of the central node N _i to the self value, ω _ij is the weight parameter of the central node N _i to the adjacent node N _j to the c is the identification variable, the relation takes 1 for the adjacent node N _j and takes 1 for the input degree; deg _i is the sum of the outgoing and incoming degrees of the central node, and deg _j is the sum of the outgoing and incoming degrees of the adjacent nodes;

In the structure aggregation convolution network, the information of the node part and the relation part has different importance degrees, and the weighted summation is carried out, wherein the matrix expression is as follows:

Wherein f (H ^(l), A) is a node information aggregation function of the first layer convolution network, and sigma is an activation function; h ^(l) is a vector representation matrix of entity nodes, the size is n×d, n is the number of entity nodes, and d is the dimension of the embedded vector; a is an adjacent matrix of entity nodes in the map, the value of the connection relation between two entities is 1, otherwise, the value is 0, and the size is n; r is a relation matrix of m x d, m is the number of relations, d is the dimension of a relation embedding vector and the embedding dimension of an entity node are the same; m is a matrix of n x M, each row corresponds to an entity node i, each column corresponds to a relation j, and the element value of the matrix represents the number of the relation j contained in the entity i; obtained by A+I, I is an identity matrix, matrix/> Compared with the matrix A, the self-connection relation of the entity nodes is increased, and the information omission of the entity nodes is avoided when aggregation operation is carried out; /(I)The diagonal element value is the summation result of the row vector elements corresponding to the matrix A; w ^(l),/>And the weight parameters of the information aggregation of the entity part and the relation part are d.

Further, in the step S207, the scoring function is:

f(h,r,t)＝similarity(h,r)+similarity(h,t)+similarity(r,t) (10)

Wherein f (h, r, t) is a scoring function, h is a head entity embedding vector, r is a relation embedding vector, and t is a tail entity embedding vector;

The semantic feature set S obtained by the semantic feature learning module is calculated in a scoring function, and whether the new triplet meets the scoring is determined, wherein the scoring function is as follows:

Wherein g (S _new, S) is a division function, mathematical symbol Indicating the presence; s _new is a feature vector in the second semantic feature set S _new, S _ij is a j-th bit value of the feature vector S _i in the second semantic feature set S _new, and S _newj is a j-th bit value of the feature vector S _new; n is the length of the feature vector, and alpha and xi are respectively added value and added threshold value, which are manually set;

Calculating Euclidean distance between the feature vector S _new and the vector in the first semantic rule feature set S, and conforming to the adding score if a certain result is smaller than a preset value xi; the final score is obtained by the score result and the score adding result through an activation function; the final score is obtained by the score result and the score adding result through an activation function, and is as follows:

score＝sigmoid(f(h,r,t)+g(s_new,S)) (13)。

The system is used for realizing the knowledge graph completion method based on the structure aggregation graph convolutional network, and comprises a word embedding module, a semantic feature learning module, a coding module and a scoring module;

the word embedding module is used for converting the knowledge graph word representation into a vector representation of a low-dimensional continuous space;

the semantic feature learning module is used for extracting a semantic rule feature set in the knowledge graph triplet;

the coding module is a structure aggregation diagram convolution network and is used for obtaining vector representation with structural characteristics;

The scoring module consists of a scoring function and a scoring function, the scoring function does not participate in the triad scoring during iterative training, and the scoring function jointly score a new triad during map population, so as to evaluate whether the new triad is established.

Further improved, the Word embedding module is a Word2Vec model, the knowledge graph to be complemented is input into the Word2Vec model, the knowledge graph is converted into vector representation of a low-dimensional continuous space from Word representation, and a preliminary embedding representation result is obtained.

Further improvement, the semantic feature learning module is an LSTM network; inputting the knowledge graph to be complemented into an LSTM network, and entering the LSTM network to learn the semantic connection rule of the triples in the knowledge graph to be complemented to obtain the semantic rule feature set S of the knowledge graph to be complemented.

Compared with the prior art, the invention has the beneficial effects that:

The invention takes the important relation vector information existing among the knowledge graph nodes into consideration, enriches the node vector representation information, improves the link prediction performance, has stronger information aggregation performance than the existing GCN, and is beneficial to improving the effect of GCN on graph completion.

Drawings

FIG. 1 is a diagram of an LSTM network architecture;

FIG. 2 is a schematic diagram of a structural aggregate graph convolutional network of the present invention;

FIG. 3 is a schematic diagram of a graph completion method based on a structured aggregation graph rolling network according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Long-short term memory network (LSTM) is one type of recurrent neural network. The LSTM model comprises an input x _t at time t, an output gate o _t, an hidden layer state h _t, a forgetting gate f _t, a memory gate i _t, a cell state C _t, and a temporary cell stateThe final result is obtained through cyclic calculation through memorizing useful information and forgetting useless information, and the specific structure is shown in figure 1.

The forgetting gate is calculated as follows:

f_t＝σ(W_f·[h_t-1,x_t]+b_f) (1)

Input gate calculation:

i_t＝σ(W_i·[h_t-1,x_t]+b_i) (2)

Output gate calculation:

o_t＝σ(W_o·[h_t-1,x_t]+b_o) (4)

h_t＝o_t*tanh(C_t) (5)

And obtaining a semantic feature set S through LSTM network learning.

When the graph rolling network (GCN) performs aggregation operation on a certain entity node, the feature vectors of the self node and the neighbor node are obtained and multiplied by corresponding parameter weights respectively to be summed, and a new node vector representation is obtained after an activation function. The graph node vector performs the aggregation operation as follows:

in the above formula, i is the sequence number of the map entity node, and j is the sequence number of a certain adjacent node in all adjacent nodes of the ith entity node; n _i ^l is the result of the vector representation of the center node after the first layer convolution aggregation operation, v _j is all the neighboring nodes of the center node N _i, N _j ^l-1 is the vector representation of the neighboring nodes, R _j is the relationship vector between the center node N _i and the neighboring node N _j, For the weight parameter of the corresponding relation vector R _j, ω _ii is the weight parameter of the central node N _i to the self value, ω _ij is the weight parameter of the central node N _i to the adjacent node N _j to the c is the identification variable, the relation takes 1 for the adjacent node N _j and takes 1 for the input degree; deg _i is the sum of the outgoing and incoming degrees of the central node, and deg _j is the sum of the outgoing and incoming degrees of the adjacent nodes.

In GCN networks with network depth L, there is a corresponding parameter matrix ω for each layer of network, which functions to give different weights to the inputs. If the map node is n and the embedding dimension is D, the input of the GCN includes a node feature matrix H, n ×n adjacent matrix a and a degree matrix D of n×n. The matrix expression corresponding to the GCN aggregation operation is as follows:

When the GCN is applied to map completion, the aggregation operation only uses the information of the central node and the domain nodes thereof, and the important relation vector information existing between the knowledge map nodes is ignored. In order to enrich node vector representation information and improve link prediction performance, aggregation operation based on GCN is proposed to use a central node to perform structure aggregation operation:

In the above formula, R _j is the relation vector between the central node and the domain node N _j, The relation is a weight parameter of the relation, c is an identification variable, the relation takes 1 for the out degree and takes-1 for the in degree of the field node N _j. In the structure aggregation, the information of the view node part and the relation part have different importance degrees, and are weighted and summed. The matrix expression is as follows:

Wherein R is a relation matrix of m x d, m is the number of relations; m is a matrix of n x M, which is the sum of the identification variables corresponding to each relation contained in the entity node, namely

The main function of GCN in the map link prediction task is to embed the map as a whole into the representation, and after the embedding is completed, a scoring function is needed to be accessed to conduct link prediction, for example, the GCN+ DistMult model is used for completing link prediction. In order to fully utilize the graph embedded representation result obtained by the structure aggregation graph convolution network, the invention provides a method for carrying out a link prediction task by using the cosine similarity of the sum of a head entity and a relation vector and a tail entity as a scoring function, wherein the scoring function is as follows:

I.e. the new feature vector S _new calculates the euclidean distance with the vectors in the set S, if a certain result is smaller than the preset value ζ, the sum is met. The final score is obtained by the score result and the score adding result through an activation function, and is as follows:

score＝sigmoid(f(h,r,t)+g(s_new,S)) (12)。

According to the graph completion method based on the structure aggregation graph rolling network, a knowledge graph is initially embedded through a Word2Vec model, the initial embedding result is encoded by the structure aggregation graph rolling network to achieve deeper embedding, all new triples meeting requirements are found in the embedding result by using a scoring function and a scoring function, wherein the scoring function is the proposed f (h, r, t) cosine similarity calculation method, the scoring function is a semantic rule discrimination function, and finally the new triples are added into the original graph to complete graph completion. The atlas completion system based on the structure aggregation chart convolution network consists of a word embedding module, a semantic feature learning module, a coding module and a scoring module.

The specific implementation comprises the following steps:

The comparison results of the graph rolling network (Structure Aggregation Graph Convolutional Network, SAGCN) complementation method based on structure aggregation and the existing complementation methods on the data sets FB15K-237 and YAGO3-10 are as follows:

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. The knowledge graph completion method based on the structure aggregation graph convolution network is characterized by comprising the following steps of:

Step S2041: if the graph nodes of the knowledge graph to be complemented are n and the embedding dimension is D, the input of the aggregation graph rolling network comprises an adjacent matrix A of a node characteristic matrix H, n n of n and a degree matrix D of n; the aggregation graph rolling network performs structure aggregation operation on entity nodes:

Wherein f (H ^(l), A) is a node information aggregation function of the first layer convolution network, and sigma is an activation function; h ^(l) is a vector representation matrix of entity nodes, the size is n×d, n is the number of entity nodes, and d is the dimension of the embedded vector; a is an adjacent matrix of entity nodes in the map, the value of the connection relation between two entities is 1, otherwise, the value is 0, and the size is n; r is a relation matrix of m x d, m is the number of relations, d is the dimension of a relation embedding vector and the embedding dimension of an entity node are the same; m is a matrix of n x M, each row corresponds to an entity node i, each column corresponds to a relation j, and the element value of the matrix represents the number of the relation j contained in the entity i; obtained by A+I, I is an identity matrix, matrix/> Compared with the matrix A, the self-connection relation of the entity nodes is increased, and the information omission of the entity nodes is avoided when aggregation operation is carried out; /(I)The diagonal element value is the summation result of the row vector elements corresponding to the matrix A; w ^(l),/>The weight parameters of the information aggregation of the entity part and the relation part are d;

Step S2042, extracting rows of the corresponding tag matrix of the head entity-relation pairs in the training set triples according to batches, wherein all entities in the training set correspond to columns of the tag matrix to obtain a tag matrix with a size of batch_num_ent, the value of batch is the number of the head entity-relation pairs in the training batch, and the value of num_ent is the number of the training set entities; the element values of the tag matrix are valued according to the triplets in the training set, the triplets are split into head entity-relation pairs and tail entities, the values in the rows and columns corresponding to the tag matrix are 1, and the tag matrix does not have the head entity-relation pairs, and the values of the elements corresponding to the tail entities are 0; simultaneously obtaining a candidate new triplet set composed of head entity-relation pairs and all entities in the training set;

Step S2043, inputting semantic representations corresponding to the new triplet sets into an LSTM network to extract semantic features, and obtaining a second semantic rule feature set S _new;

Step S2044, inputting the new triplet set into a scoring function f (h, r, t), a first semantic rule feature set S and a second semantic rule feature set S _new into a scoring function g (S _new, S), summing the result of the scoring function f (h, r, t) and the result of the scoring function g (S _new, S), and inputting a Sigmoid function to evaluate the score of each new triplet in the new triplet candidate set, wherein the value range of the score is [0,1]; inputting BCELoss the obtained scores and a real triplet tag matrix of the training set into a loss function to calculate errors, carrying out back propagation according to the errors to update training parameters, and training until BCELoss the loss function is smaller than a preset threshold value to obtain a trained structural aggregation convolutional network;

2. The method for supplementing a knowledge graph based on a structured aggregation graph convolutional network as recited in claim 1, wherein said LSTM network comprises an input x _t, an output gate o _t, an hidden layer state h _t, a forget gate f _t, an input gate i _t, a cell state C _t, a temporary cell stateThe final result is obtained through the cyclic calculation of the memory of useful information and the forgetting of useless information;

the forgetting gate is calculated as follows:

f_t＝σ(W_fg[h_t-1,x_t]+b_f) (1)

Input gate calculation:

i_t＝σ(W_ig[h_t-1,x_t]+b_i) (2)

New cell state calculation:

Output gate calculation:

o_t＝σ(W_og[h_t-1,x_t]+b_o) (5)

h_t＝o_t*tanh(C_t) (6)

Obtaining a semantic feature set S through LSTM network learning;

3. The knowledge graph completion method based on the structural aggregate graph convolutional network as recited in claim 1, wherein in the step S2044, the scoring function is:

f(h,r,t)＝similarity(h,r)+similarity(h,t)+similarity(r,t) (10)

score＝sigmoid(f(h,r,t)+g(s_new,S)) (13)。

4. a knowledge graph completion system based on a structure aggregation graph convolution network, which is characterized in that the system is used for realizing the knowledge graph completion method based on the structure aggregation graph convolution network according to any one of claims 1-3, and comprises a word embedding module, a semantic feature learning module, a coding module and a scoring module;

5. The knowledge graph completion system based on the structure aggregation graph convolution network according to claim 4, wherein the Word embedding module is a Word2Vec model, the knowledge graph to be completed is input into the Word2Vec model, the knowledge graph is converted from Word representation into vector representation of a low-dimensional continuous space, and a preliminary embedding representation result is obtained.

6. The knowledge-graph completion system based on a structured aggregate graph convolutional network of claim 4, wherein the semantic feature learning module is an LSTM network; inputting the knowledge graph to be complemented into an LSTM network, and entering the LSTM network to learn the semantic connection rule of the triples in the knowledge graph to be complemented to obtain the semantic rule feature set S of the knowledge graph to be complemented.