CN115795042A

CN115795042A - Knowledge graph completion method based on path and graph context

Info

Publication number: CN115795042A
Application number: CN202211169234.XA
Authority: CN
Inventors: 陆佳炜; 朱昊天; 李家朋; 王琪冰; 肖刚; 李琛; 徐俊; 程振波; 吴俚达; 王志鹏
Original assignee: China Jiliang University
Current assignee: China Jiliang University
Priority date: 2022-09-26
Filing date: 2022-09-26
Publication date: 2023-03-14

Abstract

A knowledge graph completion method based on paths and graph contexts aims to complete knowledge graph completion in a link prediction mode according to existing triples (e) _h ,r,e _j ) Inferring candidate triples (e) _h ,r,e _t ) Whether it is true, i.e. entity e _h And e _t Can be linked by the relation r. With head entity e _h And tail entity e _t The set of paths P between is as input, with e _h And e _t Possibility P (r | e) to be linked by relation r _h ,e _t ) Is output, if P (r | e) _h ,e _t ) Greater than modelThe set hyper-parameter is e _h And e _t Can be linked by a relation r, a triple (e) _h ,r,e _t ) This is true. The high confidence triples will be added to the knowledge-graph, making the entire knowledge-graph further completed. The invention improves the completion result of the knowledge graph.

Description

Knowledge graph completion method based on path and graph context

Technical Field

The method relates to a knowledge graph completion method based on paths and graph contexts.

Background

On 17.5.2012, google formally proposed the concept of Knowledge Graph (Knowledge Graph), which originally aimed at optimizing the results returned by search engines and enhancing the search quality and experience of users. Knowledge graph is not a completely new concept, and the concept of Semantic Network (Semantic Network) is proposed in the literature as early as 2006. The knowledge map is a carrier for storing structured objective fact information of people, things and objects in the real world, and is generally represented by taking triples as a basic structure, wherein each triplet (h, r, t) comprises a head entity h, a tail entity t and an interrelation r between the entities.

In recent years, researchers have constructed a variety of large-scale knowledge maps, such as wikitata, YAGO, and the like. Although they have achieved significant performance in many areas, insufficient knowledge coverage has been a troublesome problem in practical applications. Therefore, how to obtain new knowledge in the existing knowledge through learning so as to complement the knowledge map becomes an effective means. And in the learning process, the knowledge graph completes and tests the reasoning ability of the model. And the completion of the knowledge graph refers to the completion of entities, relationships, attributes, attribute values and the like in the knowledge graph. Completion of the knowledge-graph is also referred to as link prediction of the knowledge-graph.

TransE is one of the earliest methods for link prediction by vector embedding, with entities and relationships projected into a low-dimensional continuous vector space, and the vector representation of the relationship is treated as a translation vector from the head entity vector to the tail entity vector. Experimental results on the data sets FB15K and WN18 prove that the TransE performance is superior, and particularly the TransE performance is reflected on a large-scale knowledge map. However, the model of TransE is too simple, and there is a clear deficiency in modeling complex relationship types. Wang et al proposed a TransH model that considers that an entity should have different embedded representations under the constraint of different relationships. In TransH, head and tail entities are projected into the hyperplane where the relationships lie, and the relationships are treated as translation vectors between the projected vectors, which allows the entities to have different vector representations in different relationships.

With the development of knowledge-graph technology, more and more researchers realize that simply focusing on triples per se is not enough, and that paths should have an important position in knowledge-graph link prediction. In the knowledge graph, a large number of multi-step relationship paths exist among various entities, and various semantic relationships among the entities are reflected. The relationship path not only can embody the sequential relationship among the relationships in the knowledge graph, but also shows a complex reasoning mode. Originally, path-based methods used every Path in the knowledge graph for link prediction, as suggested by Lao et al for Path Ranking Algorithm (PRA). PRA at head entity e _h And tail entity e _t And randomly walking on the directed graph between the head entity and the tail entity, and simultaneously recording the link relation between the head entity and the tail entity. After obtaining the relationship path, from the head entity e _h Begin to perform random walks along relationships in the path and calculate arrival e _t The probability of (c). Once the PRA computes the random walk probabilities for all relationship paths, the relationship paths with higher probabilities are selected as potential path features. The PRA looks at the path as an atomic feature,therefore, each classifier needs to train a feature matrix including millions of paths, and as the number of relationships increases, the number of relationship paths to be trained increases significantly, thereby increasing the difficulty and complexity of the calculation by a factor of two.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a knowledge graph completion method based on paths and graph contexts, and aims to complete knowledge graph completion in a link prediction mode. From existing triplets (e) _h ,r,e _j ) Inferring candidate triples (e) _h ,r,e _t ) Whether or not it is true, i.e. entity e _h And e _t Can be linked by the relation r. With head entity e _h And tail entity e _t The set of paths P between is as input, with e _h And e _t Possibility P (r | e) to be linked by relation r _h ,e _t ) Is output if P (r | e) _h ,e _t ) Greater than the hyper-parameter of the model setting, then see e _h And e _t Can be linked by a relation r, a triple (e) _h ,r,e _t ) This is true. The high confidence triples will be added to the knowledge-graph, making the entire knowledge-graph more than complemented.

In order to solve the technical problems, the invention provides the following technical scheme:

a knowledge graph completing method based on paths and graph contexts comprises the following steps:

the first step is as follows: constructing a link prediction model, wherein the completion task is to predict missing link relation in an incomplete knowledge graph;

secondly, performing path embedding expression on an embedding layer in the link prediction model;

thirdly, combining graph contexts in a graph context combination layer in the link prediction model, wherein the knowledge graph is regarded as a directed graph, and the graph context structure is divided into two types for the directed graph: one is a graph context structure formed by neighbor entities and relations entering the target entity, and the other is a graph context structure formed by neighbor entities and relations leaving from the target entity; the two context structures are mapped to a knowledge graph and defined as an entity-relationship upper pair and a relationship-entity lower pair, the upper information and the lower information of a target entity are respectively embodied, and given a knowledge graph G = (E, R, T), E, R and T respectively represent an entity set, a relationship set and a regular three-tuple set;

fourthly, constructing a bidirectional LSTM network in a Bi-LSTM layer in the link prediction model;

fifthly, integrating path modes in an attention layer in a link prediction model, wherein the paths contain all information of entities and relations, but not all paths can be used for representing the relation r needing to be predicted between the entities, so that an attention mechanism is introduced to measure the importance of the paths to path combination, and the path modes are integrated in a weighted summation mode;

sixthly, outputting at an output layer of the link prediction model due to the representation of the full path mode

Represents all paths containing relationship and entity type information and has the right abstraction level, so that the information can pass through

Compute head entity e _h And tail entity e _t Probability P (r | e) that can be linked by relation r _h ,e _t )；

Wherein the probability P (r | e) _h ,e _t ) From a linear function f _p And sigmoid activation function to calculate:

seventh step, defining the objective function, the objective is to minimize the negative logarithm (L) of the link probability, for each relation r that needs to be predicted _j The triplets in the training set are divided into two parts, a positive case triplet set T and a negative case triplet set F, each triplet in the set contains a relationship r _j Recording the number of the triples in the training set as N, wherein the objective function is as follows:

the training goal of this objective function is to train the model to yield higher values on the correct triples and lower values on the missing incorrect triples while minimizing the overall error.

Further, in the first step, the link prediction model is divided into the following five layers:

(1) Embedding layer: path embedding is carried out, one path in a path set P formed by triples is selected, and entities and relations on the path are input into an embedding layer;

(2) Graph context composition layer: embedding and combining vectors of the entities and the relations into entity-relation upper pairs and relation-entity lower pairs according to the sequence of the entities and the relations in the paths;

(3) Bi-LSTM layer: inputting the results of the combined layers into a Bi-LSTM network in sequence, and simultaneously storing the current hidden state of each LSTM unit;

(4) Attention layer: combining the forward LSTM output and the backward LSTM output into a vector matrix;

(5) And (3) an output layer: inputting the full path mode into a linear feedforward network to obtain e _h And e _t Probability P (r | e) that can be linked by relation r _h ,e _t ) Wherein P (r | e) _h ,e _t ) Represents an event e _h And e _t The conditional probability of event r under the joint distribution condition of (1).

Still further, the process of the second step is as follows:

step (2.1), giving uncertain triples X in a knowledge graph, wherein X = { (e) _h ,r,e _t )|e _h ,e _t E ^ R ∈ R }, wherein E represents an entity set in the knowledge graph, and R represents a relation set;

step (2.2) in the knowledge-graph, e _h And e _t With one or more paths in between, set p ^* Denotes e _h And e _t Multiple path set p in between ^* ＝{p ₁ ,p ₂ ,...p _s S denotes the number of paths;

step (2.3), one of the paths p _i Can be represented as p _i ＝{e ₁ ,r ₁ ,e ₂ ,r ₂ ,...r _n-1 ,e _n In which e is ₁ ＝e _h ，e _n ＝e _t N denotes the number of entities on the path, the length of the path is defined as the number of relations in the path, n-1 denotes the length of the path, r _n-1 Representing the (n-1) th relation in the path;

step (2.4), the relation r meets the condition: the relation r must be on the path p _i One of the relations r _j In the middle, path p _i ＝{e ₁ ,r ₁ ,e ₂ ,r ₂ ,...r _n-1 ,e _n Each entity e in the (b) } is a node b in the network _k All having a set T of hierarchies of abstract types _k ＝{t _k,1 ,t _k,2 ...t _k,l }，t _k,l Representing an entity e _k The most abstract type, l represents the height of the level of abstraction;

step (2.5), when the possible types of the entities are selected for representation, each entity has a large number of abstract types, it is difficult to completely traverse the combination of each abstract type, a large number of calculation requirements are generated, a soft attention mechanism is used for obtaining the appropriate abstract type of each entity, and when information is selected, the weighted average of all input information in the set is calculated instead of only selecting 1 item of information in the set, and then the weighted average is input into a Bi-LSTM network for calculation;

step (2.6), training entity e by using skip-gram model of Word2vec _k Set of abstract types T _k Each abstract type t in _k,m M represents the mth in the abstract type set, and the vector embedding of the pre-training is obtained to obtain an entity e _k Type vector set V of _k ＝{v _k (t _k,1 ),v _k (t _k,2 )...v _k (t _k,l )}；

Step (2.7), training path p by using skip-gram model of Word2vec _i ＝{e ₁ ,r ₁ ,e ₂ ,r ₂ ,...r _n-1 ,e _n Each of the relationships r _j To obtain pre-trained relationship vector embedding vr _j ；

Step (2.8), set V _k Each vector embedding of the abstract type in (1) has a weight alpha _k,m Denotes t _k,m Can properly represent an entity e _k Possibility of (a), i.e. t _k,m Probability of being the correct level of abstraction; for this weight α _k,m Introducing a trainable relational dependency vector u ^* And through a linear feed-forward network f _type Is calculated, wherein the relation depends on the vector u ^* From path p _i Neutralizing entity e _k Last relation r of connection _k-1 And the next relation r _k Vector addition is carried out, exp refers to an exponential function with a natural constant e as a base in higher mathematics, and a calculation formula is as follows:

e _k,m ＝f _type (v _k (t _k,m ),u ^* )

step (2.9) < alpha >, < alpha > _k,m Combining with corresponding type level vector, adding the result to obtain a type context vector which can fully express each type level and highlight correct abstract level

As entity e _k Vector-embedded representation of, type context vector

The calculation formula of (c) is as follows:

thus, a path p is obtained _i Set of path vectors

Further, the process of the third step is as follows:

step (3.1), the entity-relation pair formed by the neighbor entity entering the target entity and the relation is generated:

step (3.2), generating a relationship formed by the neighbor entity and the relationship left by the target entity, namely an entity following pair:

step (3.3) of integrating the path vectors

The vector embedding in (1) is combined into a context structure to obtain a set of entity-relationship pairs:

relationship-collection of entity context pairs:

due to head entity e ₁ Front and tail entities e _n Then there are no more entities and relations connected to them, so a zero vector u (vector modulo zero) is used for filling, thus completing the context structure;

and (3.4) inputting the entity-relationship upper pair and the relationship-entity lower pair into a feed-forward network to obtain a feature vector of each entity-relationship upper pair and relationship-entity lower pair, wherein the calculation formula of the feature vector is as follows:

representation set C _ERAP J-th entity-relationship pair above,

representation set C _REBP J relation in (c) -entity context pair, feed forward network f _feature Is a non-linear activation function comprising ReLU operations, thereby obtaining a feature vector set of entity-relationship pairs

And relationship-feature vector set of entity context pairs

Preferably, the process of step (3.1) is as follows:

step (3.1.1), entity-relationship pair above:

wherein r is a relationship pointing to entity e, which is an entity to which entity h is directly related through relationship r, i.e., the head entity h and relationship r in a triplet with all tail entities e will be treated as entity-relationship pair (h, r);

step (3.1.2), in one path, entity-relationship pair above is represented as a plurality of consecutive pair above, next above to more than one pair above (e) _i-1 ,r _i-1 ) Corresponding triplet (e) _i-1 ,r _i-1 ,e _i ) Tail entity e in _i For the head entity, entity e, the order of entities and relationships along the path _i And the next relation r of the connection _i And entity e _i+1 Forming a new triplet, in triplet (e) _i ,r _i ,e _i+1 ) Middle, head entity e _i And relation r _i Form a compositionA tail entity e _i+1 Entity-relationship pair (e) above _i ,r _i )。

Still preferably, the process of step (3.2) is as follows:

step (3.2.1), relationship-entity pair below:

all triples with head entity e will be treated as relationship-entity context pair (r, t);

step (3.2.2), in one path, the relation-entity context pairs are represented as a plurality of consecutive context pairs, the next context pair being the tail entity e in the above context pair _i For the head entity, the order of entities and relationships in the path, entity e _i And the next relation r of the connection _i And entity e _i+1 Forming a new triplet, in triplet (e) _i ,r _i ,e _i+1 ) In, the relationship r _i And tail entity e _i+1 Form a head entity e _i Relationship of (1) -entity context pair.

The process of the fourth step is as follows:

step (4.1), collecting

The feature vectors in (1) are sequentially input into a forward LSTM sequence, set

The feature vector in (1) is input into a backward LSTM sequence in a reverse order;

step (4.2), obtaining the current hidden state in the LSTM

And

step (4.3) at the second of forward LSTM sequenceAfter combining a hidden state with the last hidden state of the backward LSTM sequence, the ith path p is obtained _i All the information of the jth entity in the system and all the information of the two relations before and after the jth entity with the closest relation around the jth entity, namely h _j To represent an entity e _j And its context information, h _j Entity e is retained while the embedded information representing the entities and relationships _j The semantic structure of (2):

step (4.4), the path mode is composed of a plurality of continuous h _j Are combined, h _j In a virtually fragmented path pattern, the ith path p may be obtained _i The vector set after combining the context information of all entities and entities: h is _pi ＝{h ₁ ,h ₂ ...h _n-1 ,h _n }。

The process of the step (4.2) is as follows:

step (4.2.1), in the LSTM calculation process, the current hidden state of each LSTM unit is reserved, and the current hidden state of the jth LSTM unit of the forward LSTM sequence is called as

Calling of backward sequences

Step (4.2.2) when

When the forward LSTM sequence is input in order, the current set of hidden states for the forward sequence can be obtained:

while

Inputting backward LSTM sequences in reverse order, i.e.

As the first input to the backward LSTM sequence, the current set of hidden states for the backward sequence is thus obtained:

current hidden state

And

the calculation method of (c) is as follows:

the process of the fifth step is as follows:

step (5.1), the triple (e) to be predicted _h ,r,e _t ) The model is also entered, where the path of the triplet has only one p _r ＝{e _h ,r,e _t }；

Step (5.2), obtaining a vector set after the combination of the context information of all the entities and the entities in the path by the step (5.1): h is a total of _pr ＝{h _e ,h _t }；

Step (5.3), the path mode is formed by combining all entities and context information in the path, and the vector set h _pi The vector embeddings in (a) will be spliced into a vector matrix H;

step (5.4), after processing the vector matrix H by the activation function tanh, the path p can be obtained _i Is represented by a path pattern

the tanh function is a variant of the sigmoid function, which is a common activation function in neural networks;

step (5.5), in the same way as step (5.4), a test triplet (e) can be obtained _h ,r,e _t ) Is represented by a path pattern

Step (5.6), representing the path mode

And

input a feedforward network f _pattern Computing Path patterns

And

semantic similarity τ of _i Measuring the importance degree of the path to the model;

step (5.7), all semantic similarity tau is subjected to _i Performing weighted integration to obtain a head entity e _h And tail entity e _t A full information representation between them containing all relations and all entity types with the right abstraction level, i.e. a full path schema representation

The correlation formula is as follows:

H＝[h ₁ ,h ₂ ...h _n-1 ,h _n ]

H _pr ＝[h _e ,h _t ]

wherein, the first and the second end of the pipe are connected with each other,

representing the ith path p _i Is represented by a path pattern of _i ^* Is a path weight, representing the model to the path p _i Combines all the information from the multiple paths and, while discarding path information not needed by the model, retains information on the path that is desired to be focused, H _pr ＝[h _e ,h _t ]Set of representative vectors h _pr ＝{h _e ,h _t The corresponding vector matrix.

The beneficial effects of the invention are: a knowledge graph completion method based on paths and graph context is provided: and replacing entity information with abstract types of entities, selecting appropriate abstract types by using an attention mechanism, and approximately representing the entities by the abstract entity types so as to maintain better generalization and avoid entity divergence. And introducing a graph context structure in the path mode, encoding the fragmented path mode representation, and then integrating all the path modes by combining an attention mechanism to obtain the probability of whether the target triple can be established or not. The completion result of the knowledge graph is improved.

Drawings

FIG. 1 shows a model flow diagram of the present invention.

Fig. 2 shows a link prediction method comparison of the present invention.

FIG. 3 illustrates the comparison of entity type weights for different datasets according to the present invention, wherein (a) represents NELL995 entity type weights, (b) represents FB15K-237 entity type weights, and (c) represents WN18RR entity type weights.

Detailed description of the preferred embodiment 1

The invention is further described below with reference to the accompanying drawings.

Referring to fig. 1 to 3, a knowledge graph completion method based on a path and a graph context includes the following steps:

firstly, performing path embedding representation on an embedding layer in a link prediction model, wherein the process is as follows:

step (1.1), giving uncertain triples X in a knowledge graph, wherein X = { (e) _h ,r,e _t )|e _h ,e _t E ^ R ∈ R }, wherein E represents an entity set in the knowledge graph, and R represents a relation set;

step (1.2) in the knowledge-graph, e _h And e _t With one or more paths in between, set p ^* Denotes e _h And e _t Multiple path set p in between ^* ＝{p ₁ ,p ₂ ,...p _s S denotes the number of paths;

step (1.3), one of the paths p _i Can be expressed as p _i ＝{e ₁ ,r ₁ ,e ₂ ,r ₂ ,...r _n-1 ,e _n In which e ₁ ＝e _h ，e _n ＝e _t N denotes the number of entities on the path, the length of the path is defined as the number of relations in the path, n-1 denotes the length of the path, r _n-1 Representing the (n-1) th relation in the path;

step (1.4), the relation r meets the condition: the relation r must be on the path p _i One of the relations r _j Among them. Path p _i ＝{e ₁ ,r ₁ ,e ₂ ,r ₂ ,...r _n-1 ,e _n Each entity e in _k All having a set T of hierarchies of abstract types _k ＝{t _k,1 ,t _k,2 ...t _k,l }，t _k,1 Representing an entity e _k Most concrete abstract type, t _k,l Representing an entity e _k The most abstract type, l represents the height of the abstraction level;

in the step (1.5), when the possible type representations of the entities are selected, each entity has a large number of abstract types, and it is difficult to completely traverse the combination of each abstract type, which can generate a large amount of calculation requirements. A soft attention mechanism, which is commonly used in one approach in the field of computer vision, is used to obtain the appropriate abstract type for each entity. Means not from the set T when selecting information _k Only 1 of the information in (1) is selected, and the set T is calculated _k The weighted average of all the input information is input into the Bi-LSTM network for calculation;

step (1.6), train entity e with Word2vec skip-gram model (commonly used in natural language processing to predict context) _k Abstract type set T of _k Each abstract type t in _k,m (m represents the mth in the abstract type), obtaining the vector embedding of the pre-training to obtain an entity e _k Type vector set V of _k ＝{v _k (t _k,1 ),v _k (t _k,2 )...v _k (t _k,l )}；

Step (1.7), training path p by using skip-gram model of Word2vec _i ＝{e ₁ ,r ₁ ,e ₂ ,r ₂ ,...r _n-1 ,e _n Each of the relations r _j To obtain pre-trained relationship vector embedding vr _j ；

Step (1.8) and set V _k Each abstract type in (1) has a weight alpha _k,m Denotes t _k,m Can properly express entity e _k Possibility of (a), i.e. t _k,m Is the probability of the correct abstraction level. For this weight α _k,m The invention introduces a trainable relation dependent vector u ^* And through a linear feed-forward network f _type (each layer of neurons is connected with the underlying neurons, there is no homonymous connection between neurons, nor is there any cross-connection between neuronsLayer connections) in which the relationship depends on the vector u ^* From path p _i Neutralizing entity e _k Last relation r of connection _k-1 And the next relation r _k And vector addition (an operation of summing two vectors is called vector addition). exp is an exponential function with a natural constant e as the base in higher mathematics;

e _k,m ＝f _type (v _k (t _k,m ),u ^* )

step (1.9) (. Alpha.) _k,m Combining with corresponding type level vector, adding the result to obtain a type context vector which can fully express each type level and highlight correct abstract level

As entity e _k Is embedded in the representation. Type context vector

The calculation formula of (a) is as follows:

thus obtaining a path p _i Set of path vectors

And secondly, carrying out graph context combination on a graph context combination layer in the link prediction model, wherein the knowledge graph is regarded as a directed graph, and for the directed graph, the graph context structure is divided into two types: the method comprises the following steps that one graph upper structure is formed by neighbor entities and relations entering a target entity, the other graph lower structure is formed by neighbor entities and relations leaving from the target entity, the two context structures are mapped to a knowledge graph, the method is defined as an entity-relation upper pair and a relation-entity lower pair, the entity-relation upper pair and the relation-entity lower pair respectively represent the upper information and the lower information of the target entity, a knowledge graph G = (E, R, T), E, R and T respectively represent an entity set, a relation set and a regular three-tuple set, and the process is as follows:

step (2.1), the entity-relation pair formed by the neighbor entity entering the target entity and the relation is generated:

step (2.1.1), entity-relationship pair above:

wherein r is a relationship pointing to entity e, which is an entity to which entity h is directly associated through relationship r, i.e., all triples with tail entities e in which head entity h and relationship r are to be regarded as entity-relationship pair (h, r) above;

step (2.1.2), in one path, entity-relationship pair appears as a plurality of consecutive pairs of above, next above to more than one pair of above (e) _i-1 ,r _i-1 ) Corresponding triplet (e) _i-1 ,r _i-1 ,e _i ) Tail entity e in _i For the head entity, entity e, the order of entities and relationships along the path _i And the next relation r to _i And entity e _i+1 Forming a new triplet, in triplet (e) _i ,r _i ,e _i+1 ) Middle, head entity e _i And relation r _i Form a tail entity e _i+1 Entity-relationship pair (e) above _i ,r _i )；

Step (2.2), generating a relationship formed by the neighbor entity and the relationship left by the target entity, namely an entity following pair:

step (2.2.1), relationship-entity pair below:

all triples with head entity e will be treated as tail entity t and relation r as relation-entity context pair (r, t);

step (2.2.2), in one path, the relation-entity context pairs are represented as a plurality of consecutive context pairs, the next context pair being the tail entity e in the above context pair _i For the head entity, entity e, the order of entities and relationships along the path _i And the next relation r of the connection _i And entity e _i+1 Forming a new triple in the triple (e) _i ,r _i ,e _i+1 ) In, the relationship r _i And tail entity e _i+1 Form a head entity e _i Relationship of (1) -entity context pair;

step (2.3) of integrating the path vectors

relationship-collection of entity context pairs:

due to the head entity e ₁ Front and tail entities e _n Then there are no more entities and relations connected to them, so a zero vector u (vector modulo zero) is used for filling, thus completing the context structure;

step (2.4), inputting the entity-relationship upper pair and the relationship-entity lower pair into a feed-forward network to obtain a feature vector of the entity-relationship upper pair and the relationship-entity lower pair, wherein the calculation formula of the feature vector is as follows:

representation set C _ERAP J-th entity-relationship pair above,

representation set C _REBP J relation in (c) -entity context pair, feed forward network f _feature Is a non-linear activation function that includes a ReLU operation. Thereby obtaining a feature vector set of the entity-relationship pair

And relationship-feature vector set of entity context pairs

Thirdly, constructing a bidirectional LSTM network in a Bi-LSTM layer in the link prediction model, wherein the process is as follows:

step (3.1) of aggregating

step (3.2) obtaining the current hidden state in the LSTM

And

step (3.2.1), in the LSTM calculation process, the current hidden state of each LSTM unit is reserved, and the forward LSTM sequence is the second oneThe current hidden state of the j LSTM cells is called

Calling of backward sequences

Step (3.2.2) when

while

Inputting backward LSTM sequences in reverse order, i.e.

current hidden state

And

the calculation of (c) is as follows:

step (3.3) of forward LSTM sequenceAfter the first hidden state is combined with the last hidden state of the backward LSTM sequence, the ith path p can be obtained _i All information (including abstract type information) of the jth entity, and all information of the two relations before and after the jth entity with the closest relation. Ready to use h _j To represent an entity e _j And its context information, h _j Entity e is retained while the embedded information representing the entities and relationships _j The semantic structure of (2):

step (3.4), the path mode of the invention is composed of a plurality of consecutive h _j Are combined, h _j In fact a fragmented path pattern, from which the ith path p can be derived _i Vector set after combining context information of all entities and entities: h is _pi ＝{h ₁ ,h ₂ ...h _n-1 ,h _n }；

And fourthly, integrating path modes in an attention layer in a link prediction model, wherein the paths contain all information of entities and relations, but not all paths can be used for representing the relations r needing to be predicted between the entities, so that the invention introduces an attention mechanism to measure the importance of the paths to path combination, and integrates the path modes in a weighted summation mode, and the process is as follows:

step (4.1), triple (e) to be predicted _h ,r,e _t ) The model is also entered, when the path of the triplet has only one p _r ＝{e _h ,r,e _t }；

Step (4.2), obtaining a vector set after the combination of the context information of all the entities and the entities in the path by the step (4.1): h is _pr ＝{h _e ,h _t }；

And (4.3) combining all entities in the path and the context information in the path mode. Vector set h _pi The vector embeddings in (a) will be spliced into a vector matrix H;

step (4.4), after processing the vector matrix H by the activation function tanh, the path p can be obtained _i Is represented by a path pattern

Step (4.5), in the same way as step (4.4), a test triplet (e) can be obtained _h ,r,e _t ) Is represented by a path pattern

Step (4.6), representing the path mode

And

input a feedforward network f _pattern Computing a path pattern

And

semantic similarity τ of _i The importance degree of the path to the model is measured;

step (4.7) of comparing all semantic similarities tau _i Performing weighted integration to obtain a head entity e _h And tail entity e _t A full information representation between them containing all relations and all entity types with the right abstraction level, i.e. a full path schema representation

The correlation formula is as follows:

H＝[h ₁ ,h ₂ ...h _n-1 ,h _n ]

H _pr ＝[h _e ,h _t ]

wherein the content of the first and second substances,

representing the ith path p _i Is represented by a path pattern of _i ^* Is a path weight, representing the model to the path p _i The weight sum operation combines all information from multiple paths and retains information for paths that are desired to be focused on while discarding path information that is not needed by the model. H _pr ＝[h _e ,h _t ]Set of representative vectors h _pr ＝{h _e ,h _t The corresponding vector matrix. the tanh function is a variant of the sigmoid function;

fifthly, outputting at an output layer of the link prediction model due to the representation of the full path mode

Compute head entity e _h All-grass of HeweiBody e _t Probability P (r | e) that can be linked by relation r _h ,e _t )；

and sixthly, defining an objective function. The goal of the present invention is to minimize the negative logarithm of the link probability (L). For each relation r that needs to be predicted _j The triplets in the training set are divided into two parts, a positive case triplet set T and a negative case triplet set F, each triplet in the set contains a relationship r _j Recording the number of the triples in the training set as N, and the objective function is as follows:

Example (c): FIG. 1 shows a model flow diagram of a link prediction model with a head entity e _h And tail entity e _t The set of paths P between is as input, take e _h And e _t Possibility P (r | e) to be linked by relation r _h ,e _t ) Is output if P (r | e) _h ,e _t ) Greater than the hyper-parameter of the model setting, then see e _h And e _t Can be linked by a relation r, a triple (e) _h ,r,e _t ) This is true. The high confidence triples will be added to the knowledge-graph, making the entire knowledge-graph more than complemented.

The connection prediction method of the present invention is proposed by comparing the link prediction methods of other models with reference to fig. 2.

The Path Ranking Algorithm (PRA) proposed by Lao et al is a representative link prediction model, and predicts using a path composed of relations. The PRA has the main idea that random walk is executed from a head entity to a tail entity on a knowledge graph, a characteristic matrix is constructed by recording relationship paths between the head entity and the tail entity, and then a binary classification method is trained on the characteristic matrix to infer missing relationships between the entities. The PRA acquires a relationship path based on a relationship sequence, and can extract a path pattern well, but it is easy to generalize an inferred entity excessively. As shown in fig. 2, the known triples (drinking water, for drinking), link prediction is performed on the candidate triples (mineral water, for writing), PRA uses the relation pattern < -containment, for > reasoning, where-represents the opposite direction of the relation. PRA can easily conclude that candidate triples a and b possess the same path pattern as known triples, and thus PRA concludes that candidate triples a and b hold, whereas in real life candidate triplet a is not true. Das et al extend the relationship path on a PRA basis, using entities and entity type information in addition to learning the relationships in the path. However, in the Das et al approach, the path is mainly composed of entities and relationships, the entity type is treated only as a complement to the entities, the entities and relationships are represented as vector embeddings in a low dimensional space, and similar entities and relationships possess similar vector representations. As in fig. 2, the Das et al method represents the path pattern of known triplets as < potable water, -hold, glass bottle, hold, red wine, for, drink >, and the path patterns of candidate triplets a and b as < potable water, -hold, glass bottle, hold, ink, for, write > and < coconut juice, -hold, paper cup, hold, coffee, for, drink >, respectively. The path pattern of the known triple and the path patterns of the candidate triples a and b have low semantic similarity, so that the inference about the candidate triples a and b is false, whereas in the real world, the candidate triplet b is true. The Das et al approach improves the modeling quality and prediction accuracy of the path, however this approach loses the generalization of the inference entity, resulting in a failure of the prediction of the candidate triplet b.

Whereas the method of the present invention, as shown in fig. 2, uses entity abstract types and relationships to compose path patterns, similar entities like drinking water, red wine, coconut juice, etc. can all be represented as the same abstract types: a beverage is provided. In the present invention, the path pattern of the known triplet is denoted as < beverage, -hold, container, hold, beverage, for, swallow >, and the path pattern of the candidate triplets a and b is denoted as < beverage, -hold, container, hold, dye, for, record > and < beverage, -hold, container, hold, beverage, for, swallow >. Since candidate triplet a has a different path pattern while candidate triplet b has the same path pattern as the known triplet. Therefore, it can be easily concluded that the candidate triplet a does not hold and b holds. Meanwhile, the method of the invention has good generalization, and any entity type which can be summarized into a beverage can be easily predicted through a path mode.

To evaluate the model, the present invention performed experiments on three widely used knowledgegraph benchmark datasets NELL995, FB15K-237 and WN18RR. FB15K-237 and WN18RR are respectively from reference data sets FB15K and WN18, FB15K is extracted from a large-scale open knowledge base FreeBase describing real world information, 15K entities are contained in total, the content relates to aspects of each field, WN18 is from a vocabulary knowledge base WordNet, and a large number of relations among vocabularies, such as semantic hierarchical relations of upper vocabularies, synonyms and the like, are contained. However, the scholars point out that there is a serious problem of test data leakage on the two data sets, namely that triples in the test set have corresponding inverted triples in the training set. Because many simple models can achieve good link prediction results on the data set due to the data leakage problem, the scholars remove partial relationship types such as inverse relationship and inverse relationship from the FB15K and WN18, and reconstruct subsets of the FB15K and WN18, such as FB15K-237 and WN18RR. These two data sets are also becoming the common reference data sets in the field of knowledge-graph research. NELL995 is a subset proposed by Xiong et al for multi-hop inference based on the results of 995 iterations of the NELL system. In NELL995, xiong et al deleted the two relationships of generallizations and haskukipedia, which occurred more than 200 thousands of times in the NELL dataset without any reasoning value, and for the remaining relationships in the dataset, xiong et al only retained the relationship 200 before the occurrence in the NELL dataset. The entities and relations retained in the NELL995 data set have good quality and great reasoning value, and quickly become the subjects of the experiments selected by many scholars recently. The detailed statistics of these three data sets are shown in table 1.

Table 1 experimental data set statistics

Table 1Experimental dataset statistics

TABLE 1

In experiments, the invention establishes a new experimental data set Z = { [ (e) for each data set _hi ,r _j ,e _ti ),x]|e _hi ,e _ti ∈E,r _j E R, x e {0,1} }, x denotes the triplet (e) _hi ,r _j ,e _ti ) Or not is true. The triples in the data set Z include positive example triples in the original data set and negative example triples constructed by the self-countervailing negative sampling method of the present invention. The experiments of Trouillon et al show that the number of negative example triples has a large impact on the experimental quality and computational performance, and a balance between quality and performance needs to be found. Therefore, the invention sets 20 negative examples for each positive example triplet. At the same time, 80% of the triples in the data set are added to the training set and 20% are added to the test set. For the paths between entities, a bi-directional Breadth First Search algorithm (bi-directional Breadth First Search) is used to extract the paths. Meanwhile, some researchers have proposed that shorter paths are generally more effective and reliable than longer paths on the link prediction task, so that the invention sets different path lengths for different data sets, and the maximum value of the path length is set to be 6 for WN18RR with less data; for NELL995, the maximum path length is set to 5; and more diverse, more densely connected FBs 15K-237, the maximum path length is set to 3. For building entity type hierarchyTask, in WN18RR, the invention uses the upper level word set by Miller et al for WordNet as entity type; in FB15K-237 and NELL995, each entity itself carries several entity types, and NELL995 is more than one data set that explicitly labels entity types, which can be used directly. For the number of entity types, the entity type of each entity is set not to exceed 7.

The invention selects three commonly used indexes to evaluate the performance of the model, namely, average precision average (MAP), average reciprocal rank (MRR) and hits @ n index. These evaluation metrics all depend on the final ranking of the correct entity in the test triplet, assume the set of correct entities in all test triples is ET = { e = { ₁ ,e ₂ ,...e _n The calculation formulas of the three indexes are as follows:

the MAP is used for calculating the average value of the precision scores of all the ranking positions of the correct triples, | ET | represents the number of elements in the set ET, and AP represents the average value of the precision scores of the entity e at the ranking positions of the correct triples.

The MRR is used to calculate the reciprocal average of the ranking positions of the first correct triplet in the test sample, and rank (e) represents the ranking of entity e in the set ET.

Hits @ n was used to calculate the proportion of all test samples that were correctly ranked not more than n, and the present invention used n =1,3 for the test. In the results, the higher the value of the index is, the better the link prediction capability of the model is proved.

During testing, the test triples generated by replacing the head and tail entities may already exist in the original knowledge-graph dataset, and the test triples are true example triples. For example, a test triplet (spoon, to, eat) generated by a regular triplet (chopsticks, to, eat) replacement head entity is itself a regular triplet and already exists in the dataset. The test triples affect the final ranking of the target entity, and in order to eliminate the adverse effect of the existing test triples, the test triples are filtered before the experiment, and the existing triples are deleted from the test set, the verification set and the training set. In the link prediction experiment, the filtering operation that has been performed is called "Filter", and the filtering operation that has not been performed is called "Raw", and in the present invention, all the experiments are completed after the filtering operation.

In order to verify the effectiveness of the proposed model, the present invention selects several representative and widely used link prediction methods for comparison. In the comparative method, since WN18RR relations are less in number and the constructed knowledge graph is simpler, FB15k-237 and NELL995 are mainly used in the data sets of many link prediction methods. However, in real life, knowledge maps in many fields, such as tourism knowledge maps, movie knowledge maps and the like are similar to WN188RR, the relation types are less, and the entity quantity is more. In view of the above, the link prediction ability of the method of the present invention on a complex knowledge graph is verified on FB15k-237 and NELL995, and the link prediction ability of the method on a simple knowledge graph is verified on WN18RR.

Therefore, the invention selectively compares models of PRA, deepPath, distMult, convE, APR, javaral and the like, and verifies the link prediction capability of the method on the knowledge graph with complex relationships. Selecting the link prediction capability of TransE, distMult, complEx, convE and RotatE verification methods on the simple relation knowledge graph. The results of the model of the invention and the comparative model are reported in table 2 as linked predictions on three reference datasets, with the numbers shown in bold in the table indicating the best results in the control experiments, and the numbers with underlining indicating suboptimal results-indicating no data for the proposed method. Since the authors of the PRA model verified the results of the experiment using amazon machine data sets, and jagval et al replicated the experiment on both data sets FB15k-237 and NELL995, respectively, the present invention uses the data of jagval et al.

TABLE 2 results of link prediction on FB15K-237 and NELL995

Table 2Link prediction results on FB15K-237 and NELL995

TABLE 2

As can be seen from Table 2, the model of the present invention achieves superior results on data sets FB15K-237 and NELL995. Especially on NELL995 data set, on MAP, MRR and hits @3 indices, the model of the invention achieves the best results, each index being generally 2% to 3% higher than the suboptimal results. This is because the NELL995 data set is superior, removing some relationships that are a large percentage but are not useful for reasoning; on the other hand, because NELL995 is the only data set explicitly marking entity types, and the method uses the entity types to perform link prediction, and displaying the marked entity types enables the method to perform reasoning more easily, and ensures that the model can keep good generalization capability on the data set. In order to research the influence of the abstraction level on the link prediction result, the invention counts the weight distribution of entity types at different abstraction levels on three reference data sets. As shown in FIG. 3, the model proposed by the present invention represents entities with all abstraction levels, and in NELL995 data set, the entities are mainly represented by entity types of three abstraction levels L1, L2 and L6; in FB15K-237, entities are mainly represented by entity types at three abstraction levels of L1, L2 and L5; and in the data set WN18RR, entities are mainly represented by entity types at three abstraction levels L1, L2 and L3. Both levels of abstraction, L1 and L2, are used for all entities because the concrete abstraction types are closer to the entities themselves, allowing a better description of the entity information. This is not absolute, however, and fig. 3 also illustrates that not all entities should use the most concrete level of abstraction, and that different entities need to be represented by different abstract entity types.

Claims

1. A method for path and graph context based knowledge-graph completion, the method comprising the steps of:

secondly, performing path embedding representation on an embedding layer in the link prediction model;

thirdly, combining graph contexts in a graph context combination layer in the link prediction model, wherein the knowledge graph is regarded as a directed graph, and for the directed graph, the graph context structure is divided into two types: one is a graph context structure formed by neighbor entities and relations entering the target entity, and the other is a graph context structure formed by neighbor entities and relations leaving from the target entity; the two context structures are mapped to a knowledge graph and defined as an entity-relationship upper pair and a relationship-entity lower pair, the upper information and the lower information of a target entity are respectively embodied, and given a knowledge graph G = (E, R, T), E, R and T respectively represent an entity set, a relationship set and a regular three-tuple set;

fifthly, integrating path modes in an attention layer in a link prediction model, wherein paths contain all information of entities and relations, but not all paths can be used for representing the relation r needing to be predicted between the entities, so that an attention mechanism is introduced to measure the importance of the paths to path combination, and the path modes are integrated in a weighted summation mode;

sixthly, outputting the link prediction model at an output layer of the link prediction model because of the representation of the full path mode

Represents all paths containing relationship and entity type information and has correctAnd thus may pass through

seventh step, defining the objective function, the objective is to minimize the negative logarithm (L) of the link probability, for each relation r that needs to be predicted _j The triples in the training set are divided into two parts, a positive example triplet set T and a negative example triplet set F, each triplet in the set contains a relationship r _j Recording the number of the triples in the training set as N, wherein the objective function is as follows:

2. The path and graph context based knowledge-graph completion method according to claim 1, wherein in the first step, the link prediction model is divided into the following five layers:

3. The path and graph context based knowledge-graph completion method according to claim 1 or 2, wherein the second step is performed by:

step (2.3), one of the paths p _i Can be expressed as p _i ＝{e ₁ ,r ₁ ,e ₂ ,r ₂ ,...r _n-1 ,e _n In which e ₁ ＝e _h ，e _n ＝e _t N denotes the number of entities on the path, the length of the path is defined as the number of relations in the path, n-1 denotes the length of the path, r _n-1 Representing the (n-1) th relation in the path;

step (2.4), the relation r meets the condition: the relation r must be on the path p _i One of the relationships r _j In the middle, path p _i ＝{e ₁ ,r ₁ ,e ₂ ,r ₂ ,...r _n-1 ,e _n Each entity e in _k All having a set T of hierarchies of abstract types _k ＝{t _k,1 ,t _k, ₂ ...t _k,l }，t _k,l Representing an entity e _k The most abstract type, l represents the height of the level of abstraction;

step (2.6), training entity e by using skip-gram model of Word2vec _k Abstract type set T of _k Each abstract type t in _k,m M represents the mth in the abstract type set, and the vector embedding of the pre-training is obtained to obtain an entity e _k Type vector set V of _k ＝{v _k (t _k,1 ),v _k (t _k,2 )...v _k (t _k,l )}；

Step (2.7), training path p by using skip-gram model of Word2vec _i ＝{e ₁ ,r ₁ ,e ₂ ,r ₂ ,...r _n-1 ,e _n Each of the relations r _j To obtain pre-trained relationship vector embedding vr _j ；

Step (2.8), set V _k Each vector embedding of the abstract type in (1) has a weight alpha _k,m Denotes t _k,m Can properly express entity e _k Possibility of (1), i.e. t _k,m Probability of being the correct level of abstraction; for this weight α _k,m Introducing a trainable relational dependency vector u ^* And through a linear feed-forward network f _type Is calculated, wherein the relation depends on the vector u ^* From path p _i Neutralizing entity e _k Last relation of connection r _k-1 And lowerA relation r _k Vector addition is carried out, exp refers to an exponential function with a natural constant e as a base in higher mathematics, and a calculation formula is as follows:

e _k,m ＝f _type (v _k (t _k,m ),u ^* )

As entity e _k Is represented by a vector embedding, type context vector

The calculation formula of (c) is as follows:

thus obtaining a path p _i Set of path vectors of (1)

4. The path and graph context based knowledge-graph completion method according to claim 1 or 2, wherein the third step is performed as follows:

step (3.3) ofPath vector set

The vector embedding in (2) is combined into a context structure to obtain a set of entity-relationship pairs:

relationship-collection of entity context pairs:

representation set C _ERAP J-th entity-relationship pair above,

representation set C _REBP J (th) relation-entity in (1)Hereinafter, the feedforward network f _feature Is a non-linear activation function containing ReLU operation, thereby obtaining a feature vector set of entity-relation pairs

And relationship-feature vector set of entity context pairs

5. The path and graph context based knowledge graph completion method according to claim 4, wherein the process of step (3.1) is as follows:

step (3.1.1), entity-relationship pair above:

step (3.1.2), in one path, entity-relational pair of contexts is represented as a plurality of consecutive pairs of contexts, the next pair of contexts being one above (e) _i-1 ,r _i-1 ) Corresponding triplet (e) _i-1 ,r _i-1 ,e _i ) Tail entity e in _i For the head entity, entity e, the order of entities and relationships along the path _i And the next relation r of the connection _i And entity e _i+1 Forming a new triple in the triple (e) _i ,r _i ,e _i+1 ) Middle, head entity e _i And relation r _i Form a tail entity e _i+1 Entity-relationship pair (e) above _i ,r _i )。

6. The path and graph context based knowledge graph completion method according to claim 4, wherein the process of step (3.2) is as follows:

step (3.2.1), relationship-entity pair below:

step (3.2.2), in one path, the relation-entity context pairs are represented as a plurality of consecutive context pairs, the next context pair being the tail entity e in the above context pair _i For the head entity, entity e, the order of entities and relationships along the path _i And the next relation r of the connection _i And entity e _i+1 Forming a new triplet, in triplet (e) _i ,r _i ,e _i+1 ) In relation to r _i And tail entity e _i+1 Form a head entity e _i Relationship of (1) -entity context pair.

7. The path-and-graph-context based knowledge-graph completion method according to claim 1 or 2, characterized in that the fourth step is performed by:

step (4.1) of aggregating

step (4.2) obtaining the current hidden state in the LSTM

And

step (4.3), after combining the first hidden state of the forward LSTM sequence and the last hidden state of the backward LSTM sequence, the ith path p can be obtained _i All the information of the jth entity in the system and all the information of the two relations before and after the jth entity with the closest relation around the jth entity, namely h _j To represent an entity e _j And its context information, h _j Entity e is retained while the embedded information representing the entities and relationships _j The semantic structure of (2):

step (4.4), the path mode is composed of a plurality of continuous h _j Are combined, h _j In a virtually fragmented path pattern, the ith path p can be obtained _i Vector set after combining context information of all entities and entities: h is _pi ＝{h ₁ ,h ₂ ...h _n-1 ,h _n }。

8. The path and graph context based knowledge-graph completion method according to claim 7, wherein the process of step (4.2) is as follows:

The backward sequence being called

Step (4.2.2) when

Sequential input forward LSTMIn sequence, the current hidden state set of the forward sequence can be obtained:

while

Inputting backward LSTM sequences in reverse order, i.e.

current hidden state

And

the calculation of (c) is as follows:

9. the path-and-graph-context based knowledge-graph completion method according to claim 1 or 2, characterized in that the process of the fifth step is as follows:

step (5.1), triple (e) to be predicted _h ,r,e _t ) The model is also entered, when the path of the triplet has only one p _r ＝{e _h ,r,e _t }；

Step (5.2), obtaining this strip from step (5.1)A vector set formed by combining the context information of all entities and entities in the path: h is _pr ＝{h _e ,h _t }；

Step (5.6), representing the path mode

And

input a feedforward network f _pattern Computing a path pattern

And

step (5.7) of comparing all semantic similarities tau _i Performing weighted integration to obtain a head entity e _h And tail entity e _t One between contains all relationships and all has the rightFull information representation of entity types at the level of abstraction, i.e. full path schema representation

The correlation formula is as follows:

H＝[h ₁ ,h ₂ ...h _n-1 ,h _n ]

H _pr ＝[h _e ,h _t ]

representing the ith path p _i Is represented by a path pattern of _i ^* Is the path weight, representing the model vs. path p _i Combines all the information from the multiple paths and, while discarding path information not needed by the model, retains information on the path that is desired to be focused, H _pr ＝[h _e ,h _t ]Set of representative vectors h _pr ＝{h _e ,h _t The corresponding vector matrix.