CN115481215A

CN115481215A - Partner prediction method and prediction system based on temporal partner knowledge graph

Info

Publication number: CN115481215A
Application number: CN202110606169.1A
Authority: CN
Inventors: 金栋栋; 程鹏; 林学民; 陈雷
Original assignee: Junshuo Shanghai Information Technology Co ltd; East China Normal University
Current assignee: Junshuo Shanghai Information Technology Co ltd; East China Normal University
Priority date: 2021-05-31
Filing date: 2021-05-31
Publication date: 2022-12-16

Abstract

The invention discloses a partner prediction method based on a temporal partner knowledge graph, which comprises the following steps: 1. constructing a knowledge graph of a temporal partner; 2. adopting a semantic encoder based on a pre-training language model to mine semantic information of an entity in a knowledge graph of a temporal partner as initialization representation of the semantic information; 3. and calculating the normalized attention coefficient of each temporal triplet in the neighborhood of each entity relative to the central entity according to a temporal graph attention mechanism, and weighting and aggregating neighborhood information according to the coefficient to update the entity embedding representation 4. Given a triplet, decoding the triplet embedding matrix by a decoder based on a multi-scale convolution kernel strategy to obtain the fraction of the triplet. 5. Two authors who have never collaborated are given, get their embedded representation from the embedded encoder, output their score from the decoder, and predict that if greater than a threshold, they will collaborate with each other. The method can greatly improve the prediction performance.

Description

Partner prediction method and prediction system based on temporal partner knowledge graph

Technical Field

The invention belongs to the technical field of social network analysis, relates to a link prediction method among nodes in a graph, and particularly relates to a partner prediction method and a partner prediction system based on a temporal partner knowledge graph.

Background

In recent decades, due to the rapid development of modern internet and its related technologies, various types of interactions between people have been increasingly frequent, thereby promoting researchers' interest and enthusiasm for increasingly large online social networks. The application scenarios of social network analysis are very rich, such as public opinion analysis and control in a public opinion network, instant user recommendation in an online social system, influence analysis, and the like. Partner networks have also received much attention in recent years as an important subnetwork of social networks. The partner network prediction task is an important task of a partner network, and can provide powerful suggestions for project and paper cooperation among academic researchers, so that suitable partners are recommended, and academic thought exchange among the researchers is increased.

For partner prediction problems, conventional work is often based on homogeneous networks, i.e., networks with only one type of node and one type of relationship, such as partner networks, friendship networks, and so on. Conventional co-author prediction models often ignore or underutilize information about various relationships, such as treatise meetings, treatise topics, and organizations to which authors belong. This information is very important for co-author prediction. For example, authors who belong to the same structure are more likely to collaborate in the future.

After a multi-relation directed graph Knowledge graph (Kknowledgegraph) is proposed in Google in 2012 and used in semantic search, the Knowledge graph is once again introduced into the research sight of students, the Knowledge graph containing a large amount of prior Knowledge of human beings is combined with deep learning to further improve one of important ideas of artificial intelligence effects, the status of the Knowledge graph is also once again taken into consideration, and the Knowledge graph is greatly successful in multiple fields of semantic analysis, question-answering systems, recommendation systems and the like. The knowledge graph is composed of a large number of triples in the form of (head entities, relationships, tail entities), the entities being nodes, the relationships being directed edges between the nodes. The partner predicts the expression form of the problem on the knowledge graph as a binary classification form of the triple, namely, judges whether the triple is true, and particularly needs to fix the type of the head entity and the tail entity as the partner, and the relationship is the cooperative relationship.

Knowledge graph (Knowledge graph) is used as a multi-relation graph, and powerful expression of an author node can be obtained through KG embedded learning. KG embedding refers to mapping entities and relationships in a knowledge-graph into a continuous embedding space. The existing KG embedding learning models are mainly classified into models aiming at static KGs, such as a translation-based model TransE, a convolutional neural network-based ConvE, a graph neural network-based KBGAT and the like, but the models do not consider time information; and HyTE associating each timestamp with a corresponding hyperplane for temporal KGs, such as TAE that uses chronological order as a constraint to improve the quality of the embedded representation, but these models also consider each triplet independently and cannot model the temporal interaction between entities. The temporal interaction means that the neighbor nodes of each central entity have temporal attributes because the neighbor nodes and the central entity have temporal attributes.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a partner prediction method based on a temporal partner knowledge graph. Due to the complexity of the problem, a deep learning model with an encoder-decoder is proposed, wherein the encoder is divided into two parts, namely a semantic encoder for capturing entity semantic information, an embedded encoder for modeling the neighborhood of a central entity through a temporal graph attention machine and aggregating the temporal-spatial neighborhood information, and a decoder for triple scoring based on a multi-scale convolution kernel. The method comprises the following specific steps:

the model training of the method of the invention comprises the following steps:

step (1): constructing a temporal partner knowledge graph: cleaning the collected academic relations, then creating a temporal partner knowledge graph which is composed of temporal triples in the form of head entities, relations and tail entities and has time attributes, and storing the temporal partner knowledge graph in a Neo4j graph database;

step (2): training a semantic encoder based on a pre-training language model to mine semantic information of an entity in a knowledge graph of a temporal partner as initialization representation of the semantic information;

and (3): if the training round of the semantic encoder does not reach a set value or the training loss does not reach an early stop condition in the step (2), calculating a normalized attention coefficient of each temporal triplet in the neighborhood of each entity relative to a central entity according to a temporal graph attention mechanism, weighting and aggregating neighborhood information according to the coefficient and attaching jump connection to update entity embedding representation, wherein the process is a single temporal graph attention layer (namely the whole process of calculating triplet embedding, calculating the normalized attention coefficient and weighting and aggregating neighborhood information and attaching jump connection to update entity embedding representation), enhancing the embedding representation of the entities by training an embedded encoder stacked with at least one temporal graph attention layer, and one training round refers to completing the step of updating entity embedding representation once by using complete data and updating parameters of the encoder;

and (4): after the training round of the semantic encoder reaches a set value or the training loss reaches an early stop condition in the step (2), a triple is given, a head entity and a tail entity of the triple and the relation of the triple form a triple embedded matrix, and a decoder based on the multi-scale convolution kernel strategy is trained to decode on the embedded matrix to obtain the fraction of the triple as the confidence coefficient;

the model test of the method comprises the following steps:

and (5): and (4) after the training round of the decoder reaches a set value or the training loss reaches an early stop condition in the step (4), giving out two authors which are in the temporal partner knowledge graph but never cooperate, obtaining two author entities and embedded representation of a cooperation relation according to the trained embedded encoder, inputting the two author entities and the embedded representation of the cooperation relation into the trained decoder, scoring the decoder according to the embedded identifiers of the entities and the relation to obtain the confidence coefficient of the two author entities, predicting that cooperation can be generated between the two author entities if the entity and the embedded representation of the relation are greater than a set threshold value, scoring all triples in the data set once and updating the parameters of the decoder to obtain one training round.

In the invention, the partner prediction method based on the temporal partner knowledge graph is used for predicting future partners by using the temporal knowledge graph as an information carrier. The temporal partner knowledge graph is a multi-relation directed graph composed of triples in the form of (head entities, relations, tail entities) and with time attributes, wherein the entities are nodes in the knowledge graph, and the relations are directed edges between the entities in the knowledge graph. Future collaborators, i.e. two never collaborated authors, will co-publish the same paper in the future.

In the invention, the specific steps of the step (1) comprise:

step (1.1): the academic relationship is obtained by collecting public data of a thesis website, and refers to a set of multi-element relationships such as cooperation relationships among authors, conference journals published by the thesis and the like; the data cleaning is to remove authors with too few occurrences; and the filtering threshold value of the occurrence times can be freely selected, and an entity set and a relationship set are obtained by screening. And constructing a triple according to the multiple relations including the cooperative relation, the affiliated institution, the subject of the paper and the like in the academic relation, wherein the triple has a time attribute, and the time of creating the triple, such as the publication time of the paper and the like, is added to the triple.

Step (1.2): in order to enable each entity to obtain information for more distant neighbors (i.e., the shortest reachable path length between two entities is longer), an auxiliary directed edge is artificially constructed between each entity and its multi-hop neighbors, facilitating the flow of knowledge in the temporal partner knowledgegraph.

In the invention, the semantic encoder in the step (2) is a semantic encoder for capturing entity semantic information,

the method comprises the following specific steps:

step (2.1): in the step (1), each entity e in the temporal partner knowledge graph is composed of a word sequence s, i.e. e = s = [ w ] ₁ ，...w _l ，...w _L ]Wherein w is _l Refers to the ith word in the word sequence and L refers to the total number of words in the word sequence. The word sequence s is fed into a pre-trained language model BERT to obtain word embedding for each word [ w ₁ ，...w _l ，...w _L ]＝BERT([w ₁ ，...w _l ，...w _L ]) Wherein

Represents the word w _l Word embedding of d _BERT Representing the dimension of word embedding.

Step (2.2): for the word embedding sequence obtained by each entity e, an average strategy is adopted to convert the word embedding sequence into entity embedding, namely

Embedding e dimension reduction, i.e. e, into obtained entity using a full-connected layer pair _init ＝FC _s (e) In which FC _s Denotes a fully connected layer, e _init I.e. the initialization embedding of the entity.

In the invention, the embedded encoder in the step (3) refers to an embedded encoder which models the neighborhood of the central entity by a temporal graph attention machine and performs temporal interaction and aggregates temporal-spatial neighborhood information,

the method comprises the following specific steps:

step (3.1): the normalized attention coefficients of the neighbors of each central entity t are calculated by the temporal graph attention layer. For each central entity t, representing a triplet with the central entity t as the tail entity as the set V = [ V = ₁ ，...，v _y ，...，v _Y ]Wherein v is _y ＝[h _i ，r _j ，t]And has a time attribute time _a Wherein h is _i Represents the head entity, r _j Representing a relationship and t representing a tail entity, time _a Representing the time attribute, Y represents the number of triples in the set V, and theseTriple time by time attribute _a And (5) sorting in an ascending order. For each triple v _y Applying a linear transformation

Where | | | represents splicing operation, W ₁ In order to linearly transform the weight matrix,

and e _t Respectively represent head entity h _i Relation r of _j And an embedded representation of the tail entity t, v _y I.e. the triplet v _y The corresponding embedded representation. The resulting embedded set is denoted as V = [ V ] ₁ ，...，v _y ，...，v _Y ]。

The normalized attention coefficient of the triplet made up of each neighbor is computed over the Bi-directional LSTM network Bi-LSTM. Bi-LSTM comprises two LSTM networks, namely forward-layer and backward-layer, and in particular

And

representing the hidden layer states of forward-layer and backward-layer at time step y-1, the hidden layer states of forward-layer and backward-layer at time step y are calculated as follows,

and

v _y refers to the embedded representation of the triplet, the hidden layer state of Bi-LSTM at time step y is

Then through a full connection layer FC _a () Obtaining a triplet v _y The attention factor of (a) is,

wherein the weight matrix W ₂ And the offset b is a trainable parameter and σ is the activation function LeakyReLU. The normalized attention coefficient is calculated by the softmax function,

step (3.2): the weighted aggregate triples update the central entity representation. The central entity t obtains a new embedded representation by weighting the triplet representations in the aggregate set V,

σ is the activation function. Through the training process of the multi-head attention mechanism stable attention mechanism,

wherein

And

calculated from the kth independent attention head, K represents the number of attention heads. If the current temporal attention layer is the last layer, the physical embedding of the multi-headed attention mechanism output will not be stitched but averaged,

in order to prevent the problems of gradient dispersion and disappearance of entity semantic information caused by deepening of a network, jump connection is adopted in each temporal graph attention layer and is expressed as follows:

wherein W ₃ Is trainableThe weight matrix is a matrix of weights,

i.e. the entity embedding for the attention layer output of the graph, e _init For the initialisation embedding of the entities output by the semantic encoder in step 2, e' _t The output of the multi-head attention mechanism.

Step (3.3): the embedded representation of the relationship is updated. Learning a new embedded representation of a relationship by linear transformation, r' = W ₄ r, r' and

are of the same dimension, W ₄ Is a linear transformation weight matrix, r is the embedded representation before the relationship update,

is the physical embedding of the attention layer output.

Step (3.4): the encoder is trained. Jointly training a semantic encoder and an embedded encoder with a loss function of L _{Encoder for encoding a video signal} ＝∑ _{(h，r，t)∈Δ} ∑ _{(h′，r，t′)∈Δ′} [d _(h，r，t) -d _{(h′，r，t′)} +γ] ₊ Where Δ represents a set of positive samples, Δ' is a set of negative samples, γ represents a safety margin distance, [ x ]] ₊ ＝Max[x，0]，

e _h ，e _r ，e _t Respectively representing the embedded representation of h, r, t, l, output by the embedded encoder ₁ Means that ₁ And (4) regularizing.

Step (3.5): and calculating a loss function value of the encoder and updating the parameters of the encoder.

In the present invention, the step (4) decoder is used for a multi-scale convolution kernel based on triple scoring,

the method comprises the following specific steps:

step (4.1): the convolution is performed using a set of multi-scale convolution kernels.

Given a triplet (h, r, t), e _h ，e _r ，e _t Representing the embedded representation of h, r, t, respectively, output by the embedded encoder. Representing the head and tail entities and relationship composition triplets as an embedding matrix,

where d represents the dimension of entity embedding and relationship embedding.

Using convolution kernels of three different sizes

Convolving the triple embedded matrix to obtain a characteristic matrix tau ₁ ，τ ₂ ，τ ₃ In which

Representing the sum of convolution kernels ω _i Generated feature matrix, c _i Is a feature matrix tau _i Of (c) is measured. The generated feature matrix is spliced with the feature matrix,

c＝c ₁ +c ₂ +c ₃ and c is the dimension of p.

Step (4.2): the process of convolution to stabilize feature extraction is performed using a plurality (at least one) of sets of multi-scale convolution kernels. The set of M multi-scale convolution kernels is denoted as Ω' = [ Ω ] ₁ ，...，Ω _M ]And the output M characteristic matrixes are spliced,

mc is the dimension of the spliced vector P.

Step (4.3): the triplets are scored according to P. Using a weight matrix w _d Dot multiplication is carried out on the matrix, f (h, r, t) = P.w _d . The scoring function of the final decoder is

Where represents the convolution operation, b represents the offset, Ω _m For the mth set of multi-scale convolution kernels, A is the embedding matrix for the triplet (h, r, t).

Step (4.4): the decoder is trained. The loss function of the decoder training is such that,

wherein

Is a weight vector w _d And (3) regularization, Δ represents a positive sample set, and Δ' is a negative sample set.

In the invention, the step (5) of the model test comprises the following specific steps:

step (5.1): two author entities which never cooperate are selected, corresponding embedding and embedding of cooperation relation are obtained according to a trained embedding encoder, an embedding matrix is formed and input into a trained decoder, and scores of the embedding matrix are obtained. And judging whether the result is true according to a set threshold value.

Based on the method, the invention also provides a partner prediction system based on the temporal partner knowledge graph, which comprises the following steps: a memory and a processor; the memory has stored thereon a computer program which, when executed by the processor, implements the prediction method described above.

Compared with the prior art, the invention has the beneficial effects that: the partner prediction method based on the temporal partner knowledge graph is provided, and the prediction performance can be greatly improved by considering time information and effective semantic capture; by means of the temporal interaction between the neighbors of the modeling center entity of the temporal graph attention mechanism and aggregation of multi-hop temporal-spatial neighborhood information, the information utilization rate can be improved to the maximum, and the method can be applied to other types of knowledge graphs very simply and conveniently.

The method firstly considers the multivariate relation, secondly considers the semantic information and the time information and carries out modeling, and finally, the method can be conveniently applied to other types of temporal knowledge maps.

Drawings

FIG. 1 is a flow chart of a temporal partner knowledge graph-based partner prediction method of the present invention.

FIG. 2 is an example of a partner knowledge graph provided by embodiments of the present invention.

FIG. 3 is a model diagram of temporal attention mechanism in the temporal partner knowledge graph-based partner prediction method of the present invention under the embodiment of FIG. 2.

FIG. 4 is a model diagram of an encoder in the temporal partner knowledge-graph based partner prediction method of the present invention under the embodiment of FIG. 2.

FIG. 5 is a model diagram of a decoder in the temporal partner-knowledge-graph-based partner prediction method of the present invention under the embodiment of FIG. 2.

FIG. 6 is the result of the invention's operation in a real-world temporal partner knowledgegraph.

Detailed Description

The invention is described in further detail in connection with the following specific examples and the accompanying drawings so that those skilled in the art can better understand the invention. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, the partner prediction method based on temporal partner knowledge graph provided by the present invention includes the following steps:

(1) Constructing a knowledge graph of the temporal collaborators, cleaning the collected academic relation, then creating a knowledge graph of the temporal collaborators, which consists of triples with time attributes in the form of head entities, relations and tail entities, and storing the knowledge graph in a Neo4j graph database;

(2) Training a semantic encoder based on a pre-training language model to mine semantic information of an entity in a knowledge graph of a temporal partner as initialization representation of the semantic information;

(3) Calculating a normalized attention coefficient of each temporal triple in the neighborhood of each entity relative to a central entity according to a temporal graph attention mechanism, weighting and aggregating neighborhood information according to the coefficient to update entity embedded representation, wherein the process is a single temporal graph attention layer, and the embedded representation of the entities is enhanced by training an embedded encoder stacked with a plurality of temporal graph attention layers;

(4) Giving a triple, forming a triple embedding matrix by a head entity and a tail entity and a relation of the triple, and training a decoder based on a multi-scale convolution kernel strategy to decode on the embedding matrix to obtain a fraction of the triple as a confidence coefficient;

(5) Given two authors who are already in the temporal partner knowledge graph but never collaborated, their embedded representations are obtained from the trained embedded encoder and input to the trained decoder for their confidence level, and if greater than a set threshold, they are predicted to collaborate with each other.

Examples

Referring to FIG. 2, assume that FIG. 2 is a diagram of a system with 6 author entities A ₁ ，...A ₆ 2 paper entities P ₁ ，P ₂ An organization entity C to which the author belongs ₁ A conference entity K ₁ The temporal partner knowledgegraph of (1), wherein the solid line edge is a true relationship and the dashed line edge is an artificially added auxiliary edge. Assume with A ₁ Is a central entity, then A ₁ In the past with A ₅ Is more closely related to A ₄ Is more closely related because of A ₁ And A ₄ There is a more frequent and later time partnership. With A ₁ As a central entity, A ₁ And A ₄ The temporal interaction of (A) will affect ₁ And A ₅ In relation to each other, while A ₁ And A ₃ Also promote A ₁ And A ₄ In cooperation with each other. Thus A is ₁ There is temporal interaction between neighbors, while A ₁ Generation of new partnerships with other authors also depends on A ₁ Neighborhood information with other authors. The partner prediction method based on the temporal partner knowledge graph can be realized through a temporal image notationGood modeling of the force mechanism A ₁ Can effectively aggregate A through embedding the encoder ₁ Neighborhood information of (c).

The specific steps of the step (1) comprise:

and (1.1) according to academic relations obtained from public data of the thesis website, cleaning data, removing authors with too few occurrences (the filtering threshold value can be freely selected), and screening to obtain an entity set and a relation set. Constructing a triple according to a multivariate relation such as a cooperative relation (Co-Author), a belonged organization (research of), a paper publication conference (publication paper on) and the like contained in the academic relation, and attaching the time of creating the triple, such as the time of publication of the paper and the like, to the triple.

Step (1.2) at the central entity A ₁ And multihop neighbor A ₆ And artificially constructing an auxiliary edge.

As shown in fig. 3, with a central entity a ₁ And (3) sequencing 7 triples of the tail entity according to time attributes, sending the triples into the bidirectional LSTM network, sending the hidden layer state of the bidirectional LSTM network into a full connection layer at each step to obtain the attention coefficient of each triplet, and obtaining normalized attention coefficients corresponding to the 7 triples through a softmax function.

As shown in fig. 4, will be centered on entity a ₁ And feeding the triples of the tail entities into the encoder. First, each triplet is sent to the embedding encoder for embedding initialization, and for the purpose of model visualization, assuming dimension 2, it can be set to 50, etc. And then, splicing the head and tail entities of the triples and the initialized embedding of the relationship, and sending the spliced entities into 2 different attention head (actually, 8 equal values can be set) calculation center entities A ₁ The embedding of the output is spliced with A ₁ Is initialized and embedded to make a jump connection to obtain A ₁ And learning the relationship obtains the sum A by linear transformation ₁ Is dimensionally the same, this is the first layer temporal attention layer. Then, the calculation of the temporal attention layer of the second layer is carried out to obtain A ₁ Further new representation of (2). Finally, by means of tripletsThe new representation of head-to-tail entities, relationships calculates the losses to train the encoder.

The specific steps of the step (3) comprise:

and (3.1) calculating the normalized attention coefficient of the neighbor of each central entity through the attention layer of the temporal graph. For each central entity t, representing a triplet with the central entity t as the tail entity as the set V = [ V = ₁ ，...，v _y ，...，v _Y ]Wherein v is _y ＝[h _i ，r _j ，t]And has a time attribute time _a Wherein h is _i Represents the head entity, r _j Representing a relationship and t representing a tail entity, time _a Represents the time attribute, Y represents the number of triples in the set V, and these triples are by attribute time _a And (5) sorting in an ascending order. For each triplet v _y Applying a linear transformation

Where | | | represents the stitching operation, W ₁ In order to linearly transform the weight matrix, the weight matrix is,

and e _t Respectively represent header entities h _i Relation r of _j And an embedded representation of the tail entity t, v _y I.e. a triplet v _y And correspondingly embedding. The resulting embedded set is denoted as V = [ V ] ₁ ，...，v _y ，...，v _Y ]. The normalized attention coefficient of the triplet made up of each neighbor is computed over the Bi-directional LSTM network Bi-LSTM. Bi-LSTM comprises two LSTM networks, namely forward-layer and backward-layer, and in particular

And

and

then the hidden layer state of Bi-LSTM at time step y is

The triplet v is then obtained via a full link layer _y The attention factor of (a) is,

and (3.2) weighting the aggregate triple update central entity representation. The central entity t obtains a new embedded representation by weighting the triplet representations in the aggregated set V,

through the training process of the multi-head attention mechanism stable attention mechanism,

wherein

And

in order to prevent the problems of gradient dispersion and disappearance of entity semantic information caused by deepening of a network, a jump connection is adopted in each temporal graph attention layer and is expressed as follows:

wherein W ₃ Is a matrix of weights that can be trained in the process,

Step (3.3) updates the embedded representation of the relationship. Learning a new embedded representation of a relationship by linear transformation, r' = W ₄ r, r' and

are the same dimension, W ₄ Is a linear transformation weight matrix, r is the embedded representation before the relationship update,

is the physical embedding of the attention layer output.

And (3.4) training a loss function of the encoder. Jointly training a semantic encoder and an embedded encoder with a loss function of L _{Encoder for encoding a video signal} ＝∑ _{(h，r，t)∈Δ} ∑ _{(h′，r，t′)∈Δ′} [d _(h，r，t) -d _{(h′，r，t′)} +γ] ₊ Where Δ represents a set of positive samples, Δ' is a set of negative samples, γ represents a safety margin distance,

Example (b): as shown in FIG. 5, assume that the current triplet is (A) ₅ ，Co-Author，A ₁ 2013), the embedded dimension size of the entities and relationships embedded into the encoder output is 7. Assuming that there is only one set of multi-scale convolution kernels, passing through three convolution kernels of different sizes, i.e.

To obtain three sizes of

Where ReLU is the activation function, b ₁ ，b ₂ ，b ₃ Are different offsets, and are spliced to obtain a size of

Vector of (a), pass through and weight matrix w _d The point multiplication is performed to obtain the score, i.e. confidence, of the triplet.

In the invention, the specific steps of the step (4) comprise:

and (4.1) carrying out convolution by using a multi-scale convolution kernel set. Given a triplet (h, r, t), e _h ，e _r ，e _t Representing the embedded representation of h, r, t, respectively, output by the embedded encoder. The triplet is represented as an embedded matrix and,

where d represents the dimension of entity embedding and relationship embedding. Using convolution kernels of three different sizes

Convolving the triple embedded matrix to obtain a characteristic matrix tau ₁ ，τ ₂ ，τ ₃ Wherein

Representing the sum of convolution kernels ω _i Generated feature matrix, c _i Representation of the feature matrix tau _i Dimension. The generated feature matrix is spliced with the feature matrix,

c＝c ₁ +c ₂ +c ₃ and c is the dimension of p.

Step (4.2) is to convolve with a plurality (at least one) of sets of multi-scale convolution kernels to stabilize the process of feature extraction. The multiple sets of multi-scale convolution kernels are denoted as Ω' = [ Ω ] ₁ ，...，Ω _M ]And the output M characteristic matrixes are spliced,

mc is the dimension of the spliced vector P.

And (4.3) grading the triad according to P. Using a weight matrix w _d Dot multiplication of matrix is carried out, f (h, r, t) = P.w _d . The scoring function of the final decoder is

Where denotes the convolution operation, b denotes the offset, Ω _m For the mth set of multi-scale convolution kernels, A is the embedding matrix for the triplet (h, r, t).

And (4.4) training a decoder. The loss function of the decoder training is such that,

wherein

Is a weight vector w _d And (3) regularizing, wherein delta represents a positive sample set and delta' is a negative sample set.

As shown in fig. 6, the results of the present method are excellent in an academic relationship network in the real world.

The protection of the present invention is not limited to the above embodiments. Variations and advantages that may occur to those skilled in the art are intended to be included within the invention without departing from the spirit and scope of the inventive concept, and the scope of the invention is to be determined by the appended claims.

Claims

1. A partner prediction method based on a temporal partner knowledge graph is characterized by comprising the following steps:

step (1): constructing a temporal partner knowledge graph, performing data cleaning on the acquired academic relation, then creating a temporal partner knowledge graph consisting of temporal triples with time attributes in the form of head entities, relations and tail entities, and storing the temporal partner knowledge graph in a Neo4j graph database;

step (2): training a semantic encoder based on a pre-training language model to mine semantic information of an entity in a temporal partner knowledge graph as an initialization representation of the semantic information;

and (3): if the training round of the semantic encoder in the step (2) does not reach a set value or the training loss does not reach an early stop condition, calculating a normalized attention coefficient of the temporal triple in the neighborhood of each entity relative to the central entity according to a temporal graph attention mechanism, and weighting and aggregating neighborhood information according to the coefficient to update entity embedded representation; calculating triple embedding, calculating a normalized attention coefficient and weighted aggregation neighborhood information, adding jump connection to update the whole process of entity embedding representation into a single temporal graph attention layer, and enhancing the embedding representation of the entity by training an embedding encoder stacked with at least one temporal graph attention layer;

and (4): after the training round of the semantic encoder reaches a set value or the training loss reaches an early stop condition in the step (2), a triple is given, a head entity, a tail entity and a relation of the triple form a triple embedded matrix, and a decoder based on the multi-scale convolution kernel strategy is trained to decode on the embedded matrix to obtain a score of the triple as a confidence coefficient;

and (5): after the training round of the decoder reaches the set value or the training loss reaches the early stop condition in the step (4), two authors who are already in the temporal partner knowledge map but never cooperate are given, the embedded representation of the authors is obtained according to the trained embedded encoder and input into the trained decoder to obtain the confidence level of the authors, and if the embedded representation is larger than the set threshold value, the cooperation between the authors is predicted.

2. The temporal partner knowledge graph-based partner prediction method of claim 1, wherein a temporal partner knowledge graph is used as an information carrier for future partner prediction, the temporal partner knowledge graph being a multi-relationship directed graph composed of triples in the form of head entities, relationships, tail entities and with temporal attributes, the entities being nodes in the temporal partner knowledge graph, the relationships being directed edges between the entities in the temporal partner knowledge graph; the future collaborators, i.e. two never collaborated authors, will co-publish the same paper in the future.

3. The temporal partner knowledge graph-based partner prediction method according to claim 1, wherein the specific steps of step (1) comprise:

step (1.1): the academic relation refers to a set of multi-element relations including cooperation relations among authors and conference periodicals published in a paper, and is obtained through public data of a paper website; the data cleaning means removing authors with too few occurrences, and screening to obtain an entity set and a relationship set; the filtering threshold value of the occurrence times can be freely selected; constructing a triple according to the cooperative relationship, the affiliated institution and the multivariate relationship of the subject of the thesis contained in the academic relationship, and attaching the time of creating the triple to the triple, wherein the time comprises the publication time of the thesis;

step (1.2): in order to enable each entity to obtain information of farther neighbors, auxiliary directed edges are artificially constructed between each entity and a multi-hop neighbor thereof, and knowledge in a temporal partner knowledge graph flows; by further is meant longer than the length of the shortest reachable path between two entities.

4. The temporal partner knowledge graph-based partner prediction method according to claim 1, wherein the semantic encoder of step (2) comprises the following specific steps:

step (2.1): each entity e in the temporal partner knowledge graph is composed of one word sequence s, i.e. e = s = [ w ] ₁ ，...w _l ，...w _L ]Wherein w is _l Refers to the ith word in the word sequence, and L refers to the total number of words in the word sequence; sending the word sequence s into a pre-training language model BERT to obtain word embedding [ w ] of each word ₁ ，...w _l ，...w _L ]＝BERT([w ₁ ，...w _l ，...w _L ]) Wherein

Represents the word w _l Word embedding of d _BERT A dimension representing word embedding;

Embedding e dimension reduction, i.e. e, into the obtained entity using a fully connected layer pair _init ＝FC _s (e) Wherein FC _s Denotes a fully connected layer, e _init I.e. the initial embedding of the entity.

5. The temporal partner knowledge graph-based partner prediction method according to claim 1, wherein the step (3) of embedding the encoder comprises the specific steps of:

step (3.1): calculating the normalized attention coefficient of the neighbor of each central entity t through the attention layer of the temporal graph; for each central entity t, representing a triplet with the central entity t as the tail entity as the set V = [ V = ₁ ，...，v _y ，...，v _Y ]Wherein v is _y ＝[h _i ，r _j ，t]And has a time attribute time _a Wherein h is _i Represents the head entity, r _j Representing a relationship and t representing a tail entity, time _a Represents the time attribute, Y represents the number of triples in the set V, and these triples are time by time attribute _a Sorting in ascending order; for each triplet v _y By applying a linear transformation

Where | | | represents splicing operation, W ₁ In order to linearly transform the weight matrix, the weight matrix is,

and e _t Respectively represent head entity h _i Relation r of _j And an embedded representation of the tail entity t, v _y An embedded representation for the triplet; the resulting embedded set is denoted as V = [ V ] ₁ ，...，v _y ，...，v _Y ]；

Calculating a normalized attention coefficient of the triplet constituted by each neighbor by a Bi-directional LSTM network Bi-LSTM; bi-LSTM comprises two LSTM networks, namely forward-layer and backward-layer, and in particular

And

and

v _y for the embedded representation of the triplet, the hidden layer state of Bi-LSTM at time step y is

wherein the weight matrix W ₂ And bias b is a trainable parameter, σ is the activation function LeakyReLU; the normalized attention coefficient is calculated by the softmax function,

step (3.2): updating the central entity representation by the weighted aggregate triple; the central entity t obtains a new embedded representation by weighting the triplet representations in the aggregate set V,

σ is an activation function; through the training process of the multi-head attention mechanism stable attention mechanism,

wherein

And

calculating by a kth independent attention head, wherein K represents the number of the attention heads; if the current temporal attention layer is the last layer, the physical embedding of the multi-headed attention mechanism output will not be spliced but averaged,

wherein W ₃ Is a matrix of weights that can be trained in the process,

i.e. the entity embedding for the attention layer output of the graph, e _init For the initialisation embedding of the entities output by the semantic encoder in step 2, e' _t Is the output of the multi-head attention mechanism;

step (3.3): updating the embedded representation of the relationship; learning a new embedded representation of a relationship by linear transformation, r' = W ₄ r, r' and

embedding entities for graph attention layer output;

step (3.4): jointly training a semantic encoder and an embedded encoder with a loss function of L _{Encoder for encoding a video signal} ＝∑ _{(h，r，t)∈Δ} ∑ _{(h′，r，t′)∈Δ′} [d _(h，r，t) -d _{(h′，r，t′)} +γ] ₊ Where Δ represents a set of positive samples, Δ' is a set of negative samples, γ represents a safety margin distance, [ x ]] ₊ ＝Max[x，0]，

e _h ，e _r ，e _t Respectively representing the embedded representations of h, r, t output by the embedded encoder; l ₁ Means that ₁ Regularization;

6. The temporal partner knowledge graph-based partner prediction method according to claim 1, wherein the decoder of step (4) comprises the following specific steps:

step (4.1): performing convolution by using a multi-scale convolution kernel set; given a triplet (h, r, t), e _h ，e _r ，e _t Respectively representing the embedded representations of h, r, t output by the embedded encoder; representing the head and tail entities and relationship composition triplets as an embedding matrix,

wherein d represents the dimensions of entity embedding and relationship embedding; using convolution kernels of three different sizes

Representing the sum of convolution kernels ω _i Generated feature matrix, c _i Representing the feature matrix tau _i Dimension of (d); the generated feature matrix is spliced with the feature matrix,

c＝c ₁ +c ₂ +c ₃ c is the dimension of p;

step (4.2): a process of convolving with at least one set of multi-scale convolution kernels to stabilize feature extraction; the set of M multiscale convolution kernels is denoted as Ω' = [ Ω ] ₁ ，...，Ω _M ]And the output M characteristic matrixes are spliced,

mc is the dimension of the spliced vector P;

step (4.3): scoring the triples according to P; using a weight matrix w _d Dot multiplication of matrix is carried out, f (h, r, t) = P.w _d (ii) a The scoring function of the final decoder is

Where denotes the convolution operation, b denotes the offset, Ω _m For the mth multiple-scale convolution kernel set, A is an embedded matrix of the triples (h, r, t);

step (4.4): training a decoder; the loss function of the decoder training is such that,

wherein

7. The temporal partner knowledge graph-based partner prediction method according to claim 1, wherein the step (5) of model testing comprises the following specific steps:

step (5.1): selecting two author entities which have never cooperated, obtaining corresponding embedding and embedding of cooperative relationship according to a trained embedding encoder, forming an embedding matrix, inputting the embedding matrix into a trained decoder, and obtaining scores of the embedding matrix; and judging whether the result is true according to a set threshold value.

8. A temporal partner knowledge graph-based partner prediction system, comprising: a memory and a processor;

the memory has stored thereon a computer program which, when executed by the processor, implements the prediction method as claimed in any one of claims 1-7.