CN115481215A - Partner prediction method and prediction system based on temporal partner knowledge graph - Google Patents

Partner prediction method and prediction system based on temporal partner knowledge graph Download PDF

Info

Publication number
CN115481215A
CN115481215A CN202110606169.1A CN202110606169A CN115481215A CN 115481215 A CN115481215 A CN 115481215A CN 202110606169 A CN202110606169 A CN 202110606169A CN 115481215 A CN115481215 A CN 115481215A
Authority
CN
China
Prior art keywords
temporal
entity
embedding
partner
embedded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110606169.1A
Other languages
Chinese (zh)
Inventor
金栋栋
程鹏
林学民
陈雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Junshuo Shanghai Information Technology Co ltd
East China Normal University
Original Assignee
Junshuo Shanghai Information Technology Co ltd
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Junshuo Shanghai Information Technology Co ltd, East China Normal University filed Critical Junshuo Shanghai Information Technology Co ltd
Priority to CN202110606169.1A priority Critical patent/CN115481215A/en
Publication of CN115481215A publication Critical patent/CN115481215A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a partner prediction method based on a temporal partner knowledge graph, which comprises the following steps: 1. constructing a knowledge graph of a temporal partner; 2. adopting a semantic encoder based on a pre-training language model to mine semantic information of an entity in a knowledge graph of a temporal partner as initialization representation of the semantic information; 3. and calculating the normalized attention coefficient of each temporal triplet in the neighborhood of each entity relative to the central entity according to a temporal graph attention mechanism, and weighting and aggregating neighborhood information according to the coefficient to update the entity embedding representation 4. Given a triplet, decoding the triplet embedding matrix by a decoder based on a multi-scale convolution kernel strategy to obtain the fraction of the triplet. 5. Two authors who have never collaborated are given, get their embedded representation from the embedded encoder, output their score from the decoder, and predict that if greater than a threshold, they will collaborate with each other. The method can greatly improve the prediction performance.

Description

Partner prediction method and prediction system based on temporal partner knowledge graph
Technical Field
The invention belongs to the technical field of social network analysis, relates to a link prediction method among nodes in a graph, and particularly relates to a partner prediction method and a partner prediction system based on a temporal partner knowledge graph.
Background
In recent decades, due to the rapid development of modern internet and its related technologies, various types of interactions between people have been increasingly frequent, thereby promoting researchers' interest and enthusiasm for increasingly large online social networks. The application scenarios of social network analysis are very rich, such as public opinion analysis and control in a public opinion network, instant user recommendation in an online social system, influence analysis, and the like. Partner networks have also received much attention in recent years as an important subnetwork of social networks. The partner network prediction task is an important task of a partner network, and can provide powerful suggestions for project and paper cooperation among academic researchers, so that suitable partners are recommended, and academic thought exchange among the researchers is increased.
For partner prediction problems, conventional work is often based on homogeneous networks, i.e., networks with only one type of node and one type of relationship, such as partner networks, friendship networks, and so on. Conventional co-author prediction models often ignore or underutilize information about various relationships, such as treatise meetings, treatise topics, and organizations to which authors belong. This information is very important for co-author prediction. For example, authors who belong to the same structure are more likely to collaborate in the future.
After a multi-relation directed graph Knowledge graph (Kknowledgegraph) is proposed in Google in 2012 and used in semantic search, the Knowledge graph is once again introduced into the research sight of students, the Knowledge graph containing a large amount of prior Knowledge of human beings is combined with deep learning to further improve one of important ideas of artificial intelligence effects, the status of the Knowledge graph is also once again taken into consideration, and the Knowledge graph is greatly successful in multiple fields of semantic analysis, question-answering systems, recommendation systems and the like. The knowledge graph is composed of a large number of triples in the form of (head entities, relationships, tail entities), the entities being nodes, the relationships being directed edges between the nodes. The partner predicts the expression form of the problem on the knowledge graph as a binary classification form of the triple, namely, judges whether the triple is true, and particularly needs to fix the type of the head entity and the tail entity as the partner, and the relationship is the cooperative relationship.
Knowledge graph (Knowledge graph) is used as a multi-relation graph, and powerful expression of an author node can be obtained through KG embedded learning. KG embedding refers to mapping entities and relationships in a knowledge-graph into a continuous embedding space. The existing KG embedding learning models are mainly classified into models aiming at static KGs, such as a translation-based model TransE, a convolutional neural network-based ConvE, a graph neural network-based KBGAT and the like, but the models do not consider time information; and HyTE associating each timestamp with a corresponding hyperplane for temporal KGs, such as TAE that uses chronological order as a constraint to improve the quality of the embedded representation, but these models also consider each triplet independently and cannot model the temporal interaction between entities. The temporal interaction means that the neighbor nodes of each central entity have temporal attributes because the neighbor nodes and the central entity have temporal attributes.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a partner prediction method based on a temporal partner knowledge graph. Due to the complexity of the problem, a deep learning model with an encoder-decoder is proposed, wherein the encoder is divided into two parts, namely a semantic encoder for capturing entity semantic information, an embedded encoder for modeling the neighborhood of a central entity through a temporal graph attention machine and aggregating the temporal-spatial neighborhood information, and a decoder for triple scoring based on a multi-scale convolution kernel. The method comprises the following specific steps:
the model training of the method of the invention comprises the following steps:
step (1): constructing a temporal partner knowledge graph: cleaning the collected academic relations, then creating a temporal partner knowledge graph which is composed of temporal triples in the form of head entities, relations and tail entities and has time attributes, and storing the temporal partner knowledge graph in a Neo4j graph database;
step (2): training a semantic encoder based on a pre-training language model to mine semantic information of an entity in a knowledge graph of a temporal partner as initialization representation of the semantic information;
and (3): if the training round of the semantic encoder does not reach a set value or the training loss does not reach an early stop condition in the step (2), calculating a normalized attention coefficient of each temporal triplet in the neighborhood of each entity relative to a central entity according to a temporal graph attention mechanism, weighting and aggregating neighborhood information according to the coefficient and attaching jump connection to update entity embedding representation, wherein the process is a single temporal graph attention layer (namely the whole process of calculating triplet embedding, calculating the normalized attention coefficient and weighting and aggregating neighborhood information and attaching jump connection to update entity embedding representation), enhancing the embedding representation of the entities by training an embedded encoder stacked with at least one temporal graph attention layer, and one training round refers to completing the step of updating entity embedding representation once by using complete data and updating parameters of the encoder;
and (4): after the training round of the semantic encoder reaches a set value or the training loss reaches an early stop condition in the step (2), a triple is given, a head entity and a tail entity of the triple and the relation of the triple form a triple embedded matrix, and a decoder based on the multi-scale convolution kernel strategy is trained to decode on the embedded matrix to obtain the fraction of the triple as the confidence coefficient;
the model test of the method comprises the following steps:
and (5): and (4) after the training round of the decoder reaches a set value or the training loss reaches an early stop condition in the step (4), giving out two authors which are in the temporal partner knowledge graph but never cooperate, obtaining two author entities and embedded representation of a cooperation relation according to the trained embedded encoder, inputting the two author entities and the embedded representation of the cooperation relation into the trained decoder, scoring the decoder according to the embedded identifiers of the entities and the relation to obtain the confidence coefficient of the two author entities, predicting that cooperation can be generated between the two author entities if the entity and the embedded representation of the relation are greater than a set threshold value, scoring all triples in the data set once and updating the parameters of the decoder to obtain one training round.
In the invention, the partner prediction method based on the temporal partner knowledge graph is used for predicting future partners by using the temporal knowledge graph as an information carrier. The temporal partner knowledge graph is a multi-relation directed graph composed of triples in the form of (head entities, relations, tail entities) and with time attributes, wherein the entities are nodes in the knowledge graph, and the relations are directed edges between the entities in the knowledge graph. Future collaborators, i.e. two never collaborated authors, will co-publish the same paper in the future.
In the invention, the specific steps of the step (1) comprise:
step (1.1): the academic relationship is obtained by collecting public data of a thesis website, and refers to a set of multi-element relationships such as cooperation relationships among authors, conference journals published by the thesis and the like; the data cleaning is to remove authors with too few occurrences; and the filtering threshold value of the occurrence times can be freely selected, and an entity set and a relationship set are obtained by screening. And constructing a triple according to the multiple relations including the cooperative relation, the affiliated institution, the subject of the paper and the like in the academic relation, wherein the triple has a time attribute, and the time of creating the triple, such as the publication time of the paper and the like, is added to the triple.
Step (1.2): in order to enable each entity to obtain information for more distant neighbors (i.e., the shortest reachable path length between two entities is longer), an auxiliary directed edge is artificially constructed between each entity and its multi-hop neighbors, facilitating the flow of knowledge in the temporal partner knowledgegraph.
In the invention, the semantic encoder in the step (2) is a semantic encoder for capturing entity semantic information,
the method comprises the following specific steps:
step (2.1): in the step (1), each entity e in the temporal partner knowledge graph is composed of a word sequence s, i.e. e = s = [ w ] 1 ,...w l ,...w L ]Wherein w is l Refers to the ith word in the word sequence and L refers to the total number of words in the word sequence. The word sequence s is fed into a pre-trained language model BERT to obtain word embedding for each word [ w 1 ,...w l ,...w L ]=BERT([w 1 ,...w l ,...w L ]) Wherein
Figure BDA0003093654140000031
Represents the word w l Word embedding of d BERT Representing the dimension of word embedding.
Step (2.2): for the word embedding sequence obtained by each entity e, an average strategy is adopted to convert the word embedding sequence into entity embedding, namely
Figure BDA0003093654140000032
Embedding e dimension reduction, i.e. e, into obtained entity using a full-connected layer pair init =FC s (e) In which FC s Denotes a fully connected layer, e init I.e. the initialization embedding of the entity.
In the invention, the embedded encoder in the step (3) refers to an embedded encoder which models the neighborhood of the central entity by a temporal graph attention machine and performs temporal interaction and aggregates temporal-spatial neighborhood information,
the method comprises the following specific steps:
step (3.1): the normalized attention coefficients of the neighbors of each central entity t are calculated by the temporal graph attention layer. For each central entity t, representing a triplet with the central entity t as the tail entity as the set V = [ V = 1 ,...,v y ,...,v Y ]Wherein v is y =[h i ,r j ,t]And has a time attribute time a Wherein h is i Represents the head entity, r j Representing a relationship and t representing a tail entity, time a Representing the time attribute, Y represents the number of triples in the set V, and theseTriple time by time attribute a And (5) sorting in an ascending order. For each triple v y Applying a linear transformation
Figure BDA0003093654140000033
Where | | | represents splicing operation, W 1 In order to linearly transform the weight matrix,
Figure BDA0003093654140000034
and e t Respectively represent head entity h i Relation r of j And an embedded representation of the tail entity t, v y I.e. the triplet v y The corresponding embedded representation. The resulting embedded set is denoted as V = [ V ] 1 ,...,v y ,...,v Y ]。
The normalized attention coefficient of the triplet made up of each neighbor is computed over the Bi-directional LSTM network Bi-LSTM. Bi-LSTM comprises two LSTM networks, namely forward-layer and backward-layer, and in particular
Figure BDA0003093654140000041
And
Figure BDA0003093654140000042
representing the hidden layer states of forward-layer and backward-layer at time step y-1, the hidden layer states of forward-layer and backward-layer at time step y are calculated as follows,
Figure BDA0003093654140000043
and
Figure BDA0003093654140000044
Figure BDA0003093654140000045
v y refers to the embedded representation of the triplet, the hidden layer state of Bi-LSTM at time step y is
Figure BDA0003093654140000046
Then through a full connection layer FC a () Obtaining a triplet v y The attention factor of (a) is,
Figure BDA0003093654140000047
Figure BDA0003093654140000048
wherein the weight matrix W 2 And the offset b is a trainable parameter and σ is the activation function LeakyReLU. The normalized attention coefficient is calculated by the softmax function,
Figure BDA0003093654140000049
step (3.2): the weighted aggregate triples update the central entity representation. The central entity t obtains a new embedded representation by weighting the triplet representations in the aggregate set V,
Figure BDA00030936541400000410
σ is the activation function. Through the training process of the multi-head attention mechanism stable attention mechanism,
Figure BDA00030936541400000411
wherein
Figure BDA00030936541400000412
And
Figure BDA00030936541400000413
calculated from the kth independent attention head, K represents the number of attention heads. If the current temporal attention layer is the last layer, the physical embedding of the multi-headed attention mechanism output will not be stitched but averaged,
Figure BDA00030936541400000414
in order to prevent the problems of gradient dispersion and disappearance of entity semantic information caused by deepening of a network, jump connection is adopted in each temporal graph attention layer and is expressed as follows:
Figure BDA00030936541400000415
wherein W 3 Is trainableThe weight matrix is a matrix of weights,
Figure BDA00030936541400000416
i.e. the entity embedding for the attention layer output of the graph, e init For the initialisation embedding of the entities output by the semantic encoder in step 2, e' t The output of the multi-head attention mechanism.
Step (3.3): the embedded representation of the relationship is updated. Learning a new embedded representation of a relationship by linear transformation, r' = W 4 r, r' and
Figure BDA00030936541400000417
are of the same dimension, W 4 Is a linear transformation weight matrix, r is the embedded representation before the relationship update,
Figure BDA00030936541400000418
is the physical embedding of the attention layer output.
Step (3.4): the encoder is trained. Jointly training a semantic encoder and an embedded encoder with a loss function of L Encoder for encoding a video signal =∑ (h,r,t)∈Δ(h′,r,t′)∈Δ′ [d (h,r,t) -d (h′,r,t′) +γ] + Where Δ represents a set of positive samples, Δ' is a set of negative samples, γ represents a safety margin distance, [ x ]] + =Max[x,0],
Figure BDA00030936541400000419
e h ,e r ,e t Respectively representing the embedded representation of h, r, t, l, output by the embedded encoder 1 Means that 1 And (4) regularizing.
Step (3.5): and calculating a loss function value of the encoder and updating the parameters of the encoder.
In the present invention, the step (4) decoder is used for a multi-scale convolution kernel based on triple scoring,
the method comprises the following specific steps:
step (4.1): the convolution is performed using a set of multi-scale convolution kernels.
Given a triplet (h, r, t), e h ,e r ,e t Representing the embedded representation of h, r, t, respectively, output by the embedded encoder. Representing the head and tail entities and relationship composition triplets as an embedding matrix,
Figure BDA0003093654140000051
where d represents the dimension of entity embedding and relationship embedding.
Using convolution kernels of three different sizes
Figure BDA0003093654140000052
Convolving the triple embedded matrix to obtain a characteristic matrix tau 1 ,τ 2 ,τ 3 In which
Figure BDA0003093654140000053
Representing the sum of convolution kernels ω i Generated feature matrix, c i Is a feature matrix tau i Of (c) is measured. The generated feature matrix is spliced with the feature matrix,
Figure BDA0003093654140000054
c=c 1 +c 2 +c 3 and c is the dimension of p.
Step (4.2): the process of convolution to stabilize feature extraction is performed using a plurality (at least one) of sets of multi-scale convolution kernels. The set of M multi-scale convolution kernels is denoted as Ω' = [ Ω ] 1 ,...,Ω M ]And the output M characteristic matrixes are spliced,
Figure BDA0003093654140000055
Figure BDA0003093654140000056
mc is the dimension of the spliced vector P.
Step (4.3): the triplets are scored according to P. Using a weight matrix w d Dot multiplication is carried out on the matrix, f (h, r, t) = P.w d . The scoring function of the final decoder is
Figure BDA0003093654140000057
Where represents the convolution operation, b represents the offset, Ω m For the mth set of multi-scale convolution kernels, A is the embedding matrix for the triplet (h, r, t).
Step (4.4): the decoder is trained. The loss function of the decoder training is such that,
Figure BDA0003093654140000058
Figure BDA00030936541400000510
wherein
Figure BDA0003093654140000059
Is a weight vector w d And (3) regularization, Δ represents a positive sample set, and Δ' is a negative sample set.
In the invention, the step (5) of the model test comprises the following specific steps:
step (5.1): two author entities which never cooperate are selected, corresponding embedding and embedding of cooperation relation are obtained according to a trained embedding encoder, an embedding matrix is formed and input into a trained decoder, and scores of the embedding matrix are obtained. And judging whether the result is true according to a set threshold value.
Based on the method, the invention also provides a partner prediction system based on the temporal partner knowledge graph, which comprises the following steps: a memory and a processor; the memory has stored thereon a computer program which, when executed by the processor, implements the prediction method described above.
Compared with the prior art, the invention has the beneficial effects that: the partner prediction method based on the temporal partner knowledge graph is provided, and the prediction performance can be greatly improved by considering time information and effective semantic capture; by means of the temporal interaction between the neighbors of the modeling center entity of the temporal graph attention mechanism and aggregation of multi-hop temporal-spatial neighborhood information, the information utilization rate can be improved to the maximum, and the method can be applied to other types of knowledge graphs very simply and conveniently.
The method firstly considers the multivariate relation, secondly considers the semantic information and the time information and carries out modeling, and finally, the method can be conveniently applied to other types of temporal knowledge maps.
Drawings
FIG. 1 is a flow chart of a temporal partner knowledge graph-based partner prediction method of the present invention.
FIG. 2 is an example of a partner knowledge graph provided by embodiments of the present invention.
FIG. 3 is a model diagram of temporal attention mechanism in the temporal partner knowledge graph-based partner prediction method of the present invention under the embodiment of FIG. 2.
FIG. 4 is a model diagram of an encoder in the temporal partner knowledge-graph based partner prediction method of the present invention under the embodiment of FIG. 2.
FIG. 5 is a model diagram of a decoder in the temporal partner-knowledge-graph-based partner prediction method of the present invention under the embodiment of FIG. 2.
FIG. 6 is the result of the invention's operation in a real-world temporal partner knowledgegraph.
Detailed Description
The invention is described in further detail in connection with the following specific examples and the accompanying drawings so that those skilled in the art can better understand the invention. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, the partner prediction method based on temporal partner knowledge graph provided by the present invention includes the following steps:
(1) Constructing a knowledge graph of the temporal collaborators, cleaning the collected academic relation, then creating a knowledge graph of the temporal collaborators, which consists of triples with time attributes in the form of head entities, relations and tail entities, and storing the knowledge graph in a Neo4j graph database;
(2) Training a semantic encoder based on a pre-training language model to mine semantic information of an entity in a knowledge graph of a temporal partner as initialization representation of the semantic information;
(3) Calculating a normalized attention coefficient of each temporal triple in the neighborhood of each entity relative to a central entity according to a temporal graph attention mechanism, weighting and aggregating neighborhood information according to the coefficient to update entity embedded representation, wherein the process is a single temporal graph attention layer, and the embedded representation of the entities is enhanced by training an embedded encoder stacked with a plurality of temporal graph attention layers;
(4) Giving a triple, forming a triple embedding matrix by a head entity and a tail entity and a relation of the triple, and training a decoder based on a multi-scale convolution kernel strategy to decode on the embedding matrix to obtain a fraction of the triple as a confidence coefficient;
(5) Given two authors who are already in the temporal partner knowledge graph but never collaborated, their embedded representations are obtained from the trained embedded encoder and input to the trained decoder for their confidence level, and if greater than a set threshold, they are predicted to collaborate with each other.
Examples
Referring to FIG. 2, assume that FIG. 2 is a diagram of a system with 6 author entities A 1 ,...A 6 2 paper entities P 1 ,P 2 An organization entity C to which the author belongs 1 A conference entity K 1 The temporal partner knowledgegraph of (1), wherein the solid line edge is a true relationship and the dashed line edge is an artificially added auxiliary edge. Assume with A 1 Is a central entity, then A 1 In the past with A 5 Is more closely related to A 4 Is more closely related because of A 1 And A 4 There is a more frequent and later time partnership. With A 1 As a central entity, A 1 And A 4 The temporal interaction of (A) will affect 1 And A 5 In relation to each other, while A 1 And A 3 Also promote A 1 And A 4 In cooperation with each other. Thus A is 1 There is temporal interaction between neighbors, while A 1 Generation of new partnerships with other authors also depends on A 1 Neighborhood information with other authors. The partner prediction method based on the temporal partner knowledge graph can be realized through a temporal image notationGood modeling of the force mechanism A 1 Can effectively aggregate A through embedding the encoder 1 Neighborhood information of (c).
The specific steps of the step (1) comprise:
and (1.1) according to academic relations obtained from public data of the thesis website, cleaning data, removing authors with too few occurrences (the filtering threshold value can be freely selected), and screening to obtain an entity set and a relation set. Constructing a triple according to a multivariate relation such as a cooperative relation (Co-Author), a belonged organization (research of), a paper publication conference (publication paper on) and the like contained in the academic relation, and attaching the time of creating the triple, such as the time of publication of the paper and the like, to the triple.
Step (1.2) at the central entity A 1 And multihop neighbor A 6 And artificially constructing an auxiliary edge.
As shown in fig. 3, with a central entity a 1 And (3) sequencing 7 triples of the tail entity according to time attributes, sending the triples into the bidirectional LSTM network, sending the hidden layer state of the bidirectional LSTM network into a full connection layer at each step to obtain the attention coefficient of each triplet, and obtaining normalized attention coefficients corresponding to the 7 triples through a softmax function.
As shown in fig. 4, will be centered on entity a 1 And feeding the triples of the tail entities into the encoder. First, each triplet is sent to the embedding encoder for embedding initialization, and for the purpose of model visualization, assuming dimension 2, it can be set to 50, etc. And then, splicing the head and tail entities of the triples and the initialized embedding of the relationship, and sending the spliced entities into 2 different attention head (actually, 8 equal values can be set) calculation center entities A 1 The embedding of the output is spliced with A 1 Is initialized and embedded to make a jump connection to obtain A 1 And learning the relationship obtains the sum A by linear transformation 1 Is dimensionally the same, this is the first layer temporal attention layer. Then, the calculation of the temporal attention layer of the second layer is carried out to obtain A 1 Further new representation of (2). Finally, by means of tripletsThe new representation of head-to-tail entities, relationships calculates the losses to train the encoder.
The specific steps of the step (3) comprise:
and (3.1) calculating the normalized attention coefficient of the neighbor of each central entity through the attention layer of the temporal graph. For each central entity t, representing a triplet with the central entity t as the tail entity as the set V = [ V = 1 ,...,v y ,...,v Y ]Wherein v is y =[h i ,r j ,t]And has a time attribute time a Wherein h is i Represents the head entity, r j Representing a relationship and t representing a tail entity, time a Represents the time attribute, Y represents the number of triples in the set V, and these triples are by attribute time a And (5) sorting in an ascending order. For each triplet v y Applying a linear transformation
Figure BDA0003093654140000071
Where | | | represents the stitching operation, W 1 In order to linearly transform the weight matrix, the weight matrix is,
Figure BDA0003093654140000072
and e t Respectively represent header entities h i Relation r of j And an embedded representation of the tail entity t, v y I.e. a triplet v y And correspondingly embedding. The resulting embedded set is denoted as V = [ V ] 1 ,...,v y ,...,v Y ]. The normalized attention coefficient of the triplet made up of each neighbor is computed over the Bi-directional LSTM network Bi-LSTM. Bi-LSTM comprises two LSTM networks, namely forward-layer and backward-layer, and in particular
Figure BDA0003093654140000081
And
Figure BDA0003093654140000082
representing the hidden layer states of forward-layer and backward-layer at time step y-1, the hidden layer states of forward-layer and backward-layer at time step y are calculated as follows,
Figure BDA0003093654140000083
and
Figure BDA0003093654140000084
then the hidden layer state of Bi-LSTM at time step y is
Figure BDA0003093654140000085
The triplet v is then obtained via a full link layer y The attention factor of (a) is,
Figure BDA0003093654140000086
Figure BDA0003093654140000087
wherein the weight matrix W 2 And the offset b is a trainable parameter and σ is the activation function LeakyReLU. The normalized attention coefficient is calculated by the softmax function,
Figure BDA0003093654140000088
and (3.2) weighting the aggregate triple update central entity representation. The central entity t obtains a new embedded representation by weighting the triplet representations in the aggregated set V,
Figure BDA0003093654140000089
through the training process of the multi-head attention mechanism stable attention mechanism,
Figure BDA00030936541400000822
wherein
Figure BDA00030936541400000810
And
Figure BDA00030936541400000811
calculated from the kth independent attention head, K represents the number of attention heads. If the current temporal attention layer is the last layer, the physical embedding of the multi-headed attention mechanism output will not be stitched but averaged,
Figure BDA00030936541400000812
in order to prevent the problems of gradient dispersion and disappearance of entity semantic information caused by deepening of a network, a jump connection is adopted in each temporal graph attention layer and is expressed as follows:
Figure BDA00030936541400000813
wherein W 3 Is a matrix of weights that can be trained in the process,
Figure BDA00030936541400000814
i.e. the entity embedding for the attention layer output of the graph, e init For the initialisation embedding of the entities output by the semantic encoder in step 2, e' t The output of the multi-head attention mechanism.
Step (3.3) updates the embedded representation of the relationship. Learning a new embedded representation of a relationship by linear transformation, r' = W 4 r, r' and
Figure BDA00030936541400000815
are the same dimension, W 4 Is a linear transformation weight matrix, r is the embedded representation before the relationship update,
Figure BDA00030936541400000816
is the physical embedding of the attention layer output.
And (3.4) training a loss function of the encoder. Jointly training a semantic encoder and an embedded encoder with a loss function of L Encoder for encoding a video signal =∑ (h,r,t)∈Δ(h′,r,t′)∈Δ′ [d (h,r,t) -d (h′,r,t′) +γ] + Where Δ represents a set of positive samples, Δ' is a set of negative samples, γ represents a safety margin distance,
Figure BDA00030936541400000817
e h ,e r ,e t respectively representing the embedded representation of h, r, t, l, output by the embedded encoder 1 Means that 1 And (4) regularizing.
Example (b): as shown in FIG. 5, assume that the current triplet is (A) 5 ,Co-Author,A 1 2013), the embedded dimension size of the entities and relationships embedded into the encoder output is 7. Assuming that there is only one set of multi-scale convolution kernels, passing through three convolution kernels of different sizes, i.e.
Figure BDA00030936541400000818
To obtain three sizes of
Figure BDA00030936541400000819
Figure BDA00030936541400000820
Where ReLU is the activation function, b 1 ,b 2 ,b 3 Are different offsets, and are spliced to obtain a size of
Figure BDA00030936541400000821
Vector of (a), pass through and weight matrix w d The point multiplication is performed to obtain the score, i.e. confidence, of the triplet.
In the invention, the specific steps of the step (4) comprise:
and (4.1) carrying out convolution by using a multi-scale convolution kernel set. Given a triplet (h, r, t), e h ,e r ,e t Representing the embedded representation of h, r, t, respectively, output by the embedded encoder. The triplet is represented as an embedded matrix and,
Figure BDA0003093654140000091
where d represents the dimension of entity embedding and relationship embedding. Using convolution kernels of three different sizes
Figure BDA0003093654140000092
Figure BDA0003093654140000093
Convolving the triple embedded matrix to obtain a characteristic matrix tau 1 ,τ 2 ,τ 3 Wherein
Figure BDA0003093654140000094
Representing the sum of convolution kernels ω i Generated feature matrix, c i Representation of the feature matrix tau i Dimension. The generated feature matrix is spliced with the feature matrix,
Figure BDA0003093654140000095
Figure BDA0003093654140000096
c=c 1 +c 2 +c 3 and c is the dimension of p.
Step (4.2) is to convolve with a plurality (at least one) of sets of multi-scale convolution kernels to stabilize the process of feature extraction. The multiple sets of multi-scale convolution kernels are denoted as Ω' = [ Ω ] 1 ,...,Ω M ]And the output M characteristic matrixes are spliced,
Figure BDA0003093654140000097
Figure BDA0003093654140000098
mc is the dimension of the spliced vector P.
And (4.3) grading the triad according to P. Using a weight matrix w d Dot multiplication of matrix is carried out, f (h, r, t) = P.w d . The scoring function of the final decoder is
Figure BDA0003093654140000099
Where denotes the convolution operation, b denotes the offset, Ω m For the mth set of multi-scale convolution kernels, A is the embedding matrix for the triplet (h, r, t).
And (4.4) training a decoder. The loss function of the decoder training is such that,
Figure BDA00030936541400000910
Figure BDA00030936541400000911
wherein
Figure BDA00030936541400000912
Is a weight vector w d And (3) regularizing, wherein delta represents a positive sample set and delta' is a negative sample set.
As shown in fig. 6, the results of the present method are excellent in an academic relationship network in the real world.
The protection of the present invention is not limited to the above embodiments. Variations and advantages that may occur to those skilled in the art are intended to be included within the invention without departing from the spirit and scope of the inventive concept, and the scope of the invention is to be determined by the appended claims.

Claims (8)

1. A partner prediction method based on a temporal partner knowledge graph is characterized by comprising the following steps:
step (1): constructing a temporal partner knowledge graph, performing data cleaning on the acquired academic relation, then creating a temporal partner knowledge graph consisting of temporal triples with time attributes in the form of head entities, relations and tail entities, and storing the temporal partner knowledge graph in a Neo4j graph database;
step (2): training a semantic encoder based on a pre-training language model to mine semantic information of an entity in a temporal partner knowledge graph as an initialization representation of the semantic information;
and (3): if the training round of the semantic encoder in the step (2) does not reach a set value or the training loss does not reach an early stop condition, calculating a normalized attention coefficient of the temporal triple in the neighborhood of each entity relative to the central entity according to a temporal graph attention mechanism, and weighting and aggregating neighborhood information according to the coefficient to update entity embedded representation; calculating triple embedding, calculating a normalized attention coefficient and weighted aggregation neighborhood information, adding jump connection to update the whole process of entity embedding representation into a single temporal graph attention layer, and enhancing the embedding representation of the entity by training an embedding encoder stacked with at least one temporal graph attention layer;
and (4): after the training round of the semantic encoder reaches a set value or the training loss reaches an early stop condition in the step (2), a triple is given, a head entity, a tail entity and a relation of the triple form a triple embedded matrix, and a decoder based on the multi-scale convolution kernel strategy is trained to decode on the embedded matrix to obtain a score of the triple as a confidence coefficient;
and (5): after the training round of the decoder reaches the set value or the training loss reaches the early stop condition in the step (4), two authors who are already in the temporal partner knowledge map but never cooperate are given, the embedded representation of the authors is obtained according to the trained embedded encoder and input into the trained decoder to obtain the confidence level of the authors, and if the embedded representation is larger than the set threshold value, the cooperation between the authors is predicted.
2. The temporal partner knowledge graph-based partner prediction method of claim 1, wherein a temporal partner knowledge graph is used as an information carrier for future partner prediction, the temporal partner knowledge graph being a multi-relationship directed graph composed of triples in the form of head entities, relationships, tail entities and with temporal attributes, the entities being nodes in the temporal partner knowledge graph, the relationships being directed edges between the entities in the temporal partner knowledge graph; the future collaborators, i.e. two never collaborated authors, will co-publish the same paper in the future.
3. The temporal partner knowledge graph-based partner prediction method according to claim 1, wherein the specific steps of step (1) comprise:
step (1.1): the academic relation refers to a set of multi-element relations including cooperation relations among authors and conference periodicals published in a paper, and is obtained through public data of a paper website; the data cleaning means removing authors with too few occurrences, and screening to obtain an entity set and a relationship set; the filtering threshold value of the occurrence times can be freely selected; constructing a triple according to the cooperative relationship, the affiliated institution and the multivariate relationship of the subject of the thesis contained in the academic relationship, and attaching the time of creating the triple to the triple, wherein the time comprises the publication time of the thesis;
step (1.2): in order to enable each entity to obtain information of farther neighbors, auxiliary directed edges are artificially constructed between each entity and a multi-hop neighbor thereof, and knowledge in a temporal partner knowledge graph flows; by further is meant longer than the length of the shortest reachable path between two entities.
4. The temporal partner knowledge graph-based partner prediction method according to claim 1, wherein the semantic encoder of step (2) comprises the following specific steps:
step (2.1): each entity e in the temporal partner knowledge graph is composed of one word sequence s, i.e. e = s = [ w ] 1 ,...w l ,...w L ]Wherein w is l Refers to the ith word in the word sequence, and L refers to the total number of words in the word sequence; sending the word sequence s into a pre-training language model BERT to obtain word embedding [ w ] of each word 1 ,...w l ,...w L ]=BERT([w 1 ,...w l ,...w L ]) Wherein
Figure FDA0003093654130000021
Represents the word w l Word embedding of d BERT A dimension representing word embedding;
step (2.2): for the word embedding sequence obtained by each entity e, an average strategy is adopted to convert the word embedding sequence into entity embedding, namely
Figure FDA0003093654130000022
Embedding e dimension reduction, i.e. e, into the obtained entity using a fully connected layer pair init =FC s (e) Wherein FC s Denotes a fully connected layer, e init I.e. the initial embedding of the entity.
5. The temporal partner knowledge graph-based partner prediction method according to claim 1, wherein the step (3) of embedding the encoder comprises the specific steps of:
step (3.1): calculating the normalized attention coefficient of the neighbor of each central entity t through the attention layer of the temporal graph; for each central entity t, representing a triplet with the central entity t as the tail entity as the set V = [ V = 1 ,...,v y ,...,v Y ]Wherein v is y =[h i ,r j ,t]And has a time attribute time a Wherein h is i Represents the head entity, r j Representing a relationship and t representing a tail entity, time a Represents the time attribute, Y represents the number of triples in the set V, and these triples are time by time attribute a Sorting in ascending order; for each triplet v y By applying a linear transformation
Figure FDA0003093654130000023
Where | | | represents splicing operation, W 1 In order to linearly transform the weight matrix, the weight matrix is,
Figure FDA0003093654130000024
and e t Respectively represent head entity h i Relation r of j And an embedded representation of the tail entity t, v y An embedded representation for the triplet; the resulting embedded set is denoted as V = [ V ] 1 ,...,v y ,...,v Y ];
Calculating a normalized attention coefficient of the triplet constituted by each neighbor by a Bi-directional LSTM network Bi-LSTM; bi-LSTM comprises two LSTM networks, namely forward-layer and backward-layer, and in particular
Figure FDA0003093654130000025
And
Figure FDA0003093654130000026
representing the hidden layer states of forward-layer and backward-layer at time step y-1, the hidden layer states of forward-layer and backward-layer at time step y are calculated as follows,
Figure FDA0003093654130000027
and
Figure FDA0003093654130000028
Figure FDA0003093654130000029
v y for the embedded representation of the triplet, the hidden layer state of Bi-LSTM at time step y is
Figure FDA00030936541300000210
Then through a full connection layer FC a () Obtaining a triplet v y The attention factor of (a) is,
Figure FDA00030936541300000211
Figure FDA0003093654130000031
wherein the weight matrix W 2 And bias b is a trainable parameter, σ is the activation function LeakyReLU; the normalized attention coefficient is calculated by the softmax function,
Figure FDA0003093654130000032
step (3.2): updating the central entity representation by the weighted aggregate triple; the central entity t obtains a new embedded representation by weighting the triplet representations in the aggregate set V,
Figure FDA0003093654130000033
σ is an activation function; through the training process of the multi-head attention mechanism stable attention mechanism,
Figure FDA0003093654130000034
wherein
Figure FDA0003093654130000035
And
Figure FDA0003093654130000036
calculating by a kth independent attention head, wherein K represents the number of the attention heads; if the current temporal attention layer is the last layer, the physical embedding of the multi-headed attention mechanism output will not be spliced but averaged,
Figure FDA0003093654130000037
in order to prevent the problems of gradient dispersion and disappearance of entity semantic information caused by deepening of a network, a jump connection is adopted in each temporal graph attention layer and is expressed as follows:
Figure FDA0003093654130000038
wherein W 3 Is a matrix of weights that can be trained in the process,
Figure FDA0003093654130000039
i.e. the entity embedding for the attention layer output of the graph, e init For the initialisation embedding of the entities output by the semantic encoder in step 2, e' t Is the output of the multi-head attention mechanism;
step (3.3): updating the embedded representation of the relationship; learning a new embedded representation of a relationship by linear transformation, r' = W 4 r, r' and
Figure FDA00030936541300000310
are the same dimension, W 4 Is a linear transformation weight matrix, r is the embedded representation before the relationship update,
Figure FDA00030936541300000311
embedding entities for graph attention layer output;
step (3.4): jointly training a semantic encoder and an embedded encoder with a loss function of L Encoder for encoding a video signal =∑ (h,r,t)∈Δ(h′,r,t′)∈Δ′ [d (h,r,t) -d (h′,r,t′) +γ] + Where Δ represents a set of positive samples, Δ' is a set of negative samples, γ represents a safety margin distance, [ x ]] + =Max[x,0],
Figure FDA00030936541300000312
e h ,e r ,e t Respectively representing the embedded representations of h, r, t output by the embedded encoder; l 1 Means that 1 Regularization;
step (3.5): and calculating a loss function value of the encoder and updating the parameters of the encoder.
6. The temporal partner knowledge graph-based partner prediction method according to claim 1, wherein the decoder of step (4) comprises the following specific steps:
step (4.1): performing convolution by using a multi-scale convolution kernel set; given a triplet (h, r, t), e h ,e r ,e t Respectively representing the embedded representations of h, r, t output by the embedded encoder; representing the head and tail entities and relationship composition triplets as an embedding matrix,
Figure FDA00030936541300000313
wherein d represents the dimensions of entity embedding and relationship embedding; using convolution kernels of three different sizes
Figure FDA00030936541300000314
Convolving the triple embedded matrix to obtain a characteristic matrix tau 1 ,τ 2 ,τ 3 In which
Figure FDA00030936541300000315
Representing the sum of convolution kernels ω i Generated feature matrix, c i Representing the feature matrix tau i Dimension of (d); the generated feature matrix is spliced with the feature matrix,
Figure FDA00030936541300000316
c=c 1 +c 2 +c 3 c is the dimension of p;
step (4.2): a process of convolving with at least one set of multi-scale convolution kernels to stabilize feature extraction; the set of M multiscale convolution kernels is denoted as Ω' = [ Ω ] 1 ,...,Ω M ]And the output M characteristic matrixes are spliced,
Figure FDA00030936541300000317
Figure FDA0003093654130000041
mc is the dimension of the spliced vector P;
step (4.3): scoring the triples according to P; using a weight matrix w d Dot multiplication of matrix is carried out, f (h, r, t) = P.w d (ii) a The scoring function of the final decoder is
Figure FDA0003093654130000042
Where denotes the convolution operation, b denotes the offset, Ω m For the mth multiple-scale convolution kernel set, A is an embedded matrix of the triples (h, r, t);
step (4.4): training a decoder; the loss function of the decoder training is such that,
Figure FDA0003093654130000043
Figure FDA0003093654130000044
wherein
Figure FDA0003093654130000045
Figure FDA0003093654130000046
Is a weight vector w d And (3) regularization, Δ represents a positive sample set, and Δ' is a negative sample set.
7. The temporal partner knowledge graph-based partner prediction method according to claim 1, wherein the step (5) of model testing comprises the following specific steps:
step (5.1): selecting two author entities which have never cooperated, obtaining corresponding embedding and embedding of cooperative relationship according to a trained embedding encoder, forming an embedding matrix, inputting the embedding matrix into a trained decoder, and obtaining scores of the embedding matrix; and judging whether the result is true according to a set threshold value.
8. A temporal partner knowledge graph-based partner prediction system, comprising: a memory and a processor;
the memory has stored thereon a computer program which, when executed by the processor, implements the prediction method as claimed in any one of claims 1-7.
CN202110606169.1A 2021-05-31 2021-05-31 Partner prediction method and prediction system based on temporal partner knowledge graph Pending CN115481215A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110606169.1A CN115481215A (en) 2021-05-31 2021-05-31 Partner prediction method and prediction system based on temporal partner knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110606169.1A CN115481215A (en) 2021-05-31 2021-05-31 Partner prediction method and prediction system based on temporal partner knowledge graph

Publications (1)

Publication Number Publication Date
CN115481215A true CN115481215A (en) 2022-12-16

Family

ID=84419528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110606169.1A Pending CN115481215A (en) 2021-05-31 2021-05-31 Partner prediction method and prediction system based on temporal partner knowledge graph

Country Status (1)

Country Link
CN (1) CN115481215A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116431836A (en) * 2023-06-12 2023-07-14 湖北大学 Knowledge graph embedding method and system based on multi-scale dynamic convolution network model

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116431836A (en) * 2023-06-12 2023-07-14 湖北大学 Knowledge graph embedding method and system based on multi-scale dynamic convolution network model

Similar Documents

Publication Publication Date Title
Zheng et al. Exploiting sample uncertainty for domain adaptive person re-identification
Tuor et al. Overcoming noisy and irrelevant data in federated learning
Wang et al. Powerful graph convolutional networks with adaptive propagation mechanism for homophily and heterophily
CN110334742B (en) Graph confrontation sample generation method based on reinforcement learning and used for document classification and adding false nodes
CN110674869B (en) Classification processing and graph convolution neural network model training method and device
Liu et al. Incdet: In defense of elastic weight consolidation for incremental object detection
CN110619081B (en) News pushing method based on interactive graph neural network
CN109919209B (en) Domain self-adaptive deep learning method and readable storage medium
Pan et al. Image aesthetic assessment assisted by attributes through adversarial learning
Lin et al. Attribute-Aware Convolutional Neural Networks for Facial Beauty Prediction.
CN113065974B (en) Link prediction method based on dynamic network representation learning
Lin et al. Ru-net: Regularized unrolling network for scene graph generation
CN107945210B (en) Target tracking method based on deep learning and environment self-adaption
WO2016165058A1 (en) Social prediction
CN112115993B (en) Zero sample and small sample evidence photo anomaly detection method based on meta-learning
CN112580728B (en) Dynamic link prediction model robustness enhancement method based on reinforcement learning
CN112990295A (en) Semi-supervised graph representation learning method and device based on migration learning and deep learning fusion
CN111414964A (en) Image security identification method based on defense sample
CN112784118A (en) Community discovery method and device in graph sensitive to triangle structure
CN110781405B (en) Document context perception recommendation method and system based on joint convolution matrix decomposition
CN113987203A (en) Knowledge graph reasoning method and system based on affine transformation and bias modeling
CN115481215A (en) Partner prediction method and prediction system based on temporal partner knowledge graph
CN113343123A (en) Training method and detection method for generating confrontation multiple relation graph network
Zhuang et al. Deperturbation of online social networks via bayesian label transition
CN116306834A (en) Link prediction method based on global path perception graph neural network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination