CN115329146A - Link prediction method in time series network, electronic device and storage medium - Google Patents

Link prediction method in time series network, electronic device and storage medium Download PDF

Info

Publication number
CN115329146A
CN115329146A CN202210959140.6A CN202210959140A CN115329146A CN 115329146 A CN115329146 A CN 115329146A CN 202210959140 A CN202210959140 A CN 202210959140A CN 115329146 A CN115329146 A CN 115329146A
Authority
CN
China
Prior art keywords
node
source node
vector
distance
adaptive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210959140.6A
Other languages
Chinese (zh)
Inventor
陈洪辉
潘志强
蔡飞
舒振
郑建明
邵太华
郭昱普
宋城宇
张鑫
刘登峰
刘诗贤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202210959140.6A priority Critical patent/CN115329146A/en
Publication of CN115329146A publication Critical patent/CN115329146A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a link prediction method in a time sequence network, electronic equipment and a storage medium. The method comprises the following steps: respectively extracting time sequence adaptive walk of a source node and a target node in a time sequence network, wherein at least one of the source node and the target node is an invisible node, respectively obtaining distance measurement vectors of the source node and the target node in an embedding space according to the time sequence adaptive walk of the source node and the target node so as to calculate a first distance between the source node and the target node in the embedding space, respectively obtaining structure sensing vectors of the source node and the target node on a dynamic graph structure according to the time sequence adaptive walk of the source node and the target node so as to calculate a second distance between the source node and the target node on the dynamic graph structure, and accordingly predicting the probability of forming a link between the source node and the target node at a target timestamp according to the first distance and the second distance. The method and the device can effectively improve the accuracy of inductive link prediction in the time sequence network.

Description

Link prediction method in time series network, electronic device and storage medium
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a link prediction method in a time sequence network, electronic equipment and a storage medium.
Background
Inductive link prediction in time-series networks aims to predict future links associated with nodes that are not present in the historical timestamps. Existing inductive link prediction methods focus primarily on learning node representations from the properties of nodes/edges and network dynamics evolution or generating predictions by measuring distances between nodes in a time-series network. However, this approach has a limited range of applications, since in many real-world scenarios the properties of nodes/edges are not available, which results in the inability to learn node representations from the property information. Recently, time-ordered anonymous walks have been proposed to make inductive link predictions by measuring the distance between nodes. However, this approach is highly dependent on common neighbors between nodes, and has two main drawbacks, especially in sparse timing networks: on the one hand, connectivity between extracted nodes on walks can only be modeled explicitly, without considering tightly connected but not sampled node pairs; on the other hand, randomly extracting neighbors of nodes in a time sequence network to wander away, or simply selecting the nearest neighbor, cannot accurately locate common neighbors between nodes. Therefore, the prediction accuracy of the inductive link in the time sequence network is reduced by adopting the method.
Disclosure of Invention
The invention aims to provide a link prediction method in a time sequence network, an electronic device and a storage medium, which can improve the accuracy of inductive link prediction in the time sequence network.
The invention relates to a link prediction method in a time sequence network, which comprises the following steps:
separately extracting timing a source node in the network and self-adaptive time sequence wandering of a target node; the neighbor nodes in the time sequence self-adaptive walk are visible nodes, the visible nodes are nodes appearing in a preset training set, at least one of the source nodes and the target nodes is invisible nodes, and the invisible nodes are nodes not appearing in the training set;
acquiring a first distance measurement vector of the source node in an embedding space according to the time sequence self-adaptive walking of the source node, and acquiring a second distance measurement vector of the target node in the embedding space according to the time sequence self-adaptive walking of the target node;
calculating a first distance between the source node and the target node in an embedding space according to the first distance metric vector and the second distance metric vector;
acquiring a first structure perception vector of the source node on a dynamic graph structure according to the time sequence self-adaptive walking of the source node, and acquiring a second structure perception vector of the target node on the dynamic graph structure according to the time sequence self-adaptive walking of the target node;
calculating a second distance between the source node and the target node on a dynamic graph structure according to the first structure perception vector and the second structure perception vector;
and predicting the probability of forming a link at a target timestamp by the source node and the target node according to the first distance and the second distance.
Optionally, the respectively extracting timing adaptive walks of the source node and the target node in the timing network includes:
respectively acquiring a first embedded vector of the source node and a second embedded vector of the target node;
calculating an embedding distance between the source node and the target node according to the first embedding vector and the second embedding vector;
if the embedding distance is larger than a preset distance threshold value, sampling a neighbor node nearest to the source node to extract the time sequence adaptive migration of the source node, and sampling a neighbor node nearest to the target node to extract the time sequence adaptive migration of the target node;
and if the embedding distance is smaller than the distance threshold, randomly sampling the neighbor node of the source node to extract the time sequence self-adaptive migration of the source node, and randomly sampling the neighbor node of the target node to extract the time sequence self-adaptive migration of the target node.
Optionally, the obtaining a first embedded vector of the source node, the method comprises the following steps:
generating an initial embedding vector of the source node;
if the source node is an invisible node, taking the initial embedded vector of the source node as a first embedded vector of the source node;
if the source node is a visible node, extracting the time sequence random walk of the source node; generating an initial embedding vector of the node in the time-series random walk; and determining a first embedding vector of the source node according to the initial embedding vector of the node in the time sequence random walk.
Optionally, the obtaining a first distance metric vector of the source node in an embedding space according to the time-sequence adaptive walk of the source node includes:
respectively generating initial embedded vectors of the source node and neighbor nodes in the time sequence self-adaptive walking of the source node;
calculating the weight of the neighbor node of the source node according to the initial embedded vectors of the source node and the neighbor node of the source node;
and calculating a first distance metric vector of the source node in an embedding space according to the initial embedding vector of the source node, the initial embedding vector of the neighbor node of the source node and the weight.
Optionally, the calculation formula of the weight of the neighbor node of the source node is:
Figure BDA0003791216970000021
Figure BDA0003791216970000022
wherein the source node has M timing adaptive walks with M neighbor nodes,
Figure BDA0003791216970000023
is the weight of the ith neighbor node in the tau-th time sequence self-adaptive walk of the source node, tau is more than or equal to 1 and less than or equal to M, l is more than or equal to 1 and less than or equal to M,
Figure BDA0003791216970000024
for the initial embedding vector of the l-th neighbor node in the τ -th time sequence self-adaptive walk of the source node, v s For the initial embedded vector of the source node,
Figure BDA0003791216970000031
and
Figure BDA0003791216970000032
in order to train the parameters of the device,
Figure BDA0003791216970000033
i is more than or equal to 1 and less than or equal to M for the ith neighbor node in the ith time sequence self-adaptive walk of the source node,
Figure BDA0003791216970000034
is composed ofThe M time sequences of the source node are self-adaptive to the set of the ith neighbor node in the walk;
the calculation formula of the first distance metric vector of the source node in the embedding space is as follows:
Figure BDA0003791216970000035
Figure BDA0003791216970000036
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003791216970000037
a first distance metric vector in an embedding space for the source node,
Figure BDA0003791216970000038
σ is an activation function as a learning parameter;
the calculation formula of the first distance is as follows:
Figure BDA0003791216970000039
wherein the content of the first and second substances,
Figure BDA00037912169700000310
in order to be said first distance, the first distance,
Figure BDA00037912169700000311
a second distance metric vector in an embedding space for the target node.
Optionally, the obtaining a first structure sensing vector of the source node on the structure of the dynamic graph according to the time sequence adaptive walk of the source node includes:
generating anonymous distance coding vectors of neighbor nodes of the source node;
generating distance perception vectors of neighbor nodes of the source node according to the anonymous distance coding vectors of the neighbor nodes of the source node;
generating a time coding vector of a neighbor node of the source node according to the target timestamp and a timestamp of a link formed by the neighbor node and a previous node in the time sequence self-adaptive walking of the source node;
generating a time sequence distance perception vector of the neighbor node of the source node according to the distance perception vector and the time coding vector of the neighbor node of the source node;
generating a time sequence self-adaptive wandering state vector of the source node according to the time sequence distance perception vector of the neighbor node of the source node;
and generating a first structure perception vector of the source node on the structure of the dynamic graph according to the time sequence self-adaptive wandering state vector of the source node.
Optionally, a generation formula of the distance sensing vector of the neighbor node of the source node is:
s i =MLP(a i )=W 2 (σ(W 1 a i ));
the source node has M timing sequence self-adaptive walks, and the timing sequence self-adaptive walks have M neighbor nodes s i I is more than or equal to 1 and less than or equal to M x M, a is the distance perception vector of the ith neighbor node of the source node i Encoding a vector for the anonymous distance of the i-th neighbor node of the source node, W 1 And W 2 As a training parameter, σ is an activation function;
the generation formula of the time coding vector of the neighbor node of the source node is as follows:
Figure BDA0003791216970000041
Figure BDA0003791216970000042
wherein, T i A time-coded vector, t, for the ith neighbor node of the source node st Is a stand forThe target timestamp, t i Forming a time stamp, omega, of a link for the ith neighbor node and the previous node in the time sequence adaptive walk of the source node 12 ,…,ω d Is a learning parameter;
the generation formula of the time sequence distance perception vector of the neighbor node of the source node is as follows:
h i =[s i ,T i ];
wherein, the first and the second end of the pipe are connected with each other, h is a total of i A time sequence distance perception vector of the ith neighbor node of the source node;
the generation formula of the time sequence self-adaptive wandering state vector of the source node is as follows:
Figure BDA0003791216970000043
Figure BDA0003791216970000044
Figure BDA0003791216970000045
wherein the content of the first and second substances,
Figure BDA0003791216970000046
j is more than or equal to 1 and less than or equal to M for the jth time sequence self-adaptive wandering state vector of the source node,
Figure BDA0003791216970000047
a hidden state vector in the forward transmission process of the mth neighbor node which is self-adaptively walked for the jth time sequence of the source node,
Figure BDA0003791216970000048
a hidden state vector h in the backward transfer process of the mth neighbor node of the jth time sequence self-adaptive wandering of the source node j,m The jth timing sequence for the source node adapted to the m-th neighbour node of the walkA time-sequential distance perception vector;
the generation formula of the first structure perception vector of the source node on the dynamic graph structure is as follows:
Figure BDA0003791216970000049
Figure BDA00037912169700000410
Figure BDA00037912169700000411
wherein the content of the first and second substances,
Figure BDA00037912169700000412
a first structure perception vector of the source node on the dynamic graph structure,
Figure BDA00037912169700000413
and
Figure BDA00037912169700000414
is a learning parameter;
the calculation formula of the second distance is as follows:
Figure BDA00037912169700000415
wherein the content of the first and second substances,
Figure BDA00037912169700000416
as a result of the second distance being said,
Figure BDA00037912169700000417
for a second structure-aware vector, W, of the target node on the structure of the dynamic graph 3 And W 4 To learn parameters.
Optionally, the formula for calculating the probability is:
Figure BDA00037912169700000418
wherein z is st In order to be a probability that,
Figure BDA00037912169700000419
is the first distance, and is the second distance,
Figure BDA00037912169700000420
is said second distance, W 5 And W 6 To train the parameters, σ is the activation function.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the link prediction method in the time sequence network.
The present invention also provides a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the above-described link prediction method in a time series network.
The method has the advantages that the time sequence self-adaptive walk of a source node and a target node in a time sequence network is respectively extracted, at least one of the source node and the target node is an invisible node, the distance measurement vectors of the source node and the target node in an embedding space are respectively obtained according to the time sequence self-adaptive walk of the source node and the target node so as to calculate the first distance between the source node and the target node in the embedding space, the structure perception vectors of the source node and the target node on a dynamic graph structure are respectively obtained according to the time sequence self-adaptive walk of the source node and the target node so as to calculate the second distance between the source node and the target node on the dynamic graph structure, and therefore the probability that the source node and the target node form a link in a target timestamp is predicted according to the first distance and the second distance, and the accuracy of inductive link prediction in the time sequence network is effectively improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart illustrating a link prediction method in a time series network according to an embodiment of the present disclosure.
Fig. 2 is a block diagram of a DEAL model in the link prediction method in the time series network according to the embodiment of the present application.
Fig. 3a is a schematic diagram of a timing network according to an embodiment of the present application.
Fig. 3b is a schematic diagram of a timing adaptive walk of a source node according to an embodiment of the present application.
Fig. 3c is a schematic diagram of a timing adaptive walk of a target node according to an embodiment of the present disclosure.
Fig. 4a to 4b are performance comparison graphs of the DEAL model and the basic model on the AP index under different data sparse scenes.
Fig. 5a to 5c are graphs comparing the performance of the DEAL model using different neighbor sampling methods on the AP index.
Fig. 6a to 6d are graphs comparing the performance of the DEAL model in the AP index under different hyper-parameters.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As shown in fig. 1, an embodiment of the present invention provides a method for predicting a link in a time series network. It should be noted that graph networks provide an efficient way to study complex systems, treating elements as nodes and correlations between nodes as edges. In practical scenes such as social media and network search, network data usually evolves in a time sequence, that is, nodes and edges continuously evolve to form a time sequence network. As shown in FIG. 3a, a new node (i.e., an invisible node, such as node v) t 、v 5 、v 6 ) And edges (e.g., node v) t And v 5 The edges therebetween) continuously appear in the time-series network, wherein the edges can be divided into direct-push links connecting visible nodes and inductive links related to invisible nodes, that is, the inductive links can be links connecting the visible nodes and the invisible nodes, and can also be links connecting the invisible nodes.
The method adopts a DistancE-Aware Learning (DEAL) model to predict inductive links in the time-series network. The purpose of inductive link prediction is to predict the occurrence of future links associated with new nodes (invisible nodes) that do not occur during the training phase. Is provided with
Figure BDA0003791216970000061
Representing a time-sequential network in which v and epsilon are both time-varying, each side (v) i ,v j ,t ij ) Epsilon represents a time sequence edge at time stamp t ij Connected with a node v i And v j . Let V s Indicating visible nodes, V, occurring during the training phase u Representing invisible nodes that appear after the training phase. Then, the prediction of the inductive link can be expressed as prediction { (v) i ,v j ,t ij )|v i ∈V u ||v j ∈V u } of the probability of the error.
As shown in fig. 1, an embodiment of the present invention provides a method for predicting a link in a time series network, including steps 101 to 106, which are specifically as follows:
step 101, respectively extracting time sequence self-adaptive walks of a source node and a target node in a time sequence network; the neighbor nodes in the time sequence self-adaptive walk are visible nodes, the visible nodes are nodes appearing in a preset training set, at least one of the source nodes and the target nodes is invisible nodes, and the invisible nodes are nodes not appearing in the training set.
The neighbor node in the time sequence adaptive walk of the source node is the neighbor node of the source node, that is, the other nodes except the source node in the time sequence adaptive walk of the source node. The neighbor node in the time sequence adaptive walk of the target node is the neighbor node of the target node, that is, the other nodes except the target node in the time sequence adaptive walk of the target node.
As shown in fig. 2, an adaptive sampling module is arranged in the DEAL model, and time-sequence adaptive walk of a source node and a target node is extracted by dynamically combining a method of randomly sampling neighbor nodes and selecting nearest neighbor nodes, so as to improve the probability of containing common neighbors of the source node and the target node.
Specifically, the respectively extracting timing adaptive walks of the source node and the target node in the timing network in step 101 includes:
respectively acquiring a first embedded vector of the source node and a second embedded vector of the target node;
calculating an embedding distance between the source node and the target node according to the first embedding vector and the second embedding vector;
if the embedding distance is larger than a preset distance threshold value, sampling a neighbor node nearest to the source node to extract the time sequence adaptive migration of the source node, and sampling a neighbor node nearest to the target node to extract the time sequence adaptive migration of the target node;
and if the embedding distance is smaller than the distance threshold, randomly sampling the neighbor node of the source node to extract the time sequence self-adaptive migration of the source node, and randomly sampling the neighbor node of the target node to extract the time sequence self-adaptive migration of the target node.
Wherein the source node v s And the nodes can be visible nodes or invisible nodes. At the source node v s In case of invisible node, source node v s Is its initial embedded vector; at the source node v s In case of invisible node, source node v s The first embedded vector of (a) is a trained embedded vector.
Specifically, the obtaining a first embedded vector of the source node includes:
generating an initial embedding vector of the source node;
if the source node is an invisible node, taking the initial embedded vector of the source node as a first embedded vector of the source node;
if the source node is a visible node, extracting the time sequence random walk of the source node; generating an initial embedding vector of the node in the time-series random walk; and determining a first embedding vector of the source node according to the initial embedding vector of the node in the time sequence random walk.
Each node in the time sequence network has a unique identification code ID, an initial embedding vector of the node can be generated according to the identification code of the node, the initial embedding vector is generated by initialization of an embedding layer, and the embedding dimension is d. Thus, according to the source node v s To generate a source node v s Initial embedding vector v s
If the source node v s Is invisible node, then source node v s First embedded vector of
Figure BDA0003791216970000071
For which a vector v is initially embedded s
If the source node v s For visible node, v is given to source node s Is trained to obtain the source node v s First embedded vector of
Figure BDA0003791216970000072
It should be noted that the embedded vector of each visible node in the training set may also be pre-trained before prediction, so that the source node can be directly obtained during predictionPoint v s The trained embedding vector is used as a first embedding vector
Figure BDA0003791216970000073
In particular, the nodes of the link are collected by backtracking the timestamps of the edges in the training set, forming a source node v s Each time sequence of (a) is randomly walked, i.e.:
w={(v 1 ,v 2 …,v n )|(v i-1 ,v i ,t i )∈ε,
Figure BDA0003791216970000074
wherein v is 1 Is a starting point of a time-sequential random walk, i.e. v s N is the length of a time-sequential random walk, i.e. the number of walk steps, t 2 >t 3 >…>t n Indicating that nodes on a time-sequential random walk should be arranged in time sequence, (v) i-1 ,v i ,t i ) E ε is a timing edge, indicating node v i-1 And v i At the time stamp t i A link is formed.
Source node v s There are M timing random walks, namely:
Figure BDA0003791216970000075
firstly, according to the identification code of each node in M time sequence random walks, correspondingly generating the initial embedded vector of each node, and then according to the initial embedded vectors of all nodes in M time sequence random walks, updating the source node v s I.e.:
Figure BDA0003791216970000081
Figure BDA0003791216970000082
wherein the content of the first and second substances,
Figure BDA00037912169700000817
is a source node v s The initial embedded vector of the ith neighbor node of the time-sequence random walk, wherein the neighbor node is the time-sequence random walk except the source node v s Other nodes, tau is more than or equal to 1 and less than or equal to M, l is more than or equal to 1 and less than or equal to n-1,
Figure BDA0003791216970000083
is a source node v s The aggregate representation of all the ith neighbor nodes in the M timing random walks,
Figure BDA0003791216970000084
is the source node v s Is finally updated, i.e. the source node v s First embedded vector of
Figure BDA0003791216970000085
That is, at the source node v s When being a visible node, the source node v s First embedded vector of
Figure BDA0003791216970000086
Is composed of
Figure BDA0003791216970000087
Figure BDA0003791216970000088
To average the pooled aggregation function, to avoid introducing additional trainable parameters, mean pooling is employed here for subsequent focus on measuring the distance between node embeddings.
Additionally, the obtaining a second embedded vector of the target node includes:
generating an initial embedded vector of the target node;
if the target node is an invisible node, taking the initial embedded vector of the target node as a second embedded vector of the target node;
if the target node is a visible node, extracting the time sequence random walk of the target node; generating an initial embedding vector of the node in the time-series random walk; and determining a second embedded vector of the target node according to the initial embedded vector of the node in the time sequence random walk.
Likewise, target node v t Can be visible node or invisible node, but source node v s And a target node v t At least one node in the set of nodes is an invisible node. At the target node v t When the node is invisible, the target node v t Second embedded vector of
Figure BDA0003791216970000089
For which a vector v is initially embedded t (ii) a At the target node v t When the node is invisible, the target node v t Second embedded vector of
Figure BDA00037912169700000810
For the trained embedded vector
Figure BDA00037912169700000811
Target node v t Second embedded vector of
Figure BDA00037912169700000812
And a source node v s First embedded vector of
Figure BDA00037912169700000813
The obtaining method is the same, and is not described in detail herein.
Obtaining a first embedded vector
Figure BDA00037912169700000814
And a second embedded vector
Figure BDA00037912169700000815
Then, the source node v is measured by using an L2 normalization method s And target node v t D 'of' st Namely:
Figure BDA00037912169700000816
it should be noted that the time-series random walk in the present embodiment only propagates the information of the visible node (i.e., the neighbor node appearing in the training set), and ignores the information of the invisible node.
If the source node v s And a target node v t Is embedded distance d' st Larger, then source node v s And a target node v t May be invisible nodes and all the neighbor nodes are invisible nodes. In this case, the nearest neighbor node is sampled with the nearest sample to improve the capture source node v s And a target node v t The probability of a common neighbor between them. If the source node v s And a target node v t Is embedded distance d' st Smaller, its historical neighbor nodes tend to be similar visible nodes, from source node v s And a target node v t The neighbor nodes are sampled at full time scale to avoid missing their early common neighbors.
Therefore, in extracting the timing adaptive walk, the neighbor node v i Sampling probability of (2)
Figure BDA0003791216970000091
Comprises the following steps:
Figure BDA0003791216970000092
wherein the content of the first and second substances,
Figure BDA0003791216970000093
adaptively walking the last step of historical interactive nodes, wherein lambda is a hyperparameter determining the recent sampling strength, when lambda is equal to 0, the recent sampling is degenerated into uniform sampling, and d threshold Is a trade-off parameter that controls the threshold for selecting random neighbor node samples or nearest neighbor node samples.
According to the neighbour nodePoint v i Sampling probability of
Figure BDA0003791216970000099
Selecting a Source node v s And a target node v t Adaptive walk. Source node v s Adaptive walk with M timing sequences
Figure BDA0003791216970000098
Target node v t Adaptive walk with M timing sequences
Figure BDA0003791216970000094
As shown in FIG. 3b, source node v s The three time sequence self-adaptive walks are respectively v s →v 2 →v 1 、v s →v 4 →v 2 、v s →v 3 →v t Target node v, as shown in FIG. 3c t The three time sequence self-adaptive walks are respectively v t →v 3 →v s 、v t →v 4 →v 2 、v t →v 5 →v 6
Then, as shown in fig. 2, the DEAL model further has a dual-channel distance measurement module for respectively measuring the source nodes v s And target node v t A distance based on the embedding space (i.e., a first distance) and a distance based on the dynamic graph structure (i.e., a second distance).
Step 102, obtaining a first distance measurement vector of the source node in the embedding space according to the time sequence self-adaptive walking of the source node, and obtaining a second distance measurement vector of the target node in the embedding space according to the time sequence self-adaptive walking of the target node.
After the time sequence self-adaptive walk of the source node and the target node is obtained, information is transmitted from neighbor nodes visible to the source node and the target node, and corresponding distance metric representation in an embedding space, namely a distance metric vector, is learned.
Specifically, the obtaining a first distance metric vector of the source node in an embedding space according to the time sequence adaptive walk of the source node in step 102 includes:
respectively generating initial embedded vectors of the source node and neighbor nodes in the time sequence self-adaptive walking of the source node;
calculating the weight of the neighbor node of the source node according to the initial embedded vectors of the source node and the neighbor node of the source node;
and calculating a first distance metric vector of the source node in an embedding space according to the initial embedding vector of the source node, the initial embedding vector of the neighbor node of the source node and the weight.
Source node v s Adaptive walk with M timing sequences
Figure BDA0003791216970000095
Each timing adaptive walk has m neighbor nodes,
Figure BDA0003791216970000096
Figure BDA0003791216970000097
is a source node v s The l is more than or equal to 1 and less than or equal to M in the set of the l neighbor nodes in the M time sequence self-adaptive walk.
Using an embedding layer for each node (including source node v) s And its neighbor nodes in the time-sequence adaptive walk) to generate a d-dimensional embedding vector. Then, the weight of each neighbor node in the timing adaptive walk is learned, that is:
Figure BDA0003791216970000101
wherein the content of the first and second substances,
Figure BDA0003791216970000102
is the source node v s The weight of the first neighbor node in the tau-th time sequence self-adaptive walk is more than or equal to 1 and less than or equal to M, l is more than or equal to 1 and less than or equal to M,
Figure BDA0003791216970000103
is the source node v s Is the initial embedding vector of the ith neighbor node in the τ -th time sequence adaptive walk, v s Is the source node v s The initial embedded vector of (a) is,
Figure BDA0003791216970000104
and
Figure BDA0003791216970000105
is a trainable parameter, leakyRelu is an activation function, [,]indicating a connect operation.
The attention weight is then normalized using the softmax function, i.e.:
Figure BDA0003791216970000106
wherein the content of the first and second substances,
Figure BDA0003791216970000107
is the source node v s The ith timing adaptation of the final weight of the ith neighbor node in the walk,
Figure BDA0003791216970000108
is the source node v s I is more than or equal to 1 and less than or equal to M in the ith time sequence self-adaptive walk,
Figure BDA0003791216970000109
is the source node v s The M pieces of time sequence self-adaption walk set of the ith neighbor node.
Then, the source node v is divided according to the attention score s All the ith neighbor nodes (i.e., of
Figure BDA00037912169700001010
) The embedding of (a) is combined, namely:
Figure BDA00037912169700001011
wherein the content of the first and second substances,
Figure BDA00037912169700001012
in order for the parameters to be learnable,
Figure BDA00037912169700001013
is the source node v s The tau-th neighbor node in the self-adaptive time sequence wandering is more than or equal to 1 and less than or equal to M, l is more than or equal to 1 and less than or equal to M, and sigma is a ReLu activation function.
Then, the source node v is pooled using averaging s Is combined with the aggregated representation of the neighbor nodes at different locations to generate a source node v s Is the source node v s First distance metric vector in embedding space
Figure BDA00037912169700001014
Figure BDA00037912169700001015
Wherein the content of the first and second substances,
Figure BDA00037912169700001016
is the average pooled polymerization function and is,
Figure BDA00037912169700001017
in addition, the obtaining a second distance metric vector of the target node in the embedding space according to the time-sequence adaptive walk of the target node in step 102 includes:
respectively generating initial embedded vectors of the target node and neighbor nodes in the time sequence self-adaptive walking of the target node;
calculating the weight of the neighbor node of the target node according to the initial embedded vectors of the target node and the neighbor node of the target node;
and calculating a second distance metric vector of the target node in an embedding space according to the initial embedding vector of the target node, the initial embedding vector of the neighbor node of the target node and the weight.
Wherein the target node v t Self-adaptive walk with M time sequences
Figure BDA00037912169700001018
Target node v t The second distance metric vector of
Figure BDA0003791216970000111
Target node v t Second distance metric vector of
Figure BDA0003791216970000112
And source node v s First distance metric vector of
Figure BDA0003791216970000113
The calculation methods are the same, and are not described in detail herein.
It should be noted that, in the prediction process, the source node v s Or target node v t The node may be an invisible node, the embedded vector of the invisible node cannot be well trained, but the embodiment benefits from multi-hop information propagation and can convert the invisible node v s Or v t Linking to visible node, thereby obtaining invisible node v s Or v t Valuable representation, guarantee the accuracy of subsequent prediction.
Step 103, calculating a first distance between the source node and the target node in an embedding space according to the first distance metric vector and the second distance metric vector.
At the acquisition source node v s First distance metric vector of
Figure BDA0003791216970000114
And a target node v t The second distance metric vector of
Figure BDA0003791216970000115
Then, the first distance metric vector is measured
Figure BDA0003791216970000116
And the second distance metric vector is
Figure BDA0003791216970000117
Multiplying to obtain a source node v s And target node v t A first distance in the embedding space
Figure BDA0003791216970000118
Namely:
Figure BDA0003791216970000119
and 104, acquiring a first structure sensing vector of the source node on the dynamic graph structure according to the time sequence self-adaptive walking of the source node, and acquiring a second structure sensing vector of the target node on the dynamic graph structure according to the time sequence self-adaptive walking of the target node.
Except for the measurement source node v s And target node v t A first distance in the embedding space
Figure BDA00037912169700001110
It is also proposed to generate a source node v for measuring distances on a dynamic graph structure s And target node v t The structural perceptual vector of (1).
Specifically, the obtaining a first structure sensing vector of the source node on the structure of the dynamic graph according to the time-sequence adaptive walk of the source node in step 104 includes:
generating a distance coding vector of a neighbor node of the source node;
generating a distance sensing vector of a neighbor node of the source node according to the distance coding vector of the neighbor node of the source node;
generating a time coding vector of a neighbor node of the source node according to the target timestamp and a timestamp of a link formed by the neighbor node and a previous node in the time sequence self-adaptive walking of the source node;
generating a time sequence distance perception vector of the neighbor node of the source node according to the distance perception vector and the time coding vector of the neighbor node of the source node;
generating a time sequence self-adaptive walking state vector of the source node according to the time sequence distance perception vector of the neighbor node of the source node;
and generating a first structure perception vector of the source node on the structure of the dynamic graph according to the time sequence self-adaptive wandering state vector of the source node.
Source node v s M pieces of timing adaptive walk into
Figure BDA00037912169700001111
Target node v t M pieces of timing adaptive walk into
Figure BDA00037912169700001112
For source node v s M pieces of time sequence self-adaptive walk and target node v t Each neighbor node in the M pieces of timing adaptive walk
Figure BDA0003791216970000121
Generating one by anonymous distance coding
Figure BDA0003791216970000122
The distance of (2) encodes the vector. The purpose of anonymous distance coding is to generate a vector to measure the source node v on the graph structure s And a target node v t The distance between them.
Each time-sequence adaptive walk is composed of a sequence of nodes tracing back to time, as shown by v in FIG. 3b s →v 4 →v 2 . Thus, can be
Figure BDA0003791216970000123
Each neighbor v appearing in i Generating anonymous distance-coded vector a i Namely:
Figure BDA0003791216970000124
wherein the content of the first and second substances,
Figure BDA0003791216970000125
representing a neighbour v i In that
Figure BDA0003791216970000126
(
Figure BDA0003791216970000127
Or
Figure BDA0003791216970000128
) The number of occurrences at a certain position in (b) can be expressed as:
Figure BDA0003791216970000129
wherein the content of the first and second substances,
Figure BDA00037912169700001210
indicating wandering
Figure BDA00037912169700001211
The jth node of (1).
Figure BDA00037912169700001212
Representing neighbor nodes v i At different locations to the source node v s The distance of (a) to (b),
Figure BDA00037912169700001213
representing neighbor nodes v i To the target node v t The distance of (c). As shown in fig. 3a to 3b, a source node v s V of a neighbor node 4 The number of different positions in its time-sequence adaptive walking is
Figure BDA00037912169700001224
Target node v t V of a neighbor node 4 Adapt in its timingThe number of different positions in the walking tour is (0,1,0) T I.e. by
Figure BDA00037912169700001214
Anonymous distance coding by
Figure BDA00037912169700001215
And
Figure BDA00037912169700001216
spliced together, for each neighbour
Figure BDA00037912169700001217
Generating a vector
Figure BDA00037912169700001218
A neighbor node v i Measuring source node v as an intermediate node s And target node v t The distance between them. The distance coding process adopts an anonymous mode, namely the identification code ID of the node is not needed, so the method is suitable for inductive scenes on a time sequence network.
Then, the source node v s V of a neighbor node i Anonymous distance-coding vector a of i Inputting into a multilayer perceptron (MLP) to obtain distance perception vector thereof
Figure BDA00037912169700001219
Namely:
s i =MLP(a i )=W 2 (σ(W 1 a i ))。
wherein the content of the first and second substances,
Figure BDA00037912169700001220
and
Figure BDA00037912169700001221
is a trainable parameter in MLP and σ denotes the ReLu activation function.
To account for the timing dynamics of the dynamic graph structure, the time intervals between timing edges are encoded using random fourier features, and any positive definite kernel can be approximated according to the Bochner's theorem. The temporal coding vector can be represented as:
Figure BDA00037912169700001222
wherein, ω is 12 ,…,ω d Are learnable parameters. Given (v) s ,v t ,t st ) Target timestamp t in (1) st And timing sequence self-adaptive walk neighbor node v i With a previous node v i-1 Forming a link (v) i-1 ,v i ,t i ) Time stamp t of i A time-coded vector can be obtained
Figure BDA00037912169700001223
Namely:
Figure BDA0003791216970000131
then, the neighbor node v i Distance sensing vector s i And a temporal coding vector T i Are connected in series to obtain
Figure BDA0003791216970000132
As its timing distance perception vector h i Namely:
h i =[s i ,T i ]。
each neighbor node v in acquiring each timing adaptive walk i Time-sequential distance perception vector h of i Then, modeling each time sequence self-adaptive walk by using the BilSTM, and taking the last state of the BilSTM as a source node v s Time sequence self-adaptive walk of
Figure BDA0003791216970000133
The state vector of (a), namely:
Figure BDA0003791216970000134
Figure BDA0003791216970000135
Figure BDA0003791216970000136
wherein,
Figure BDA0003791216970000137
is the source node v s Time sequence self-adaptive walk of
Figure BDA0003791216970000138
J is more than or equal to 1 and less than or equal to M,
Figure BDA0003791216970000139
is the source node v s Time sequence adaptive walk of
Figure BDA00037912169700001310
The m-th neighbor node of (2) forwards the hidden state vector in progress,
Figure BDA00037912169700001311
is the source node v s Time sequence self-adaptive walk of
Figure BDA00037912169700001312
The m-th neighbor node of (1) hidden state vector h in the backward transfer process j,m Is the source node v s Time sequence self-adaptive walk of
Figure BDA00037912169700001313
The m-th neighbor node of (1) is determined.
To incorporate timing distance information for different timing adaptive walks, a self-attention network is employed to dynamically determineDetermining the importance of different time sequences to self-adapt the walk, thereby generating a source node v s The final representation of (c):
Figure BDA00037912169700001314
Figure BDA00037912169700001315
wherein the content of the first and second substances,
Figure BDA00037912169700001316
is a source node v s The M timing-adaptive walk state vectors,
Figure BDA00037912169700001317
Figure BDA00037912169700001318
are learnable parameters in a self-attentive network.
Then, adopting average pooling to obtain a source node v s First structure perception vector on dynamic graph structure
Figure BDA00037912169700001319
Namely:
Figure BDA00037912169700001320
in addition, the obtaining a second structure sensing vector of the target node on the dynamic graph structure according to the time sequence adaptive walk of the target node includes:
generating anonymous distance coding vectors of neighbor nodes of the target node;
generating distance perception vectors of neighbor nodes of the target node according to the anonymous distance coding vectors of the neighbor nodes of the target node;
generating a time coding vector of a neighbor node of the target node according to the target timestamp and a timestamp of a link formed by the neighbor node and a previous node in the time sequence self-adaptive walking of the target node;
generating a time sequence distance perception vector of the neighbor node of the target node according to the distance perception vector and the time coding vector of the neighbor node of the target node;
generating a time sequence self-adaptive walking state vector of the target node according to the time sequence distance perception vector of the neighbor node of the target node;
and generating a second structure perception vector of the target node on the dynamic graph structure according to the time sequence self-adaptive wandering state vector of the target node.
Wherein the target node v t The second structure perception vector on the structure of the dynamic graph is
Figure BDA0003791216970000141
Target node v t The second structure perception vector of
Figure BDA0003791216970000142
Is generated and source node v s First structure perception vector of
Figure BDA0003791216970000143
The generation method is the same, and detailed description is omitted here.
And 105, calculating a second distance between the source node and the target node on the dynamic graph structure according to the first structure perception vector and the second structure perception vector.
Will source node v s First structure perception vector of
Figure BDA0003791216970000144
And target node v t The second structure perception vector of
Figure BDA0003791216970000145
The data are spliced and input into a multi-layer perceptron to generate a prediction scoreNumber measurement source node v s And a target node v t Distances on the dynamic graph structure, i.e. second distances
Figure BDA0003791216970000146
Figure BDA0003791216970000147
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003791216970000148
and
Figure BDA0003791216970000149
is a learnable parameter in MLP, σ represents the ReLu activation function.
And 106, predicting the probability of forming a link at the target timestamp by the source node and the target node according to the first distance and the second distance.
As shown in fig. 2, the first distance is set
Figure BDA00037912169700001410
And a second distance
Figure BDA00037912169700001411
Input to a multi-layer perceptron MLP to generate a final probabilistic predicted score to measure the source node v s And a target node v t At the target timestamp t st (future time stamp) probability z of forming a link st Namely:
Figure BDA00037912169700001412
wherein the content of the first and second substances,
Figure BDA00037912169700001413
and
Figure BDA00037912169700001414
σ is the ReLu activation function for trainable parameters.
To train the DEAL model, with the cross entropy function as the optimization objective, trainable parameters are learned:
Figure BDA00037912169700001415
wherein (v) s ,v t ,t st ) E is epsilon as a time sequence edge observed in the training set, and sigma is a sigmoid function. v. of n Denotes a negative example, z sn Is a negative sample edge (v) s ,v n ,t st ) I.e. using v n Substitution (v) s ,v t ,t st ) Target node v in (1) t Corresponding prediction scores. In addition, Q is the negative sample number, P n (v) Representing a negative sample distribution of the node space, with the number of negative samples set to 1. Finally, the proposed DEAL model is trained using the Back-Propagation Through Time (BPTT) algorithm.
In order to test the effectiveness of the DEAL model in the application, the performances of the DEAL model and a reference model are tested on three sparse time series network data sets without attributes, namely MathOverflow, askUbuntu and StackOverflow. The three data sets come from three websites, math Overflow, ask Ubuntu and Stack Overflow, respectively, and constitute a dynamic graph of interactions. In particular, the time edges (v) in each dynamic network s ,v t ,t st ) E ε contains three interaction types, namely at timestamp t st User v s Answer user v t To the user v t And a user v t The answers to (1) are commented on.
Data from the last 90 days of MathOverflow, 30 days of AskUbuntu, and 3 days of StackOverflow were used to perform the experiments. The time series links are divided into 70%, 15% and 15% in time sequence on all three data sets for training, verification and testing. In addition, model evaluation is performed in the validation set and the test set using inductive links associated with invisible nodes that are not present in the training set. Table 1 summarizes the statistics of the three datasets after preprocessing.
Figure BDA0003791216970000151
TABLE 1
The reference model compared to the DEAL model is as follows:
graph is generalized by sampling visible nodes and propagating information Xi Jiedian where the embedded vectors of visible nodes can be learned in training.
Gat used the same settings in the experiments as graphics, except that an attention-free way was used to determine the weights of different neighbors in the message passing.
And 3, graphSAGE + T considers the time sequence dynamic information of the network on the basis of GraphSAGE, and the node embedding and the time coding vector are spliced.
And combining the node embedding and time coding vector obtained by GAT by using a splicing mode for GAT + T.
Tgat proposes a timing graph attention layer to aggregate time information and topology information from neighboring nodes and further designs a time coding function based on the Bochner theorem to achieve continuous time coding.
Tgn developed a timing graph network that learned dynamic node embedding through memory networks and graph-based operations and further introduced an advanced training strategy to efficiently learn from data sequences.
CAW-N proposes a random anonymous walk method, which generates a relative identity embedding between two nodes by measuring the distance between the two nodes on a dynamic graph structure for inductive link prediction.
The hyper-parameters of the DEAL model in this application are tuned on the validation set to determine the best choice on different data sets. Specifically, the vector dimension d is adjusted in {60,80,100,120,140} and the scaling parameter λ is adjusted in {1e } -7 ,1e -6 ,1e -5 ,1e -4 The search is done with a trade-off threshold d threshold The loop is performed in {0.1,0.15,0.2,0.25,0.3}, the walk number M is adjusted in {2,4,8,16,32}, and the walk length M is set to 2. In addition, the learning rate of 1e is adopted -4 The Adam optimizer of (a) to train learnable parameters in the DEAL model and set the batch size to 64. In addition, accuracy (ACC), AUC and AP were used as estimates of the performance of the DEAL model and the reference model on predictive inductive links.
The performance of the DEAL model and the reference model in the present application is shown in table 2. The best performing benchmark model and the best performing model in each column are underlined and bolded, respectively.
Figure BDA0003791216970000161
TABLE 2
For the reference model, the embedding-based methods GraphSAGE and GAT are first compared to their corresponding time-series versions, graphSAGE + T and GAT + T, respectively. It was found that GaphSAGE + T consistently outperformed GraphSAGE on AskUbuntu and StackOverflow, while at a disadvantage on MathOverflow, GAT + T outperformed GAT on AskUbuntu, but in most cases worse than GAT on MathOverflow and StackOverflow. This indicates that the simple combination of temporal coding and node embedding does not always preserve prediction performance in a variety of scenarios. Furthermore, it can be observed that the performance of TGAT and TGN is not satisfactory, since they are highly dependent on edge attribute information, resulting in poor performance in experimental settings where attribute information is not available. In contrast, the performance of CAW-N on all metrics is best in the benchmark model by anonymously distance-coding the relative identity vectors of the learning nodes on the three datasets.
For the DEAL model in the present application, it can be observed from table 2 that, on all three data sets, the DEAL model has better performance on accuracy, AUC, and AP indexes than the competitive reference model all the time, verifying the validity of the DEAL model. Furthermore, it is worth noting that the performance improvement is more pronounced on small-scale datasets than on large datasets. Wherein, the accuracy, AUC and AP indexes of DEAL on MathOverflow are respectively improved by 8.30%, 10.53% and 7.04%, the improvement rates on AskUbuntu are respectively 3.88%, 4.93% and 5.02%, and the improvement rates on StackOverflow are respectively 0.11%, 1.30% and 1.52%. The method has the advantages that the algorithm can not only defeat a competitive reference model in a sparse data scene, but also show more obvious effectiveness in a practical application scene with limited data.
To clarify the contribution of components in the DEAL model to the predictive performance of the inductive link, the proposed DEAL model was compared with three variants thereof: (1) w/o AdaSampler, which replaces the adaptive neighbor sampling in the DEAL model with random sampling; (2) w/o Embedding and w/o Structure, respectively, remove distance prediction based on Embedding in space and distance prediction based on dynamic graph Structure. Table 3 presents the results of DEAL and its variants in the accuracy, AUC and AP indices. Firstly, it can be observed that the DEAL model is superior to the variant model w/o AdaSampler under all conditions on three data sets, which indicates that through the visible node embedding of the pre-training, the adaptive sampling module in the application can dynamically select a proper sampling strategy from the recent sampling and the random sampling, capture the common neighbors of the source node and the target node, and improve the prediction performance.
Figure BDA0003791216970000171
TABLE 3
Furthermore, comparing w/o Embedding and w/o Structure with the DEAL model, it can be seen that both Embedding space-based distance and dynamic graph Structure-based distance contribute to accurate inductive link prediction on a time-series network. The distance based on the dynamic graph Structure has a larger impact than the distance based on the Embedding space because w/o Embedding always achieves better performance than w/o Structure on the three datasets shown in Table 3. This may be because in a data sparse scenario, the visible node embedding is difficult to learn, and therefore the effectiveness of explicit distance coding is obvious on the dynamic graph structure. The performance gap between w/o Embedding and w/o Structure is especially apparent on large datasets. For example, the accuracy, AUC and AP of w/o Embedding on MathOverflow increased by 14.82%, 8.66% and 12.06%, respectively, while the corresponding increases on StackOverflow were 22.34%, 10.97% and 14.16%. This shows that the advantages of the distance strategy method based on the dynamic graph structure compared to the distance measurement method based on the embedding space are more obvious on large data sets than on small data sets.
To test the sensitivity of the DEAL model to data sparsity, the time-series links in the training set are randomly deleted and the original edges are kept in proportion to γ, where γ is adjusted in {20%,40%,60%,80%,100% }. In addition, the best baseline model CAW-N and the variant model w/o AdaSampler are also taken into consideration to verify the effectiveness of the adaptive sampling module and the two-channel distance measurement module in different data sparsity scenarios by comparing DEAL with w/o AdaSampler and by comparing w/o AdaSampler with CAW-N. Experiments are carried out by adopting AskUbuntu and StackOverflow, and the fact that the Mathoverflow data set is small in size can cause unstable training when a large number of time sequence links in the training set are removed is considered. The results on the AP index are shown in fig. 5a and 5b, and the results on the accuracy and AUC index show similar phenomena.
As can be seen from fig. 4a and 4b, first, as the training data increases, the performance of all models on StackOverflow generally improves, except that w/o adasamplers show a steady trend on a larger scale. However, in contrast, in AskUbuntu, when the training data percentage increases, the performance of the DEAL model continues to improve, while the performance of the CAW-N and w/o adasamplers first improve to reach about 80% of the optimal performance and then begin to decline. The reason why the performance of the CAW-N and w/o AdaSampler is reduced may be that on a small data set, the common neighbors of the source node and the target node cannot be accurately detected without the adaptive sampling method of the application, and the performance is not ideal.
Furthermore, by comparing the DEAL model with the w/o AdaSampler, it can be seen that the DEAL model performs slightly better than the w/o AdaSampler on a small training data scale on both datasets. However, when the proportion reaches 80%, the performance gap between the DEAL model and the w/o AdaSampler becomes obvious, which shows that the adaptive sampling plays a relatively more important role in scenes with rich training data. Furthermore, it can be seen that the improvement in the small training data scale is particularly significant compared to w/o AdaSampler and CAW-N on StackOverflow, indicating that the performance contribution over a sparse large data set based on embedded distance measurements is relatively larger.
To verify that the adaptive sampling method in the present application is superior to other neighbor sampling methods, the DEAL model is compared with several different DEAL variant models: (1) DEAL Random The random sampling is used for replacing the self-adaptive sampling in the DEAL model; (2) DEAL Recent The nearest neighbors are selected using the nearest samples, and exp (t) is used i -t i-1 ) Proportional probability where t i And t i-1 Respectively candidates for wandering time during sampling time stamps of neighbors and previous nodes; (3) DEAL Early With DEAL Recent Instead, it tends to select early nodes in the history interaction as neighbor nodes, with probabilities of exp (t) and i-1 -t i ) Proportional to the total amount of the catalyst).
Furthermore, for the DEAL model and the variant model DEAL Recent And DEAL Early The scaling parameter λ in {1e } -7 ,1e -6 ,1e -5 ,1e -4 Adjustments were made to observe the effect of λ on model performance. Fig. 6a, 6b and 6c show the results of DEAL and the three variant models on the AP index on three data sets.
From fig. 5a, 5b and 5c, it can be observed first that as the scaling parameter λ increases, DEAL Early Performance continues to decline on all three data sets since the early neighbors selected do not account for temporal dynamics in the timing network. For DEAL Recent It can be seen that DEAL is increased when λ is increased Recent First increasing and then generally decreasing. This indicates that DEAL is present in the case where λ is small Recent Can help to capture nearest neighbor nodes of a source node and a target node, considers the dynamic characteristics of the network to a certain extent and obtains a ratio DEAL Random And DEAL Early Better performance. However, the lambda value on MathOverflow is 1e -5 Lambda values on AskUbuntu and StackOverflow are 1e -6 And after the time respectively reaches the predicted performance peak value, the time scale limit of the sampling neighbor causes the reduction of the model performance.
In addition, it can be seen that the performance of DEAL on three data sets is generally superior to that of the variant model, and the effectiveness of adaptive sampling in selecting valuable neighbors for distance measurement between nodes in the application is verified. Specifically, the adaptive sampling module measures the distance between a source node and a target node in a pre-trained embedding space through L2 normalization, dynamically selects a neighbor sampling method from random sampling or recent sampling, and extracts time sequence adaptive walk. That is, adaptive sampling can effectively take into account the dynamics of the network while guaranteeing a time scale.
Adjusting the hyper-parameters in the DEAL model, including adjusting the number of pre-training iterations in {6,8,10,12,14} and adjusting the sampling threshold d in {0.1,0.15,0.2,0.25,0.3} threshold The sensitivity of the DEAL model to different hyper-parameters was tested by adjusting the number of walks M in 2,4,8,16,32 and the vector dimension d in 60,80,100,120,140. The AP index results on the three data sets are shown in fig. 6a to 6 d.
Pre-training iteration number it can be observed from fig. 6a that when the number of pre-training iterations is increased from 6 to 14, the performance of the DEAL model on StackOverflow generally increases, while the performance on MathOverflow and AskUbuntu decreases slightly. This difference may be due to the different sizes of the three data sets, indicating that pre-training requires a relatively larger number of iterations for a large data set.
Sampling threshold As can be seen in FIG. 6b, with adaptive sampling of the threshold d threshold Increasing from 0.1 to 0.3, the deal performance on AskUbuntu and StackOverflow fluctuated, showing a small upward trend overall, while the performance on MathOverflow rose first and fell after reaching a peak at a threshold of 0.2. This may be due to the fact that for a small dataset MathOverflow, node embedding cannot look like the large dataset AskUbuntuLearned as well as StackOverflow so that node embeddings are distributed relatively evenly in the embedding space. Therefore, it is appropriate to set a small threshold in the MathOverflow dataset.
Number of walks as can be seen from fig. 6c, the performance of the DEAL model improves generally over the three data sets as the number of walks M increases. On MathOverflow and AskUbuntu, the rate of rise continues to decrease as the number of wandering increases, especially after the number of wandering reaches 8, the performance remains stable. In contrast, on the StackOverflow dataset, performance still improves rapidly over larger number of walks. This may be because on large-scale datasets, relatively more timing adaptive walks are needed to help measure the distance of the source node from the target node in the embedding space and on the dynamic graph structure.
Vector dimension as can be seen in fig. 6d, as the vector dimension d increases from 60 to 140, deal has a significantly improved performance on both AskUbuntu and StackOverflow large datasets, showing a steady trend after d = 100. This is because the larger vector dimension increases the ability of the DEAL model to represent distance measurements. In contrast, on MathOverflow, when the vector dimension increases, the performance of the DEAL model increases from d =60 to 100 first, and then shows a trend of decreasing continuously. This is probably due to the fact that on small datasets, too large vector dimensions can lead to over-fitting problems, reducing the generalization ability of the DEAL model on test sets.
Therefore, the self-adaptive sampling method in the application utilizes the pre-trained visible node to embed the dynamic sampling neighbor node, and improves the probability of capturing the common neighbor of the source node and the target node. The dual-channel distance measurement module in the application measures the distance between a source node and a target node in an embedding space and on a dynamic graph structure simultaneously, and is used for predicting a future link. A large number of experiments carried out on three time series network data sets show that the accuracy, AUC and AP index mountains of the algorithm are obviously improved.
In summary, in the embodiment of the present application, time-sequence adaptive walks of a source node and a target node in a time-sequence network are respectively extracted, at least one of the source node and the target node is an invisible node, distance measurement vectors of the source node and the target node in an embedding space are respectively obtained according to the time-sequence adaptive walks of the source node and the target node, so as to calculate a first distance between the source node and the target node in the embedding space, and structure perception vectors of the source node and the target node on a dynamic graph structure are respectively obtained according to the time-sequence adaptive walks of the source node and the target node, so as to calculate a second distance between the source node and the target node on the dynamic graph structure, so that probabilities of the source node and the target node forming a link at a target timestamp are predicted according to the first distance and the second distance, and accuracy of inductive link prediction in the time-sequence network is effectively improved.
Fig. 7 is a schematic diagram illustrating a specific hardware structure of an electronic device according to this embodiment, where the electronic device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (for example, USB, network cable, etc.), and can also realize communication in a wireless mode (for example, mobile network, WIFI, bluetooth, etc.).
The bus 1050 includes a path to transfer information between various components of the device, such as the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
Embodiments of the present invention provide a non-transitory computer readable storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to perform steps of any one of the methods for link prediction in a time series network provided by the embodiments of the present invention.
Non-transitory computer readable media of the present embodiments, including non-transitory and non-transitory, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the idea of the invention, also features in the above embodiments or in different embodiments may be combined, steps may be implemented in any order, and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.
In addition, well known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures for simplicity of illustration and discussion, and so as not to obscure the invention. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the invention, and also in view of the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the present invention is to be implemented (i.e., specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the invention, it should be apparent to one skilled in the art that the invention can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative instead of restrictive.
While the present invention has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures, such as Dynamic RAM (DRAM), may use the discussed embodiments.
The embodiments of the invention are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements and the like that may be made without departing from the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (10)

1. A method for link prediction in a time series network, comprising:
respectively extracting time sequence self-adaptive walks of a source node and a target node in a time sequence network; the neighbor nodes in the time sequence self-adaptive walk are visible nodes, the visible nodes are nodes appearing in a preset training set, at least one of the source nodes and the target nodes is invisible nodes, and the invisible nodes are nodes not appearing in the training set;
acquiring a first distance measurement vector of the source node in an embedding space according to the time sequence self-adaptive walking of the source node, and acquiring a second distance measurement vector of the target node in the embedding space according to the time sequence self-adaptive walking of the target node;
calculating a first distance between the source node and the target node in an embedding space according to the first distance metric vector and the second distance metric vector;
acquiring a first structure perception vector of the source node on a dynamic graph structure according to the time sequence self-adaptive walking of the source node, and acquiring a second structure perception vector of the target node on the dynamic graph structure according to the time sequence self-adaptive walking of the target node;
calculating a second distance between the source node and the target node on the structure of the dynamic graph according to the first structure perception vector and the second structure perception vector;
and predicting the probability of forming a link at a target timestamp by the source node and the target node according to the first distance and the second distance.
2. The method of link prediction in a timing network as claimed in claim 1, wherein said separately extracting timing adaptive walks of a source node and a destination node in the timing network comprises:
respectively acquiring a first embedded vector of the source node and a second embedded vector of the target node;
calculating an embedding distance between the source node and the target node according to the first embedding vector and the second embedding vector;
if the embedding distance is larger than a preset distance threshold value, sampling a neighbor node nearest to the source node to extract the time sequence adaptive migration of the source node, and sampling a neighbor node nearest to the target node to extract the time sequence adaptive migration of the target node;
and if the embedding distance is smaller than the distance threshold, randomly sampling the neighbor node of the source node to extract the time sequence self-adaptive migration of the source node, and randomly sampling the neighbor node of the target node to extract the time sequence self-adaptive migration of the target node.
3. The method of claim 2, wherein said obtaining a first embedded vector for said source node comprises:
generating an initial embedding vector of the source node;
if the source node is an invisible node, taking the initial embedded vector of the source node as a first embedded vector of the source node;
if the source node is a visible node, extracting the time sequence random walk of the source node; generating an initial embedding vector of the node in the time-series random walk; and determining a first embedding vector of the source node according to the initial embedding vector of the node in the time sequence random walk.
4. The method of claim 1, wherein said obtaining a first distance metric vector of said source node in embedding space based on said source node's timing adaptive walk comprises:
respectively generating initial embedded vectors of the source node and neighbor nodes in the time sequence self-adaptive walking of the source node;
calculating the weight of the neighbor node of the source node according to the initial embedded vectors of the source node and the neighbor node of the source node;
and calculating a first distance metric vector of the source node in an embedding space according to the initial embedding vector of the source node, the initial embedding vector of the neighbor node of the source node and the weight.
5. The method of claim 4, wherein the weights of the neighboring nodes of the source node are calculated by the formula:
Figure FDA0003791216960000021
Figure FDA0003791216960000022
wherein the source node has M timing adaptive walks with M neighbor nodes,
Figure FDA0003791216960000023
is the weight of the ith neighbor node in the tau-th time sequence self-adaptive walk of the source node, tau is more than or equal to 1 and less than or equal to M, l is more than or equal to 1 and less than or equal to M,
Figure FDA0003791216960000024
for the initial embedding vector of the l-th neighbor node in the τ -th time sequence self-adaptive walk of the source node, v s For the initial embedded vector of the source node,
Figure FDA0003791216960000025
and
Figure FDA0003791216960000026
in order to train the parameters of the device,
Figure FDA0003791216960000027
i is more than or equal to 1 and less than or equal to M for the ith neighbor node in the ith time sequence self-adaptive walk of the source node,
Figure FDA0003791216960000028
the self-adaptive wandering time sequence of the source node is a set of the ith neighbor node;
the calculation formula of the first distance metric vector of the source node in the embedding space is as follows:
Figure FDA0003791216960000029
Figure FDA00037912169600000210
wherein the content of the first and second substances,
Figure FDA00037912169600000211
a first distance metric vector in an embedding space for the source node,
Figure FDA00037912169600000212
σ is an activation function as a learning parameter;
the calculation formula of the first distance is as follows:
Figure FDA00037912169600000213
wherein the content of the first and second substances,
Figure FDA00037912169600000214
in order to be said first distance, the first distance,
Figure FDA00037912169600000215
a second distance metric vector in embedding space for the target node.
6. The method of claim 1, wherein the obtaining a first structure-aware vector of the source node on a dynamic graph structure according to the time-adaptive walking of the source node comprises:
generating anonymous distance coding vectors of neighbor nodes of the source node;
generating a distance sensing vector of a neighbor node of the source node according to the anonymous distance coding vector of the neighbor node of the source node;
generating a time coding vector of a neighbor node of the source node according to the target timestamp and a timestamp of a link formed by the neighbor node and a previous node in the time sequence self-adaptive walking of the source node;
generating a time sequence distance perception vector of the neighbor node of the source node according to the distance perception vector and the time coding vector of the neighbor node of the source node;
generating a time sequence self-adaptive walking state vector of the source node according to the time sequence distance perception vector of the neighbor node of the source node;
and generating a first structure perception vector of the source node on the structure of the dynamic graph according to the time sequence self-adaptive wandering state vector of the source node.
7. The method of claim 6, wherein the distance sensing vector of the neighboring node of the source node is generated by the formula:
s i =MLP(a i )=W 2 (σ(W 1 a i ));
the source node has M timing sequence self-adaptive walks, and the timing sequence self-adaptive walks have M neighbor nodes s i Is the source nodeI is not less than 1 and not more than M, a i An anonymous distance-coding vector, W, for the ith neighbor node of the source node 1 And W 2 As a training parameter, σ is an activation function;
the generation formula of the time coding vector of the neighbor node of the source node is as follows:
Figure FDA0003791216960000031
Figure FDA0003791216960000032
wherein, T i A time-coded vector, t, for the ith neighbor node of the source node st Is the target timestamp, t i Forming a time stamp, omega, of a link for the ith neighbor node in the time-sequence adaptive walk of the source node and the previous node 12 ,…,ω d Is a learning parameter;
the generation formula of the time sequence distance perception vector of the neighbor node of the source node is as follows:
h i =[s i ,T i ];
wherein h is i A time sequence distance perception vector of the ith neighbor node of the source node;
the generation formula of the time sequence self-adaptive wandering state vector of the source node is as follows:
Figure FDA0003791216960000033
Figure FDA0003791216960000034
Figure FDA0003791216960000035
wherein the content of the first and second substances,
Figure FDA0003791216960000036
j is more than or equal to 1 and less than or equal to M for the jth time sequence self-adaptive wandering state vector of the source node,
Figure FDA0003791216960000037
a hidden state vector in the forward transmission process of the mth neighbor node which is self-adaptively walked for the jth time sequence of the source node,
Figure FDA0003791216960000041
a hidden state vector h in the backward transfer process of the mth neighbor node of the self-adaptive wandering time sequence of the source node j,m A timing sequence distance perception vector of the mth neighbor node which is self-adaptively walked for the jth timing sequence of the source node;
the generation formula of the first structure perception vector of the source node on the dynamic graph structure is as follows:
Figure FDA0003791216960000042
Figure FDA0003791216960000043
Figure FDA0003791216960000044
wherein the content of the first and second substances,
Figure FDA0003791216960000045
a vector is perceived for a first structure of the source node on a dynamic graph structure,
Figure FDA0003791216960000046
and
Figure FDA0003791216960000047
is a learning parameter;
the calculation formula of the second distance is as follows:
Figure FDA0003791216960000048
wherein the content of the first and second substances,
Figure FDA0003791216960000049
as a result of the second distance being said,
Figure FDA00037912169600000410
for a second structure-aware vector, W, of the target node on the structure of the dynamic graph 3 And W 4 To learn parameters.
8. The method of claim 1, wherein the probability is calculated by the formula:
Figure FDA00037912169600000411
wherein z is st In order to be a probability,
Figure FDA00037912169600000412
in order to be said first distance, the first distance,
Figure FDA00037912169600000413
is said second distance, W 5 And W 6 To train the parameters, σ is the activation function.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement a method of link prediction in a time series network as claimed in any one of claims 1 to 8.
10. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of link prediction in a time series network according to any one of claims 1 to 8.
CN202210959140.6A 2022-08-10 2022-08-10 Link prediction method in time series network, electronic device and storage medium Pending CN115329146A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210959140.6A CN115329146A (en) 2022-08-10 2022-08-10 Link prediction method in time series network, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210959140.6A CN115329146A (en) 2022-08-10 2022-08-10 Link prediction method in time series network, electronic device and storage medium

Publications (1)

Publication Number Publication Date
CN115329146A true CN115329146A (en) 2022-11-11

Family

ID=83922664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210959140.6A Pending CN115329146A (en) 2022-08-10 2022-08-10 Link prediction method in time series network, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN115329146A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115550240A (en) * 2022-11-24 2022-12-30 云账户技术(天津)有限公司 Network routing method, system, electronic device and readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115550240A (en) * 2022-11-24 2022-12-30 云账户技术(天津)有限公司 Network routing method, system, electronic device and readable storage medium
CN115550240B (en) * 2022-11-24 2023-03-10 云账户技术(天津)有限公司 Network routing method, system, electronic device and readable storage medium

Similar Documents

Publication Publication Date Title
CN113222700B (en) Session-based recommendation method and device
Li et al. A deep learning approach to link prediction in dynamic networks
US10387768B2 (en) Enhanced restricted boltzmann machine with prognosibility regularization for prognostics and health assessment
US20230394368A1 (en) Collecting observations for machine learning
CN110659723B (en) Data processing method and device based on artificial intelligence, medium and electronic equipment
US20170116530A1 (en) Generating prediction models in accordance with any specific data sets
Xie et al. A multimodal variational encoder-decoder framework for micro-video popularity prediction
KR102203253B1 (en) Rating augmentation and item recommendation method and system based on generative adversarial networks
CN114297036A (en) Data processing method and device, electronic equipment and readable storage medium
Li et al. Lrbm: A restricted boltzmann machine based approach for representation learning on linked data
CN110633859B (en) Hydrologic sequence prediction method integrated by two-stage decomposition
CN116151485B (en) Method and system for predicting inverse facts and evaluating effects
CN115329146A (en) Link prediction method in time series network, electronic device and storage medium
CN115471271A (en) Method and device for attributing advertisements, computer equipment and readable storage medium
CN115296984A (en) Method, device, equipment and storage medium for detecting abnormal network nodes
CN115062779A (en) Event prediction method and device based on dynamic knowledge graph
CN111161238A (en) Image quality evaluation method and device, electronic device, and storage medium
Ardimansyah et al. Preprocessing matrix factorization for solving data sparsity on memory-based collaborative filtering
US20210216845A1 (en) Synthetic clickstream testing using a neural network
Yang et al. Gated graph convolutional network based on spatio-temporal semi-variogram for link prediction in dynamic complex network
KR102192461B1 (en) Apparatus and method for learning neural network capable of modeling uncerrainty
CN116361643A (en) Model training method for realizing object recommendation, object recommendation method and related device
CN117010480A (en) Model training method, device, equipment, storage medium and program product
Valderrama et al. Integrating machine learning with pharmacokinetic models: Benefits of scientific machine learning in adding neural networks components to existing PK models
CN115482500A (en) Crowd counting method and device based on confidence probability

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination