CN115017368A

CN115017368A - Dynamic graph representation learning method based on self-supervision learning

Info

Publication number: CN115017368A
Application number: CN202210455958.4A
Authority: CN
Inventors: 鲍鹏; 李家年
Original assignee: Beijing Jiaotong University
Current assignee: Beijing Jiaotong University
Priority date: 2022-04-28
Filing date: 2022-04-28
Publication date: 2022-09-06

Abstract

The invention relates to a dynamic graph representation learning method based on self-supervision learning, which comprises the following steps: sampling the dynamic graph to respectively obtain a space-time subgraph, a non-time sequence subgraph and a non-space subgraph; respectively obtaining node levels of the three subgraphs and representation of the graph level by using a space-time weight encoder; designing time-space comparison learning, and defining a total loss function from the angles of time and space respectively; and (3) completing the training of a space-time weight encoder through a total loss function, finally sending all nodes in the dynamic graph into the trained space-time weight encoder to obtain the node representation of the dynamic graph, and applying the node representation to tasks such as classification, recommendation, link prediction and the like in the application scene of the dynamic graph model. The invention integrates the time and space information of the dynamic graph and enhances the performance of the dynamic graph representation method; has good interpretability, verifiability, characterization performance and mobility. And the invention can be applied to the e-commerce network in reality to predict the goods which the user may purchase.

Description

Dynamic graph representation learning method based on self-supervision learning

Technical Field

The invention belongs to the field of graph representation learning, and particularly relates to a dynamic graph representation learning method based on self-supervision learning.

Background

With the explosive growth of graph structure-related data, more and more scholars are engaged in research related to the field of graph representation learning. However, most existing supervised or semi-supervised graph representations rely heavily on data labels for learning methods. On the one hand, collecting data tags would cost a lot of manpower and material resources. On the other hand, models trained by using data labels are poorly generalizable and less robust. This has caused a great deal of trouble in current research, and in recent years, there have been many studies on learning of a self-supervision map representation to solve the above-mentioned problems.

The idea of learning represented by the self-supervision diagram is that a specific task is designed, dependency of a model on a data label is relieved, and generalization capability and robustness of the model are further improved. For example: the DGI model, a first self-supervised graph representation learning approach, generates a representation of a node by maximizing the mutual information between the node representation and the global graph representation. MVGRL is a multi-view-angle, self-supervised graph representation model whose goal is to maximize node characterization in the original graph and node characterization in the diffusion graph. The Sub-Con model maximizes mutual information between the node representation and the Sub-graph representation to learn the node representation. Unlike previous work, is the Grace model, which focuses primarily on node-level consistency. The method generates a new visual angle through a disturbance function, and learns node representation by maximizing the representation of the same node under different views.

However, existing self-supervision graphs represent that the learning model ignores the evolution process on the graph. Specifically, the existing self-supervision graph shows that the learning model discards time information on the graph for simplifying the operation, which leads to the failure of the model to capture the evolution process on the graph, and further leads to calculation errors. Through research, we find that the relevant research of learning dynamic graph representation based on an auto-supervision mode is very little.

The method disclosed by the invention integrates the thought of the self-supervision learning, namely, a dynamic graph representation learning method based on the self-supervision learning is researched. The overall idea is to define a comparative learning task which takes time and space into consideration, and the model completes the task to capture the time and space characteristics of the dynamic graph. Specifically, the method firstly defines three subgraphs, namely a spatio-temporal subgraph, a non-time sequence subgraph and a non-space subgraph. Furthermore, the invention comprises a spatio-temporal weight encoder that aggregates the temporal and spatial information of the sub-pictures. Finally, the method is applied to an e-commerce scene to predict the commodities which are possibly purchased by the user.

Disclosure of Invention

The invention provides a dynamic graph representation learning method based on self-supervision learning. Firstly, the method takes any node as a central node in a graph, and respectively carries out space-time random walk and random walk, and the sampled subgraphs are respectively defined as a space-time subgraph and a non-time sequence subgraph. In addition, random space-time walking is carried out by taking any other node as a central node, and the sampled subgraph is called a non-space subgraph; secondly, capturing time and space information in the graph by using a space-time weight encoder to obtain node characteristics and sub-graph characteristics of corresponding nodes; and finally, the characterization of the node is learned by maximizing mutual information between the node characterization and the subgraph characterization through a contrast learning method. The specific technical scheme is as follows:

a dynamic graph representation learning method based on self-supervision learning comprises the following steps:

s1, sampling any node in the graph as a central node to obtain a space-time subgraph, a non-time sequence subgraph and a non-space subgraph respectively;

s2, capturing time and space information in the graph by using a space-time weight encoder to obtain node characteristics of corresponding nodes and graph level characteristics of the subgraph;

s3, learning the node representation by comparing the mutual information between the maximum node representation and the graph level representation of the subgraph in a learning method, and respectively defining a total loss function from the aspects of time and space;

and S4, completing the training of the space-time weight encoder through a total loss function, and finally sending all the nodes in the dynamic graph into the trained space-time weight encoder to obtain the node representation of the dynamic graph.

On the basis of the above technical solution, the specific steps of step S1 are: given a dynamic graph

Where V represents the set of edges and,

is a set of a series of edges, where there is an edge E ═ (v, w, t) ∈ E _t Representing the edges generated by node v and node w at time t,

representing a function that can map each edge to a timestamp. The invention hopes to acquire a space-time subgraph G taking a node v as a central node in a dynamic graph G _v 。

For spatio-temporal subgraph G _v The sampling is mainly divided into the following two steps: initial edge selection and spatio-temporal random walk.

In the initial edge selection step, sampling is performed with node v as the center node. The samples include unbiased samples and biased samples.

For unbiased sampling, define all edge sets around node v as Ψ _v The initial edge is sampled using a uniformly distributed sampling approach. The edge e ∈ Ψ (v, w, t) ∈ Ψ _v The probability of being selected is defined as:

for biased sampling, it is assumed that all edges around node v are selected with different probabilities. Biased sampling will assign different sampling probabilities to edges depending on when they occur. Specifically, the edge e ═ v, w, t ∈ Ψ _v The probability of being selected is defined as:

wherein e' is as much as Ψ _v Representing any edge connected to the central node v,

the time corresponding to the edge e' is,

represents the time corresponding to the edge e, and t _min Is the minimum time associated with an edge in the dynamic graph. Biased sampling is to encourage sampling of later occurring edges.

The task mainly performed by the step of spatio-temporal random walk is to select the next node. The selection of the next node may be selected from the set of spatio-temporal neighbors Γ. The set of spatiotemporal neighbors may be defined as:

wherein, gamma is _t (v) Represents the set of spatio-temporal neighbors of node v at time t, w represents the nodes adjacent to node v, and t' represents the time on the edge between node w and node v.

Similarly, the invention has two space-time random walk strategies, which can be defined as biased space-time random walk and unbiased space-time random walk.

For unbiased space-time random walk, given an arbitrary edge e ═ (u, v, t), the space-time neighbor w ∈ Γ of node v at time t _t (v) The probability of being selected is as follows:

for biased space-time random walk, the space-time neighbor w epsilon Γ of the node v at the time t _t (v) The probability of being selected is:

wherein w ' represents any space-time neighbor of the node v at the time t, δ (w) is the time corresponding to the node w, and δ (w ') is the time corresponding to the node w '.

Furthermore, we refer to the idea of random walk, i.e. sampling without taking into account the time factor when sampling. Random walk and sampling non-time sequence subgraph with node v as center node

And performing space-time random walk by taking the node o (wherein the node o and the node v are different nodes) as a central node to obtain a non-space subgraph G _o 。

On the basis of the above technical solution, the specific steps of step S2 are: by using a space-time weight encoder, a characterization of all nodes is obtained.

Specifically, a spatio-temporal subgraph G is obtained _v The spatio-temporal weight encoder is then used to output a representation of each node in the spatio-temporal subgraph. While the main goal of the spatio-temporal weight encoder is to avoid the so-called characterization obsolescence problem. The spatio-temporal weight encoder includes a temporal aggregator and a spatial aggregator.

For the time aggregator, when an event occurs involving the node itself, the timing of node v at time t characterizes z _v (t) is expressed as:

z _v (t)＝LSTM(f _(v，w) (t)，h _v (t ^- ))

where LSTM is represented as the LSTM model, f _(v，w) (t) represents the interaction feature between nodes v and w at time t, h _v (t ^- ) Is the time that node v last updated.

For the spatial aggregator, this component is used to incorporate characterization information of spatio-temporal neighbors, which can be assigned different weights according to different times of occurrence of neighboring nodes. In particular, the spatial characterization of node v at time t

The definition is as follows:

wherein w ∈ Neighbor representsA node w connected to the node v,

representing the calculation of the current time t and the time t _w Time coding of the spacing, z _w (t) is a time-series representation of node W at time t, W ₁ Are learnable parameters.

Finally, we characterize z by aggregation timing _v (t) and spatial characterization

Obtaining the node characterization h of the node v at the time t _v (t)：

Wherein, W ₂ Are learnable parameters.

Notably, the graph-level characterization s of the spatio-temporal subgraph _v Is obtained by a readout function.

Similarly, non-time-series subgraphs can be obtained by a space-time weight coder

Graph level characterization of

And a non-spatial subgraph G _o Graph-level characterization s _o 。

The specific steps of step S3 are: and training the space-time weight encoder by comparing positive and negative samples. The specific comparison learning target is to distinguish a spatio-temporal subgraph, a non-time sequence subgraph and a non-spatial subgraph. For the selection of the positive sample, the node characterization of the spatio-temporal subgraph and the graph-level characterization of the spatio-temporal subgraph are formed into the positive sample.

Secondly, the invention maps the non-time sequence

Graph level characterization of

And spatio-temporal subgraph G _v Characterization of nodes of h _v Time negative samples are formed, called time negative examples. The loss function for a negative sample of time is defined as:

wherein sigma represents a sigmoid function, phi is a marginal value,

representing an expectation function, s _v For graph-level characterization of spatio-temporal subgraphs, h _v For node characterization of node v in the spatio-temporal subgraph,

graph-level characterization for non-time-series subgraphs.

Furthermore, a non-spatial subgraph G _o Graph-level characterization s _o And spatio-temporal subgraph G _v Characterization of nodes of h _v Constituting a spatial negative example. The loss function for a spatial negative example is defined as:

where, σ is the sigmoid function,

is the value of the margin at which,

representing an expectation function, s _v For graph-level characterization of spatio-temporal subgraphs, h _v For node characterization of node v in a spatio-temporal subgraph, s _o A graph-level characterization of a non-spatial subgraph.

The total loss function of the space-time weight encoder is therefore:

where λ is the balance parameter of the total loss function.

The invention has the beneficial effects that:

in order to more efficiently mine time and space information contained in a dynamic graph, the application intends to fuse the time and space information on the graph in the process of learning the dynamic graph representation, and provides a learning method for the dynamic graph representation based on self-supervision learning, wherein the method has good interpretability and excellent characterization performance and has the following advantages:

(1) the interpretability is high: the invention samples the dynamic graph to respectively obtain a space-time subgraph, a non-time sequence subgraph and a non-space subgraph, designs a comparison learning task based on the three subgraphs, and finishes capturing time and space information. The scheme has good interpretability.

(2) Strong mobility: the space-time sampling graph method is relatively independent, and the generated subgraph can be used as an auxiliary graph of other dynamic graph representation learning methods and has strong mobility.

(3) The characterization performance is excellent: the method is used for capturing time and space information on the dynamic graph, and experiments show that the method greatly improves the performance of tasks such as classification, recommendation, prediction and the like in the graph model application scene.

Drawings

The invention has the following drawings:

FIG. 1 is a schematic diagram of a learning framework based on a dynamic graph representation of an auto-supervised learning;

fig. 2 is a schematic diagram of a spatiotemporal weight encoder.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

The invention provides a dynamic graph representation learning method (hereinafter referred to as a method) based on self-supervision learning, which is shown in figure 1.

The invention mainly takes the E-commerce field as the background, the commodity and the user are nodes in the graph, and the interaction between the user and the commodity is an edge in the graph. The specific interaction can be understood as the following operations: browsing, collecting and purchasing. Finally, our invention will get the user and commodity characterization and predict the commodities the user may buy based on the characterization.

Firstly, the method takes any node as a central node in a graph and respectively carries out space-time random walk and random walk, and the sampled subgraphs are respectively defined as a space-time subgraph and a non-time sequence subgraph. In addition, random space-time walking is carried out by taking any other node as a central node, and the sampled subgraph is called a non-space subgraph; secondly, capturing time and space information in the graph by using a space-time weight encoder to obtain node characteristics and sub-graph characteristics of corresponding nodes; and finally, the characterization of the node is learned by maximizing mutual information between the node characterization and the subgraph characterization through a contrast learning method. The specific technical scheme is as follows:

(1) generating subgraphs

As shown in FIG. 1, a dynamic graph is given

Where V represents a set of edges and,

representing a function that can map each edge to a timestamp. The invention hopes to acquire a space-time subgraph G taking a node v as a central node in a dynamic graph G _i 。

For spatio-temporal subgraph G _i The sampling is mainly divided into the following two steps: initial edge selection and spatio-temporal random walk.

In the initial edge selection step, sampling is performed by taking the node v as a center node. There are two ways in which this approach can be chosen, one is called unbiased sampling and the other is called biased sampling.

For unbiased sampling, define the set of all edges around node v as Ψ _v The initial edge is sampled using an evenly distributed sampling approach. The edge e ∈ Ψ (v, w, t) ∈ Ψ _v The probability of being selected is defined as:

wherein e' ∈ Ψ _v Representing any edge connected to the central node v,

the time corresponding to the edge e' is,

represents the time corresponding to the edge e and t _min Is the minimum time associated with an edge in the dynamic graph. Biased sampling is to encourage sampling of later occurring edges.

The task mainly completed by the step of space-time random walk is to select the next node. The selection of the next node may be selected from the set of spatio-temporal neighbors Γ. The set of spatiotemporal neighbors may be defined as:

Similarly, the invention also has two space-time random walk strategies, which can be defined as biased space-time random walk and unbiased space-time random walk.

For unbiased spatio-temporal random walk, given an arbitrary edge e ═ u, v, t, the spatio-temporal neighborhood w ∈ Γ of node v at time t _t (v) The probability of being selected is as follows:

(2) Space-time weight encoder

In order to capture the temporal and spatial characterization of the nodes, the present invention designs a spatio-temporal weight encoder, as shown in fig. 2. Specifically, in obtaining G _v The spatio-temporal weight encoder is then used to output a representation of each node in the spatio-temporal subgraph. And temporal and spatial weight reprogrammingThe main goal of the coders is to avoid the so-called characterization obsolescence problem. It contains two components, called temporal aggregator and spatial aggregator.

z _v (t)＝LSTM(f _(v，w) (t)，h _v (t ^- ))

The definition is as follows:

wherein w ∈ Neighbor denotes a node w connected to the node v,

representing the calculation of the current time t and the time t _w Temporal coding of the spacing, z _w (t) is a time-series representation of node W at time t, W ₁ Are learnable parameters.

Obtaining the representation h of the node v at the time t _v (t)：

Wherein, W ₂ Are learnable parameters. Notably, the graph-level characterization s of the spatio-temporal subgraph _v Is obtained by a readout function.

Graph level characterization of

And a non-spatial subgraph G _o Graph-level characterization s _o 。

(3) Spatio-temporal contrast learning

And training the space-time weight encoder by comparing positive and negative samples. The specific contrast learning objective is to distinguish spatio-temporal subgraphs, non-temporal subgraphs and non-spatial subgraphs. For the selection of the positive sample, the invention forms the positive sample by the central node representation of the spatio-temporal subgraph and the graph level representation of the spatio-temporal subgraph.

Secondly, the invention maps the non-time sequence

Graph level characterization of

And spatio-temporal subgraph G _v Characterization of the center node of h _v Time negative examples are formed and are called time negative examples. The loss function for a time negative example is defined as:

wherein, sigma represents sigmoid function, phi is marginal value,

representing an expectation function, s _v For graph-level characterization of spatio-temporal subgraphs, h _v For the node v characterization in the spatio-temporal subgraph,

graph-level characterization for non-time-series subgraphs.

Furthermore, a non-spatial subgraph G _o Graph characterization s of _o And spatio-temporal subgraph G _v Characterization of the center node of h _v Constituting a spatial negative example. The loss function for a spatial negative example is defined as:

where, σ is the sigmoid function,

is the value of the margin at which,

representing an expectation function, s _v For graph-level characterization of spatio-temporal subgraphs, h _v For the characterization of a node v in a spatio-temporal subgraph, s _o Is a graph-level characterization of a non-spatial subgraph. The total loss function of the space-time weight encoder is therefore:

where λ is the equilibrium parameter of the total loss function.

And S4, completing the training of the space-time weight encoder through a total loss function, and finally sending all nodes in the dynamic graph into the trained space-time weight encoder to obtain the node representation of the dynamic graph. By applying the method to the E-commerce scene, the commodities which are possibly purchased by the user can be predicted.

1. In order to depict the time and space information of the dynamic graph and more efficiently model the original graph, the invention provides a method for utilizing a comparative learning thought and further enriching the time and space information in the learning process of the dynamic graph representation by sampling subgraphs on the dynamic graph as auxiliary graphs, and the method has good interpretability.

2. Aiming at the characteristic that nodes and edges in a subgraph are dynamically changed, the invention provides a space-time weight encoder to capture the time and space characteristics in the dynamic graph.

3. The space-time sampling graph method is relatively independent, and the generated subgraph can be used as an auxiliary graph of other dynamic graph representation learning methods and has strong mobility.

4. The method is used for capturing the time and space information on the dynamic graph, and experiments show that the method can be applied to e-commerce scenes to predict commodities purchased by users.

The examples given herein are given solely for the purpose of illustrating the invention and are not to be construed as limiting the embodiments of the invention, as those skilled in the art will be able to make other variations and modifications based on the above teachings, which are not exhaustive of all embodiments and obvious variations and modifications of the invention as may be within the scope of the appended claims.

Those not described in detail in this specification are within the skill of the art.

Claims

1. A dynamic graph representation learning method based on self-supervision learning is characterized by comprising the following steps:

2. The method for learning a dynamic graph representation based on the self-supervised learning as claimed in claim 1, wherein the step S1 comprises the following steps: given a dynamic graph

Where V represents the set of edges and,

is a set of a series of edges, where there is an edge E ═ v, w, t ∈ E in the graph _t Representing the edges generated by node v and node w at time t,

represents a function that can map each edge to a timestamp;

time-space diagram G _v The sampling comprises the following two steps: initial edge selection and spatio-temporal random walk.

3. The self-supervised learning based dynamic graph representation learning method as claimed in claim 2, wherein: in the initial edge selection step, sampling is carried out by taking the node v as a central node; the sampling comprises unbiased sampling and biased sampling;

for unbiased sampling, define all edge sets around node v as Ψ _v Sampling the initial edge by using a uniformly distributed sampling mode; the edge e ═ v, w, t ∈ Ψ _v The probability of being selected is defined as:

for biased sampling, all edges around a node v are determined to have different probabilities to be selected; the biased sampling allocates different sampling probabilities to the edge according to the time of the edge; in particular toIn other words, the edge e ═ (v, w, t) ∈ Ψ _v The probability of being selected is defined as:

wherein e' ∈ Ψ _v Representing any edge connected to the central node v,

the time corresponding to the edge e' is,

indicates the time, t, corresponding to the edge e _min Is the minimum time associated with an edge in the dynamic graph.

4. The self-supervised learning based dynamic graph representation learning method as claimed in claim 2, wherein: the main task of the space-time random walk step is to select the next node; the next node is selected from the space-time neighbor set gamma; the set of spatio-temporal neighbors is defined as:

wherein, gamma is _t (v) Representing a spatio-temporal neighbor set of a node v at time t, w representing a node adjacent to the node v, and t' representing time on an edge between the node w and the node v;

the space-time random walk comprises biased space-time random walk and unbiased space-time random walk;

5. The self-supervised learning based dynamic graph representation learning method as claimed in claim 2, wherein: the idea of random walk is introduced, and sampling is carried out without considering time factors during sampling; random walk and sampling non-time sequence subgraph with node v as center node

And performing space-time random walk by taking the node o as a central node to obtain a non-space subgraph G _o 。

6. The method for learning a dynamic graph representation based on the self-supervised learning as claimed in claim 1, wherein the step S2 comprises the following steps: the space-time weight encoder comprises a time aggregator and a space aggregator;

z _v (t)＝LSTM(f _(v,w) (t),h _v (t ^- ))

where LSTM is represented as the LSTM model, f _(v,w) (t) represents the interaction feature between nodes v and w at time t, h _v (t ^- ) Is the last update time of node v;

the space aggregator is used for merging the representation information of the space-time neighbors and distributing different weights according to different occurrence times of the neighbor nodes; spatial characterization of node v at time t

The definition is as follows:

wherein w ∈ Neighbor denotes a node w connected to the node v,

represents calculating the current time t and the time t _w Temporal coding of the spacing, z _w (t) is a time-series representation of node W at time t, W ₁ Is a learnable parameter;

finally, characterization of z by polymerization timing _v (t) and spatial characterization

Obtaining the node characterization h of the node v at the time t _v (t)：

Wherein, W ₂ Is a learnable parameter;

non-time series diagram

Graph level characterization of

And a non-spatial subgraph G _o Graph level characterization s _o Obtained by a space-time weight encoder; graph-level characterization s of spatio-temporal subgraphs _v Obtained by the readout function.

7. The self-supervised learning-based dynamic graph representation learning method as claimed in claim 1, wherein the specific steps of step S3 are as follows: training a space-time weight encoder by comparing positive and negative samples; the comparison learning target is to distinguish a space-time subgraph, a non-time sequence subgraph and a non-space subgraph; for the selection of the positive sample, constructing the node characterization of the spatio-temporal subgraph and the graph level characterization of the spatio-temporal subgraph into the positive sample;

will not be time sequential subgraph

Graph level characterization of

And spatio-temporal subgraph G _v Characterization of nodes of h _v Forming a time negative example, the loss function of which is defined as:

wherein, sigma represents sigmoid function, phi is marginal value,

graph-level characterization for non-temporal subgraphs;

non-spatial subgraph G _o Graph-level characterization s _o And spatio-temporal subgraph G _v Characterization of nodes of h _v Forming a space negative sample; the loss function for a spatial negative example is defined as:

where, σ is the sigmoid function,

is a value of a margin, which is,

representing an expectation function, s _v For graph-level characterization of spatio-temporal subgraphs, h _v For node characterization of node v in a spatio-temporal subgraph, s _o Graph-level characterization for non-spatial subgraphs;

the total loss function of the space-time weight encoder is:

where λ is the equilibrium parameter of the total loss function.