CN114124729A

CN114124729A - Dynamic heterogeneous network representation method based on meta-path

Info

Publication number: CN114124729A
Application number: CN202111393567.6A
Authority: CN
Inventors: 谭洪胜; 刘群; 袁铭; 王国胤
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2021-11-23
Filing date: 2021-11-23
Publication date: 2022-03-01

Abstract

The invention belongs to the field of graph network representation learning, and particularly relates to a dynamic heterogeneous network representation method based on a meta-path, which comprises the steps of constructing a dynamic heterogeneous network, and sampling different time weighting meta-path sequences from the network according to a time weighting meta-path; preprocessing the vector of the network node, and aggregating the information of the network node sequence of each element path after preprocessing through a GRU (generalized regression Unit); coding the time of the node sequence by adopting a relative time coding technology; acquiring deep-level characteristic information of the node sequence by adopting a Bi-GRU method, and aggregating time characteristics and structural characteristics; interacting characteristic information of different sequences at different time by using a Bi-GRU with an attention mechanism to obtain final representation of the node; the method can adapt to the node learning task and the dynamic evolution of the network under the dynamic heterogeneous network, and the downstream task can classify and cluster the nodes, thereby effectively improving the learning and representing capability of the nodes of the graph network.

Description

Dynamic heterogeneous network representation method based on meta-path

Technical Field

The invention belongs to the field of graph network representation learning, and particularly relates to a dynamic heterogeneous network representation method based on a meta path.

Background

Network data is generally unstructured data, and it is difficult to mine information in the network directly using machine learning models. Network representation learning is to map high-dimensional sparse graph data to a low-dimensional space, and meanwhile, retain structural information in a network, so that low-dimensional dense structural vector representation is obtained.

At present, the study on static network representation learning and dynamic homogeneous network representation learning is relatively mature, but the dynamic heterogeneous network study which is closer to the actual network is in the beginning stage, and it is necessary to discuss and study the dynamic heterogeneous network study.

Network representation learning is to represent nodes in a network into low-dimensional vectors while preserving network structure and semantic information so as to facilitate subsequent tasks of graph mining, such as link prediction, node classification, clustering and the like. Many existing heterogeneous network representation learning methods are designed for static heterogeneous networks, and the characteristics that network space information (topological structure and attributes) changes along with time are ignored, and only spaces (structures) corresponding to different times are simply compressed together. Since the network is constantly changing over time, two nodes that have no relation in the previous second may be associated in the next second, and the establishment of links between nodes obviously changes the topology (space) of the network. Therefore, only considering the static processing mode does not conform to the evolution rule of the actual network. And the other type of dynamic homogeneous network representation learning method does not consider the difference of network nodes and links, and if the method is directly applied to a dynamic heterogeneous network, some semantic information is inevitably lost.

Disclosure of Invention

In order to enable downstream tasks to be capable of classifying, clustering, visualizing and the like butt nodes and effectively improve the learning and expression capability of nodes of a dynamic heterogeneous graph network, the invention provides a dynamic heterogeneous network expression learning method based on a meta-path, a DHNR model is constructed to perform expression learning on the network nodes, the DHNR model comprises GRUs and Bi-GRUs with attention mechanisms, and the process of acquiring the node expression vectors by the DHNR model specifically comprises the following steps:

s1: the time for establishing the link of the network node is used as the link weight and is reserved in the network to construct a dynamic heterogeneous network;

s2: sampling different time weighting element path sequences from the network according to the time weighting element paths;

s3: preprocessing a network to obtain an initial vector of each node, inputting each time weighted meta-path sequence and the vector thereof, and aggregating information of the network node sequence of each meta-path through a GRU (general packet Unit);

s4: a relative time coding technology is adopted as a time coder to code the time of the node sequence;

s5: acquiring deep-level characteristic information of the node sequence by adopting a Bi-GRU method, and aggregating time characteristics and structural characteristics;

s6: and interacting the characteristic information of different sequences at different times by using a Bi-GRU with an attention mechanism to obtain a final representation of the node.

Further, the dynamic heterogeneous network is represented as G ═ V, E, T, where V represents a node set, E represents a link set, T represents a time set on a link, and a mapping function between a node and a node type in the dynamic heterogeneous network is represented as

A represents a node type set; e → R, R represents the link type set, in the dynamic heterogeneous network, | A | + | R | > 2, a time link in the dynamic heterogeneous network is represented as (i, j, t) and (i, j, t) ∈ E, the time link (i, j, t) represents that the node i is connected to the node j at the moment t, | A | represents the number of the node types in the node type set, and | R | represents the number of the link types in the link type set.

Further, sampling different time-weighted meta-path sequences from the network according to the time-weighted meta-path comprises the steps of:

constructing a time weighting path, namely, obtaining a time weighting meta-path corresponding to the type meta-path for one meta-path under the guidance of a time attribute value;

acquiring a time-weighted meta-path sequence set, namely acquiring time-weighted meta-paths of each type of meta-paths under different time attribute values of T, and representing the time-weighted meta-path sequence set as

Wherein k is the type number of the element path; t is a time attribute value set, | T | is the number of time attributes in the time attribute value set;

represents a time-weighted meta-path of the kth type under the guide of the | T | time attribute value.

Further, preprocessing the vector of the network node comprises:

setting a transformation matrix for each type of nodes, so that each type of nodes is projected into the same feature space through the transformation matrix, wherein the feature space contains features in all types of node feature spaces, and the node type is phi_iNode n of_iThe projection process is represented as:

wherein x is_iIs a node n_iThe original characteristics of (a);

is a node n_iThe projection feature of (a);

is node type is phi_iA transformation matrix of the node of (1).

Further, aggregating information of the sequence of network nodes of each meta-path by the GRU includes the following processes:

aggregating the different time weighted element path information of the node to form a primary vector representation, and then the node n₀The polymerization process of (a) comprises:

wherein the content of the first and second substances,

is a node n_iThe projection feature of (a); phi (n)_i) Is a node n_iNode type of n_iIs a node n₀M +1-a hop neighbor nodes on the time weighted element path sequence;

the hidden state of the m +1 hop neighbor node of the target node in the sequence;

representing a node n_iThe hidden state of the layer a is output through the GRU, and a is more than 0 and less than or equal to m1+, and m is the number of sequence links; a. the_m+1The type of the m +1 hop neighbor of the target node; a. the_m+1-aIs m +1-a hop neighbor type;

hidden information indicating that a-1 layer hidden layer is output

Node n with input GRU_iProjection feature of

Polymerization is carried out.

Further, the hidden information output by the a-1 layer hidden layer

Node n with input GRU_iProjection feature of

Carrying out the polymerization, namely:

wherein A is_z、A_r、A_hAnd B_z、B_r、B_hIs a model parameter of the GRU; z is a radical of_i、r_iIs an update gate and a forget gate in the GRU, and the value ranges thereof are both [0,1]；

Representing element-by-element multiplication;

the new state after forgetting the input vector and the state vector at the previous moment; sigma is an activation function;

is the final state processed by one GRU unit.

Further, the time of the node sequences is encoded, that is, a set of fixed sine functions is defined as the time offset to encode the time of each sequence, and the encoding process includes:

wherein Base (t), Base (t,2i) and Base (t,2i +1) are time bias functions,

for the time vector at time T, T _ Linear is a Linear fine tuning function, and d is a vector dimension.

Further, aggregating the temporal and structural features comprises:

wherein the content of the first and second substances,

represents a feature obtained by polymerizing a temporal feature and a structural feature;

is the characteristic of the node, when i is st, it represents the time characteristic of the node, when i is RT (t), it represents the semantic characteristic of the node; mean () represents a function of averaging the array, averaging the hidden states of the bidirectional GRU;

representing a representation of a bi-directional GRU.

Further, the step of using the Bi-GRU with attention mechanism to interact with different sequences of feature information at different times comprises the following steps:

and (3) interacting the characteristic information at different times by using the Bi-GRU to simulate the evolution of the network, namely:

reference attention mechanism to capture important feature vectors, where node n_iAttention weight α_iThe calculation formula of (a) is as follows:

using the calculated attention value to connect the node n₀All state features of (a) are aggregated into a final vector representation of the node:

wherein O ═ { O ═ O₁,o₂,....,o_k|T|Is the set of state vector sequences of Bi-GRU, o_k|T|Represents the k × | T | state vectors; leak ReLU is activation function, a belongs to R^1×2d'For the attention parameter, d is the representative dimension of the vector, and the specific value is determined by experiments.

Furthermore, the structural features of the sequence are obtained through GRUs, then network node semantics and time features are obtained through Bi-GRUs with an attention mechanism, in the representing training process of the network, cross entropies minimizing real values and predicted values of all marked nodes are used as loss functions, the model is optimized through a gradient descent algorithm, and the loss functions are represented as follows:

wherein the content of the first and second substances,

representing a loss function; l represents a label; c is a classifier parameter, y_lIs an index set of node labels; y is^lIs a node tag value, u_lLabel values are predicted for the node feature vectors.

The invention provides a method for representing and learning a dynamic heterogeneous network aiming at time information contained in the dynamic heterogeneous network and rich semantic information brought by different nodes and links. The timing information of the network is well captured using a bi-directional gated round robin unit with an attention mechanism. The method can generate the representation of the dynamic heterogeneous network nodes, downstream tasks can be classified and clustered by connecting nodes, and the learning and representation capabilities of the dynamic heterogeneous network nodes are effectively improved.

Drawings

FIG. 1 is a flow chart of a meta-path based dynamic heterogeneous network representation learning method according to the present invention;

FIG. 2 is a diagram of an embodiment of a meta-path based dynamic heterogeneous network representation learning method of the present invention;

FIG. 3 is a model framework diagram of a dynamic heterogeneous network representation learning method based on meta-paths according to the present invention;

fig. 4 is a schematic diagram of medium GRU polymerization.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention provides a dynamic heterogeneous network representation method based on a meta-path, which is characterized in that a DHNR model is constructed to carry out representation learning on network nodes, the DHNR model comprises GRUs and Bi-GRUs with attention mechanisms, and the process of acquiring node representation vectors by the DHNR model specifically comprises the following steps:

s3: preprocessing the vector of the network node for the input sequence to obtain an initialization vector, and aggregating the information of the network node sequence of each meta-path through a GRU;

The invention provides a DHNR model, which comprises the steps of firstly dividing a neighborhood structure of a node into different subspace structures according to time, sampling sequences of all time weighting element paths for each node, integrating neighborhood information on all time weighting element path sequences of the node through a gating circulation unit (GRU), and finally learning space-time context information on the fused node sequence by using a bidirectional gating circulation unit (Bi-GRU) with an attention mechanism to obtain a final expression vector of each node. In this embodiment, as shown in fig. 1, fig. 2, and fig. 3, the structural neighborhood of each node is divided into weighted element path sequences at different times according to time; aggregating neighbor information in each temporal weighted meta-path sequence by a gated round robin unit (GRU); filtration screening was performed using a Bi-directional gated loop unit (Bi-GRU) with attention mechanism.

In the process of constructing the dynamic heterogeneous network, the time of establishing the link by the node is taken as a link weight and is reserved in the network, and the dynamic heterogeneous network is represented as G ═ V, E and T, wherein V represents a node set, E represents a link set, and T represents a time set on the link.

The mapping function between nodes and node types in the network is

The mapping function between links and link types is psi E → R, where A and R represent node type and link type, dynamic heterogeneousIn the network, | A | + | R | > 2, each time link (i, j, t) ∈ E indicates that node i is connected to node j at time t.

For a given node type of

Node of (2), weighting meta-path in time

Under the guidance of (2), obtaining a node n₀At a time attribute value of t₀A time-weighted meta-path sequence of (a); using a plurality of meta-paths P of different types₁，P₂,...,P_kSetting T different time attribute values for each meta-path to construct different types of time weighted meta-paths, and finally obtaining a time weighted meta-path sequence set for each node

Nodes of the same type have different feature spaces for each type of node (e.g., node type of φ_i) Designing transformation matrices of a particular type

Thereby projecting the features of different types of nodes into the same feature space. The specific type of transformation matrix is based on the node type, and the transformation process is as follows:

wherein x_iAnd

are respectively a node n_iThe original features and the projected features of (1).

And respectively aggregating the different time weighted element path information of each node to form a preliminary vector representation. As in fig. 4, for n₀Time weighting ofAll neighbor information and self information in the meta-path sequence are aggregated, and the basic recursion process of the GRU aggregation module on the time weighting meta-path sequence is as follows:

wherein a is more than 0 and less than or equal to m +1,

representing a node n_iHidden state output by GRU, node n_iIs node n₀And m +1-a hop neighbor nodes on the time weighted element path sequence.

The formula is expressed as:

wherein A is_z，A_r，A_hAnd B_z，B_r，B_hIs a parameter, z_i，r_iIs an update gate and a forget gate, and the value ranges thereof are both [0,1]；

Is an element-by-element multiplication.

After m +1 times of propagation on the structural neighborhood of the time-weighted element path sequence, the node n₀Weighting meta-paths in time

The state vector output above can be obtained by the following equation:

wherein

(d': represents a dimension).

When the relative time coding technology is used as a time coder to code the time of the node sequence, a group of fixed sine functions are defined as time offset to code the time of each sequence:

where t is the time attribute value of the time weighted meta-path sequence,

the time characteristics of the nodes are subjected to T-Linear fine tuning transformation.

Aggregating the temporal features and the structural semantics into the aforementioned node feature vector, namely:

wherein

Is a temporal feature or a semantic structural feature of the node. Taking it as the input of a bidirectional gated cyclic unit, and then applying a Mean function to the node n₀All state features are averaged to obtain a time-coded vector representation of the node on a time-weighted element path sequence

The method comprises the following steps of interacting characteristic information of different sequences at different times by using a Bi-GRU with an attention mechanism to obtain a final representation of a node, and specifically comprising the following steps:

s61: and (3) interacting the characteristic information at different times by using the Bi-GRU to simulate the evolution of the network:

wherein { o₁,o₂,....,o_k|T|Is the set of Bi-GRU state vector sequences;

s62: an attention mechanism is referenced to capture important feature vectors with an attention weight alpha_iThe calculation formula of (a) is as follows:

wherein alpha is_iHigher, o_iThe more important. LeakReLU is an activation function, a belongs to R^1×2d’Is an attention parameter;

s63: using the calculated attention value to connect the node n₀All state features of (a) are aggregated into a final vector representation of the node:

wherein u is₀∈R^d’Is node n₀The final vector representation of (a);

s64: the cross entropy that minimizes the true and predicted values of all the marked nodes is used as a loss function:

where C is the classifier parameter, y_lIs an indexed set of node labels. Y is^lAnd u_lAre the node label value and the node feature vector prediction label value. And optimizing the model by a gradient descent algorithm.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A dynamic heterogeneous network representation method based on meta-path is characterized in that a DHNR model is constructed to carry out representation learning on network nodes, the DHNR model comprises GRUs and Bi-GRUs with attention mechanisms, and the process of obtaining node representation vectors by the DHNR model specifically comprises the following steps:

2. The meta-path-based dynamic heterogeneous network representation method of claim 1, wherein the dynamic heterogeneous network is represented by G ═ V, E, T, where V represents a set of nodes, E represents a set of links, T represents a set of times on links, and a mapping function between nodes and node types in the dynamic heterogeneous network is as follows

3. The meta-path based dynamic heterogeneous network representation method of claim 1, wherein sampling different time weighted meta-path sequences from the network according to the time weighted meta-path comprises the steps of:

4. The meta-path based dynamic heterogeneous network representation method of claim 1, wherein preprocessing the vector of the network node comprises:

wherein x is_iIs a node n_iThe original characteristics of (a);

is a node n_iThe projection feature of (a);

is node type is phi_iA transformation matrix of the node of (1).

5. A meta-path based dynamic heterogeneous network representation method according to claim 1, wherein aggregating information of the sequence of network nodes of each meta-path through GRUs comprises the following procedures:

wherein the content of the first and second substances,

representing a node n_iThe hidden state of the a-th layer is output through the GRU, a is more than 0 and less than or equal to 1m +, and m is the number of sequence links; a. the_m+1The type of the m +1 hop neighbor of the target node; a. the_m+1-aIs m +1-a hop neighbor type;

hidden information indicating that a-1 layer hidden layer is output

Node n with input GRU_iProjection feature of

Polymerization is carried out.

6. The meta-path based dynamic heterogeneous network representation method according to claim 5, wherein the hidden information outputted from the a-1 layer hidden layer

Node n with input GRU_iProjection feature of

Carrying out the polymerization, namely:

Representing element-by-element multiplication;

for the input vector and the previous timeThe state vector carries out a new state after forgetting; sigma is an activation function;

is the final state processed by one GRU unit.

7. A meta-path based dynamic heterogeneous network representation method according to claim 1, wherein the time of the node sequences is encoded, that is, a fixed set of sine functions is defined as time offsets to encode the time of each sequence, and the encoding process comprises:

wherein Base (t), Base (t,2i) and Base (t,2i +1) are time bias functions,

8. The meta-path based dynamic heterogeneous network representation method of claim 1, wherein aggregating temporal features and structural features comprises:

wherein the content of the first and second substances,

representing a representation of a bi-directional GRU.

9. The meta-path-based dynamic heterogeneous network representation method of claim 1, wherein the step of using Bi-GRUs with attention mechanism to interact with different sequences of feature information at different times comprises the steps of:

wherein O ═ { O ═ O₁,o₂,....,o_k|T|Is the set of state vector sequences of Bi-GRU, o_k|T|Represents the k × T state vectors; LeakReLU is an activation function, a belongs to R^1×2d'For the attention parameter, d is the representative dimension of the vector, and the specific value is determined by experiments.

10. The meta-path-based dynamic heterogeneous network representation method according to claim 1, wherein after the structural features of the sequence are obtained through the GRU, the semantic and temporal features of the network nodes are obtained through the Bi-GRU with attention mechanism, in the representation training process of the network, the cross entropy that minimizes the true and predicted values of all the labeled nodes is adopted as a loss function, and the model is optimized through a gradient descent algorithm, and the loss function is represented as:

wherein the content of the first and second substances,