CN111949892A

CN111949892A - A Multi-Relation Aware Temporal Interaction Network Prediction Method

Info

Publication number: CN111949892A
Application number: CN202010797094.5A
Authority: CN
Inventors: 陈岭; 余珊珊
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2020-08-10
Filing date: 2020-08-10
Publication date: 2020-11-17
Anticipated expiration: 2040-08-10
Also published as: CN111949892B

Abstract

The invention discloses a multi-relation-aware temporal interaction network prediction method, comprising: (1) taking the interaction in the temporal interaction network as a sample; (2) processing each interaction in sequence according to the interaction occurrence time, and based on historical interaction information Mining nodes with historical interaction, common interaction and interaction sequence similarity with interactive nodes, and constructs a local relationship graph before the current interaction for the interaction node; (3) According to the user's last interaction representation and through hierarchical The relationship-aware aggregation obtained by the user predicts the representation of the item before the current interaction based on the representation of the neighbors; (4) According to the representation of the interaction node after the last interaction, the time interval between the last interaction and the current interaction, and the neighbor-based representation, update the interaction node's representation. (5) After training the temporal interaction network prediction model, use the parameter-tuned temporal interaction network prediction model to predict items that users may interact with.

Description

A Multi-Relation Aware Temporal Interaction Network Prediction Method

技术领域technical field

本发明涉及时态交互网络预测领域，具体涉及一种多关系感知的时态交互网络预测方法。The invention relates to the field of temporal interaction network prediction, in particular to a multi-relation-aware temporal interaction network prediction method.

背景技术Background technique

在现实生活许多领域中，例如电子商务(顾客购买商品)、教育平台(学生参加慕课教程)和社交网络平台(用户在社区中发布帖子)，用户会在不同时间和不同物品发生交互，用户和物品之间的交互形成了时态交互网络。与静态交互网络相比，时态交互网络增加了对交互时间的关注。时态交互网络预测指在交互发生前预测用户会和哪个物品进行交互，对于商品推荐、课程推荐、社区推荐等任务具有重要意义。In many real-life areas, such as e-commerce (customers buy goods), educational platforms (students take MOOCs), and social networking platforms (users post in communities), users interact with different items at different times, and users The interaction with objects forms a temporal interaction network. Compared with static interaction networks, temporal interaction networks increase the focus on interaction time. Temporal interaction network prediction refers to predicting which item the user will interact with before the interaction occurs, which is of great significance for tasks such as product recommendation, course recommendation, and community recommendation.

现有基于时态交互网络的预测方法包括两类，一类是不基于图结构的预测方法，另一类是基于图结构的预测方法。不基于图结构的预测方法是指不以图结构而以矩阵或序列等其他形式表示用户和物品之间的交互，可以分为基于隐语义模型的预测方法和基于序列模型的预测方法。基于隐语义模型的预测方法在传统隐语义模型的基础上引入时间信息来建模用户兴趣和物品属性的变化，得到用户和物品的表示从而进行预测。然而，这类工作没有考虑用户和物品之间发生交互的顺序。在时态交互网络中往往存在着丰富的序列信息，为了利用这些信息，许多基于序列模型的预测方法被提出，然而，这些方法都利用物品静态的表示作为输入来更新用户的表示，忽略了物品的当前状态信息。此外，这些方法大部分只考虑了用户兴趣的动态变化，忽略了物品属性的动态变化。Existing prediction methods based on temporal interaction networks include two categories, one is a prediction method not based on graph structure, and the other is a prediction method based on graph structure. Prediction methods not based on graph structure refer to representing the interaction between users and items in other forms such as matrices or sequences instead of graph structures. They can be divided into prediction methods based on latent semantic models and prediction methods based on sequence models. The prediction method based on the latent semantic model introduces time information on the basis of the traditional latent semantic model to model the changes of user interests and item attributes, and obtains the representation of users and items for prediction. However, this type of work does not consider the order in which interactions between users and items occur. There is often rich sequence information in temporal interaction networks. In order to utilize this information, many prediction methods based on sequence models have been proposed. However, these methods all use the static representation of items as input to update the user's representation, ignoring the items. current status information. In addition, most of these methods only consider the dynamic changes of user interests and ignore the dynamic changes of item attributes.

为了挖掘到用户和物品交互中更加丰富的信息，许多基于图结构的预测方法被提出。传统基于图结构的预测方法虽然将时间段作为图中节点，但其本质上还是静态图，无法很好建模用户和物品属性的动态性。为了解决这一问题，许多基于时态交互网络嵌入的预测方法被提出。基于时态交互网络嵌入的预测方法对时态交互网络进行嵌入得到用户和物品的表示从而进行预测。根据嵌入时是否聚合邻居信息，基于时态交互网络嵌入的预测方法可分为不考虑邻居信息的预测方法和考虑邻居信息的预测方法。不考虑邻居信息的预测方法虽然建模了交互节点的属性变化，但忽略了邻居信息的影响。现有的考虑邻居信息的预测方法考虑邻居信息时，只将具有历史交互关系的节点作为邻居节点，忽略了历史交互信息中的其他关系类型(共同交互关系、交互序列相似关系等)。In order to mine richer information in user-item interactions, many prediction methods based on graph structure have been proposed. Although traditional prediction methods based on graph structure use time periods as nodes in the graph, they are essentially static graphs and cannot model the dynamics of user and item attributes well. To address this issue, many prediction methods based on temporal interaction network embeddings have been proposed. Prediction methods based on temporal interaction network embeddings embed temporal interaction networks to obtain representations of users and items for prediction. According to whether neighbor information is aggregated during embedding, prediction methods based on temporal interaction network embedding can be divided into prediction methods that do not consider neighbor information and prediction methods that consider neighbor information. Although the prediction method without considering neighbor information models the attribute changes of interacting nodes, it ignores the influence of neighbor information. When the existing prediction methods considering neighbor information consider neighbor information, only nodes with historical interaction relationship are regarded as neighbor nodes, ignoring other relationship types (common interaction relationship, interaction sequence similarity relationship, etc.) in the historical interaction information.

发明内容SUMMARY OF THE INVENTION

鉴于上述，本发明提供了一种多关系感知的时态交互网络预测方法，通过有效利用邻居信息提升时态交互网络预测的准确性。In view of the above, the present invention provides a multi-relation-aware temporal interaction network prediction method, which improves the accuracy of temporal interaction network prediction by effectively utilizing neighbor information.

本发明的技术方案为：The technical scheme of the present invention is:

一种多关系感知的时态交互网络预测方法，包括以下步骤：A multi-relation-aware temporal interaction network prediction method, comprising the following steps:

(1)以用户u_i和物品v_j在时刻t发生的交互(u_i,v_j,t)作为一个样本构建训练数据集，并对训练数据集进行分批；(1) The interaction (u _i , v _j , t) between user _ui and item v _j at time t is used as a sample to construct a training data set, and the training data set is divided into batches;

(2)对于交互(u_i,v_j,t)，基于历史交互信息挖掘与交互节点之间存在历史交互关系、共同交互关系和交互序列相似关系的节点，为交互节点u_i和v_j构建当前交互前的局部关系图

和

(2) For interaction (u _i , v _j , t), based on historical interaction information mining nodes with historical interaction relationship, common interaction relationship and interaction sequence similarity relationship between interaction nodes, and construct for interaction nodes u _i and v _j Local relationship graph before the current interaction

and

(3)根据局部关系图

和

通过层次化多关系感知聚合得到用户u_i基于邻居的表示

和物品v_j基于邻居的表示

(3) According to the local relationship diagram

and

Neighbor-based representation of user _ui is obtained through hierarchical multi-relation-aware aggregation

and neighbor-based representation of item v _j

(4)根据用户u_i上一次交互后的表示

和用户u_i基于邻居的表示

利用全连接层计算当前交互前物品v_j预测的表示

(4) According to the representation of user _ui after the last interaction

and the neighbor-based representation of user _ui

Using a fully connected layer to compute a representation of the prediction of the item v _j before the current interaction

(5)根据用户u_i和物品v_j上一次交互后的表示

和

上一次交互和当前交互的时间间隔

和

以及基于邻居的表示

和

利用两个循环神经网络层分别计算用户u_i和物品v_j当前交互后的表示

和

(5) According to the last interaction between user _ui and item v _j

and

The time interval between the last interaction and the current interaction

and

and a neighbor-based representation

and

Use two recurrent neural network layers to calculate the current interaction representation of user _ui and item v _j respectively

and

(6)根据当前交互前物品v_j预测的表示

和真实的表示

之间的误差、用户u_i正则化损失和物品v_j正则化损失，计算整体损失

根据批次中所有样本的损失

对时态交互网络预测模型中的网络参数进行调整，直到所有批次都参与了模型训练，所述时态交互网络预测模型包括步骤(2)～(6)用到的所有全连接层和循环神经网络层；(6) Representation predicted according to the current pre-interaction item v _j

and true representation

The error between, user _ui regularization loss and item v _j regularization loss, calculate the overall loss

Loss based on all samples in the batch

Adjust the network parameters in the temporal interaction network prediction model until all batches participate in model training, and the temporal interaction network prediction model includes all fully connected layers and loops used in steps (2) to (6). neural network layer;

(7)利用参数调优后的时态交互网络预测模型预测用户可能会发生交互的物品。(7) Use the temporal interaction network prediction model after parameter tuning to predict the items that users may interact with.

本发明基于历史交互信息挖掘节点之间的多关系，为交互节点构建当前交互前的局部关系图，通过层次化多关系感知聚合来考虑邻居节点根据不同关系类型传播过来的交互影响。与现有方法相比，其优点在于：The present invention mines multi-relationships between nodes based on historical interaction information, constructs a local relationship graph before current interaction for interaction nodes, and considers the interaction effects propagated by neighbor nodes according to different relationship types through hierarchical multi-relational perceptual aggregation. Compared with existing methods, its advantages are:

1)基于历史交互信息挖掘与交互节点之间存在历史交互关系、共同交互关系和交互序列相似关系的节点，为交互节点构建当前交互前的局部关系图，通过层次化多关系感知聚合得到交互节点基于邻居的表示来预测物品的表示以及更新交互节点的表示，考虑了节点之间的多关系，有效利用邻居信息，从而提升时态交互网络预测的准确性；1) Based on historical interaction information mining and nodes with historical interaction relationship, common interaction relationship and interaction sequence similarity relationship between interaction nodes, construct a local relationship graph before current interaction for interaction nodes, and obtain interaction nodes through hierarchical multi-relationship perception aggregation Predicting the representation of items and updating the representation of interactive nodes based on the representation of neighbors, considering multiple relationships between nodes, effectively utilizing neighbor information, thereby improving the accuracy of temporal interaction network prediction;

2)引入带注意力层的图神经网络，根据邻居节点传播过来的交互影响和节点之间的关系类型为邻居节点赋予相应权重，层次化地聚合根据不同关系类型传播过来的交互影响。2) Introduce a graph neural network with an attention layer, assign corresponding weights to neighbor nodes according to the interaction influences propagated by neighbor nodes and the relationship types between nodes, and hierarchically aggregate the interaction influences propagated according to different relationship types.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图做简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动前提下，还可以根据这些附图获得其他附图。In order to illustrate the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative efforts.

图1是实施例提供的多关系感知的时态交互网络预测方法整体流程图；1 is an overall flowchart of a multi-relation-aware temporal interaction network prediction method provided by an embodiment;

图2是实施例提供的多关系感知的时态交互网络预测方法整体框架图；2 is an overall framework diagram of a multi-relation-aware temporal interaction network prediction method provided by an embodiment;

图3是实施例提供的层次化多关系感知聚合示意图。FIG. 3 is a schematic diagram of hierarchical multi-relationship-aware aggregation provided by an embodiment.

具体实施方式Detailed ways

为使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例对本发明进行进一步的详细说明。应当理解，此处所描述的具体实施方式仅仅用以解释本发明，并不限定本发明的保护范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, and do not limit the protection scope of the present invention.

图1是实施例提供的多关系感知的时态交互网络预测方法整体流程图。图2是实施例提供的多关系感知的时态交互网络预测方法整体框架图。如图1和图2所示，实施例提供的多关系感知的时态交互网络预测方法包括以下步骤：FIG. 1 is an overall flowchart of a multi-relation-aware temporal interaction network prediction method provided by an embodiment. FIG. 2 is an overall framework diagram of a multi-relation-aware temporal interaction network prediction method provided by an embodiment. As shown in FIG. 1 and FIG. 2 , the multi-relation-aware temporal interaction network prediction method provided by the embodiment includes the following steps:

步骤1，输入时态交互网络

表示按时间排序的N个交互，i为交互的索引，将每个交互s作为样本得到训练数据集，其中s＝(u,v,t)表示用户

和物品

在

时刻发生的交互，

和

分别为用户集合、物品集合和交互时间集合。将训练数据集按照t-n-Batch算法进行分批，批次总数为C。Step 1, input the temporal interaction network

Represents N interactions sorted by time, i is the index of the interaction, and each interaction s is used as a sample to obtain a training data set, where s=(u, v, t) represents the user

and items

exist

interactions that take place all the time,

and

They are user set, item set and interaction time set respectively. The training data set is divided into batches according to the tn-Batch algorithm, and the total number of batches is C.

实施例中，利用t-n-Batch算法对训练数据集进行分批，使同一个批次中的交互可以并行处理，并且按批次的索引顺序处理所有批次时可以保持交互之间的时间依赖。In the embodiment, the t-n-Batch algorithm is used to batch the training data set, so that the interactions in the same batch can be processed in parallel, and the time dependency between the interactions can be maintained when all batches are processed in the order of the index of the batch.

利用t-n-Batch算法对训练数据集进行分批的过程为：The process of batching the training data set using the t-n-Batch algorithm is as follows:

首先，初始化N个空批次，然后遍历训练数据集，将每个交互划分到相应的批次中。令lastU和lastV分别记录用户和物品所在批次的最大索引。以交互(u_i,v_j,t)为例，lastU[u_i]表示用户u_i所在批次的最大索引，即索引为lastU[u_i]的批次中的交互涉及到用户u_i，且该批次为涉及到该用户的批次中索引最大的。同理lastV[v_j]表示物品v_j所在批次的最大索引，idxN为用户u_i和物品v_j的所有邻居节点所在batch的最大索引。由于每个节点在一个批次中最多只能出现一次，并且每个节点的第i个和第i+1个交互需要分别被划分到第k个批次B_k和第l个批次B_l，其中k<l，因此交互(u_i,v_j,t)会被划分到索引为max(lastU[u_i],lastV[v_j],idxN)+1的批次。批次划分结束后，去掉多余的空批次，剩余批次总数为C。First, N empty batches are initialized, and then the training dataset is traversed, dividing each interaction into a corresponding batch. Let lastU and lastV record the maximum index of the batch in which the user and item are located, respectively. Taking the interaction (u _i , v _j , t) as an example, lastU[u _i ] represents the maximum index of the batch where the user _ui is located, that is, the interaction in the batch with the index lastU[u _i ] involves the user _ui , And the batch has the largest index among the batches involving the user. Similarly lastV[v _j ] represents the maximum index of the batch where item v _j is located, and idxN is the maximum index of the batch where user _ui and all neighbor nodes of item v _j are located. Since each node can only appear at most once in a batch, and the i-th and i+1-th interactions of each node need to be divided into the k-th batch B _k and the l-th batch B _l respectively , where k<l, so interactions (u _i ,v _j ,t) are divided into batches with index max(lastU[u _i ],lastV[v _j ],idxN)+1. After the batch division is completed, the redundant empty batches are removed, and the total number of remaining batches is C.

步骤2，从训练数据集中顺序选取索引为k的一批训练样本，其中k∈{1,2,…,C}。对该批次中的每一个训练样本，进行步骤3-7。Step 2, sequentially select a batch of training samples with index k from the training dataset, where k∈{1,2,…,C}. For each training sample in the batch, proceed to steps 3-7.

步骤3，对于交互(u_i,v_j,t)，基于历史交互信息挖掘与交互节点之间存在历史交互关系、共同交互关系和交互序列相似关系的节点，为交互节点u_i和v_j构建当前交互前的局部关系图

和

Step 3: For interaction (u _i , v _j , t), based on historical interaction information, mine the nodes with historical interaction, common interaction and interaction sequence similarity with interaction nodes, and construct for interaction nodes _ui and _vj . Local relationship graph before the current interaction

and

本实施例中，以节点n_i为例，局部关系图

其中

和

分别表示与节点n_i相关的节点集合、边集合、关系类型集合和关系属性集合。边e定义为三元组

表示节点n_i与节点n_j之间存在关系，关系类型为

包括历史交互关系、共同交互关系和交互序列相似关系三种类型，关系属性为

其中q＝(t,w)，t表示时间属性，w表示权重属性。In this embodiment, taking the node n _i as an example, the local relationship diagram

in

and

Respectively represent the node set, edge set, relation type set and relation attribute set related to node n _i . Edge e is defined as a triple

Indicates that there is a relationship between node n _i and node n _j , and the relationship type is

It includes three types of historical interaction relationship, common interaction relationship and interaction sequence similarity relationship. The relationship attributes are

Where q=(t, w), t represents the time attribute, and w represents the weight attribute.

多关系导出的具体方法如下：The specific method of multi-relation export is as follows:

1)历史交互关系1) Historical interaction

若两个节点历史上发生过交互，则两个节点之间存在历史交互关系，历史交互关系的时间属性t为两个节点最后一次交互的时刻，权重属性w为历史发生交互的次数。If the two nodes have interacted in the history, there is a historical interaction relationship between the two nodes. The time attribute t of the historical interaction relationship is the last interaction time between the two nodes, and the weight attribute w is the number of historical interactions.

2)共同交互关系2) Common interaction

若两个节点在T时间段内和同一个节点发生过交互，则两个节点之间存在共同交互关系。共同交互关系的时间属性t为两个节点最后一次共同交互的时刻，其中共同交互的时刻为两个节点和同一个节点交互的时刻中离当前最近的时刻，权重属性w为历史共同交互次数。If the two nodes have interacted with the same node within the T time period, there is a common interaction relationship between the two nodes. The time attribute t of the common interaction relationship is the last time when the two nodes interact together, where the time of common interaction is the closest moment to the current moment when the two nodes interact with the same node, and the weight attribute w is the number of historical common interactions.

3)交互序列相似关系3) Reciprocal sequence similarity

将所有交互序列看成“文档”，每个交互序列看成“句子”，交互序列中的节点看成“词”，利用Doc2Vec模型分别对用户交互序列和物品交互序列进行嵌入后，可以得到每个用户基于交互序列的表示和每个物品基于交互序列的表示。All interaction sequences are regarded as "documents", each interaction sequence is regarded as "sentence", and the nodes in the interaction sequence are regarded as "words". interaction-sequence-based representation of each user and interaction-sequence-based representation of each item.

由于用户和物品之间不断发生交互，使用增量训练的方式来更新Doc2Vec模型，得到新的用户和物品基于交互序列的表示。给定两个同类型节点(两个用户或两个物品)n_i和n_j基于交互序列的表示

和

计算两者之间的余弦相似度，计算方式如下：Due to the continuous interaction between users and items, incremental training is used to update the Doc2Vec model to obtain new representations of users and items based on interaction sequences. Given two nodes of the same type (two users or two items) n _i and n _j based on the interaction sequence representation

and

Calculate the cosine similarity between the two as follows:

其中，·表示点积。where · represents the dot product.

设置阈值μ，只有当余弦相似度cosSim

大于阈值μ时，两个节点之间存在交互序列相似关系。交互序列相似关系的时间属性t为两个节点交互序列中最后发生的交互的时刻，权重属性w为余弦相似度。Set the threshold μ, only when the cosine similarity cosSim

When it is greater than the threshold μ, there is an interaction sequence similarity relationship between the two nodes. The time attribute t of the interaction sequence similarity relationship is the moment of the last interaction in the interaction sequence of the two nodes, and the weight attribute w is the cosine similarity.

通过上述多关系导出具体方法挖掘与交互节点存在历史交互关系、共同交互关系和交互序列相似关系的节点后，即可以为交互节点u_i和v_j构建当前交互前的局部关系图

和

After mining the nodes that have historical interaction relationship, common interaction relationship and interaction sequence similarity relationship with interactive nodes through the above specific method of multi-relation derivation, the local relationship graph before the current interaction can be constructed for the interaction nodes u _i and v _j

and

步骤4，根据局部关系图

和

通过层次化多关系感知聚合得到用户u_i基于邻居的表示

和物品v_j基于邻居的表示

Step 4, according to the local relationship diagram

and

and neighbor-based representation of item v _j

实施例中，层次化多关系感知聚合共包含两层聚合过程：关系内聚合和关系间聚合。图3给出了层次化多关系感知聚合示意图。In the embodiment, the hierarchical multi-relationship-aware aggregation includes a total of two layers of aggregation processes: intra-relational aggregation and inter-relational aggregation. Figure 3 presents a schematic diagram of hierarchical multi-relation-aware aggregation.

为简化运算，将邻居节点上一次交互后的表示作为其传播过来的交互影响。以节点n_i为例，构建该节点在当前交互前的局部关系图

若节点n_i为用户，对应用户为u_j，则该节点上一次交互后的表示为

若节点n_i为物品，对应物品为v_j，则该节点上一次交互后的表示为

为简化符号，将节点n_i上一次交互后的表示记为

当节点n_i发生交互时，给定该节点在上一次交互和当前交互的时间间隔内，局部关系图

中发生了交互的邻居节点传播过来的交互影响，即邻居节点发生交互后的表示

其中M为邻居节点中发生了交互的节点数量，层次化多关系感知聚合的具体过程为：In order to simplify the operation, the representation of the neighbor node after the last interaction is taken as its propagated interaction influence. Taking node n _i as an example, construct the local relationship graph of this node before the current interaction

If the node n _i is a user and the corresponding user is u _j , the representation of the node after the last interaction is:

If the node n _i is an item and the corresponding item is v _j , the representation of the node after the last interaction is

To simplify the notation, denote the representation of node n _i after the last interaction as

When node n _i interacts, given the time interval between the last interaction and the current interaction of this node, the local relationship graph

The interactive influence propagated by the interacting neighbor nodes, that is, the representation after the neighbor nodes interact

where M is the number of nodes that interact with neighbor nodes, and the specific process of hierarchical multi-relationship-aware aggregation is:

第一层为关系内聚合，聚合邻居节点根据同一种关系类型传播过来的交互影响，为不同的邻居节点赋予相应的权重，得到节点基于特定关系类型的邻居表示。为区分节点之间的关系类型，利用三个参数不同的包含K个头的多头注意力机制分别对历史交互关系、共同交互关系和交互序列相似关系进行关系内聚合，得到节点n_i基于历史交互关系的邻居表示

基于共同交互关系的邻居表示

和基于交互序列相似关系的邻居表示

The first layer is intra-relational aggregation, which aggregates neighbor nodes according to the interactive influences propagated by the same relationship type, assigns corresponding weights to different neighbor nodes, and obtains a node’s neighbor representation based on a specific relationship type. In order to distinguish the relationship types between nodes, three multi-head attention mechanisms including K heads with different parameters are used to aggregate historical interaction, common interaction and interaction sequence similarity respectively, and get node n _i based on historical interaction. neighbors said

Neighbor Representation Based on Common Interactions

and neighbor representations based on interaction sequence similarity

对于给定节点n_i的邻居节点n_j，多头注意力机制的输入为

则第k个头的注意力机制的输入

计算如下：For the neighbor nodes n _j of a given node n _i , the input of the multi-head attention mechanism is

Then the input of the attention mechanism of the kth head

The calculation is as follows:

其中，

表示第k个头输入参数矩阵，不同关系类型

相同。根据该邻居节点的输入

第k个头的注意力系数计算如下：in,

Represents the kth head input parameter matrix, different relation types

same. According to the input of this neighbor node

The attention coefficient of the k-th head is calculated as follows:

其中，

表示第k个头的注意力权重矩阵，不同关系类型

不同。^T表示矩阵转置，‖表示向量连接操作。

表示与关系属性q相关的权重，计算过程如公式(4)所示。in,

Represents the attention weight matrix of the kth head, different relation types

different. ^T stands for matrix transpose, and ‖ stands for vector join operation.

represents the weight related to the relationship attribute q, and the calculation process is shown in formula (4).

将关系属性q＝(t,w)输入全连接层，得到输出值

若关系类型r为历史交互关系，则t属性表示节点n_i和邻居节点n_j最后一次交互的时刻，w属性表示两个节点历史发生交互的次数；若关系类型r为共同交互关系，则t属性为两个节点最后一次共同交互的时刻，w属性为历史共同交互次数；若关系类型r为交互序列相似关系，则t属性为两个节点交互序列中最后发生的交互的时刻，w属性为余弦相似度。计算公式如下：Input the relational attribute q=(t, w) into the fully connected layer to get the output value

If the relationship type r is a historical interaction relationship, the t attribute represents the last interaction moment between the node n _i and its neighbor node n _j , and the w attribute represents the number of historical interactions between the two nodes; if the relationship type r is a common interaction relationship, then t The attribute is the last mutual interaction time between the two nodes, and the w attribute is the number of historical mutual interactions; if the relationship type r is an interaction sequence similarity relationship, the t attribute is the last interaction time in the interaction sequence of the two nodes, and the w attribute is Cosine similarity. Calculated as follows:

其中W_feat为全连接层的参数矩阵，b_feat为全连接层的偏置，不同关系类型不同头共享该全连接层。Among them, W _feat is the parameter matrix of the fully connected layer, and b _feat is the bias of the fully connected layer. Different relationship types and different heads share the fully connected layer.

对给定节点n_i所有关系类型为历史交互关系的邻居节点做权重归一化，得到邻居节点n_j归一化后的第k个头的注意力系数：Normalize the weights of all neighbor nodes whose relationship type is historical interaction relationship for a given node n _i , and obtain the attention coefficient of the k-th head after the normalization of neighbor node n _j :

其中，

是给定节点n_i关系类型为历史交互关系的邻居节点集合。基于上述计算，第k个头的隐向量

计算如下：in,

is the set of neighbor nodes for a given node n _i whose relationship type is historical interaction relationship. Based on the above calculation, the hidden vector of the kth head

The calculation is as follows:

对于给定节点n_i通过K个头得到的隐向量

求均值后得到节点n_i基于历史交互关系的邻居表示

计算过程如公式(7)所示：For a given node n _i the hidden vector obtained through the K heads

After averaging, the neighbor representation of node n _i based on the historical interaction relationship is obtained

The calculation process is shown in formula (7):

将关系类型为历史交互关系的相关参数换成共同交互关系的相关参数，利用公式(2)～(7)获得基于共同交互关系的邻居表示

同样，将关系类型为历史交互关系的相关参数换成交互序列相似关系的相关参数，利用公式(2)～(7)获得基于交互序列相似关系的邻居表示

涉及到的相关参数包括关系属性q、每种关系类型的邻居节点集合、注意力权重矩阵

Replace the relevant parameters of the historical interaction relationship with the relevant parameters of the common interaction relationship, and use formulas (2) to (7) to obtain the neighbor representation based on the common interaction relationship

Similarly, replace the relevant parameters of the historical interaction relationship with the relevant parameters of the interaction sequence similarity relationship, and use formulas (2) to (7) to obtain the neighbor representation based on the interaction sequence similarity relationship.

The relevant parameters involved include the relationship attribute q, the set of neighbor nodes for each relationship type, and the attention weight matrix

第二层为关系间聚合，由于根据不同关系类型传播过来的交互影响对给定节点的重要性是不同的，利用自注意力机制为不同的关系类型赋予相应的权重。给定节点n_i，利用关系内聚合可以得到基于不同关系类型的邻居表示，将其通过自注意力机制进行聚合，得到节点n_i基于邻居的表示。The second layer is the aggregation between relations. Since the importance of the interaction effects propagated by different relation types to a given node is different, the self-attention mechanism is used to assign corresponding weights to different relation types. Given a node n _i , neighbor representations based on different relation types can be obtained by using intra-relational aggregation, and aggregated through a self-attention mechanism to obtain a neighbor-based representation of node n _i .

对于节点n_i基于历史交互关系的邻居表示

基于共同交互关系的邻居表示

和基于交互序列相似关系的邻居表示

拼接后得到自注意力机制的输入

自注意力机制的查询矩阵Q、键矩阵K和值矩阵V的计算过程如下：Neighbor representation based on historical interaction for node n _i

Neighbor Representation Based on Common Interactions

and neighbor representations based on interaction sequence similarity

After splicing, the input of the self-attention mechanism is obtained

The calculation process of the query matrix Q, key matrix K and value matrix V of the self-attention mechanism is as follows:

Q＝HW_Q (8)Q=HW _Q (8)

K＝HW_K (9)K=HW _K (9)

V＝HW_V (10)V=HW _V (10)

其中

和

分别为查询权重矩阵、键权重矩阵和值权重矩阵。自注意力机制的输出

如公式(11)所示：in

and

are the query weight matrix, key weight matrix and value weight matrix, respectively. The output of the self-attention mechanism

As shown in formula (11):

其中，

为比例因子，d_k＝d_v。in,

is the scaling factor, d _k =d _v .

将上述注意力机制的输出Z输入到全连接层中，得到节点n_i基于邻居的表示

计算过程如公式(12)所示：Input the output Z of the above attention mechanism into the fully connected layer to obtain the neighbor-based representation of node n _i

The calculation process is shown in formula (12):

其中，W_out为全连接层的参数矩阵，b_out为全连接层的偏置。Among them, W _out is the parameter matrix of the fully connected layer, and b _out is the bias of the fully connected layer.

根据局部关系图

和

利用上述层次化多关系感知聚合得到用户u_i基于邻居的表示

和物品v_j基于邻居的表示

According to the local relationship diagram

and

Using the above hierarchical multi-relation-aware aggregation to obtain the neighbor-based representation of user _ui

and neighbor-based representation of item v _j

步骤5，根据用户u_i上一次交互后的表示

和用户u_i基于邻居的表示

利用全连接层计算当前交互前物品v_j预测的表示

Step 5, according to the representation of user _ui after the last interaction

and the neighbor-based representation of user _ui

实施例中，根据用户u_i上一次交互后的表示

和用户u_i基于邻居的表示

利用全连接层计算当前交互前物品v_j预测的表示

计算过程如公式(13)所示：In the embodiment, according to the representation of the user _ui after the last interaction

and the neighbor-based representation of user _ui

The calculation process is shown in formula (13):

其中，W₁和W₂为全连接层的参数矩阵，b为全连接层的偏置。Among them, W ₁ and W ₂ are the parameter matrices of the fully connected layer, and b is the bias of the fully connected layer.

步骤6，根据用户u_i和物品v_j上一次交互后的表示

和

上一次交互和当前交互的时间间隔

和

以及基于邻居的表示

和

和

Step 6, according to the last interaction between user _ui and item v _j

and

The time interval between the last interaction and the current interaction

and

and a neighbor-based representation

and

如图2所示，利用两个循环神经网络层RNN_U和RNN_V分别计算用户u_i和物品v_j当前交互后的表示

和

RNN_U的输入为用户u_i上一次交互后的表示

物品v_j上一次交互后的表示

用户u_i基于邻居的表示

以及用户上一次交互和当前交互的时间间隔

RNN_V的输入为物品v_j上一次交互后的表示

用户u_i上一次交互后的表示

物品v_j基于邻居的表示

以及物品上一次交互和当前交互的时间间隔

RNN_U和RNN_V的具体计算公式如下：As shown in Figure 2, two recurrent neural network layers RNN _U and RNN _V are used to calculate the current interaction representation of user _ui and item v _j , respectively

and

The input of RNN _U is the representation of user u _i after the last interaction

Representation of item v _j after the last interaction

Neighbor-based representation of user u _i

and the time interval between the user's last interaction and the current interaction

The input of RNN _V is the representation of item v _j after the last interaction

The representation of user u _i after the last interaction

Neighbor-based representation of item v _j

and the time interval between the item's last interaction and the current interaction

The specific calculation formulas of RNN _U and RNN _V are as follows:

其中，

表示RNN_U的网络参数，

表示RNN_V的网络参数，

和

分别为时间间隔

和

通过全连接层得到的表示，不同时间间隔共享该全连接层。所有用户共享RNN_U以更新用户的表示，所有物品共享RNN_V以更新物品的表示。将RNN_U和RNN_V的隐状态分别作为用户和物品的表示。in,

represents the network parameters of RNN _U ,

represents the network parameters of RNN _V ,

and

time interval

and

A representation obtained through a fully connected layer, which is shared across different time intervals. All users share RNN _U to update the user's representation, and all items share RNN _V to update the item's representation. The hidden states of RNN _U and RNN _V are used as user and item representations, respectively.

步骤7，根据当前交互前物品v_j预测的表示

和真实的表示

Step 7, according to the predicted representation of the item v _j before the current interaction

and true representation

将物品v_j上一次交互后的表示作为其当前交互前真实的表示

最小化物品v_j预测的表示

和真实的表示

之间的均方误差得到预测损失，整体损失

计算如下：Take the representation of item v _j after the last interaction as its real representation before the current interaction

Minimize the representation of item v _j predictions

and true representation

The mean squared error between gets the predicted loss, the overall loss

The calculation is as follows:

其中，第一项为预测损失，后两项为正则化项，以避免用户和物品的表示变化过大，λ_U和λ_I为尺度参数，‖ ‖₂表示L2距离。Among them, the first term is the prediction loss, the last two terms are regularization terms to avoid excessive variation in the representation of users and items, λ _U and λ _I are scale parameters, and ‖ ‖ ₂ represents the L2 distance.

步骤8，根据批次中所有样本的损失

对整个模型中的网络参数进行调整。Step 8, according to the loss of all samples in the batch

Make adjustments to network parameters throughout the model.

计算批次中所有样本的损失

具体计算方式如下所示：Calculate the loss for all samples in the batch

The specific calculation method is as follows:

其中

为每个样本的损失，M为批次中样本的数量。在本发明中，根据损失

对整个模型中的网络参数进行调整。in

is the loss per sample, and M is the number of samples in the batch. In the present invention, according to the loss

Make adjustments to network parameters throughout the model.

步骤9，重复步骤2-8直到训练数据集的所有批次都参与了模型训练。Step 9, repeat steps 2-8 until all batches of the training dataset participate in model training.

步骤10，若达到指定的训练迭代次数，则训练结束；否则返回步骤2。Step 10: If the specified number of training iterations is reached, the training ends; otherwise, return to Step 2.

步骤11，利用参数调优后的时态交互网络预测模型预测用户可能会发生交互的物品。Step 11: Use the temporal interaction network prediction model after parameter tuning to predict items that the user may interact with.

基于上述训练结束后得到的用户和物品表示，以用户u_i为例，给定用户u_i上一次交互后的表示

和用户u_i基于邻居的表示

计算交互涉及物品预测的表示

具体过程如公式(13)所示。计算物品预测的表示

与所有物品真实的表示

之间的L2距离，L2距离小的top-K个物品为该用户可能会发生交互的物品。Based on the user and item representations obtained after the above training, taking user _ui as an example, given the representation of user _ui after the last interaction

and the neighbor-based representation of user _ui

Computational interactions involve representations of item predictions

The specific process is shown in formula (13). Compute the representation of item predictions

Authentic representation with all items

The L2 distance between them, the top-K items with the smallest L2 distance are the items that the user may interact with.

以上所述的具体实施方式对本发明的技术方案和有益效果进行了详细说明，应理解的是以上所述仅为本发明的最优选实施例，并不用于限制本发明，凡在本发明的原则范围内所做的任何修改、补充和等同替换等，均应包含在本发明的保护范围之内。The above-mentioned specific embodiments describe in detail the technical solutions and beneficial effects of the present invention. It should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, additions and equivalent substitutions made within the scope shall be included within the protection scope of the present invention.

Claims

1. A multi-relation-aware temporal interaction network prediction method is characterized by comprising the following steps:

(1) with user u_iAnd an article v_jInteraction (u) occurring at time t_i,v_jT) constructing a training data set as a sample, and batching the training data set;

(2) for interaction (u)_i,v_jT), mining the nodes with historical interaction relationship, common interaction relationship and interaction sequence similarity relationship between the nodes based on historical interaction information to obtain interactive nodes u_iAnd v_jConstructing a local relationship graph before current interaction

And

(3) according to a local relationship diagram

And

obtaining user u through hierarchical multi-relation perception aggregation_iNeighbor-based representation

And an article v_jNeighbor-based representation

(4) According to user u_iLast interactive representation

And user u_iNeighbor-based representation

Calculating current pre-interaction item v by utilizing full connection layer_jRepresentation of predictions

(5) According to user u_iAnd an article v_jLast interactive representation

And

time interval between last interaction and current interaction

And

and neighbor-based representation

And

respectively calculating user u by utilizing two recurrent neural network layers_iAnd an article v_jCurrently interacted with representation

And

(6) according to the current pre-interaction item v_jRepresentation of predictions

And a real representation

Error between, user u_iRegularization loss and article v_jRegularization loss, calculating the overall loss

According to the loss of all samples in the batch

Adjusting network parameters in a temporal interaction network prediction model until all batches participate in model training, wherein the temporal interaction network prediction model comprises all full connection layers and a cyclic neural network layer used in the steps (2) to (6);

(7) and predicting the articles which are possibly interacted by the user by using the temporal interaction network prediction model after the parameters are adjusted.

2. The method of claim 1, wherein the training data set is batched using a t-n-Batch algorithm.

3. The method for predicting a temporal interaction network with multi-relationship awareness as claimed in claim 1, wherein the specific process of the step (2) is as follows:

local relationship diagram

Wherein

And

respectively represent and node n_iRelated node set, edge set, relation type set and relation attribute set, edge e is defined as triple

Representing a node n_iAnd node n_jThere is a relationship between them, the relationship type is

Comprises three types of historical interactive relationship, common interactive relationship and interactive sequence similarity relationship, and the relationship attribute is

Wherein q is (t, w), t represents a time attribute, and w represents a weight attribute;

the specific method of multi-relation derivation is as follows:

1) historical interaction relationships

If two nodes are interacted historically, a historical interaction relationship exists between the two nodes, the time attribute t of the historical interaction relationship is the last interaction time of the two nodes, and the weight attribute w is the historical interaction frequency;

2) mutual interaction relation

If two nodes interact with the same node in the T time period, a common interaction relationship exists between the two nodes. The time attribute t of the common interaction relationship is the time of the last common interaction of the two nodes, wherein the time of the common interaction is the closest time to the current time in the time of the interaction of the two nodes and the same node, and the weight attribute w is the historical common interaction times;

3) interaction sequence similarity relationship

All the interactive sequences are regarded as 'documents', each interactive sequence is regarded as 'sentences', nodes in the interactive sequences are regarded as 'words', and after the user interactive sequences and the article interactive sequences are respectively embedded by using a Doc2Vec model, the representation of each user based on the interactive sequences and the representation of each article based on the interactive sequences can be obtained;

as the interaction between the user and the article continuously occurs, the Doc2Vec model is updated in an incremental training mode, and a new representation of the user and the article based on the interaction sequence is obtained. Given two nodes n of the same type_iAnd n_jPresentation based on interaction sequences

And

calculating the cosine similarity between the two, wherein the calculation mode is as follows:

wherein, represents the dot product;

setting a threshold value mu only when the cosine similarity

And when the value is larger than the threshold value mu, the interaction sequence similarity relation exists between the two nodes. Interactive sequence phaseThe time attribute t of the similarity relation is the moment of interaction which occurs at last in the interaction sequence of the two nodes, and the weight attribute w is cosine similarity;

after the nodes with historical interaction relationship, common interaction relationship and interaction sequence similarity relationship with the interaction nodes are mined by the multi-relationship derivation concrete method, the interaction nodes u can be regarded as interaction nodes u_iAnd v_jConstructing a local relationship graph before current interaction

And

4. the method for predicting a temporal interaction network with multi-relationship awareness as claimed in claim 1, wherein the specific process of the step (3) is as follows:

if node n_iAs a user, the corresponding user is u_jThen the last interactive representation of the node is

If node n_iIs an article, corresponding to the article v_jThen the last interactive representation of the node is

Node n_iThe representation after the last interaction is recorded as

When node n_iWhen interaction occurs, the node is given a local relationship graph in the time interval between the last interaction and the current interaction

The interaction influence propagated by the neighbor node in which the interaction occurs, namely the representation of the neighbor node after the interaction occurs

Wherein M is the number of nodes interacted among the neighbor nodes, and the specific process of hierarchical multi-relationship perception aggregation is as follows:

the first layer is intra-relationship aggregation, neighbor nodes are aggregated according to the interaction influence transmitted by the same relationship type, corresponding weights are given to different neighbor nodes, and the neighbor representation of the node based on the specific relationship type is obtained, wherein the process is as follows:

for a given node n_iIs a neighbor node n_jThe input of the multi-head attention mechanism is

Then the input of attention mechanism of the kth head

The calculation is as follows:

wherein,

representing a matrix of k-th head input parameters, different relation types

The same is true. According to the input of the neighbor node

The attention coefficient for the kth head is calculated as follows:

wherein,

attention weight matrix representing kth head, different relation types

Instead, T represents the matrix transpose, | represents the vector join operation,

the weight associated with the relationship attribute q is represented and the calculation process is shown in equation (4).

Inputting the relation attribute q ═ t, w into the full-link layer to obtain an output value

If the relationship type r is a history interactive relationship, the t attribute represents the node n_iAnd a neighbor node n_jAt the last interaction time, the w attribute represents the historical interaction times of the two nodes; if the relationship type r is a common interaction relationship, the t attribute is the time of the last common interaction of the two nodes, and the w attribute is the historical common interaction times; if the relationship type r is an interaction sequence similarity relationship, the t attribute is the moment of interaction occurring at the last in the interaction sequences of the two nodes, and the w attribute is cosine similarity. The calculation formula is as follows:

wherein W_featParameter matrix being a fully connected layer, b_featFor the biasing of the fully connected layer, different heads of different relation types share the fully connected layer;

for a given node n_iAll the neighbor nodes with the relation types of the historical interaction relation are subjected to weight normalization to obtain neighbor nodes n_jNormalized kth head attention coefficient:

wherein,

is given node n_iThe relationship type is a neighbor node set of historical interaction relationship. Based on the above calculation, the hidden vector of the k-th head

The calculation is as follows:

for a given node n_iImplicit vectors obtained by K heads

Obtaining a node n after averaging_iNeighbor representation based on historical interaction relationships

The calculation process is shown in formula (7):

the related parameters with the relationship type of the historical interactive relationship are converted into the related parameters of the common interactive relationship, and the neighbor expression based on the common interactive relationship is obtained by using the formulas (2) to (7)

Similarly, the related parameters with the relationship type of the historical interaction relationship are converted into the related parameters of the interaction sequence similarity relationship, and the neighbor expression based on the interaction sequence similarity relationship is obtained by using the formulas (2) to (7)

The related parameters comprise a relationship attribute q, a neighbor node set of each relationship type and an attention weight matrix

The second layer is inter-relationship aggregation, because the importance of the interaction influence propagated according to different relationship types to a given node is different, corresponding weights are given to the different relationship types by using a self-attention mechanism, and the specific process is as follows:

for node n_iNeighbor representation based on historical interaction relationships

Neighbor representation based on common interaction relationships

And neighbor representation based on inter-sequence similarity

Obtaining input of self-attention mechanism after splicing

The calculation process of the query matrix Q, the key matrix K and the value matrix V of the self-attention mechanism is as follows:

Q＝HW_Q (8)

K＝HW_K (9)

V＝HW_V (10)

wherein

And

respectively, an inquiry weight matrix, a key weight matrix and a value weight matrix;output of self-attention mechanism

As shown in formula (11):

wherein,

is a scale factor, d_k＝d_v；

Inputting the output Z of the attention mechanism into the full-connection layer to obtain a node n_iNeighbor-based representation

The calculation process is shown in formula (12):

wherein, W_outParameter matrix being a fully connected layer, b_outA bias for a fully connected layer;

according to a local relationship diagram

And

user u is obtained by hierarchical multi-relation perception aggregation_iNeighbor-based representation

And an article v_jNeighbor-based representation

5. The method for predicting a temporal interaction network with multi-relationship awareness as claimed in claim 1, wherein the specific process of the step (4) is as follows:

according to user u_iLast interactive representation

And user u_iNeighbor-based representation

The calculation process is shown in formula (13):

wherein, W₁And W₂Is the parameter matrix of the fully-connected layer, and b is the bias of the fully-connected layer.

6. The method for predicting a temporal interaction network with multi-relationship awareness as claimed in claim 1, wherein the specific process of the step (5) is as follows:

using two recurrent neural network layers RNN_UAnd RNN_VCalculate user u separately_iAnd an article v_jCurrently interacted with representation

And

RNN_Uis input by user u_iLast interactive representation

Article v_jLast interactive representation

User u_iNeighbor-based representation

And the time interval between the last interaction and the current interaction of the user

RNN_VIs an item v_jLast interactive representation

User u_iLast interactive representation

Article v_jNeighbor-based representation

And the time interval between the last interaction and the current interaction of the object

RNN_UAnd RNN_VThe specific calculation formula of (2) is as follows:

wherein,

denotes RNN_UThe network parameters of (a) are set,

denotes RNN_VThe network parameters of (a) are set,

and

respectively time interval

And

by means of the representation obtained by the full connection layer, the full connection layer is shared by different time intervals, and RNN is shared by all users_UTo update the user's representation, all items share the RNN_VTo update the representation of the item, the RNN_UAnd RNN_VAs a representation of the user and the item, respectively.

7. The method of claim 1, wherein in step (6), the overall loss is reduced

The calculation is as follows:

the first term is prediction loss, and the last two terms are regularization terms to avoid excessive representation change of users and articles, namely lambda_UAnd λ_IIs a scale parameter, | |)₂Indicating the L2 distance.

8. The method for predicting a temporal interaction network with multi-relationship awareness as claimed in claim 1, wherein the specific process of the step (7) is as follows:

given user u_iLast interactive representation

And user u_iNeighbor-based representation

Computing representations of interactions involving item predictions

Then, a representation of the item forecast is computed

Representation of all objects

The distance L2 between them, the top-K items with the small distance L2 are the items that the user may interact with.