CN110781401A - A Top-n Item Recommendation Method Based on Collaborative Autoregressive Flow - Google Patents

A Top-n Item Recommendation Method Based on Collaborative Autoregressive Flow Download PDF

Info

Publication number
CN110781401A
CN110781401A CN201911079406.2A CN201911079406A CN110781401A CN 110781401 A CN110781401 A CN 110781401A CN 201911079406 A CN201911079406 A CN 201911079406A CN 110781401 A CN110781401 A CN 110781401A
Authority
CN
China
Prior art keywords
user
item
items
users
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911079406.2A
Other languages
Chinese (zh)
Inventor
莫玉华
钟婷
周帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201911079406.2A priority Critical patent/CN110781401A/en
Publication of CN110781401A publication Critical patent/CN110781401A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a project recommendation method realized by utilizing Collaborative Autoregressive Flow (CAF), which expands a generation model based on the autoregressive flow to a collaborative filtering technology for modeling implicit feedback. The CAF converts the simple initial density into a more complex density which accords with the real distribution of data through a series of reversible transformations, and mines the items accessed by the user and the potential information in the attribute characteristics of the user and the items, so that the potential representation of the user and the potential representation of the items with more representation capability can be learned, and the items which are most likely to be accessed next time are stably and effectively recommended for the user. In addition, the invention adds a collaborative autoregressive flow to the model to learn the true distribution of the hidden variables, which can reduce the error of the traditional variational model (such as VAE) in the conversation-based recommendation problem.

Description

一种基于协同自回归流实现的Top-n项目推荐方法A Top-n Item Recommendation Method Based on Collaborative Autoregressive Flow

技术领域technical field

本发明属于机器学习中的神经网络领域,是一种基于深度学习的方法,主要利用自回归流(Autoregressive Flow)去挖掘用户访问过的项目以及用户和项目的属性特征中存在的潜在信息,学习用户和项目的潜在表示,并在此基础上利用协同过滤(Collaborative Filtering)技术给用户推荐n个最有可能访问的项目。The invention belongs to the field of neural networks in machine learning, and is a method based on deep learning, which mainly uses autoregressive flow to mine items visited by users and potential information existing in attribute features of users and items, and learn The latent representation of users and items, and on this basis, collaborative filtering (Collaborative Filtering) technology is used to recommend n items that are most likely to be accessed to users.

背景技术Background technique

随着互联网技术的快速发展,人们的生活越来越离不开网络,随之出现的便是海量的用户项目交互数据。我们利用这些数据的属性特征,研究用户与项目的交互行为,分析用户的喜好,并向用户呈现他们可能喜欢但是没有访问过的项目,以此来促进个性化的项目推荐,从而能够有效缓解互联网信息过载的问题。With the rapid development of Internet technology, people's lives are more and more inseparable from the Internet, and a large amount of user project interaction data has emerged. We use the attribute characteristics of these data to study the interaction behavior of users with items, analyze user preferences, and present items to users that they may like but have not visited, so as to promote personalized item recommendation, which can effectively alleviate the Internet The problem of information overload.

传统的协同过滤方法通过矩阵分解(Matrix Factorization)利用用户对项目的评分矩阵来预测用户的喜好。这种方法没有考虑用户以及项目的属性信息,只能简单的模拟用户与项目交互的线性关系,无法提取用户与项目交互的复杂关系,导致模型推荐性能不好。还有一类方法结合传统的贝叶斯推理技术和不确定性表示去学习用户与项目之间的稀疏隐式反馈和辅助信息的复杂交互。这类方法通常需要假设真实数据的后验分布为高斯分布,然而真实世界的数据并不一定服从这种分布形式。因此,这种假设会使得模型不够灵活,无法与推荐的真实后验分布和不确定性相匹配,容易导致推荐错误。Traditional collaborative filtering methods use matrix factorization to predict user preferences by using the user's rating matrix for items. This method does not consider the attribute information of users and items, and can only simply simulate the linear relationship between users and items, but cannot extract the complex relationship between users and items, resulting in poor model recommendation performance. There is also a class of methods that combine traditional Bayesian inference techniques and uncertainty representations to learn complex interactions between users and items with sparse implicit feedback and auxiliary information. Such methods usually need to assume that the posterior distribution of the real data is Gaussian distribution, but real-world data does not necessarily obey this distribution. Therefore, this assumption will make the model inflexible enough to match the true posterior distribution and uncertainty of the recommendation, which can easily lead to recommendation errors.

基于上述问题,本发明提出一种基于深度学习方法,利用两个变分自动编码器分别对用户和项目进行建模,并在对用户和项目嵌入向量表示进行编码得到用户和项目的潜在表示过程中加入协同自回归流,从而解决传统项目推荐方法中由于无法模拟用户和项目之间的非线性交互以及模型不够灵活导致的推荐准确率不高、推荐效率低等问题。Based on the above problems, the present invention proposes a method based on deep learning, which uses two variational auto-encoders to model users and items respectively, and encodes the embedding vector representations of users and items to obtain the potential representation process of users and items. The collaborative autoregressive flow is added to the traditional item recommendation method to solve the problems of low recommendation accuracy and low recommendation efficiency caused by the inability to simulate the nonlinear interaction between users and items and the inflexibility of the model.

发明内容SUMMARY OF THE INVENTION

本发明的目的旨在针对传统项目推荐方法存在的推荐准确率不高、推荐性能低等技术不足的现状,提出一种基于协同自回归流的表示方法,实现了对用户最有可能访问项目的准确、高效预测,解决了现有模型推荐效果不好的问题。The purpose of the present invention is to propose a representation method based on collaborative autoregressive flow in view of the technical deficiencies such as low recommendation accuracy and low recommendation performance in traditional item recommendation methods, which realizes the most likely access to items for users. Accurate and efficient prediction solves the problem of poor recommendation effect of existing models.

本发明的思路为,根据用户访问项目的情况,将用户和项目表示成嵌入向量,再利用两个变分自动编码器(Variational Autoencoder,VAE)对用户和项目的嵌入向量进行学习,从而得到用户和项目的潜在表示。此外,本发明在得到用户与项目的潜在表示过程中,加入了变分推断的过程,利用协同自回归流灵活地模拟用户和项目的非线性交互,学习用户和项目数据的真实分布,这样很大程度能够避免直接使用变分自编码器的误差,从而达到更好的项目推荐效果。The idea of the present invention is to represent the user and the item as an embedding vector according to the user's access to the item, and then use two variational autoencoders (Variational Autoencoder, VAE) to learn the embedding vector of the user and the item, so as to obtain the user and potential representations of items. In addition, in the process of obtaining the potential representation of users and items, the present invention adds the process of variational inference, uses collaborative autoregressive flow to flexibly simulate the nonlinear interaction between users and items, and learns the real distribution of user and item data. To a large extent, the error of directly using the variational autoencoder can be avoided, so as to achieve a better item recommendation effect.

基于上述发明思路,本发明设计了一种利用协同自回归流实现的项目推荐方法,其具体包括以下步骤:Based on the above inventive idea, the present invention designs a project recommendation method implemented by using collaborative autoregressive flow, which specifically includes the following steps:

S1,数据的预处理:根据原始数据中用户历史访问项目的情况,将数据集划分成训练集,验证集和测试集;再根据训练集中用户访问项目的情况构建一个项目标签矩阵R;再对训练集中的每一个用户和每一个项目进行嵌入,得到用户嵌入向量矩阵U和项目嵌入向量矩阵V;S1, data preprocessing: According to the user's historical access items in the original data, the data set is divided into training set, validation set and test set; then an item label matrix R is constructed according to the user access items in the training set; Embed each user and each item in the training set to obtain the user embedding vector matrix U and the item embedding vector matrix V;

S2,优化模型参数,获取最优项目推荐模型:将用户嵌入向量矩阵U和项目嵌入向量矩阵V分别输入到不同的变分自动编码器中,在编码过程中引入协同自回归流,得到的用户和项目的最终隐含变量表示矩阵

Figure BDA0002263463090000021
Figure BDA0002263463090000022
然后让用户和项目的最终隐含变量表示与其相对应的协同信息相结合得到用户和项目的最终潜在表示矩阵U′和V′,接着将用户和项目的最终潜在表示的矢量积输入到一个分类器中,得到第一损失,再将用户和项目的最终隐含变量表示输入到一个解码器中得到第二损失,将第一损失与第二损失相加生成最后的总损失,然后最小化总损失,即得到所述的项目推荐模型;S2, optimize the model parameters to obtain the optimal item recommendation model: input the user embedding vector matrix U and item embedding vector matrix V into different variational auto-encoders respectively, and introduce a collaborative autoregressive flow in the encoding process, the obtained user and the final latent variable representation matrix of the item
Figure BDA0002263463090000021
and
Figure BDA0002263463090000022
Then let the final latent variable representations of users and items be combined with their corresponding collaborative information to obtain the final latent representation matrices U' and V' of users and items, and then input the vector product of the final latent representations of users and items into a classification In the decoder, the first loss is obtained, and the final latent variable representations of users and items are input into a decoder to obtain the second loss, the first loss and the second loss are added to generate the final total loss, and then the total loss is minimized. loss, that is, to obtain the item recommendation model;

S3,给用户推荐项目:利用上述训练好的项目推荐模型,给用户做Top-n项目推荐。S3, recommend items to users: Use the above trained item recommendation model to recommend Top-n items to users.

上述利用协同自回归流实现的项目推荐模型训练方法,步骤S1的目的在于处理原始数据集,划分训练集,验证集和测试集,再将训练集中所有的用户和项目进行嵌入,具体包括以下分步骤:In the above-mentioned training method for item recommendation model implemented by collaborative autoregressive flow, the purpose of step S1 is to process the original data set, divide the training set, the verification set and the test set, and then embed all the users and items in the training set, including the following points: step:

S11,划分数据集:根据原始数据集中用户访问项目的情况,从每一个用户历史访问过的项目中随机选择70%作为训练集,选择20%作为测试集,剩下的10%作为验证集;S11, divide the data set: according to the user access items in the original data set, randomly select 70% of the items visited by each user as the training set, 20% as the test set, and the remaining 10% as the validation set;

S12,根据训练集中用户访问项目的情况构建用户嵌入向量矩阵U和项目嵌入向量矩阵V以及项目标签矩阵R,具体包括以下步骤:S12, construct the user embedding vector matrix U, the item embedding vector matrix V and the item label matrix R according to the situation of the user accessing the item in the training set, which specifically includes the following steps:

S121,构建用户嵌入向量矩阵U:将训练集中所有用户Us={u1,…,ui,…,uM}表示为嵌入向量,根据用户对项目的偏好初始化一个用户嵌入向量矩阵

Figure BDA0002263463090000032
其中M表示用户的数量,N表示项目的数量,uij表示用户嵌入向量矩阵中的元素,如果用户ui访问过项目vj,则uij=1,否则标记为0,用户嵌入向量矩阵U中的第i行表示用户ui的嵌入向量ui;S121, construct a user embedding vector matrix U: represent all users U s ={u 1 ,...,u i ,...,u M } in the training set as embedding vectors, and initialize a user embedding vector matrix according to the user's preference for items
Figure BDA0002263463090000032
where M represents the number of users, N represents the number of items, u ij represents the elements in the user embedding vector matrix, if user ui has visited item v j , then u ij =1, otherwise marked as 0, the user embedding vector matrix U The i-th row in represents the embedding vector ui of user ui ;

S122,构建项目嵌入向量矩V:将训练集中所有项目Vs={v1,…,vj,…,vN}表示为嵌入向量,项目嵌入向量矩阵为用户嵌入向量矩阵的转置,即V=UT,项目嵌入向量矩阵V中的第j行表示项目vj的嵌入向量vjS122, constructing the item embedding vector moment V: all items in the training set V s ={v 1 ,...,v j ,...,v N } are represented as embedding vectors, and the item embedding vector matrix is the transpose of the user embedding vector matrix, that is, V= UT , the jth row in the item embedding vector matrix V represents the embedding vector v j of the item v j ;

S123,构建项目标签矩阵R:根据训练集中用户访问项目的情况构建一个项目标签矩阵

Figure BDA0002263463090000033
其中rij=uij。S123, construct an item label matrix R: construct an item label matrix according to the user access items in the training set
Figure BDA0002263463090000033
where r ij = u ij .

上述利用协同自回归流实现的项目推荐模型训练方法,步骤S2的目的在于,将步骤S1中得到的用户嵌入变量矩阵U和项目嵌入变量矩阵V输入到两个引入了协同自回归流的变分自动编码器中,得到用户和项目的最终隐含变量表示矩阵

Figure BDA0002263463090000035
进而得到用户和项目的最终潜在表示矩阵U′和V′。接着构建第一损失和第二损失,最小化两个损失之和,得到最优的项目推荐模型。该步骤具体包括以下分步骤:In the above item recommendation model training method using collaborative autoregressive flow, the purpose of step S2 is to input the user embedded variable matrix U and item embedded variable matrix V obtained in step S1 into two variational variables that have introduced collaborative autoregressive flow. In the autoencoder, the final latent variable representation matrix for users and items is obtained and
Figure BDA0002263463090000035
Then the final latent representation matrices U' and V' of users and items are obtained. Then construct the first loss and the second loss, minimize the sum of the two losses, and obtain the optimal item recommendation model. This step specifically includes the following sub-steps:

S21,利用两个变分自动编码器分别对用户嵌入向量ui和项目嵌入向量vj进行编码,编码得到用户ui和项目vj的初始隐含变量表示

Figure BDA0002263463090000036
Figure BDA0002263463090000037
初始隐含变量表示都是d维向量;S21, using two variational auto-encoders to encode the user embedding vector ui and the item embedding vector vj respectively, and obtain the initial implicit variable representation of the user ui and the item vj by encoding
Figure BDA0002263463090000036
and
Figure BDA0002263463090000037
The initial hidden variable representations are all d-dimensional vectors;

本步骤利用编码器(多层感知机)对表示成用户和项目的嵌入向量进行编码,每一层的计算过程如下:In this step, an encoder (multi-layer perceptron) is used to encode the embedding vectors expressed as users and items. The calculation process of each layer is as follows:

第一层:

Figure BDA0002263463090000038
level one:
Figure BDA0002263463090000038

第二层:

Figure BDA0002263463090000039
Second floor:
Figure BDA0002263463090000039

……...

第t层:

Figure BDA0002263463090000041
Layer t:
Figure BDA0002263463090000041

上式中,t表示多层感知机的层数,在本发明中t=3,通过最后一层的结果分别计算用户和项目的初始隐含变量表示

Figure BDA0002263463090000044
Figure BDA0002263463090000045
Figure BDA0002263463090000046
Figure BDA0002263463090000047
均服从高斯分布,其中用户初始隐含变量表示
Figure BDA0002263463090000048
分布的均值为
Figure BDA0002263463090000049
方差为
Figure BDA00022634630900000410
Figure BDA00022634630900000411
对于项目初始隐含变量表示
Figure BDA00022634630900000412
其均值
Figure BDA00022634630900000413
方差
Figure BDA00022634630900000414
通过项目的均值和方差得到
Figure BDA00022634630900000415
编码器每一层的计算过程中,
Figure BDA00022634630900000436
表示非线性的激活函数,常见的有sigmoid或tanh,本发明中选取sigmoid作为激活函数,
Figure BDA00022634630900000416
Figure BDA00022634630900000417
分别表示用户嵌入向量和项目嵌入向量经过第t层神经网络得到的隐藏状态表示,矩阵
Figure BDA00022634630900000418
和偏差向量
Figure BDA00022634630900000419
均为用户编码器的训练参数,其中上标t表示这是经过第t层神经网络的参数,而
Figure BDA00022634630900000420
Figure BDA00022634630900000421
是求用户初始隐含变量表示
Figure BDA00022634630900000422
分布时需要学习的参数;矩阵
Figure BDA00022634630900000423
Figure BDA00022634630900000424
以及偏差向量
Figure BDA00022634630900000425
为项目编码器的训练参数,上标t表示这是经过第t层神经网络的参数,
Figure BDA00022634630900000426
是求项目初始隐含变量表示
Figure BDA00022634630900000427
分布时需要学习的参数;
Figure BDA00022634630900000428
Figure BDA00022634630900000429
都是用于随机采样的标准正态分布的变量;In the above formula, t represents the number of layers of the multilayer perceptron, in the present invention t=3, the result of the last layer is passed and Compute initial latent variable representations for users and items separately
Figure BDA0002263463090000044
and
Figure BDA0002263463090000045
Figure BDA0002263463090000046
and
Figure BDA0002263463090000047
All obey a Gaussian distribution, where the user's initial hidden variable represents
Figure BDA0002263463090000048
The mean of the distribution is
Figure BDA0002263463090000049
The variance is
Figure BDA00022634630900000410
but
Figure BDA00022634630900000411
The initial implicit variable representation for the project
Figure BDA00022634630900000412
its mean
Figure BDA00022634630900000413
variance
Figure BDA00022634630900000414
Obtained by the mean and variance of the items
Figure BDA00022634630900000415
During the calculation process of each layer of the encoder,
Figure BDA00022634630900000436
Represents a nonlinear activation function, the common ones are sigmoid or tanh. In the present invention, sigmoid is selected as the activation function,
Figure BDA00022634630900000416
and
Figure BDA00022634630900000417
Represents the hidden state representation obtained by the user embedding vector and the item embedding vector through the t-th layer of neural network, matrix
Figure BDA00022634630900000418
and the bias vector
Figure BDA00022634630900000419
are the training parameters of the user encoder, where the superscript t indicates that this is the parameter of the t-th layer of neural network, and
Figure BDA00022634630900000420
Figure BDA00022634630900000421
is to find the initial implicit variable representation of the user
Figure BDA00022634630900000422
Parameters to learn when distributing; matrix
Figure BDA00022634630900000423
Figure BDA00022634630900000424
and the bias vector
Figure BDA00022634630900000425
is the training parameter of the item encoder, the superscript t indicates that this is the parameter of the neural network of the t layer,
Figure BDA00022634630900000426
is to find the initial implicit variable representation of the project
Figure BDA00022634630900000427
Parameters that need to be learned when distributing;
Figure BDA00022634630900000428
and
Figure BDA00022634630900000429
are standard normally distributed variables used for random sampling;

S22,定义K个可逆自回归流,将上一步得到的用户初始隐含变量表示

Figure BDA00022634630900000430
和项目初始隐含变量表示
Figure BDA00022634630900000431
输入这K个自回归流中进行可逆变换,学习得到用户和项目的最终隐含变量表示
Figure BDA00022634630900000433
使其分布更接近于真实的潜在层数据分布。学习分布的过程如下(由于用户和项目的最终隐含变量表示的学习过程是一致的,因此我们此处省略下标ui和vj,用统一的符号z0和zK去表示初始隐含变量表示和最终隐含变量表示):S22, define K reversible autoregressive flows, and represent the initial hidden variables of the user obtained in the previous step
Figure BDA00022634630900000430
and the item initial implicit variable representation
Figure BDA00022634630900000431
Input these K autoregressive streams for reversible transformation, and learn to obtain the final latent variable representation of users and items and
Figure BDA00022634630900000433
Make its distribution closer to the true latent layer data distribution. study and The distribution process is as follows (because the learning process represented by the final hidden variables of users and items is the same, we omit the subscripts ui and v j here, and use the unified symbols z 0 and z K to represent the initial hidden variables representation and final implicit variable representation):

Figure BDA0002263463090000051
Figure BDA0002263463090000051

上式中

Figure BDA0002263463090000052
是一维概率密度,并且以zK先前i-1维的概率为条件,d表示的是zK的维度;In the above formula
Figure BDA0002263463090000052
is a one-dimensional probability density and is conditioned on the probability of the previous i-1 dimension of z K , and d represents the dimension of z K ;

求用户和项目的最终隐含变量表示的过程,包括以下分步骤:The process of finding the final implicit variable representation of users and items includes the following sub-steps:

S221,求目标分布p(zk):在得到最终隐含变量表示的过程中,我们首先得到zk的d个维度,其中的k看作是经过第k个流,zk中的第i个维度是以zk-1的前i-1维为条件的,根据

Figure BDA0002263463090000053
其分布为p(zk-1),计算zk的d个维度,从而获得zk的分布p(zk),S221, find the target distribution p(z k ): in the process of obtaining the final implicit variable representation, we first obtain the d dimensions of z k , where k is regarded as passing through the k th flow, the i th in z k dimensions are conditioned on the first i-1 dimensions of z k-1 , according to
Figure BDA0002263463090000053
Its distribution is p(z k-1 ), and the d dimensions of z k are calculated to obtain the distribution p(z k ) of z k ,

Figure BDA0002263463090000054
Figure BDA0002263463090000054

上式中,M(·)和S(·)分别是求均值和方差的神经网络,通过S22我们可以得到一个目标分布p(zk),由于生成目标分布的过程是基于p(zk-1)的已知概率密度分布,因此,对于GPU的并行化,该过程非常快。这一个过程主要是利用逆自回归流(InverseAutoregressive Flow,IAF)的思想,它是一种特殊的神经网络,可以同时输出均值和标准差的所有值,便于对μ1:i-1和σ1:i-1的采样;In the above formula, M( ) and S( ) are the neural networks for calculating the mean and variance, respectively. Through S22, we can obtain a target distribution p(z k ), since the process of generating the target distribution is based on p(z k- 1 ), the process is very fast for GPU parallelization. This process mainly uses the idea of Inverse Autoregressive Flow (IAF), which is a special neural network that can output all the values of the mean and standard deviation at the same time, which is convenient for μ 1: i-1 and σ 1 : sampling of i-1 ;

S222,求目标分布

Figure BDA0002263463090000056
对于p(zk-1)密度估计的可逆过程,本发明使用掩模自回归流(Masked Autoregressive Flow,MAF)的思想,评估目标分布 S222, find the target distribution
Figure BDA0002263463090000056
For the reversible process of p(z k-1 ) density estimation, the present invention uses the idea of Masked Autoregressive Flow (MAF) to evaluate the target distribution

Figure BDA0002263463090000058
Figure BDA0002263463090000058

Figure BDA0002263463090000059
Figure BDA0002263463090000059

上式中,M′(·)和S′(·)分别是使用掩模自回归流求均值和方差的特殊神经网络,在模型中充当的是可逆计算过程;In the above formula, M'( ) and S'( ) are special neural networks that use mask autoregressive flow to find the mean and variance, respectively, and act as a reversible calculation process in the model;

S223,通过步骤S221和S222得到两个基于不同条件的同一目标分布p(zk)和

Figure BDA00022634630900000510
但是它们在模拟自回归流的过程中存在不同的偏差,为了利用两个输出分布来稳定训练过程,我们计算两个输出分布之间的KL散度,S223, through steps S221 and S222, two identical target distributions p(z k ) and
Figure BDA00022634630900000510
But they have different biases in the process of simulating autoregressive flow, in order to use the two output distributions to stabilize the training process, we calculate the KL divergence between the two output distributions,

Figure BDA0002263463090000061
Figure BDA0002263463090000061

上式中,两个目标分布p(zk)和

Figure BDA0002263463090000062
的交叉熵计算方式如下:In the above formula, the two target distributions p(z k ) and
Figure BDA0002263463090000062
The cross entropy is calculated as follows:

Figure BDA0002263463090000063
Figure BDA0002263463090000063

接下来,描述熵

Figure BDA0002263463090000064
的计算过程:Next, entropy is described
Figure BDA0002263463090000064
The calculation process of:

Figure BDA0002263463090000065
Figure BDA0002263463090000065

S224,重复K次步骤S221至S223,得到用户和项目最终隐含变量表示

Figure BDA0002263463090000066
Figure BDA0002263463090000067
S224, repeating steps S221 to S223 K times to obtain the final implicit variable representation of the user and item
Figure BDA0002263463090000066
and
Figure BDA0002263463090000067

S23,将S22中得到的用户和项目最终隐含变量表示

Figure BDA0002263463090000068
Figure BDA0002263463090000069
与用户和项目的协同信息
Figure BDA00022634630900000610
结合起来得到用户和项目的最终潜在表示ui′和vj′,S23, express the final hidden variables of users and items obtained in S22
Figure BDA0002263463090000068
and
Figure BDA0002263463090000069
Collaboration information with users and projects
Figure BDA00022634630900000610
and Combined to get the final latent representations ui ' and v j ' of users and items,

Figure BDA00022634630900000612
Figure BDA00022634630900000612

上式中,uc和vc均从高斯分布中取得,分别表示用户和项目的协同信息,随着模型的优化,这两个向量也不断地被优化,最后可以很好的表示用户和项目的协同信息;In the above formula, u c and vc are obtained from the Gaussian distribution, representing the collaborative information of users and items respectively. With the optimization of the model, these two vectors are also continuously optimized, and finally they can represent users and items well. collaboration information;

S24,不断重复步骤S21至S23,直到获取所有用户和项目的最终潜在表示,将所有用户和项目的最终潜在表示连接起来得到用户和项目的最终潜在表示矩阵U′和V′;S24, continuously repeating steps S21 to S23 until the final latent representations of all users and items are obtained, and connecting the final latent representations of all users and items to obtain the final latent representation matrices U' and V' of users and items;

S25,构建两个损失函数L1和L2,将L1和L2加在一起生成最后的总损失L,并最小化这个总损失函数,完成对模型的训练,得到所述项目推荐模型。其中L1为预测项目与标签项目的交叉熵损失函数,L2用于让学习到的隐含变量分布更接近于真实分布的重构损失函数。该步骤具体包括以下分步骤:S25, construct two loss functions L 1 and L 2 , add L 1 and L 2 together to generate a final total loss L, and minimize this total loss function, complete the training of the model, and obtain the item recommendation model. Among them, L 1 is the cross-entropy loss function of the predicted item and the label item, and L 2 is used to make the learned latent variable distribution closer to the real distribution. The reconstruction loss function. This step specifically includes the following sub-steps:

S251,获取第一损失L1:将S24中得到的用户和项目的最终潜在表示矩阵U′和V′的矢量积输入到一个多层感知机组成的分类器中,输出用户访问每一个项目的概率矩阵,即看作是用户对所有项目的评分矩阵R′,再利用S1得到的项目标签矩阵R与预测评分矩阵R′求交叉熵损失,S251, obtain the first loss L 1 : input the vector product of the final latent representation matrices U′ and V′ of the user and the item obtained in S24 into a classifier composed of a multi-layer perceptron, and output the user access to each item Probability matrix, which is regarded as the user's scoring matrix R' for all items, and then use the item label matrix R obtained from S1 and the predicted scoring matrix R' to calculate the cross entropy loss,

Figure BDA0002263463090000071
Figure BDA0002263463090000071

上式中,r′ij表示用户ui访问项目vj的概率,将L1作为第一损失,M为用户总数,N为项目总数;In the above formula, r' ij represents the probability of user ui accessing item v j , L 1 is taken as the first loss, M is the total number of users, and N is the total number of items;

S252,获取第二损失L2:再将S23中得到的用户和项目最终隐含变量表示

Figure BDA0002263463090000072
Figure BDA0002263463090000073
输入到一个可逆的解码器中,构建用户和项目隐含变量的联合分布,让其与输入用户和项目数据的联合分布求相对熵,得到第二损失L2;(为了表述方便,在第二损失计算公式中,取消用户和项目的下标i和j)S252, obtain the second loss L 2 : then express the final implicit variables of the user and item obtained in S23
Figure BDA0002263463090000072
and
Figure BDA0002263463090000073
Input into a reversible decoder, construct the joint distribution of the hidden variables of users and items, and obtain the relative entropy with the joint distribution of the input user and item data to obtain the second loss L 2 ; (for convenience of expression, in the second In the loss calculation formula, cancel the subscripts i and j of users and items)

Figure BDA0002263463090000074
Figure BDA0002263463090000074

通过用户后验分布q(zu|u)近似隐含变量zu的真实分布p(u,zu),q(zu|u)定义为

Figure BDA0002263463090000075
同样的,用项目后验分布q(zv|v)近似隐含变量zv的真实分布p(v,zv),q(zv|v)定义为
Figure BDA0002263463090000076
p(u,zu)和p(v,zv)表示用户和项目输入数据的真实分布,zu、zv表示协同自回归流模型中的隐含变量,u、v表示模型输入数据,θ、φ分别表示概率分布的参数,上式中
Figure BDA0002263463090000077
Figure BDA0002263463090000081
表示重构损失,
Figure BDA0002263463090000082
Figure BDA0002263463090000083
为常数项,剩下四项表示自回归流。对第二损失L2进行优化的过程中,实际上已经完成了对S223中
Figure BDA0002263463090000084
的最小化,具体的推导过程参照以下论文:【van den Oord,A.,Li,Y.,Babuschkin,I.,Simonyan,K.,Vinyals,O.,Kavukcuoglu,K.,van den Driessche,”Parallel wavenet:Fast high-fidelity speech synthesis”】;The true distribution p(u,z u ) of the latent variable z u is approximated by the user posterior distribution q(z u | u ), which is defined as
Figure BDA0002263463090000075
Similarly, the true distribution p(v,z v ) of the latent variable z v is approximated by the item posterior distribution q(z v |v), and q(z v |v) is defined as
Figure BDA0002263463090000076
p(u,z u ) and p(v,z v ) represent the true distribution of user and item input data, zu , z v represent the latent variables in the collaborative autoregressive flow model, u, v represent the model input data, θ and φ represent the parameters of the probability distribution, respectively, in the above formula
Figure BDA0002263463090000077
and
Figure BDA0002263463090000081
represents the reconstruction loss,
Figure BDA0002263463090000082
and
Figure BDA0002263463090000083
is a constant term, and the remaining four terms represent the autoregressive flow. In the process of optimizing the second loss L2, the
Figure BDA0002263463090000084
The minimization of , the specific derivation process refers to the following papers: [van den Oord, A., Li, Y., Babuschkin, I., Simonyan, K., Vinyals, O., Kavukcuoglu, K., van den Driessche, " Parallel wavenet:Fast high-fidelity speech synthesis"];

S253,将第一损失和第二损失加在一起生成最后的总损失L=L1+L2,并最小化这个总损失函数,完成对模型的训练,得到所述项目推荐模型。S253 , add the first loss and the second loss together to generate a final total loss L=L 1 +L 2 , and minimize the total loss function to complete the training of the model to obtain the item recommendation model.

上述利用协同自回归流实现的项目推荐模型训练方法,步骤S3的目的在于,利用S2训练好的项目推荐模型给用户进行个性化的项目推荐,包含以下步骤:In the above-mentioned project recommendation model training method implemented by using collaborative autoregressive flow, the purpose of step S3 is to use the project recommendation model trained by S2 to perform personalized project recommendation for users, including the following steps:

S31,根据S2训练好的模型,得到用户评分矩阵R′,再将用户在训练集中访问过的项目索引位置的评分置为0,例如用户ui在训练集中访问过项目vj,则将评分矩阵R′中对应的元素r′ij置为0,再将新的评分矩阵中的每一行按照评分高低进行排序得到矩阵R″;S31, obtain the user rating matrix R' according to the model trained in S2, and then set the rating of the index position of the item visited by the user in the training set to 0. For example, if the user ui has visited the item v j in the training set, the rating will be The corresponding element r' ij in the matrix R' is set to 0, and then each row in the new scoring matrix is sorted according to the score to obtain the matrix R";

S32,根据S31中得到的评分矩阵R″,选取R″中评分前n的项目作为最后给用户推荐的结果。S32, according to the scoring matrix R" obtained in S31, select the items with the top n scores in R" as the final recommended result to the user.

本发明提供的利用协同自回归流实现的项目推荐方法(CollaborationAutoregressive Filtering,CAF),将基于流的生成模型扩展到协同过滤技术上,用于建模隐式反馈。同时,结合用户和项目的辅助信息和协同信息,能够更好的学习用户和项目的潜在表示,而且能够通过自回归流学习隐含变量的真实分布,从而大大提高推荐性能。The item recommendation method (Collaboration Autoregressive Filtering, CAF) provided by the present invention using the collaborative autoregressive flow extends the flow-based generation model to the collaborative filtering technology for modeling implicit feedback. At the same time, combined with the auxiliary information and collaborative information of users and items, the latent representation of users and items can be better learned, and the true distribution of latent variables can be learned through autoregressive flow, thereby greatly improving the recommendation performance.

本发明提供的利用协同自回归流实现的项目推荐方法。与现有技术相比,具有以下有益效果:The present invention provides an item recommendation method realized by using collaborative autoregressive flow. Compared with the prior art, it has the following beneficial effects:

1、本发明利用两个变分自动编码器分别学习用户和项目的潜在变量表示,然后采用协同过滤技术给用户进行项目推荐。传统的方法很多都是利用用户项目矩阵来学习用户偏好,很少结合用户,项目,用户项目三个方面做推荐,这就使得传统的模型不能全面的捕捉用户项目的交互信息。1. The present invention uses two variational auto-encoders to learn the latent variable representations of users and items respectively, and then uses collaborative filtering technology to recommend items to users. Many traditional methods use the user-item matrix to learn user preferences, and rarely combine users, items, and user-items to make recommendations, which makes the traditional models unable to fully capture the interaction information of user-items.

2、本发明在模型编码过程中加入了协同自回归流去学习隐含变量的真实分布,减小了传统的变分模型(比如VAE)在项目推荐问题中的误差;这是因为传统的VAE需要假设一个先验分布,而这个先验分布将带来很大的误差;正则化流则不用假设先验分布,使得学习到的分布更接近于隐含变量的真实分布。2. The present invention adds a collaborative autoregressive flow in the model coding process to learn the true distribution of latent variables, reducing the error of traditional variational models (such as VAE) in item recommendation problems; this is because traditional VAE A prior distribution needs to be assumed, and this prior distribution will bring a large error; the regularization flow does not need to assume a prior distribution, so that the learned distribution is closer to the true distribution of the hidden variables.

3、本发明中的自回归流采用了可逆自回归流(Inverse Autoregressive Flow,IAF)和掩模自回归流(Masked Autoregressive Flow,MAF),能够有效的促进变分推断和数据采样的效率,缩小了隐含变量简单分布和具有复杂分布的实际数据之间的差距。3. The autoregressive flow in the present invention adopts the reversible autoregressive flow (Inverse Autoregressive Flow, IAF) and the masked autoregressive flow (Masked Autoregressive Flow, MAF), which can effectively promote the efficiency of variational inference and data sampling, and reduce the The gap between a simple distribution of latent variables and actual data with a complex distribution is identified.

附图说明Description of drawings

图1为利用协同自回归流实现的项目推荐模型整体结构图。Figure 1 shows the overall structure of the item recommendation model implemented by using collaborative autoregressive flow.

图2为使用自回归流进行变分推断的过程。Figure 2 shows the process of variational inference using autoregressive flows.

图3为用户和项目最终潜在表示的可视化图谱。Figure 3 is a visual graph of the final latent representation of users and items.

图4为项目初始隐含变量

Figure BDA0002263463090000091
在不同数据集上经过K个自回归流的可视化变化过程。Figure 4 shows the initial hidden variables of the project
Figure BDA0002263463090000091
Visualize the change process through K autoregressive streams on different datasets.

图5为项目推荐模型(CAF)在不同数据集上的预测结果随着自回归流中的参数K变化示意图;(a)对应MovieLens数据集,(b)对应CiteULike数据集;(c)对应LastFM数据集,K可以理解为可逆函数(或称为自回归流)的个数。Figure 5 is a schematic diagram of the prediction results of the item recommendation model (CAF) on different datasets with the parameter K in the autoregressive stream; (a) corresponds to MovieLens dataset, (b) corresponds to CiteULike dataset; (c) corresponds to LastFM Data set, K can be understood as the number of reversible functions (or called autoregressive flows).

图6为项目推荐模型(CAF)在不同数据及上的预测结果随训练轮数变化示意图。FIG. 6 is a schematic diagram showing the variation of the prediction results of the item recommendation model (CAF) on different data and with the number of training rounds.

术语解释Terminology Explanation

变分推断(variational inference):变分推断简单来说便是需要根据已有数据推断需要的分布p;当p不容易表达,不能直接求解时,可以尝试用变分推断的方法。即,寻找容易表达和求解的分布q,当q和p的差距很小的时候,q就可以作为p的近似分布代替p。整个过程会用到一个变分的自编码器(VAE)导出一个证据下界(ELBO),通过最大化证据下界来学习我们的近似分布q。Variational inference: Simply put, variational inference is the need to infer the required distribution p based on existing data; when p is not easy to express and cannot be solved directly, you can try the method of variational inference. That is, look for a distribution q that is easy to express and solve. When the difference between q and p is small, q can be used as an approximate distribution of p to replace p. The whole process uses a Variational Autoencoder (VAE) to derive an Evidence Lower Bound (ELBO) to learn our approximate distribution q by maximizing the evidence lower bound.

协同过滤(Collaborative Filtering):简单来说是利用某兴趣相投、拥有共同经验之群体的喜好来推荐用户感兴趣的信息,个人透过合作的机制给予信息相当程度的回应(如评分)并记录下来以达到过滤的目的进而帮助别人筛选信息,回应不一定局限于特别感兴趣的,特别不感兴趣信息的纪录也相当重要。协同过滤又可分为评比(rating)或者群体过滤(social filtering)。其后成为电子商务当中很重要的一环,即根据某顾客以往的购买行为以及从具有相似购买行为的顾客群的购买行为去推荐这个顾客其“可能喜欢的项目”,也就是借由社群的喜好提供个人化的信息、商品等的推荐服务。除了推荐之外,近年来也发展出数学运算让系统自动计算喜好的强弱进而去芜存菁使得过滤的内容更有依据,也许不是百分之百完全准确,但由于加入了强弱的评比让这个概念的应用更为广泛,除了应用于电子商务之外还可应用于信息检索领域、网络个人影音柜、个人书架等领域。Collaborative filtering (Collaborative Filtering): Simply put, it uses the preferences of a group with similar interests and common experience to recommend information that users are interested in. Individuals respond to the information to a certain extent (such as ratings) through a cooperative mechanism and record it. In order to achieve the purpose of filtering and help others filter information, the response is not necessarily limited to those of special interest, and the record of information of special interest is also very important. Collaborative filtering can be further classified into rating or social filtering. Since then, it has become an important part of e-commerce, that is, recommending a customer's "likely items" based on the customer's previous purchase behavior and the purchase behavior from a customer group with similar purchase behavior, that is, through the community. Personalized information, products and other recommendation services are provided. In addition to recommendations, in recent years, mathematical operations have also been developed to allow the system to automatically calculate the strength of preferences, and then remove the clutter and save the essentials to make the filtered content more basis. It may not be 100% accurate, but due to the addition of a strength rating, this concept is In addition to being used in e-commerce, it can also be used in the field of information retrieval, network personal audio-visual cabinets, personal bookshelves and other fields.

具体实施方式Detailed ways

以下结合附图对本发明作进一步描述。The present invention will be further described below in conjunction with the accompanying drawings.

实施例利用协同自回归流实现的项目推荐模型训练Example Item recommendation model training using collaborative autoregressive flow

本实施例采用三个真实数据集MovieLens、CiteULike和LastFM(数据集可以分别从https://grouplens.org/datasets/movielens/1M/http://www.citeulike.org/http://www.lastfm.com/获得)作为研究对象,对本发明提供的项目推荐推荐模型训练方法进行详细的解释。This example uses three real datasets MovieLens, CiteULike and LastFM (the datasets can be obtained from https://grouplens.org/datasets/movielens/1M/ , http://www.citeulike.org/ , http:// www.lastfm.com/obtain ) as the research object, to give a detailed explanation of the item recommendation model training method provided by the present invention.

如图1所示,本实施例提供的利用协同自回归化流实现的项目推荐模型(CAF)主要由编码器、自回归流层、解码器、分类器构成,该项目推荐模型首先对数据进行预处理:对于MovieLens数据集我们将保留用户对电影的评分为1到5的数据,其他的数据删除;把CiteULike数据集中访问少于十篇文章的用户排除掉;对于LastFM数据集,我们保留所有的数据。As shown in FIG. 1 , the item recommendation model (CAF) implemented by using collaborative autoregressive flow provided by this embodiment is mainly composed of an encoder, an autoregressive flow layer, a decoder, and a classifier. Preprocessing: For the MovieLens dataset, we will keep the data with user ratings for movies from 1 to 5, and delete other data; exclude users who have accessed less than ten articles in the CiteULike dataset; for the LastFM dataset, we keep all The data.

表1数据集的统计信息Table 1 Statistics of the dataset

Figure BDA0002263463090000101
Figure BDA0002263463090000101

本实施例中数据集的统计信息如表1所示,在三个实验数据集上划分训练集、验证集和测试集,对于每一个用户随机选择70%访问项目作为训练集,10%作为验证集,剩下的20%为测试集。The statistical information of the data set in this embodiment is shown in Table 1. The training set, the verification set and the test set are divided into three experimental data sets. For each user, 70% of the access items are randomly selected as the training set and 10% as the verification set. set, and the remaining 20% is the test set.

根据用户访问项目的情况构建一个项目标签矩阵R,再将输入数据中的每一个用户和每一个项目进行嵌入,得到用户嵌入向量矩阵U和项目嵌入向量矩阵V;Build an item label matrix R according to the user's access to the item, and then embed each user and each item in the input data to obtain the user embedding vector matrix U and the item embedding vector matrix V;

构建好训练数据集后,本实施例采用训练集中的数据按照步骤S2训练得到三个项目推荐模型,验证集主要用于调参,测试集用于后面的测试效果讨论。After the training data set is constructed, this embodiment uses the data in the training set to train according to step S2 to obtain three item recommendation models, the verification set is mainly used for parameter adjustment, and the test set is used for the subsequent discussion of the test effect.

如附图1所示,首先将用户嵌入向量矩阵U和项目嵌入向量矩阵V输入到步骤S2,首先根据S21中多层感知机网络编码获得每一个用户和每一个项目的隐含变量

Figure BDA0002263463090000111
Figure BDA0002263463090000112
如附图2所示,为了得到用户和项目最终隐含变量表示
Figure BDA0002263463090000113
Figure BDA0002263463090000114
将S21中编码器的输出
Figure BDA0002263463090000115
Figure BDA0002263463090000116
带入到步骤S22中,经过K个可逆自回归流的变换,得到
Figure BDA0002263463090000117
然后再根据S23将
Figure BDA0002263463090000119
Figure BDA00022634630900001110
与服从高斯分布的用户和项目协同信息结合得到用户和项目的最终潜在表示,随着模型的不断更新,用户和项目的潜在表示不断被优化,从而得到用户和项目的最终潜在表示ui′和vj′,重复步骤S2直到模型收敛,最终得到所有用户和所有项目的潜在表示U′和V′。As shown in Figure 1, firstly input the user embedding vector matrix U and item embedding vector matrix V into step S2, first obtain the hidden variables of each user and each item according to the multi-layer perceptron network coding in S21
Figure BDA0002263463090000111
and
Figure BDA0002263463090000112
As shown in Figure 2, in order to obtain the final implicit variable representation of users and items
Figure BDA0002263463090000113
and
Figure BDA0002263463090000114
Convert the output of the encoder in S21
Figure BDA0002263463090000115
and
Figure BDA0002263463090000116
Bring it into step S22, after the transformation of K reversible autoregressive flows, get
Figure BDA0002263463090000117
and Then according to S23,
Figure BDA0002263463090000119
and
Figure BDA00022634630900001110
Combined with the user and item collaboration information that obeys the Gaussian distribution, the final latent representation of the user and item is obtained. With the continuous updating of the model, the latent representation of the user and item is continuously optimized, so as to obtain the final latent representation of the user and item ui ′ and v j ', repeat step S2 until the model converges, and finally get the latent representations U' and V' of all users and all items.

如附图1中解码器和分类器部分所示,将S23得到的用户和项目最终潜在表示矩阵U′和V′的矢量积输入到步骤S25的分类器中得到由项目标签矩阵R和预测评分矩阵R′计算交叉熵的第一损失;再将S23中得到的最终用户和项目隐含变量

Figure BDA00022634630900001111
Figure BDA00022634630900001112
输入到S25的可逆解码器中,构建用户和项目隐含变量的联合分布,根据L2求第二损失;将第一损失和第二损失联合起来求总损失L,并最小化L,完成对模型的训练,得到所述的项目推荐模型。As shown in the decoder and classifier part in Fig. 1, the vector product of the final latent representation matrices U' and V' of the user and item obtained in S23 is input into the classifier in step S25 to obtain the item label matrix R and the predicted score The matrix R′ calculates the first loss of cross-entropy; then the end-user and item latent variables obtained in S23
Figure BDA00022634630900001111
and
Figure BDA00022634630900001112
Input into the reversible decoder of S25, construct the joint distribution of hidden variables of users and items, and obtain the second loss according to L 2 ; combine the first loss and the second loss to obtain the total loss L, and minimize L to complete the pairing. Model training to obtain the item recommendation model.

应用例Application example

针对三个实验数据集的测试集,分别采用实施例训练得到的项目推荐模型按照以下步骤执行项目推荐:For the test set of the three experimental data sets, the item recommendation model obtained by the training of the embodiment is used to perform item recommendation according to the following steps:

S1′,根据训练好的模型CAF,得到用户评分矩阵R′,再将这个评分矩阵中用户训练集中访问过的项目索引位置的值置为0,对新得到的评分矩阵按评分高低进行排序得到一个有序的评分矩阵R″。S1', according to the trained model CAF, obtain the user rating matrix R', and then set the value of the index position of the items visited in the user training set in this rating matrix to 0, and sort the newly obtained rating matrix according to the score to get An ordered rating matrix R".

S2′,获取用户预测项目,本步骤采用评分矩阵法,根据步骤S1′得到的用户评分矩阵R″给每一个用户进行项目推荐,选取R″中评分前n(n=5,10,20,50)的项目作为最后给用户推荐的结果,如果这n个项目中存在测试数据中用户真实的标签项目,则认为此次预测正确,在实际应用中,每次给用户推荐n个项目,如果里面有用户感兴趣的,则推荐成功。这样在得到推荐项目的同时,能够进一步提高项目推荐的精确度和效率。S2', obtain user prediction items, this step adopts the scoring matrix method, and recommends items to each user according to the user scoring matrix R" obtained in step S1', and selects the top n (n=5, 10, 20, 50) as the result of the final recommendation to the user. If there are real label items of the user in the test data among the n items, the prediction is considered correct. In practical applications, n items are recommended to the user each time. If If there are users interested in it, the recommendation is successful. In this way, while the recommended items are obtained, the accuracy and efficiency of the item recommendation can be further improved.

采用上述项目推荐模型(CAF)在测试集上的预测效果见表2加粗部分所示。The prediction effect of the above-mentioned item recommendation model (CAF) on the test set is shown in the bold part of Table 2.

表2:在三个数据集上进行项目推荐模型结果Table 2: Item recommendation model results on three datasets

Figure BDA0002263463090000121
Figure BDA0002263463090000121

为了进一步说明本发明提供的利用协同自回归流实现的项目推荐方法的预测效果。本应用例用三个实验数据集在七种基线方法(BPR、CDL、CVAE、CVAE-B、MVAE、VAE-AR、CLVAES)上训练得到项目推荐模型,然后利用这七种模型在测试集中给用户推荐接下来最有可能访问的n个项目,这七种模型的预测结果见表2所示。其中的指标R@n表示在测试用例中,预测正确的项目个数占真实的项目个数的比例,即召回率;P@n表示在测试集中,预测正确的项目个数占预测的项目个数的比例,即精确率。In order to further illustrate the prediction effect of the item recommendation method implemented by the collaborative autoregressive flow provided by the present invention. This application example uses three experimental datasets to train on seven baseline methods (BPR, CDL, CVAE, CVAE-B, MVAE, VAE-AR, CLVAES) to obtain an item recommendation model, and then uses these seven models in the test set to give Users recommend n items that are most likely to be accessed next. The prediction results of these seven models are shown in Table 2. The index R@n indicates the ratio of the number of correctly predicted items to the actual number of items in the test case, that is, the recall rate; P@n indicates that in the test set, the number of correctly predicted items accounted for the number of predicted items The ratio of numbers, that is, the precision rate.

对表格中其余方法的介绍如下:The rest of the methods in the table are described below:

BPR:是一种广泛使用的矩阵分解方法,它通过随机梯度下降使用成对排序目标函数利用隐式反馈优化潜在因子。可以参考论文:S.Rendle,C.Freudenthaler,Z.Gantner,and L.Schmidt-Thieme.2009.Bpr:Bayesian personalized ranking from implicitfeedback.In UAI.BPR: is a widely used matrix factorization method that utilizes implicit feedback to optimize latent factors via stochastic gradient descent using a pairwise ranking objective function. You can refer to the paper: S.Rendle,C.Freudenthaler,Z.Gantner,and L.Schmidt-Thieme.2009.Bpr:Bayesian personalized ranking from implicitfeedback.In UAI.

CDL:是一种联合贝叶斯模型学习辅助信息,并通过堆叠去噪自动编码器和协同过滤提取潜在特征。可以参考论文:P.Wang,J.Guo,Y.Lan,J.Xu,S.Wan,andX.Cheng.2015.Learning hierarchical representation model for nextbasketrecommendation.In SIGIR.CDL: is a joint Bayesian model that learns auxiliary information and extracts latent features by stacking denoising autoencoders and collaborative filtering. You can refer to the paper: P.Wang,J.Guo,Y.Lan,J.Xu,S.Wan,andX.Cheng.2015.Learning hierarchical representation model for nextbasketrecommendation.In SIGIR.

CVAE:是第一个基于变分自动编码器的协同项目推荐方法,它将项目中的内容信息合并到矩阵分解中(结合BPR模型来改进CVAE以提高模型的推荐的方法为表2中的CVAE-B)。CVAE模型可以参考论文:X.Li and J.She.2017.Collaborative variationalautoencoder for recommender systems.In KDD.CVAE: is the first collaborative item recommendation method based on variational autoencoder, which incorporates the content information in the item into matrix factorization (combining BPR model to improve CVAE to improve the recommended method of the model is CVAE in Table 2). -B). The CVAE model can refer to the paper: X.Li and J.She.2017.Collaborative variationalautoencoder for recommender systems.In KDD.

MVAE:使用多项式条件似然作为先验分布,并且它没有包含辅助信息用于推荐。可以参考论文:D.Liang,R.G.Krishnan,M.D.Hoffman,and T.Jebara.2018.Variationalautoencoders for collaborative filtering.In WWW.MVAE: uses multinomial conditional likelihood as the prior distribution, and it does not contain auxiliary information for recommendation. You can refer to the paper: D.Liang,R.G.Krishnan,M.D.Hoffman,and T.Jebara.2018.Variationalautoencoders for collaborative filtering.In WWW.

VAE-AR:利用变分自动编码器对用户隐式反馈信息和项目辅助信息进行建模,并使用对抗生成网络提取受辅助信息影响的潜在变量表示。可以参考论文:W.Lee,K.Song,and I.-C.Moon.2017.Augmented variational autoencoders for collaborativefiltering with auxiliary information.In CIKM.VAE-AR: Utilizes a variational autoencoder to model user implicit feedback information and item auxiliary information, and uses an adversarial generative network to extract latent variable representations influenced by auxiliary information. You can refer to the paper: W.Lee,K.Song,and I.-C.Moon.2017.Augmented variational autoencoders for collaborative filtering with auxiliary information.In CIKM.

CLVAE:是一种基于条件变分自动编码器的推荐方法,它扩展了具有分层变分自动编码器结构的CVAE。可以参考论文:W.Lee,K.Song,and I.-C.Moon.2017.Augmentedvariational autoencoders for collaborative filtering with auxiliaryinformation.In CIKM.CLVAE: is a conditional variational autoencoder-based recommendation method that extends CVAE with a hierarchical variational autoencoder structure. You can refer to the paper: W.Lee,K.Song,and I.-C.Moon.2017.Augmentedvariational autoencoders for collaborative filtering with auxiliaryinformation.In CIKM.

从表2的预测结果可以看出,本发明提供的利用协同自回归流实现的项目推荐方法,其预测的精度全面高于现有的一些方法。It can be seen from the prediction results in Table 2 that the item recommendation method implemented by the collaborative autoregressive flow provided by the present invention has an overall higher prediction accuracy than some existing methods.

为了说明协同自回归流为什么能提高预测结果,本实施例提供的和一般方法的隐含变量进行可视化的比较,如图3所示,从图3中可以看出CAF模型的聚类效果更好,这也是CAF的预测效果高于其他方法的原因。In order to explain why the collaborative autoregressive flow can improve the prediction results, the hidden variables provided in this embodiment and the general method are visually compared, as shown in Figure 3. It can be seen from Figure 3 that the clustering effect of the CAF model is better , which is why the prediction effect of CAF is higher than that of other methods.

为了说明使用自回归流能够近似数据真实后验的结果,本应用例提供了三个数据的项目隐含变量随着自回归流个数的不同取值的可视化比较,如图4所示。理论上而言,更多的可逆变换可能接近更复杂的分布,但是对于我们的模型来说,较小的值就足够了。在MovieLens数据集上,可以清楚地观察到当K=7时获得最明显的表示,之后项目的隐含变量表示再次扭曲。在CiteULike和LastFM数据集上,经过5个自回归流转换足以获得最佳可视化。这种可视化很重要,因为可视化表示本质上更具可解释性,并且可视化的程度可以用于指导基于自回归流的推荐系统的训练过程。In order to illustrate that the use of autoregressive flow can approximate the result of the real posterior of the data, this application example provides a visual comparison of the hidden variables of the three data items with different values of the number of autoregressive flows, as shown in Figure 4. In theory, more invertible transformations could approach more complex distributions, but for our model, smaller values were sufficient. On the MovieLens dataset, it can be clearly observed that the most obvious representation is obtained when K = 7, after which the latent variable representation of items is distorted again. On the CiteULike and LastFM datasets, 5 autoregressive stream transformations are sufficient for optimal visualization. This visualization is important because visual representations are inherently more interpretable, and the extent of visualization can be used to guide the training process of autoregressive flow-based recommender systems.

为了说明自回归流的个数对预测结果的影响,本应用例进行实验以研究K(自回归流个数)对推荐性能的影响,推荐性能的结果(R@10指标)如图5所示。从图5(a)中可以看出MovieLens数据集随着流的个数的增加,预测的正确率在上升,直到K>7之后开始下降;图5(b)(c)分别对应CiteULike和LastFM数据集,在K<5的时候,预测的正确率不断上升,而K>5时正确率开始下降。我们可以看到图5的结果与图4的可视化结果一致,同时也说明了隐含变量越可分,CAF的推荐性能越好。In order to illustrate the influence of the number of autoregressive streams on the prediction results, this application example conducts experiments to study the influence of K (number of autoregressive streams) on the recommendation performance. The results of the recommendation performance (R@10 indicator) are shown in Figure 5. . It can be seen from Figure 5(a) that with the increase of the number of streams in the MovieLens dataset, the prediction accuracy rate increases until K>7 and starts to decline; Figure 5(b)(c) corresponds to CiteULike and LastFM respectively For the data set, when K < 5, the accuracy of the prediction continues to rise, and when K > 5, the accuracy begins to decline. We can see that the results in Figure 5 are consistent with the visualization results in Figure 4, and also illustrate that the more separable the latent variables, the better the recommendation performance of CAF.

为了说明训练轮数对预测结果的影响,本应用例CAF模型在三个数据集上分别训练40轮,然后按照实施例给出的方法分别利用数据集MovieLens、CiteULike和LastFM的训练集进行训练,并用训练好的CAF模型在数据集在三个数据集的测试集中进行预测,给用户推荐最有可能在接下来的时间里访问的项目,预测结果如图6所示。从图中可以看出,在数据集MovieLens上训练轮次取25时效果最好,而对于CiteULike和LastFM数据集,在训练30轮后效果最好。In order to illustrate the influence of the number of training rounds on the prediction results, the CAF model in this application example is trained on three data sets for 40 rounds, and then the training sets of the data sets MovieLens, CiteULike and LastFM are used for training according to the method given in the embodiment. And use the trained CAF model to make predictions in the data set in the test set of the three data sets, and recommend the items that are most likely to be accessed in the next time to users. The prediction results are shown in Figure 6. As can be seen from the figure, the best results are obtained when the training epochs are 25 on the MovieLens dataset, while for the CiteULike and LastFM datasets, the best results are obtained after 30 epochs of training.

综上所述,本发明利用协同自回归流实现项目推荐,我们的协同自回归流算法通过利用概率推荐的贝叶斯推断和灵活的后验近似的自回归流来解决偏差推理问题,可以有效解决数据集缺少用户项目信息的问题,而且能够减轻现有贝叶斯推荐方法中固有的不可知后验估计问题。加入的自回归流能够很好的解决之前变分推断(比如VAE)过程中的误差。使得我们的模型可以学习到用户和项目的真实分布,同时能够捕捉到用户和项目的复杂的交互关系,这对于提高预测精度有很大的帮助。To sum up, the present invention uses collaborative autoregressive flow to implement item recommendation, and our collaborative autoregressive flow algorithm solves the problem of biased reasoning by using Bayesian inference for probabilistic recommendation and autoregressive flow for flexible posterior approximation, which can effectively It solves the problem that the dataset lacks user item information, and can alleviate the agnostic posterior estimation problem inherent in existing Bayesian recommendation methods. The added autoregressive flow can solve the error in the previous variational inference (such as VAE) process. This enables our model to learn the real distribution of users and items, and at the same time to capture the complex interaction between users and items, which is of great help to improve the prediction accuracy.

本领域的普通技术人员将会意识到,这里所述的实施例是为了帮助读者理解本发明的原理,应被理解为本发明的保护范围并不局限于这样的特别陈述和实施例。本领域的普通技术人员可以根据本发明公开的这些技术启示做出各种不脱离本发明实质的其它各种具体变形和组合,这些变形和组合仍然在本发明的保护范围内。Those of ordinary skill in the art will appreciate that the embodiments described herein are intended to assist readers in understanding the principles of the present invention, and it should be understood that the scope of protection of the present invention is not limited to such specific statements and embodiments. Those skilled in the art can make various other specific modifications and combinations without departing from the essence of the present invention according to the technical teaching disclosed in the present invention, and these modifications and combinations still fall within the protection scope of the present invention.

Claims (4)

1.一种基于协同自回归流实现的Top-n项目推荐方法,其特征在于包括以下步骤:1. a Top-n project recommendation method realized based on collaborative autoregressive flow, is characterized in that comprising the following steps: S1,数据的预处理:根据原始数据中用户历史访问项目的情况,将数据集划分成训练集,验证集和测试集;再根据训练集中用户访问项目的情况构建一个项目标签矩阵R;再对训练集中的每一个用户和每一个项目进行嵌入,得到用户嵌入向量矩阵U和项目嵌入向量矩阵V;S1, data preprocessing: According to the user's historical access items in the original data, the data set is divided into training set, validation set and test set; then an item label matrix R is constructed according to the user access items in the training set; Embed each user and each item in the training set to obtain the user embedding vector matrix U and the item embedding vector matrix V; S2,优化模型参数,获取最优项目推荐模型:将用户嵌入向量矩阵U和项目嵌入向量矩阵V分别输入到不同的变分自动编码器中,在编码过程中引入协同自回归流,得到的用户和项目的最终隐含变量表示矩阵
Figure FDA0002263463080000011
Figure FDA0002263463080000012
然后让用户和项目的最终隐含变量表示与其相对应的协同信息相结合得到用户和项目的最终潜在表示矩阵U′和V′,接着将用户和项目的最终潜在表示的矢量积输入到一个分类器中,得到第一损失,再将用户和项目的最终隐含变量表示输入到一个解码器中得到第二损失,将第一损失与第二损失相加生成最后的总损失,然后最小化总损失,即得到所述的项目推荐模型;
S2, optimize the model parameters to obtain the optimal item recommendation model: input the user embedding vector matrix U and item embedding vector matrix V into different variational auto-encoders respectively, and introduce a collaborative autoregressive flow in the encoding process, the obtained user and the final latent variable representation matrix of the item
Figure FDA0002263463080000011
and
Figure FDA0002263463080000012
Then let the final latent variable representations of users and items be combined with their corresponding collaborative information to obtain the final latent representation matrices U' and V' of users and items, and then input the vector product of the final latent representations of users and items into a classification In the decoder, the first loss is obtained, and the final latent variable representations of users and items are input into a decoder to obtain the second loss, the first loss and the second loss are added to generate the final total loss, and then the total loss is minimized. loss, that is, to obtain the item recommendation model;
S3,给用户推荐项目:利用上述训练好的项目推荐模型,给用户做Top-n项目推荐。S3, recommend items to users: Use the above trained item recommendation model to recommend Top-n items to users.
2.根据权利要求1所述基于协同自回归流实现的Top-n项目推荐方法,其特征在于所述步骤S1包括以下分步骤:2. the Top-n project recommendation method realized based on collaborative autoregressive flow according to claim 1, is characterized in that described step S1 comprises the following sub-steps: S11,划分数据集:根据原始数据集中用户访问项目的情况,从每一个用户历史访问过的项目中随机选择70%作为训练集,选择20%作为测试集,剩下的10%作为验证集;S11, divide the data set: according to the user access items in the original data set, randomly select 70% of the items visited by each user as the training set, 20% as the test set, and the remaining 10% as the validation set; S12,根据训练集中用户访问项目的情况构建用户嵌入向量矩阵U和项目嵌入向量矩阵V以及项目标签矩阵R,具体包括以下步骤:S12, construct the user embedding vector matrix U, the item embedding vector matrix V and the item label matrix R according to the situation of the user accessing the item in the training set, which specifically includes the following steps: S121,构建用户嵌入向量矩阵U:将训练集中所有用户Us={u1,…,ui,…,uM}表示为嵌入向量,根据用户对项目的偏好初始化一个用户嵌入向量矩阵
Figure FDA0002263463080000013
Figure FDA0002263463080000014
其中M表示用户的数量,N表示项目的数量,uij表示用户嵌入向量矩阵中的元素,如果用户ui访问过项目vj,则uij=1,否则标记为0,用户嵌入向量矩阵U中的第i行表示用户ui的嵌入向量ui
S121, construct a user embedding vector matrix U: represent all users U s ={u 1 ,...,u i ,...,u M } in the training set as embedding vectors, and initialize a user embedding vector matrix according to the user's preference for items
Figure FDA0002263463080000013
Figure FDA0002263463080000014
where M represents the number of users, N represents the number of items, u ij represents the elements in the user embedding vector matrix, if user ui has visited item v j , then u ij =1, otherwise marked as 0, the user embedding vector matrix U The i-th row in represents the embedding vector ui of user ui ;
S122,构建项目嵌入向量矩V:将训练集中所有项目Vs={v1,…,vj,…,vN}表示为嵌入向量,项目嵌入向量矩阵为用户嵌入向量矩阵的转置,即V=UT,项目嵌入向量矩阵V中的第j行表示项目vj的嵌入向量vjS122, constructing the item embedding vector moment V: all items in the training set V s ={v 1 ,...,v j ,...,v N } are represented as embedding vectors, and the item embedding vector matrix is the transpose of the user embedding vector matrix, that is, V= UT , the jth row in the item embedding vector matrix V represents the embedding vector v j of the item v j ; S123,构建项目标签矩阵R:根据训练集中用户访问项目的情况构建一个项目标签矩阵其中rij=uijS123, construct an item label matrix R: construct an item label matrix according to the user access items in the training set where r ij = u ij .
3.根据权利要求1所述基于协同自回归流实现的Top-n项目推荐方法,其特征在于所述步骤S2包含以下分步骤:3. the Top-n project recommendation method realized based on collaborative autoregressive flow according to claim 1, is characterized in that described step S2 comprises following substep: S21,利用两个变分自动编码器分别对用户嵌入向量ui和项目嵌入向量vj进行编码,编码得到用户ui和项目vj的初始隐含变量表示
Figure FDA0002263463080000022
初始隐含变量表示都是d维向量;
S21, using two variational auto-encoders to encode the user embedding vector ui and the item embedding vector vj respectively, and obtain the initial implicit variable representation of the user ui and the item vj by encoding
Figure FDA0002263463080000022
and The initial hidden variable representations are all d-dimensional vectors;
本步骤利用编码器,即多层感知机,对表示成用户和项目的嵌入向量进行编码,每一层的计算过程如下:This step uses an encoder, that is, a multi-layer perceptron, to encode the embedding vectors represented as users and items. The calculation process of each layer is as follows: 第一层:
Figure FDA0002263463080000024
level one:
Figure FDA0002263463080000024
第二层: Second floor: ……... 第t层: Layer t: 上式中,t表示多层感知机的层数,在本发明中t=3,通过最后一层的结果
Figure FDA0002263463080000027
分别计算用户和项目的初始隐含变量表示
Figure FDA0002263463080000029
Figure FDA00022634630800000210
Figure FDA00022634630800000211
Figure FDA00022634630800000212
均服从高斯分布,其中用户初始隐含变量表示
Figure FDA00022634630800000213
分布的均值为
Figure FDA00022634630800000215
方差为
Figure FDA00022634630800000216
Figure FDA00022634630800000217
对于项目初始隐含变量表示
Figure FDA00022634630800000218
其均值
Figure FDA00022634630800000219
方差通过项目的均值和方差得到编码器每一层的计算过程中,
Figure FDA00022634630800000222
表示非线性的激活函数,常见的有sigmoid或tanh,本发明中选取sigmoid作为激活函数,
Figure FDA00022634630800000223
Figure FDA00022634630800000224
分别表示用户嵌入向量和项目嵌入向量经过第t层神经网络得到的隐藏状态表示,矩阵和偏差向量
Figure FDA0002263463080000032
均为用户编码器的训练参数,其中上标t表示这是经过第t层神经网络的参数,而
Figure FDA0002263463080000033
Figure FDA0002263463080000034
是求用户初始隐含变量表示
Figure FDA0002263463080000035
分布时需要学习的参数;矩阵
Figure FDA0002263463080000036
以及偏差向量
Figure FDA0002263463080000037
为项目编码器的训练参数,上标t表示这是经过第t层神经网络的参数,是求项目初始隐含变量表示
Figure FDA0002263463080000039
分布时需要学习的参数;
Figure FDA00022634630800000311
都是用于随机采样的标准正态分布的变量;
In the above formula, t represents the number of layers of the multilayer perceptron, in the present invention t=3, the result of the last layer is passed
Figure FDA0002263463080000027
and Compute initial latent variable representations for users and items separately
Figure FDA0002263463080000029
and
Figure FDA00022634630800000210
Figure FDA00022634630800000211
and
Figure FDA00022634630800000212
All obey a Gaussian distribution, where the user's initial hidden variable represents
Figure FDA00022634630800000213
The mean of the distribution is
Figure FDA00022634630800000215
The variance is
Figure FDA00022634630800000216
but
Figure FDA00022634630800000217
The initial implicit variable representation for the project
Figure FDA00022634630800000218
its mean
Figure FDA00022634630800000219
variance Obtained by the mean and variance of the items During the calculation process of each layer of the encoder,
Figure FDA00022634630800000222
Represents a nonlinear activation function, the common ones are sigmoid or tanh. In the present invention, sigmoid is selected as the activation function,
Figure FDA00022634630800000223
and
Figure FDA00022634630800000224
Represents the hidden state representation obtained by the user embedding vector and the item embedding vector through the t-th layer of neural network, matrix and the bias vector
Figure FDA0002263463080000032
are the training parameters of the user encoder, where the superscript t indicates that this is the parameter of the neural network of the t layer, and
Figure FDA0002263463080000033
Figure FDA0002263463080000034
is to find the initial implicit variable representation of the user
Figure FDA0002263463080000035
Parameters to learn when distributing; matrix
Figure FDA0002263463080000036
and the bias vector
Figure FDA0002263463080000037
is the training parameter of the item encoder, the superscript t indicates that this is the parameter of the neural network of the t layer, is to find the initial implicit variable representation of the project
Figure FDA0002263463080000039
Parameters that need to be learned when distributing; and
Figure FDA00022634630800000311
are standard normally distributed variables used for random sampling;
S22,定义K个可逆自回归流,将上一步得到的用户初始隐含变量表示
Figure FDA00022634630800000312
和项目初始隐含变量表示
Figure FDA00022634630800000313
输入这K个自回归流中进行可逆变换,学习得到用户和项目的最终隐含变量表示
Figure FDA00022634630800000314
Figure FDA00022634630800000315
使其分布更接近于真实的潜在层数据分布。由于用户和项目的最终隐含变量表示的学习过程是一致的,因此我们此处省略下标ui和vj,用统一的符号z0和zK去表示初始隐含变量表示和最终隐含变量表示,因此,学习
Figure FDA00022634630800000316
Figure FDA00022634630800000317
分布的过程如下:
S22, define K reversible autoregressive flows, and represent the initial hidden variables of the user obtained in the previous step
Figure FDA00022634630800000312
and the item initial implicit variable representation
Figure FDA00022634630800000313
Input these K autoregressive streams for reversible transformation, and learn to obtain the final latent variable representation of users and items
Figure FDA00022634630800000314
and
Figure FDA00022634630800000315
Make its distribution closer to the true latent layer data distribution. Since the learning process of the final implicit variable representation of the user and the item is the same, we omit the subscripts ui and v j here, and use the unified symbols z 0 and z K to represent the initial implicit variable representation and the final implicit variable representation The variable representation, therefore, learns
Figure FDA00022634630800000316
and
Figure FDA00022634630800000317
The distribution process is as follows:
上式中
Figure FDA00022634630800000319
是一维概率密度,并且以zK先前i-1维的概率为条件,d表示的是zK的维度;
In the above formula
Figure FDA00022634630800000319
is a one-dimensional probability density and is conditioned on the probability of the previous i-1 dimension of z K , and d represents the dimension of z K ;
求用户和项目的最终隐含变量表示的过程,包括以下分步骤:The process of finding the final implicit variable representation of users and items includes the following sub-steps: S221,求目标分布p(zk):在得到最终隐含变量表示的过程中,我们首先得到zk的d个维度,其中的k看作是经过第k个流,zk中的第i个维度是以zk-1的前i-1维为条件的,根据
Figure FDA00022634630800000320
其分布为p(zk-1),计算zk的d个维度,从而获得zk的分布p(zk),
S221, find the target distribution p(z k ): in the process of obtaining the final implicit variable representation, we first obtain the d dimensions of z k , where k is regarded as passing through the k th flow, the i th in z k dimensions are conditioned on the first i-1 dimensions of z k-1 , according to
Figure FDA00022634630800000320
Its distribution is p(z k-1 ), and the d dimensions of z k are calculated to obtain the distribution p(z k ) of z k ,
Figure FDA00022634630800000322
Figure FDA00022634630800000322
上式中,M(·)和S(·)分别是求均值和方差的神经网络,通过S22我们可以得到一个目标分布p(zk),由于生成目标分布的过程是基于p(zk-1)的已知概率密度分布,因此,对于GPU的并行化,该过程非常快。这一个过程主要是利用逆自回归流(Inverse AutoregressiveFlow,IAF)的思想,它是一种特殊的神经网络,可以同时输出均值和标准差的所有值,便于对μ1:i-1和σ1:i-1的采样;In the above formula, M( ) and S( ) are the neural networks for calculating the mean and variance, respectively. Through S22, we can obtain a target distribution p(z k ), since the process of generating the target distribution is based on p(z k- 1 ), the process is very fast for GPU parallelization. This process mainly uses the idea of Inverse Autoregressive Flow (IAF), which is a special neural network that can output all values of mean and standard deviation at the same time, which is convenient for μ 1: i-1 and σ 1 : sampling of i-1 ; S222,求目标分布
Figure FDA0002263463080000041
对于p(zk-1)密度估计的可逆过程,本发明使用掩模自回归流(Masked Autoregressive Flow,MAF)的思想,评估目标分布
Figure FDA0002263463080000042
S222, find the target distribution
Figure FDA0002263463080000041
For the reversible process of p(z k-1 ) density estimation, the present invention uses the idea of Masked Autoregressive Flow (MAF) to evaluate the target distribution
Figure FDA0002263463080000042
Figure FDA0002263463080000043
Figure FDA0002263463080000043
Figure FDA0002263463080000044
Figure FDA0002263463080000044
上式中,M′(·)和S′(·)分别是使用掩模自回归流求均值和方差的特殊神经网络,在模型中充当的是可逆计算过程;In the above formula, M'( ) and S'( ) are special neural networks that use mask autoregressive flow to find the mean and variance, respectively, and act as a reversible calculation process in the model; S223,通过步骤S221和S222得到两个基于不同条件的同一目标分布p(zk)和
Figure FDA0002263463080000045
但是它们在模拟自回归流的过程中存在不同的偏差,为了利用两个输出分布来稳定训练过程,我们计算两个输出分布之间的KL散度,
S223, through steps S221 and S222, two identical target distributions p(z k ) and
Figure FDA0002263463080000045
But they have different biases in the process of simulating autoregressive flow, in order to use the two output distributions to stabilize the training process, we calculate the KL divergence between the two output distributions,
Figure FDA0002263463080000046
Figure FDA0002263463080000046
上式中,两个目标分布p(zk)和
Figure FDA0002263463080000047
的交叉熵计算方式如下:
In the above formula, the two target distributions p(z k ) and
Figure FDA0002263463080000047
The cross entropy is calculated as follows:
接下来,描述熵
Figure FDA0002263463080000049
的计算过程:
Next, entropy is described
Figure FDA0002263463080000049
The calculation process of:
Figure FDA00022634630800000410
Figure FDA00022634630800000410
S224,重复K次步骤S221至S223,得到用户和项目最终隐含变量表示
Figure FDA00022634630800000411
S224, repeating steps S221 to S223 K times to obtain the final implicit variable representation of the user and item
Figure FDA00022634630800000411
and
S23,将S22中得到的用户和项目最终隐含变量表示
Figure FDA0002263463080000052
与用户和项目的协同信息
Figure FDA0002263463080000055
结合起来得到用户和项目的最终潜在表示ui′和vj′,
S23, express the final hidden variables of users and items obtained in S22
Figure FDA0002263463080000052
and Collaboration information with users and projects and
Figure FDA0002263463080000055
Combined to get the final latent representations ui ' and v j ' of users and items,
Figure FDA0002263463080000056
Figure FDA0002263463080000056
上式中,uc和vc均从高斯分布中取得,分别表示用户和项目的协同信息,随着模型的优化,这两个向量也不断地被优化,最后可以很好的表示用户和项目的协同信息;In the above formula, u c and vc are obtained from the Gaussian distribution, representing the collaborative information of users and items respectively. With the optimization of the model, these two vectors are also continuously optimized, and finally they can represent users and items well. collaboration information; S24,不断重复步骤S21至S23,直到获取所有用户和项目的最终潜在表示,将所有用户和项目的最终潜在表示连接起来得到用户和项目的最终潜在表示矩阵U′和V′;S24, continuously repeating steps S21 to S23 until the final latent representations of all users and items are obtained, and connecting the final latent representations of all users and items to obtain the final latent representation matrices U' and V' of users and items; S25,构建两个损失函数L1和L2,将L1和L2加在一起生成最后的总损失L,并最小化这个总损失函数,完成对模型的训练,得到所述项目推荐模型。其中L1为预测项目与标签项目的交叉熵损失函数,L2用于让学习到的隐含变量分布更接近于真实分布的重构损失函数。该步骤具体包括以下分步骤:S25, construct two loss functions L 1 and L 2 , add L 1 and L 2 together to generate a final total loss L, and minimize this total loss function, complete the training of the model, and obtain the item recommendation model. Among them, L 1 is the cross-entropy loss function of the predicted item and the label item, and L 2 is used to make the learned latent variable distribution closer to the real distribution. The reconstruction loss function. This step specifically includes the following sub-steps: S251,获取第一损失L1:将S24中得到的用户和项目的最终潜在表示矩阵U′和V′的矢量积输入到一个多层感知机组成的分类器中,输出用户访问每一个项目的概率矩阵,即看作是用户对所有项目的评分矩阵R′,再利用S1得到的项目标签矩阵R与预测评分矩阵R′求交叉熵损失,S251, obtain the first loss L 1 : input the vector product of the final latent representation matrices U′ and V′ of the user and the item obtained in S24 into a classifier composed of a multi-layer perceptron, and output the user access to each item Probability matrix, which is regarded as the user's scoring matrix R' for all items, and then use the item label matrix R obtained from S1 and the predicted scoring matrix R' to calculate the cross entropy loss,
Figure FDA0002263463080000057
Figure FDA0002263463080000057
上式中,r′ij表示用户ui访问项目vj的概率,将L1作为第一损失,M为用户总数,N为项目总数;In the above formula, r' ij represents the probability of user ui accessing item v j , L 1 is taken as the first loss, M is the total number of users, and N is the total number of items; S252,获取第二损失L2:再将S23中得到的用户和项目最终隐含变量表示
Figure FDA0002263463080000059
输入到一个可逆的解码器中,构建用户和项目隐含变量的联合分布,让其与输入用户和项目数据的联合分布求相对熵,为了表述方便,在第二损失计算公式中,取消用户和项目的下标i和j,得到第二损失L2
S252, obtain the second loss L 2 : then express the final implicit variables of the user and item obtained in S23 and
Figure FDA0002263463080000059
Input into a reversible decoder, construct the joint distribution of user and item latent variables, and let it find relative entropy with the joint distribution of input user and item data. For convenience of expression, in the second loss calculation formula, cancel the user and the item data. The subscripts i and j of the items yield the second loss L 2 :
Figure FDA0002263463080000061
Figure FDA0002263463080000061
通过用户后验分布q(zu|u)近似隐含变量zu的真实分布p(u,zu),q(zu|u)定义为
Figure FDA0002263463080000062
同样的,用项目后验分布q(zv|v)近似隐含变量zv的真实分布p(v,zv),q(zv|v)定义为
Figure FDA0002263463080000068
p(u,zu)和p(v,zv)表示用户和项目输入数据的真实分布,zu、zv表示协同自回归流模型中的隐含变量,u、v表示模型输入数据,θ、φ分别表示概率分布的参数,上式中
Figure FDA0002263463080000063
Figure FDA0002263463080000064
表示重构损失,
Figure FDA0002263463080000065
Figure FDA0002263463080000066
为常数项,剩下四项表示自回归流。对第二损失L2进行优化的过程中,实际上已经完成了对S223中的最小化,具体的推导过程参照以下论文:【van den Oord,A.,Li,Y.,Babuschkin,I.,Simonyan,K.,Vinyals,O.,Kavukcuoglu,K.,van den Driessche,”Parallel wavenet:Fast high-fidelity speech synthesis”】;
The true distribution p(u,z u ) of the latent variable z u is approximated by the user posterior distribution q(z u | u ), which is defined as
Figure FDA0002263463080000062
Similarly, the true distribution p(v,z v ) of the latent variable z v is approximated by the item posterior distribution q(z v |v), and q(z v |v) is defined as
Figure FDA0002263463080000068
p(u,z u ) and p(v,z v ) represent the true distribution of user and item input data, zu , z v represent the latent variables in the collaborative autoregressive flow model, u, v represent the model input data, θ and φ represent the parameters of the probability distribution, respectively, in the above formula
Figure FDA0002263463080000063
and
Figure FDA0002263463080000064
represents the reconstruction loss,
Figure FDA0002263463080000065
and
Figure FDA0002263463080000066
is a constant term, and the remaining four terms represent the autoregressive flow. In the process of optimizing the second loss L2, the The minimization of , the specific derivation process refers to the following papers: [van den Oord, A., Li, Y., Babuschkin, I., Simonyan, K., Vinyals, O., Kavukcuoglu, K., van den Driessche, " Parallel wavenet:Fast high-fidelity speech synthesis"];
S253,将第一损失和第二损失加在一起生成最后的总损失L=L1+L2,并最小化这个总损失函数,完成对模型的训练,得到所述项目推荐模型。S253 , add the first loss and the second loss together to generate a final total loss L=L 1 +L 2 , and minimize the total loss function to complete the training of the model to obtain the item recommendation model.
4.根据权利要求1所述基于协同自回归流实现的Top-n项目推荐方法,其特征在于所述步骤S3包含以下分步骤:4. the Top-n project recommendation method realized based on collaborative autoregressive flow according to claim 1, is characterized in that described step S3 comprises following sub-steps: S31,根据S2训练好的模型,得到用户评分矩阵R′,再将用户在训练集中访问过的项目索引位置的评分置为0,例如用户ui在训练集中访问过项目vj,则将评分矩阵R′中对应的元素r′ij置为0。再将新的评分矩阵中的每一行按照评分高低进行排序得到矩阵R″;S31, obtain the user rating matrix R' according to the model trained in S2, and then set the rating of the index position of the item visited by the user in the training set to 0. For example, if the user ui has visited the item v j in the training set, the rating will be The corresponding element r' ij in the matrix R' is set to 0. Then, sort each row in the new scoring matrix according to the level of scoring to obtain the matrix R"; S32,根据S31中得到的评分矩阵R″,选取R″中评分前n的项目作为最后给用户推荐的结果。S32, according to the scoring matrix R" obtained in S31, select the items with the top n scores in R" as the final recommended result to the user.
CN201911079406.2A 2019-11-07 2019-11-07 A Top-n Item Recommendation Method Based on Collaborative Autoregressive Flow Pending CN110781401A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911079406.2A CN110781401A (en) 2019-11-07 2019-11-07 A Top-n Item Recommendation Method Based on Collaborative Autoregressive Flow

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911079406.2A CN110781401A (en) 2019-11-07 2019-11-07 A Top-n Item Recommendation Method Based on Collaborative Autoregressive Flow

Publications (1)

Publication Number Publication Date
CN110781401A true CN110781401A (en) 2020-02-11

Family

ID=69389888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911079406.2A Pending CN110781401A (en) 2019-11-07 2019-11-07 A Top-n Item Recommendation Method Based on Collaborative Autoregressive Flow

Country Status (1)

Country Link
CN (1) CN110781401A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111310048A (en) * 2020-02-25 2020-06-19 西安电子科技大学 News recommendation method based on multilayer perceptron
CN111552881A (en) * 2020-05-09 2020-08-18 苏州市职业大学 Hierarchical Variational Attention Based Sequence Recommendation Method
CN111708937A (en) * 2020-05-27 2020-09-25 西安理工大学 Cross-domain recommendation method based on label transfer
CN112085158A (en) * 2020-07-21 2020-12-15 西安工程大学 Book recommendation method based on stack noise reduction self-encoder
CN112435751A (en) * 2020-11-10 2021-03-02 中国船舶重工集团公司第七一六研究所 Peritoneal dialysis mode auxiliary recommendation system based on variation inference and deep learning
CN114065039A (en) * 2021-11-17 2022-02-18 重庆邮电大学 Mean value pooling operation-based self-encoder recommendation method and system
CN114373537A (en) * 2021-12-06 2022-04-19 云南联合视觉科技有限公司 Diagnosis and treatment scheme recommendation method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225509A1 (en) * 2003-05-07 2004-11-11 Olivier Andre Use of financial transaction network(s) information to generate personalized recommendations
CN108320187A (en) * 2018-02-02 2018-07-24 合肥工业大学 A kind of recommendation method based on depth social networks
US20190130281A1 (en) * 2017-10-31 2019-05-02 Microsoft Technology Licensing, Llc Next career move prediction with contextual long short-term memory networks
CN109979429A (en) * 2019-05-29 2019-07-05 南京硅基智能科技有限公司 A kind of method and system of TTS
CN110162709A (en) * 2019-05-24 2019-08-23 中森云链(成都)科技有限责任公司 A kind of personalized arrangement method of the robust of combination antithesis confrontation generation network
CN110232480A (en) * 2019-03-01 2019-09-13 电子科技大学 The item recommendation method and model training method realized using the regularization stream of variation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225509A1 (en) * 2003-05-07 2004-11-11 Olivier Andre Use of financial transaction network(s) information to generate personalized recommendations
US20190130281A1 (en) * 2017-10-31 2019-05-02 Microsoft Technology Licensing, Llc Next career move prediction with contextual long short-term memory networks
CN108320187A (en) * 2018-02-02 2018-07-24 合肥工业大学 A kind of recommendation method based on depth social networks
CN110232480A (en) * 2019-03-01 2019-09-13 电子科技大学 The item recommendation method and model training method realized using the regularization stream of variation
CN110162709A (en) * 2019-05-24 2019-08-23 中森云链(成都)科技有限责任公司 A kind of personalized arrangement method of the robust of combination antithesis confrontation generation network
CN109979429A (en) * 2019-05-29 2019-07-05 南京硅基智能科技有限公司 A kind of method and system of TTS

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
F ZHOU 等: "Recommendation via Collaborative Autoregressive Flows", 《NEURAL NETWORKS》 *
FAN ZHOU 等: "Variational Session-based Recommendation Using Normalizing Flows", 《THE WORLD WIDE WEB CONFERENCEMAY》 *
常标: "面向在线媒体的信息流动模式分析及流行度预测方法研究", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 *
李宗阳: "基于深度学习的用户行为过程预测方法研究与实现", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
李鹏: "基于高斯混合模型的变分自动编码器", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
胡杰 等: "无线网络中的业务行为及业务容量——概念、模型及发展", 《中国电子科学研究院学报》 *
莫玉华: "基于流和生成网络的推荐系统研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
陈辉 等: "自回归预测多级矢量量化线谱频率编码技术", 《西安科技大学学报》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111310048A (en) * 2020-02-25 2020-06-19 西安电子科技大学 News recommendation method based on multilayer perceptron
CN111310048B (en) * 2020-02-25 2023-06-20 西安电子科技大学 News recommendation method based on multi-layer perceptron
CN111552881A (en) * 2020-05-09 2020-08-18 苏州市职业大学 Hierarchical Variational Attention Based Sequence Recommendation Method
CN111552881B (en) * 2020-05-09 2024-01-30 苏州市职业大学 Sequence recommendation method based on hierarchical variation attention
CN111708937A (en) * 2020-05-27 2020-09-25 西安理工大学 Cross-domain recommendation method based on label transfer
CN111708937B (en) * 2020-05-27 2022-12-16 北京阅视无限科技有限公司 Cross-domain recommendation method based on label migration
CN112085158A (en) * 2020-07-21 2020-12-15 西安工程大学 Book recommendation method based on stack noise reduction self-encoder
CN112435751A (en) * 2020-11-10 2021-03-02 中国船舶重工集团公司第七一六研究所 Peritoneal dialysis mode auxiliary recommendation system based on variation inference and deep learning
CN114065039A (en) * 2021-11-17 2022-02-18 重庆邮电大学 Mean value pooling operation-based self-encoder recommendation method and system
CN114373537A (en) * 2021-12-06 2022-04-19 云南联合视觉科技有限公司 Diagnosis and treatment scheme recommendation method and device

Similar Documents

Publication Publication Date Title
CN110781401A (en) A Top-n Item Recommendation Method Based on Collaborative Autoregressive Flow
Xu et al. Long-and short-term self-attention network for sequential recommendation
CN111797321B (en) A method and system for personalized knowledge recommendation for different scenarios
CN109299396B (en) Convolutional neural network collaborative filtering recommendation method and system fusing attention model
CN109785062B (en) A Hybrid Neural Network Recommendation System Based on Collaborative Filtering Model
Rudolph et al. Exponential family embeddings
CN101694652B (en) Network resource personalized recommendation method based on extremely fast neural network
CN110362738B (en) Deep learning-based individual recommendation method combining trust and influence
CN114117220A (en) Deep reinforcement learning interactive recommendation system and method based on knowledge enhancement
CN110956497A (en) A method for predicting repeated purchase behavior of e-commerce platform users
CN108563755A (en) A kind of personalized recommendation system and method based on bidirectional circulating neural network
CN108287904A (en) A kind of document context perception recommendation method decomposed based on socialization convolution matrix
CN111737592B (en) A Recommendation Method Based on Heterogeneous Propagation Collaborative Knowledge Awareness Network
CN110781409A (en) Article recommendation method based on collaborative filtering
CN112328900A (en) A deep learning recommendation method integrating rating matrix and review text
Chen et al. An ensemble model for link prediction based on graph embedding
CN110084670A (en) A kind of commodity on shelf combined recommendation method based on LDA-MLP
CN109325875A (en) Implicit group discovery method based on latent features of online social users
CN112699310A (en) Cold start cross-domain hybrid recommendation method and system based on deep neural network
CN114238758A (en) A user portrait prediction method based on multi-source cross-border data fusion
CN109033294A (en) A kind of mixed recommendation method incorporating content information
CN117951391A (en) User multi-behavior recommendation method based on graphic neural network and element learning
CN118608334A (en) An educational course recommendation system, method and application based on big data in the education industry
CN114936890A (en) Counter-fact fairness recommendation method based on inverse tendency weighting method
CN111930926B (en) Personalized recommendation algorithm combined with comment text mining

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200211