CN110555132A

CN110555132A - Noise reduction self-encoder recommendation method based on attention model

Info

Publication number: CN110555132A
Application number: CN201910742757.0A
Authority: CN
Inventors: 张延华; 王倩雯; 付琼霄; 李萌; 李庆; 陈冰容
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2019-08-13
Filing date: 2019-08-13
Publication date: 2019-12-10

Abstract

A movie recommendation method based on an attention model-based denoising autoencoder belongs to the technical field of movie recommendation. In the existing recommendation algorithms, the autoencoder recommendation model is widely used because of its fast calculation speed and easy implementation. However, when the scoring matrix is sparse, the recommendation accuracy will be greatly reduced, and the difference between the auxiliary information and the user's viewing records has not been considered. Attention. In order to solve the above problems, this method combines the attention model with the denoising autoencoder, uses the attention model to learn the user's preference, and incorporates the denoising autoencoder to iteratively update the parameters to predict the user's complete rating. This method has significantly improved the accuracy of predictive scoring.

Description

An Attention Model-Based Denoising Autoencoder Recommendation Method

技术领域technical field

本方法属于电影推荐技术领域，具体设计结合注意力模型与降噪自编码器的电影推荐方法。The method belongs to the technical field of movie recommendation, and specifically designs a movie recommendation method combining an attention model and a noise reduction autoencoder.

背景技术Background technique

电影逐渐成为我们日常生活中不可或缺的娱乐方式，但面对庞大的电影资源库，如何帮助用户快速寻找到喜爱的影片成为了推荐系统面对的主要问题。Movies have gradually become an indispensable form of entertainment in our daily life, but in the face of a huge movie resource library, how to help users quickly find their favorite movies has become the main problem faced by the recommendation system.

传统的电影推荐方法往往是根据电影的搜索热度进行推荐，但用户喜爱的电影种类各不相同，导致推荐系统的满意度较低。近年来，深度学习在各个领域取得了巨大成功，如计算机视觉，翻译等。深度学习目前也被广泛的应用在推荐系统上，因深度学习可以捕捉到数据之间的复杂联系，所以基于深度学习的推荐算法在性能方面大大超过了普通算法。Traditional movie recommendation methods are often based on movie search popularity, but the types of movies that users like are different, resulting in low satisfaction with the recommendation system. In recent years, deep learning has achieved great success in various fields, such as computer vision, translation, etc. Deep learning is also widely used in recommendation systems. Because deep learning can capture complex connections between data, the performance of recommendation algorithms based on deep learning greatly exceeds that of ordinary algorithms.

现有的深度学习推荐系统应用最广泛的为基于自编码器的推荐算法，但大多基于自编码器的算法并没有考虑辅助信息和用户对收看记录存在不同的注意度，导致推荐精度仍不理想。针对这些技术缺陷，本发明借助注意力模型和自编码器解决这些问题。The most widely used recommendation system based on autoencoder is the most widely used recommendation algorithm based on autoencoder, but most of the algorithms based on autoencoder do not consider auxiliary information and users' different attention levels to viewing records, resulting in unsatisfactory recommendation accuracy . Aiming at these technical defects, the present invention solves these problems with the help of attention model and autoencoder.

发明内容Contents of the invention

本发明解决的技术问题是：普通的降噪自编码器准确度低，且忽略了用户收看记录与电影辅助信息，即用户职业，工作，性别以及电影的类别对推荐系统的影响。The technical problem solved by the present invention is: the common noise reduction autoencoder has low accuracy, and ignores the impact of user viewing records and movie auxiliary information, that is, user occupation, work, gender and movie category on the recommendation system.

针对上述问题，本发明提供了一种基于注意力模型的降噪自编码器推荐方法，步骤如下：In view of the above problems, the present invention provides a noise reduction autoencoder recommendation method based on attention model, the steps are as follows:

步骤1.从网上公开数据集获得用户，电影信息以及用户对电影的评分。Step 1. Obtain users, movie information, and user ratings for movies from online public datasets.

步骤2.利用卷积神经网络处理电影标题信息。Step 2. Use convolutional neural network to process movie title information.

步骤3.处理用户与电影剩余信息，分别转变为用户向量与电影向量。Step 3. Process the remaining information of users and movies, and transform them into user vectors and movie vectors respectively.

步骤4.根据用户评分，选出每位用户评分前十的电影，作为用户偏好矩阵。Step 4. According to the user ratings, select the top ten movies rated by each user as the user preference matrix.

步骤5.将用户偏好矩阵输入注意力模型，最终计算得到用户的偏好特征向量。Step 5. Input the user preference matrix into the attention model, and finally calculate the user's preference feature vector.

步骤6.将用户评分输入降噪自编码器，并在降噪自编码器的隐藏层中添加用户辅助信息向量和用户偏好特征向量，经过100次迭代后，获得完整用户评分。Step 6. Input the user rating into the denoising autoencoder, and add the user auxiliary information vector and user preference feature vector in the hidden layer of the denoising autoencoder. After 100 iterations, the complete user rating is obtained.

步骤7.根据预测评分进行电影推荐。Step 7. Movie recommendation based on predicted ratings.

本发明利用注意力模型提取用户的偏好特征并将其融入降噪自编码器，使得用户特征更加准确，大大提高了推荐算法性能。The invention uses the attention model to extract the user's preference feature and integrates it into the noise reduction self-encoder, so that the user feature is more accurate and the performance of the recommendation algorithm is greatly improved.

附图说明Description of drawings

图1为用户辅助信息提取模型图Figure 1 is a model diagram of user auxiliary information extraction

图2为电影辅助信息提取模型图Figure 2 is a model diagram of movie auxiliary information extraction

图3为本发明基于注意力模型的降噪自编码器推荐方法的整体模型Fig. 3 is the overall model of the noise reduction self-encoder recommendation method based on the attention model of the present invention

具体实施方式Detailed ways

下面结合附图对本发明做进一步的详细描述：Below in conjunction with accompanying drawing, the present invention is described in further detail:

本发明提供了一种基于注意力模型的降噪自编码器推荐系统方法，该方法包括以下步骤：The present invention provides a kind of attention model-based denoising self-encoder recommendation system method, and the method comprises the following steps:

步骤1.从网上公开数据集获得用户，电影信息以及用户对电影的评分。共有6040位用户，3952部电影，1000209个评分，评分从1至5，均为整数。用户，电影信息和用户评分形式如下表。Step 1. Obtain users, movie information, and user ratings for movies from online public datasets. A total of 6040 users, 3952 movies, 1000209 ratings, ratings from 1 to 5, all integers. The user, movie information and user rating form are as follows.

User IDUser ID gendergender ageage occupationoccupancy Zip-codeZip-code 11 Ff 11 1010 4806748067 22 Mm 5656 1616 7007270072 33 Mm 2525 1515 5511755117 …… …… …… …… ……

表1.用户信息Table 1. User Information

表2.电影信息Table 2. Movie information

User IDUser ID Movie IDMovie ID RatingRating TimestampTimestamp 11 11931193 55 978300760978300760 11 661661 33 978302109978302109 11 914914 33 978301968978301968 …… …… …… ……

表3.评分数据Table 3. Scoring data

步骤2.利用卷积神经网络处理电影标题。去除标题中的年份等无关信息，并将单词转换为32维的向量，每部电影标题长度均设置为15个单词，若不足则使用空向量填补。将标题矩阵输入卷积神经网络，提取标题特征向量。Step 2. Utilize Convolutional Neural Networks to process movie titles. Remove the irrelevant information such as the year in the title, and convert the word into a 32-dimensional vector. The length of each movie title is set to 15 words. If it is insufficient, fill it with an empty vector. Feed the caption matrix into a convolutional neural network to extract caption feature vectors.

步骤3.如图1，2处理用户与电影的其他辅助信息，如用户性别，年龄，职业，电影类别。将用户性别中的F，M分别转变为两个16维向量；由于有七种年龄类别，所以将年龄转变7个16维向量；工作类别共有21个，转变为21个16维向量；电影的类别共有18种，转变为18个32维向量，为保证各电影类别矩阵大小相同，将电影类别长度设置为18，不足的使用空向量补充。Step 3. As shown in Figures 1 and 2, process other auxiliary information of users and movies, such as user gender, age, occupation, and movie category. Convert the F and M in the user's gender into two 16-dimensional vectors; since there are seven age categories, convert the age into seven 16-dimensional vectors; there are 21 job categories, and convert them into 21 16-dimensional vectors; There are 18 categories, which are converted into 18 32-dimensional vectors. In order to ensure that the size of each movie category matrix is the same, the length of the movie category is set to 18, and empty vectors are used to supplement the shortage.

将用户id，工作，年龄信息输入全连接层，得到一个200维的用户信息向量；将电影id，类别，标题同样输入全连接层获得200维电影信息向量。Input the user id, job, and age information into the fully connected layer to obtain a 200-dimensional user information vector; input the movie id, category, and title into the fully connected layer to obtain a 200-dimensional movie information vector.

步骤4.用户评分过的电影最少为20部，从中挑选评分最高的十部电影，获得相应的电影信息向量，组成10*200的用户偏好电影矩阵。Step 4. There are at least 20 movies rated by the user, and the ten movies with the highest ratings are selected from them, and the corresponding movie information vectors are obtained to form a 10*200 user preference movie matrix.

步骤5.如图3，将用户偏好电影矩阵输入注意力网络，公式如下：Step 5. As shown in Figure 3, input the user preference movie matrix into the attention network, the formula is as follows:

I′_i＝f_tanh(W'I_i+b') (1)I′ _i =f _tanh (W'I _i +b') (1)

A_i＝f_softmax(I′_iW^A) (2)A _i ＝f _softmax (I′ _i W ^A ) (2)

I_ρ＝∑A_iI_i (3)I _ρ ＝∑A _i I _i (3)

其中I_i为第i个用户评分前十的电影向量，W'是电影权重矩阵，是均值为0，标准差为0.01的正态分布随机矩阵。b'为电影偏置矩阵，f_tanh为tanh函数，I_i'为用户电影特征向量，W^A是电影特征权重矩阵，是均值为0，标准差为0.01的正态分布随机矩阵。f_softmax为softmax函数，A_i为第i个电影向量的注意力权重系数，I_ρ为最终的用户偏好向量Among them, I _i is the top ten movie vectors rated by the i-th user, and W' is the movie weight matrix, which is a normal distribution random matrix with a mean of 0 and a standard deviation of 0.01. b' is a movie bias matrix, f _tanh is a tanh function, I _i ' is a user movie feature vector, W ^A is a movie feature weight matrix, which is a normal distribution random matrix with a mean of 0 and a standard deviation of 0.01. f _softmax is the softmax function, A _i is the attention weight coefficient of the i-th movie vector, I _ρ is the final user preference vector

步骤6.将用户评分输入降噪自编码器，并在降噪自编码器的隐藏层中添加用户辅助信息向量和用户偏好特征向量共同迭代，公式如下：Step 6. Input the user rating into the denoising self-encoder, and add the user auxiliary information vector and user preference feature vector to the hidden layer of the denoising self-encoder to iterate together. The formula is as follows:

X′＝X+N (4)X'=X+N (4)

h＝f_tanh(WX'+b) (5)h＝f _tanh (WX'+b) (5)

h'＝h+W^sS+I_ρ (6)h'＝h+W ^s S+I _ρ (6)

Y＝f_tanh(W^zh'+b^z) (7)Y＝f _tanh (W ^z h'+b ^z ) (7)

其中X为用户评分向量，N为噪声向量，X'为加噪评分向量，W为加噪评分的权重矩阵，是均值为0，标准差为0.01的正态分布随机矩阵，b为加噪评分向量的偏置矩阵，h为隐藏层向量，S为用户辅助信息向量，W^s为辅助信息向量权重矩阵，是均值为0，标准差为0.01的正态分布随机矩阵，h'则为处理后的隐藏层向量，W^z，b^z分别为处理后的权重矩阵和偏置矩阵，Y为最终完整的评分向量。Where X is the user rating vector, N is the noise vector, X' is the noise-adding score vector, W is the weight matrix of the noise-adding score, which is a normal distribution random matrix with a mean of 0 and a standard deviation of 0.01, and b is the noise-adding score The offset matrix of the vector, h is the hidden layer vector, S is the user auxiliary information vector, W ^s is the weight matrix of the auxiliary information vector, which is a normal distribution random matrix with a mean of 0 and a standard deviation of 0.01, and h' is the processed The hidden layer vector of , W ^z , b ^z are the processed weight matrix and bias matrix respectively, and Y is the final complete scoring vector.

代价函数为：The cost function is:

L＝||X-Y||² (8)L＝||XY|| ² (8)

L为代价函数，即输入评分与输出评分的平方差。L is the cost function, which is the squared difference between the input score and the output score.

经过至少100次迭代，获得用户完整评分。其中参数优化方法有Adam,RMSProp,参数选择范围为：学习速率在0.00001～0.01之间，训练步长在100～500之间，隐藏层节点为100～500之间。不断改变参数设置，观察输入与输出评分之间的平方差，当平方差取到最小值时，参数优化方法为Adam,学习速率为0.00001，训练补偿为300，隐藏节点数为200。After at least 100 iterations, complete user ratings are obtained. The parameter optimization methods include Adam and RMSProp, and the parameter selection range is: the learning rate is between 0.00001 and 0.01, the training step is between 100 and 500, and the hidden layer nodes are between 100 and 500. Constantly change the parameter settings and observe the square difference between the input and output scores. When the square difference reaches the minimum value, the parameter optimization method is Adam, the learning rate is 0.00001, the training compensation is 300, and the number of hidden nodes is 200.

实验模拟Experimental simulation

本发明使用MovieLens 1M的数据集，包含6040位用户，3952部电影，近10万条评分，每人均至少20条评分数据。随机抽取部分评分数据作为训练集进行训练，最终本发明的准确度较普通推荐算法高。本发明将注意力模型与降噪自编码器相结合，解决了推荐系统中常见的冷启动问题，推荐能力也有了一定提高。The present invention uses the dataset of MovieLens 1M, including 6040 users, 3952 movies, nearly 100,000 ratings, and at least 20 ratings per person. Part of the scoring data is randomly selected as a training set for training, and finally the accuracy of the present invention is higher than that of common recommendation algorithms. The invention combines the attention model with the noise reduction self-encoder, solves the common cold start problem in the recommendation system, and improves the recommendation ability to a certain extent.

Claims

1. The attention model-based noise reduction self-encoder recommendation method is characterized by comprising the following steps of:

step 1, obtaining a user, movie information and a score of the user for a movie from an online public data set;

Step 2, processing the movie title information by using a convolutional neural network;

Step 3, processing the user and movie information and respectively converting the user and movie information into a user vector and a movie vector;

Step 4, selecting the movies with the top ten scores of each user according to the scores of the users as a user preference matrix;

step 5, inputting the user preference matrix into an attention model, and finally calculating to obtain a preference characteristic vector of the user;

step 6, inputting the user score into a noise reduction self-encoder, adding a user auxiliary information vector and a user preference characteristic vector in a hidden layer of the noise reduction self-encoder, and obtaining a complete user score after at least 100 iterations;

And 7, recommending the movies according to the prediction scores.

2. The attention model noise reduction self-encoder recommendation method according to claim 1, wherein step 3 specifically converts F, M in user gender into two 16-dimensional vectors respectively; since there are seven age categories, the age is shifted to 7 16-dimensional vectors; the number of the working categories is 21, and the working categories are converted into 21 16-dimensional vectors; the categories of the movies are 18 in total, the movies are converted into 18 32-dimensional vectors, in order to ensure that the sizes of the movie category matrixes are the same, the movie category length is set to be 18, and insufficient empty vectors are used for supplement;

inputting user id, work and age information into a full-connection layer to obtain a 200-dimensional user information vector; and inputting the movie id, the category and the title into the full-connection layer to obtain a 200-dimensional movie information vector.

3. the attention model denoising auto-encoder recommendation method according to claim 1, wherein step 5 is to input a user preference movie matrix into the attention network so as to obtain a user preference vector, and the formula is as follows:

I′_i＝f_tanh(W'I_i+b') (1)

A_i＝f_softmax(I′_iW^A) (2)

I_ρ＝∑A_iI_i (3)

Wherein I _i is the movie vector of the first ten of the ith user score, W ' is the movie weight matrix, b ' is the movie bias matrix, f _tanh is the tanh function, I ' _i is the user movie feature vector, W ^A is the movie feature weight matrix, f _softmax is the softmax function, A _i is the attention weight coefficient of the ith movie vector, and I _ρ is the final user preference vector.

4. The attention model denoising autoencoder recommendation method according to claim 1, wherein step 6 is specifically to input the user score into the denoising autoencoder, and add the user auxiliary information vector and the user preference feature vector in the hidden layer of the denoising autoencoder for joint iteration, and the formula is as follows:

X′＝X+N (4)

h＝f_tanh(WX'+b) (5)

h'＝h+W^sS+I_ρ (6)

Y＝f_tanh(W^zh'+b^z) (7)

Wherein X is a user scoring vector, N is a noise vector, X 'is a noise scoring vector, W is a noise scoring weight matrix, b is a noise scoring vector bias matrix, h is a hidden layer vector, S is a user auxiliary information vector, W ^s is an auxiliary information vector weight matrix, h' is a processed hidden layer vector, W ^z and b ^z are respectively a processed weight matrix and a bias matrix, and Y is a final complete scoring vector;

The cost function is:

L＝||X-Y||² (8)

L is a cost function, namely the square difference of the input score and the output score;

And obtaining the complete score of the user through at least 100 iterations.