CN110362755A

CN110362755A - A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule

Info

Publication number: CN110362755A
Application number: CN201910667120.XA
Authority: CN
Inventors: 李涛; 李�昊
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University
Priority date: 2019-07-23
Filing date: 2019-07-23
Publication date: 2019-10-22

Abstract

The invention belongs to proposed algorithm technical fields, disclose a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule.By the historical record data collection for constructing user behavior, obtain the correlation rule of article, user-article rating matrix is constructed again, to calculate article similarity, then article-contents attribute matrix is constructed, the similarity based on goods attribute is calculated, article scoring is finally predicted according to similarity, and generate corresponding recommendation list.Using, based on the recommended method of hybrid algorithm, can effectively solve the data sparsity problem encountered in Collaborative Filtering Recommendation Algorithm, and improve the calculation method of similarity in the present invention, to improve recommendation quality.

Description

A Recommendation Method Based on a Hybrid Algorithm of Item Collaborative Filtering and Association Rules

技术领域technical field

本发明属于推荐算法技术领域，具体涉及一种基于物品协同过滤与关联规则的混合算法的推荐方法。The invention belongs to the technical field of recommendation algorithms, and in particular relates to a recommendation method based on a hybrid algorithm of item collaborative filtering and association rules.

背景技术Background technique

随着互联网行业的高速发展，信息的传输速度越来越快，信息的种类和数量越来越多。在面对日益纷繁复杂的信息时，用户想要从中寻找到自己喜欢的内容变得困难，并且寻找的时间也变得越来越长。为了解决这一问题，个性化推荐技术即推荐算法得到快速发展。推荐算法，通过分析用户的历史行为数据获取用户偏好，从而主动为用户推荐其可能感兴趣的内容。With the rapid development of the Internet industry, the transmission speed of information is getting faster and faster, and the types and quantities of information are increasing. In the face of increasingly complex information, it becomes difficult for users to find the content they like, and the time to search is also getting longer and longer. In order to solve this problem, personalized recommendation technology, namely recommendation algorithm, has been developed rapidly. The recommendation algorithm obtains user preferences by analyzing the user's historical behavior data, so as to actively recommend content that may be of interest to the user.

传统的推荐算法有：基于内容的推荐算法、基于协同过滤的推荐算法、基于社交网络的推荐算法。目前应用最为广泛的是基于协同过滤的推荐算法，它又可以分为基于用户的协同过滤算法和基于物品的协同过滤算法。基于物品的协同过滤算法是目前业界应用最多的推荐算法，它可以给用户推荐一些与他们之前喜欢的物品相似的物品。它通过分析用户的行为记录计算物品之间的相似度，其意义为物品A和物品B具有很大相似度是因为喜欢物品A的用户大都也喜欢物品B。但是通过此方法计算的相似度并不意味着这两个物品在属性上是相似的。另外基于协同过滤的算法非常依赖于历史数据，当数据稀疏时，其推荐质量下降严重。本发明通过使用关联法则预测的物品评分和计算物品属性相似度，结合基于物品的协同过滤算法来提高推荐质量。Traditional recommendation algorithms include: content-based recommendation algorithms, collaborative filtering-based recommendation algorithms, and social network-based recommendation algorithms. At present, the most widely used recommendation algorithm is based on collaborative filtering, which can be divided into user-based collaborative filtering algorithm and item-based collaborative filtering algorithm. The item-based collaborative filtering algorithm is currently the most widely used recommendation algorithm in the industry. It can recommend some items similar to the items they liked before to users. It calculates the similarity between items by analyzing the user's behavior records, which means that item A and item B have great similarity because most users who like item A also like item B. But the similarity calculated by this method does not mean that the two items are similar in attributes. In addition, the algorithm based on collaborative filtering is very dependent on historical data, and when the data is sparse, the quality of its recommendation is seriously degraded. The present invention improves the recommendation quality by using the item rating predicted by the association rule and calculating the item attribute similarity, combined with the item-based collaborative filtering algorithm.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供一种基于物品协同过滤与关联规则的混合算法的推荐方法，解决协同过滤推荐算法中遇到的数据稀疏性问题，并改进相似度的计算方法，从而提高推荐质量。为实现上述目的，本发明提供如下技术方案：一种基于物品协同过滤与关联规则的混合算法的推荐方法，包括如下步骤：The purpose of the present invention is to provide a recommendation method based on a hybrid algorithm of item collaborative filtering and association rules, solve the data sparsity problem encountered in collaborative filtering recommendation algorithm, and improve the calculation method of similarity, thereby improving recommendation quality. In order to achieve the above object, the present invention provides the following technical solutions: a recommendation method based on a hybrid algorithm of item collaborative filtering and association rules, comprising the following steps:

S1、构建历史记录数据集，获取物品的关联规则；S1. Build a data set of historical records to obtain association rules of items;

S2、构建用户-物品评分矩阵，计算物品相似度；S2, construct a user-item rating matrix, and calculate the item similarity;

S3、构建物品-内容属性矩阵，计算基于物品属性的相似度；S3. Construct an item-content attribute matrix, and calculate the similarity based on item attributes;

S4、根据相似度预测物品评分。S4. Predict item ratings according to similarity.

进一步，所述步骤S1包括如下具体步骤：Further, the step S1 includes the following specific steps:

S1-1、根据用户每次的行为记录构建历史记录数据集，数据集的每一行记录为用户每次行为记录中发生正反馈(例如：购买)的物品集；S1-1. Construct a historical record data set according to the user's behavior records each time, and each row of the data set is recorded as a set of items for which positive feedback (for example, purchases) occurred in each behavior record of the user;

S1-2、根据设置的最小支持度和最小置信度找出二阶频繁项集的关联规则。S1-2, find out the association rule of the second-order frequent itemsets according to the set minimum support and minimum confidence.

进一步，所述步骤S2包括如下具体步骤：Further, the step S2 includes the following specific steps:

S2-1、根据用户的历史记录和评分记录构建用户-物品评分矩阵，矩阵的每一行为不同用户对不同商品的评价分数，矩阵的每一列为不同的物品；S2-1. Construct a user-item scoring matrix according to the user's historical records and scoring records. Each row of the matrix is the evaluation score of different users on different items, and each column of the matrix is a different item;

S2-2、使用步骤一中的关联规则来预测与某个用户发生过正反馈的物品的评分，并将此评分填充到上述用户-物品评分矩阵中，得到新的用户-物品评分矩阵；S2-2. Use the association rule in step 1 to predict the rating of an item that has had positive feedback with a certain user, and fill the rating into the above-mentioned user-item rating matrix to obtain a new user-item rating matrix;

S2-3、根据填充评分后的用户-物品评分矩阵使用余弦相似度公式计算不同物品之间的相似度。S2-3. Calculate the similarity between different items by using the cosine similarity formula according to the user-item rating matrix after filling the rating.

进一步，所述步骤S3包括如下具体步骤：Further, the step S3 includes the following specific steps:

S3-1、构建所有物品-内容属性矩阵，矩阵每一行为不同物品的内容属性有无情况；S3-1. Construct all item-content attribute matrix, whether each row of the matrix has the content attribute of different items;

S3-2、根据物品-内容属性矩阵，使用余弦相似度计算不同物品之间的相似度。S3-2. According to the item-content attribute matrix, the cosine similarity is used to calculate the similarity between different items.

进一步，所述步骤S4包括如下具体步骤：Further, the step S4 includes the following specific steps:

S4-1、物品i与物品j的相似度为：S4-1. The similarity between item i and item j is:

sim(i,j)表示物品i与物品j的相似度，值域为[0,1]。α表示权值，值域为[0.1,0.9]。sim(i,j) represents the similarity between item i and item j, and the value range is [0,1]. α represents the weight, and the value range is [0.1, 0.9].

N(i)表示喜欢物品i的用户数，N(i)∩N(j)表示同时喜欢物品i和物品j的用户数。N(i) represents the number of users who like item i, and N(i)∩N(j) represents the number of users who like item i and item j at the same time.

V(i)表示物品i的属性向量；V(i) represents the attribute vector of item i;

S4-2、物品i预测评分的计算公式为：S4-2, the calculation formula of item i prediction score is:

P_ui表示用户u对物品i的预测评分。S(i,K)为与物品i最相似的K个物品集合。r_uj表示用户u对物品j的真实评分；P _ui represents user u’s predicted rating for item i. S(i,K) is the set of K items most similar to item i. r _uj represents the real rating of user u to item j;

S4-3、根据预测评分公式预测某个用户对某个物品的评分。S4-3. Predict the rating of a user for an item according to a predictive rating formula.

与现有技术相比，本发明的有益效果是：Compared with the prior art, the beneficial effects of the present invention are:

1.解决了单一使用用户-物品评分矩阵的数据出现的稀疏性导致推荐质量下降的问题。1. Solve the problem that the sparsity of the data using the user-item rating matrix alone leads to the decline of the recommendation quality.

2.提高相似度预测物品评分的准确性，提升了推荐质量。2. Improve the accuracy of similarity prediction item rating and improve the quality of recommendation.

附图说明Description of drawings

图1是本发明中基于物品协同过滤与关联规则的混合算法的推荐方法的流程示意图；1 is a schematic flowchart of a recommendation method based on a hybrid algorithm of item collaborative filtering and association rules in the present invention;

图2是本发明中混合推荐算法与传统的协同过滤算法预测评分的MAE的对比。FIG. 2 is a comparison of the MAE of the predicted score between the hybrid recommendation algorithm of the present invention and the traditional collaborative filtering algorithm.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

并且，本发明各个实施例之间的技术方案可以相互结合，但是必须是以本领域普通技术人员能够实现为基础，当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在，也不在本发明要求的保护范围之内。In addition, the technical solutions between the various embodiments of the present invention can be combined with each other, but must be based on the realization by those of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be achieved, it should be considered that the combination of technical solutions does not exist and is not within the scope of protection claimed by the present invention.

如图1所示，S1构建历史记录数据集，获取物品的关联规则，具体包括如下步骤：As shown in Figure 1, S1 constructs a historical record data set and obtains the association rules of items, which includes the following steps:

1.构建历史记录数据集，该数据集每行数据为用户发生正反馈的物品列表：[ItemID1，ItemID2，……，ItemIDn]。1. Construct a historical data set, each row of data in the data set is a list of items for which the user has positive feedback: [ItemID1, ItemID2, ..., ItemIDn].

2.通过Apriori关联规则算法找出满足最小支持度为0.2，最小置信度为0.6的二阶频繁项集关联法则。Apriori关联规则算法主要包含两个步骤：首先找出数据集中所有的频繁项集，这些项集出现的频繁度要大于等于最小支持度，然后根据频繁项集找到满足最小置信度的关联规则。2. Find out the second-order frequent itemsets association rules that satisfy the minimum support degree of 0.2 and the minimum confidence degree of 0.6 through the Apriori association rule algorithm. The Apriori association rule algorithm mainly includes two steps: first, find all the frequent itemsets in the data set, the frequency of these itemsets should be greater than or equal to the minimum support degree, and then find the association rules that satisfy the minimum confidence degree according to the frequent itemsets.

支持度的计算公式：The formula for calculating support:

Support(X→Y)表示支持度，即物品集X和物品集Y同时出现的概率。N(X,Y)表示物品集X和物品Y同时出现的次数，N_all表示历史记录总数。Support(X→Y) represents the support degree, that is, the probability of item set X and item set Y appearing at the same time. N(X, Y) represents the number of times the item set X and item Y appear at the same time, and N _all represents the total number of historical records.

置信度的计算公式：Confidence calculation formula:

Conf(X→Y)表示置信度，即物品集X出现时同时出现物品集Y的概率。Conf(X→Y) represents the confidence, that is, the probability that item set Y appears at the same time when item set X appears.

3.遍历历史记录数据集，首先寻找一阶频繁项集，然后寻找二阶频繁项集，并根据二阶频繁项集找到置信度大于等于最小置信度的关联规则。最后找到的关联法则是由(ItemIDx,ItemIDy,Confidence)构成的数据集，ItemIDx表示物品x的ID，Confidence表示此组关联规则的置信度。3. Traverse the historical record data set, first find the first-order frequent itemsets, then find the second-order frequent itemsets, and find the association rules with confidence greater than or equal to the minimum confidence according to the second-order frequent itemsets. The association rule finally found is a dataset consisting of (ItemIDx, ItemIDy, Confidence), where ItemIDx represents the ID of item x, and Confidence represents the confidence of this set of association rules.

S2、构建用户-物品评分矩阵，计算物品相似度，包括如下具体步骤：S2, constructing a user-item rating matrix and calculating item similarity, including the following specific steps:

1.根据用户历史行为记录，构建用户-物品评分矩阵R：1. According to the user's historical behavior records, construct the user-item rating matrix R:

m表示用户的数量，n表示物品的数量。r_ui(1≤u≤m,1≤i≤n)表示用户u对物品i的评分，如果用户u没有对物品i进行评分，则r_ui为空。m represents the number of users, and n represents the number of items. r _ui (1≤u≤m, 1≤i≤n) represents the rating of item i by user u. If user u does not rate item i, then r _ui is empty.

2.使用步骤一中的关联法则数据集预测评分。遍历用户-物品评分矩阵中的每一行数据，对于每个用户，如果关联法则数据集中有其发生过正反馈的物品且该物品所对应的关联物品没有和用户发生过正反馈的话，则预测此用户对这个关联物品的评分。预测评分公式为：2. Use the association law dataset from step 1 to predict scores. Traverse each row of data in the user-item rating matrix. For each user, if there is an item in the association law dataset for which positive feedback has occurred and the associated item corresponding to the item has not had positive feedback with the user, predict this The user's rating for this associated item. The predicted scoring formula is:

P_ui为用户u对物品i的预测评分，S(u)表示用户u已经评分的物品集合，conf_ji表P _ui is the predicted rating of item i by user u, S(u) represents the set of items that user u has rated, and the table conf _ji

示物品j对物品i的置信度，r_uj表示用户u对物品j的评分。is the confidence of item j on item i, and r _uj is the rating of user u on item j.

3.使用预测的评分填充用户-物品评分矩阵。对每个用户还未评分的物品使用上面的预测评分来作为该用户对此物品的评分。3. Populate the user-item rating matrix with the predicted ratings. Use the predicted rating above for each item that the user has not rated yet as the user's rating for that item.

4.使用填充后的用户-物品评分矩阵计算物品的相似度。相似度计算公式为：4. Calculate item similarity using the populated user-item rating matrix. The similarity calculation formula is:

sim(i,j)为物品i与物品j的相似度，值域为[0,1]。N(i)表示喜欢物品i的用户数，N(i)∩N(j)表示同时喜欢物品i和物品j的用户数。sim(i,j) is the similarity between item i and item j, and the value range is [0,1]. N(i) represents the number of users who like item i, and N(i)∩N(j) represents the number of users who like item i and item j at the same time.

5.为每个物品计算其与其他物品的相似度，最终获得物品相似度矩阵S：5. Calculate the similarity of each item with other items, and finally obtain the item similarity matrix S:

s_ij(1≤i,j≤n)表示物品i与物品j的相似度。s _ij (1≤i, j≤n) represents the similarity between item i and item j.

S3、构建物品-内容属性矩阵，计算基于物品属性的相似度，具体步骤如下：S3. Construct an item-content attribute matrix, and calculate the similarity based on item attributes. The specific steps are as follows:

1.对每个物品进行分析，构建物品-内容属性矩阵W：1. Analyze each item and construct the item-content attribute matrix W:

n表示物品的数量，l表示物品属性的数量。w_ij(1≤i≤n,1≤j≤l)表示物品i的第j个属性，如果物品i有此属性，则w_ij＝1，否则w_ij＝0。n represents the number of items, and l represents the number of item attributes. w _ij (1≤i≤n, 1≤j≤l) represents the jth attribute of item i, if item i has this attribute, then w _ij =1, otherwise w _ij =0.

2.通过比较每个物品的属性来获得物品之间的相似度。相似度计算公式为：2. Obtain the similarity between items by comparing the attributes of each item. The similarity calculation formula is:

sim_c(i,j)表示物品i和物品j通过内容属性计算得到的相似度，值域为[0,1]。V(i)表示物品i的属性向量。sim _c (i,j) represents the similarity between item i and item j calculated through content attributes, and the value range is [0,1]. V(i) represents the attribute vector of item i.

3.为每个物品计算其与其他物品的相似度，最终获得物品相似度矩阵S_c：3. Calculate the similarity of each item with other items, and finally obtain the item similarity matrix S _c :

S_cij(1≤i,j≤n)表示物品i与物品j基于内容属性计算得到的相似度。S _cij (1≤i,j≤n) represents the similarity between item i and item j calculated based on content attributes.

S4、根据相似度预测物品评分，包括如下具体步骤：S4, predicting the item rating according to the similarity, including the following specific steps:

1.结合相似度sim(i,j)和基于内容属性计算得到的相似度sim_c(i,j)得到改进后的物品相似度，公式为：sim_new(i,j)＝αsim(i,j)+(1-α)sim_c(i,j)。1. Combine the similarity sim(i,j) and the similarity sim _c (i,j) calculated based on the content attribute to obtain the improved item similarity, the formula is: sim _new (i,j)=αsim(i, j)+(1-α)sim _c (i,j).

2.使用改进后的物品相似度对用户u的目标物品i的进行预测评分，预测评分公式为：2. Use the improved item similarity to predict and score the target item i of user u. The prediction scoring formula is:

P_ui表示用户u对物品i的预测评分。S(i,K)是与物品i最相似的K个物品集合。r_uj表示用户u对物品j的真实评分。P _ui represents user u’s predicted rating for item i. S(i,K) is the set of K items most similar to item i. r _uj represents the true rating of item j by user u.

通过S1-4得到预测物品评分，成产相应的推荐列表。对于目标用户，遍历所有物品集，挑选出还未发生过正反馈的物品集合，并按照预测评分从大到小排序，生成待推荐列表。最后选取待推荐列表中前n个物品生成推荐列表。The predicted item score is obtained through S1-4, and the corresponding recommendation list is produced. For the target user, traverse all item sets, select the item sets that have not yet experienced positive feedback, and sort them according to the predicted scores from large to small to generate a list to be recommended. Finally, select the top n items in the list to be recommended to generate a recommendation list.

通过计算预测评分与实际评分的平均绝对偏差(Mean Absolute Deviation,MAE)来衡量推荐质量，平均绝对误差值越小，预测准确度越高。计算公式为：The recommendation quality is measured by calculating the mean absolute deviation (MAE) between the predicted score and the actual score. The smaller the mean absolute error value, the higher the prediction accuracy. The calculation formula is:

MAE_u表示用户u对N_u个物品预测评分的MAE，N_u为要预测评分的物品数量，P_ui为用户u对物品i的预测评分，r_ui为用户u对物品i的实际评分。MAE _u represents the MAE of user _u 's predicted rating of Nu items, Nu is the number of items to be predicted and rated, P _ui is the predicted rating of user _{u for item i, and r ui} _is the actual rating of user u for item i.

MAE表示全体用户的MAE，M为所有用户的数量。MAE represents the MAE of all users, and M is the number of all users.

实施例1Example 1

在本发明的实施例中，使用movielens-latest-small作为数据集来验证本发明方法的效果。此数据集中包含了一个电影数据集和一个评分数据集，该电影数据集中包含了9742部电影数据，该评分数据集中是610个用户对这9742部电影产生的100836条评分数据。In the embodiment of the present invention, movielens-latest-small is used as a data set to verify the effect of the method of the present invention. This dataset contains a movie dataset and a rating dataset, the movie dataset contains 9742 movie data, and the rating dataset is 100836 ratings data generated by 610 users for these 9742 movies.

在该实施例中，首先将评分数据集按照7:1的比例分为训练数据集和测试数据集，然后在训练数据集上分别训练几种不同的算法，计算相似度时取α＝0.9，最后通过测试数据集测试这几种算法并比较产生的MAE。结果如图2所示，横坐标表示计算相似度时选取的最相似的物品数量K，纵坐标表示预测评分与实际评分的MAE，itemCF表示原始的基于物品的协同过滤算法产生的MAE结果，itemCF-content表示使用改进的相似度方法产生的MAE结果，rule-itemCF表示使用关联法则预测评分填充用户-评分矩阵之后再通过基于物品的协同过滤算法产生的MAE结果，rule-itemCF-content即是本发明提出的推荐算法产生的MAE。In this embodiment, the scoring data set is first divided into a training data set and a test data set according to a ratio of 7:1, and then several different algorithms are trained on the training data set respectively, and α=0.9 is taken when calculating the similarity, Finally, these algorithms are tested on the test dataset and the resulting MAEs are compared. The results are shown in Figure 2. The abscissa represents the number K of the most similar items selected when calculating the similarity, the ordinate represents the MAE of the predicted score and the actual score, itemCF represents the MAE result generated by the original item-based collaborative filtering algorithm, itemCF -content represents the MAE result generated by the improved similarity method, rule-itemCF represents the MAE result generated by the item-based collaborative filtering algorithm after filling the user-rating matrix with the predicted score using the association rule, rule-itemCF-content is this The MAE generated by the proposed recommendation algorithm.

从实验结果中，可以看出本文提出的推荐算法得到的MAE值最小，推荐质量得到一定程度的提高。并且当使用最相似的5个物品来预测评分时，得到的MAE值最小。From the experimental results, it can be seen that the MAE value obtained by the recommendation algorithm proposed in this paper is the smallest, and the recommendation quality is improved to a certain extent. And when the most similar 5 items are used to predict the rating, the resulting MAE value is the smallest.

以上所述仅为本发明的较佳实施方式，本发明的保护范围并不以上述实施方式为限，但凡本领域普通技术人员根据本发明所揭示内容所作的等效修饰或变化，皆应纳入权利要求书中记载的保护范围内。The above descriptions are only the preferred embodiments of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, but any equivalent modifications or changes made by those of ordinary skill in the art based on the contents disclosed in the present invention should be included in the within the scope of protection described in the claims.

Claims

1. a recommendation method based on the hybrid algorithm of article collaborative filtering and association rules, is characterized in that: comprise the steps:

S1. Build a data set of historical records to obtain association rules of items;

S2, construct a user-item rating matrix, and calculate the item similarity;

S3. Construct an item-content attribute matrix, and calculate the similarity based on item attributes;

S4. Predict item ratings according to similarity.

2. A recommendation method based on a hybrid algorithm of item collaborative filtering and association rules according to claim 1, wherein the step S1 comprises the following specific steps:

S1-1. Construct a historical record data set according to the user's behavior records each time, and each row of the data set is recorded as a set of items that have positive feedback in each behavior record of the user;

S1-2, find out the association rule of the second-order frequent itemsets according to the set minimum support and minimum confidence.

3. The recommendation method based on a hybrid algorithm of item collaborative filtering and association rules according to claim 1, wherein the step S2 comprises the following specific steps:

S2-1. Construct a user-item scoring matrix according to the user's historical records and scoring records. Each row of the matrix is the evaluation score of different users on different items, and each column of the matrix is a different item;

S2-2. Use the association rule in step 1 to predict the rating of an item that has had positive feedback with a certain user, and fill the rating into the above-mentioned user-item rating matrix to obtain a new user-item rating matrix;

S2-3. Calculate the similarity between different items by using the cosine similarity formula according to the user-item rating matrix after filling the rating.

4. The recommendation method based on a hybrid algorithm of item collaborative filtering and association rules according to claim 1, wherein the step S3 comprises the following specific steps:

S3-1. Construct all item-content attribute matrix, whether each row of the matrix has the content attribute of different items;

S3-2. According to the item-content attribute matrix, the cosine similarity is used to calculate the similarity between different items.

5. The recommendation method based on a hybrid algorithm of item collaborative filtering and association rules according to claim 1, wherein the step S4 comprises the following specific steps:

S4-1. The similarity between item i and item j is:

sim(i,j) represents the similarity between item i and item j, the value range is [0,1], α represents the weight, the value range is [0.1, 0.9], N(i) represents the number of users who like item i , N(i)∩N(j) represents the number of users who like item i and item j at the same time, and V(i) represents the attribute vector of item i;

S4-2, the calculation formula of item i prediction score is:

P _ui represents user u’s predicted rating for item i, S(i, K) is the set of K items most similar to item i, and r _uj represents user u’s actual rating for item j;

S4-3. Predict the rating of a user for an item according to a predictive rating formula.