CN110362755A - A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule - Google Patents

A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule Download PDF

Info

Publication number
CN110362755A
CN110362755A CN201910667120.XA CN201910667120A CN110362755A CN 110362755 A CN110362755 A CN 110362755A CN 201910667120 A CN201910667120 A CN 201910667120A CN 110362755 A CN110362755 A CN 110362755A
Authority
CN
China
Prior art keywords
article
user
similarity
scoring
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201910667120.XA
Other languages
Chinese (zh)
Inventor
李涛
李�昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201910667120.XA priority Critical patent/CN110362755A/en
Publication of CN110362755A publication Critical patent/CN110362755A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to proposed algorithm technical fields, disclose a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule.By the historical record data collection for constructing user behavior, obtain the correlation rule of article, user-article rating matrix is constructed again, to calculate article similarity, then article-contents attribute matrix is constructed, the similarity based on goods attribute is calculated, article scoring is finally predicted according to similarity, and generate corresponding recommendation list.Using, based on the recommended method of hybrid algorithm, can effectively solve the data sparsity problem encountered in Collaborative Filtering Recommendation Algorithm, and improve the calculation method of similarity in the present invention, to improve recommendation quality.

Description

A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule
Technical field
The invention belongs to proposed algorithm technical fields, and in particular to a kind of mixed based on article collaborative filtering and correlation rule The recommended method of hop algorithm.
Background technique
With the high speed development of internet industry, the transmission speed of information is getting faster, and the type and quantity of information are more next It is more.When facing increasingly numerous and complicated information, user wants therefrom to search out the content oneself liked and become difficult, and The time of searching also becomes more and more longer.In order to solve this problem, personalized recommendation technology, that is, proposed algorithm is quickly sent out Exhibition.Proposed algorithm, by analyzing the historical behavior data acquisition user preference of user, to actively be that user recommends it that may feel The content of interest.
Traditional proposed algorithm has: content-based recommendation algorithm, is based on social network at the proposed algorithm based on collaborative filtering The proposed algorithm of network.What is be most widely used at present is the proposed algorithm based on collaborative filtering, it can be divided into based on user again Collaborative filtering and collaborative filtering based on article.Collaborative filtering based on article is current industry using most More proposed algorithms, it can recommend some articles similar with article that is liking before them to user.It is used by analysis The behavior record at family calculates the similarity between article, and it is because liking that meaning, which has very big similarity for article A and article B, The user of article A mostly also likes article B.But it is not meant to that the two articles are belonging to by the similarity that the method calculates It is similar in property.The algorithm for being additionally based on collaborative filtering is highly dependent on historical data, when Sparse, recommends quality Decline is serious.By the present invention in that goods attribute similarity is scored and calculates with the article of association rule prediction, in conjunction with based on object The collaborative filtering of product improves recommendation quality.
Summary of the invention
The object of the present invention is to provide a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule, The data sparsity problem encountered in Collaborative Filtering Recommendation Algorithm is solved, and improves the calculation method of similarity, is pushed away to improve Recommend quality.To achieve the above object, the invention provides the following technical scheme: it is a kind of based on article collaborative filtering and correlation rule The recommended method of hybrid algorithm, includes the following steps:
S1, building historical record data collection, obtain the correlation rule of article;
S2, building user-article rating matrix, calculate article similarity;
S3, building article-contents attribute matrix, calculate the similarity based on goods attribute;
S4, article scoring is predicted according to similarity.
Further, the step S1 is comprised the following specific steps that:
S1-1, historical record data collection is constructed according to each behavior record of user, every a line of data set is recorded as using The article collection of positive feedback (such as: buying) occurs in each behavior record in family;
S1-2, the correlation rule that second order frequent item set is found out according to the minimum support and min confidence of setting.
Further, the step S2 is comprised the following specific steps that:
S2-1, historical record and scoring record building user-article rating matrix according to user, each behavior of matrix Different user is classified as different articles to the evaluation scores of different commodity, each of matrix;
S2-2, the scoring that the article of positive feedback occurred with some user is predicted using the correlation rule in step 1, And this scoring is filled into above-mentioned user-article rating matrix, obtain new user-article rating matrix;
S2-3, different articles are calculated using cosine similarity formula according to user-article rating matrix after filling scoring Between similarity.
Further, the step S3 is comprised the following specific steps that:
S3-1, building all items-contents attribute matrix, the contents attribute of each behavior difference article of matrix have mercilessness Condition;
S3-2, according to article-contents attribute matrix, calculate the similarity between different articles using cosine similarity.
Further, the step S4 is comprised the following specific steps that:
The similarity of S4-1, article i and article j are as follows:
Sim (i, j) indicates that the similarity of article i and article j, codomain are [0,1].α indicate weight, codomain be [0.1, 0.9]。
N (i) indicates to like the number of users of article i, N (i) ∩ N (j) expression while the number of users for liking article i and article j.
The attribute vector of V (i) expression article i;
The calculation formula of S4-2, article i prediction scoring are as follows:
PuiIndicate that user u scores to the prediction of article i.S (i, K) is the K article set most like with article i.rujTable Show true scoring of the user u to article j;
S4-3, the scoring according to prediction scoring some user of formula predictions to some article.
Compared with prior art, the beneficial effects of the present invention are:
1. the sparsity for solving the appearance of single use user-article rating matrix data causes to recommend quality decline Problem.
2. improving the accuracy of similarity prediction article scoring, recommendation quality is improved.
Detailed description of the invention
Fig. 1 is the process signal of the recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule in the present invention Figure;
Fig. 2 is the comparison of mixing proposed algorithm and the MAE of traditional collaborative filtering prediction scoring in the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Also, the technical solution between each embodiment of the present invention can be combined with each other, but must be general with this field Based on logical technical staff can be realized, it will be understood that when the combination of technical solution appearance is conflicting or cannot achieve this The combination of technical solution is not present, also not the present invention claims protection scope within.
As shown in Figure 1, S1 constructs historical record data collection, the correlation rule of article is obtained, is specifically comprised the following steps:
1. constructing historical record data collection, which is the item lists that positive feedback occurs for user: [ItemID1, ItemID2 ... ..., ItemIDn].
2. found out by Apriori association rule algorithm meet minimum support be 0.2, min confidence be 0.6 two Rank frequent item set is associated with rule.Apriori association rule algorithm mainly includes two steps: being found out first all in data set Frequent item set, the frequency that these item collections occur are greater than equal to minimum support, then find satisfaction most according to frequent item set The correlation rule of small confidence level.
The calculation formula of support:
Support (X → Y) indicates support, the i.e. probability that article collection X and article collection Y occur simultaneously.N (X, Y) expression thing The number that product collection X and article Y occurs simultaneously, NallIndicate historical record sum.
The calculation formula of confidence level:
Conf (X → Y) indicates confidence level, i.e., when article collection X occurs while the probability of article collection Y occurs.
3. traversal history log data set first looks for single order frequent item set, second order frequent item set is then looked for, and according to Second order frequent item set finds the correlation rule that confidence level is more than or equal to min confidence.The association rule being eventually found be by The data set that (ItemIDx, ItemIDy, Confidence) is constituted, ItemIDx indicate that the ID of article x, Confidence are indicated The confidence level of this group of correlation rule.
S2, building user-article rating matrix, calculate article similarity, comprise the following specific steps that:
1. constructing user-article rating matrix R according to user's history behavior record:
M indicates the quantity of user, and n indicates the quantity of article.rui(1≤u≤m, 1≤i≤n) indicates user u to article i's Scoring, if user u does not score to article i, ruiFor sky.
2. predicting scoring using the association rule data set in step 1.It is each in traverse user-article rating matrix Row data, for each user, if thering is it to occur corresponding to the article of positive feedback and the article in association rule data set Associated article do not have and user if positive feedback, then predict scoring of this user to this associated article.Pre- assessment Divide formula are as follows:
PuiIt scores for user u the prediction of article i, S (u) indicates the scored article set of user u, confjiTable
Show article j to the confidence level of article i, rujIndicate scoring of the user u to article j.
3. user-article rating matrix is filled in the scoring using prediction.The article that each user does not score also is used upper Scoring of the prediction scoring in face as the user to this article.
4. calculating the similarity of article using filled user-article rating matrix.Calculating formula of similarity are as follows:
Sim (i, j) is the similarity of article i and article j, and codomain is [0,1].The number of users of article i is liked in N (i) expression, N (i) ∩ N (j) is indicated while being liked the number of users of article i and article j.
5. its similarity with other articles is calculated for each article, final to obtain article similarity matrix S:
sijThe similarity of (1≤i, j≤n) expression article i and article j.
S3, building article-contents attribute matrix, calculate the similarity based on goods attribute, the specific steps are as follows:
1. pair each article is analyzed, article-contents attribute matrix W is constructed:
N indicates the quantity of article, and l indicates the quantity of goods attribute.wijThe jth of (1≤i≤n, 1≤j≤l) expression article i A attribute, if article i has this attribute, wij=1, otherwise wij=0.
2. obtaining the similarity between article by comparing the attribute of each article.Calculating formula of similarity are as follows:
simc(i, j) indicates that the similarity that article i and article j is calculated by contents attribute, codomain are [0,1].V (i) attribute vector of article i is indicated.
3. its similarity with other articles is calculated for each article, it is final to obtain article similarity matrix Sc:
Scij(1≤i, j≤n) indicates the similarity that article i and article j are calculated based on contents attribute.
S4, article scoring is predicted according to similarity, comprise the following specific steps that:
1. the similarity sim for combining similarity sim (i, j) and being calculated based on contents attributecAfter (i, j) obtains improvement Article similarity, formula are as follows: simnew(i, j)=α sim (i, j)+(1- α) simc(i,j)。
2. carrying out prediction scoring, prediction scoring formula to the target item i of user u using improved article similarity Are as follows:
PuiIndicate that user u scores to the prediction of article i.S (i, K) is the K article set most like with article i.rujTable Show true scoring of the user u to article j.
Prediction article scoring is obtained by S1-4, at the corresponding recommendation list of production.For target user, all items are traversed Collection is picked out the article set for having not occurred positive feedback, and is sorted from large to small according to prediction scoring, and column to be recommended are generated Table.It finally chooses preceding n article in list to be recommended and generates recommendation list.
Pass through the mean absolute deviation (Mean Absolute Deviation, MAE) for calculating prediction scoring with actually scoring Measure recommendation quality, mean absolute error value is smaller, prediction accuracy is higher.Calculation formula are as follows:
MAEuIndicate user u to NuThe MAE, N of a article prediction scoringuFor the number of articles that predict scoring, PuiFor user U scores to the prediction of article i, ruiPractical scoring for user u to article i.
MAE indicates that the MAE of total user, M are the quantity of all users.
Embodiment 1
In an embodiment of the present invention, movielens-latest-small is used as data set to verify present invention side The effect of method.A cinematic data collection and a score data collection are contained in this data set, cinematic data concentration contains 9742 cinematic datas, score data concentration are 100836 score datas that 610 users generate this 9742 films.
In this embodiment, score data collection is divided into training dataset and test data set according to the ratio of 7:1 first, Then several different algorithms are respectively trained on training dataset, take α=0.9 when calculating similarity, finally by test number These types of algorithm is tested according to collection and compares the MAE of generation.As a result as shown in Fig. 2, abscissa indicates to choose when calculating similarity Most like number of articles K, the MAE that ordinate indicates prediction scoring and actually scores, itemCF indicate original based on article The MAE that generates of collaborative filtering as a result, itemCF-content indicates that the MAE generated using improved similarity based method is tied Fruit, rule-itemCF indicate to pass through the association based on article again using after association rule prediction scoring filling user-rating matrix The MAE generated with filter algorithm is as a result, rule-itemCF-content is the MAE that proposed algorithm proposed by the present invention generates.
From experimental result, it can be seen that the MAE value that proposed algorithm proposed in this paper obtains is minimum, and quality is recommended to obtain one Determine the raising of degree.And when predicting scoring using 5 most like articles, obtained MAE value is minimum.
The foregoing is merely better embodiment of the invention, protection scope of the present invention is not with above embodiment Limit, as long as those of ordinary skill in the art's equivalent modification or variation made by disclosure according to the present invention, should all be included in power In the protection scope recorded in sharp claim.

Claims (5)

1. a kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule, it is characterised in that: including as follows Step:
S1, building historical record data collection, obtain the correlation rule of article;
S2, building user-article rating matrix, calculate article similarity;
S3, building article-contents attribute matrix, calculate the similarity based on goods attribute;
S4, article scoring is predicted according to similarity.
2. a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule according to claim 1, It is characterized by: the step S1 is comprised the following specific steps that:
S1-1, historical record data collection is constructed according to each behavior record of user, it is every that every a line of data set is recorded as user The article collection of positive feedback occurs in secondary behavior record;
S1-2, the correlation rule that second order frequent item set is found out according to the minimum support and min confidence of setting.
3. a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule according to claim 1, It is characterized by: the step S2 is comprised the following specific steps that:
S2-1, historical record and scoring record building user-article rating matrix according to user, each behavior difference of matrix User is classified as different articles to the evaluation scores of different commodity, each of matrix;
S2-2, the scoring that the article of positive feedback occurred with some user is predicted using the correlation rule in step 1, and will This scoring is filled into above-mentioned user-article rating matrix, obtains new user-article rating matrix;
S2-3, it is calculated between different articles according to user-article rating matrix after filling scoring using cosine similarity formula Similarity.
4. a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule according to claim 1, It is characterized by: the step S3 is comprised the following specific steps that:
S3-1, building all items-contents attribute matrix, the contents attribute of each behavior difference article of matrix have nil case;
S3-2, according to article-contents attribute matrix, calculate the similarity between different articles using cosine similarity.
5. a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule according to claim 1, It is characterized by: the step S4 is comprised the following specific steps that:
The similarity of S4-1, article i and article j are as follows:
Sim (i, j) indicates that the similarity of article i and article j, codomain are [0,1], and α indicates weight, and codomain is [0.1,0.9], N (i) number of users of article i is liked in expression, and N (i) ∩ N (j) indicates while liking the number of users of article i and article j, and V (i) is indicated The attribute vector of article i;
The calculation formula of S4-2, article i prediction scoring are as follows:
PuiIndicate that user u scores to the prediction of article i, S (i, K) is the K article set most like with article i, rujIt indicates to use True scoring of the family u to article j;
S4-3, the scoring according to prediction scoring some user of formula predictions to some article.
CN201910667120.XA 2019-07-23 2019-07-23 A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule Withdrawn CN110362755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910667120.XA CN110362755A (en) 2019-07-23 2019-07-23 A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910667120.XA CN110362755A (en) 2019-07-23 2019-07-23 A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule

Publications (1)

Publication Number Publication Date
CN110362755A true CN110362755A (en) 2019-10-22

Family

ID=68219738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910667120.XA Withdrawn CN110362755A (en) 2019-07-23 2019-07-23 A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule

Country Status (1)

Country Link
CN (1) CN110362755A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825971A (en) * 2019-11-11 2020-02-21 辽宁师范大学 Article cold start recommendation algorithm integrating relationship mining and collaborative filtering
CN111369315A (en) * 2020-02-27 2020-07-03 拉扎斯网络科技(上海)有限公司 Resource object recommendation method and device, and data prediction model training method and device
CN112200601A (en) * 2020-09-11 2021-01-08 深圳市法本信息技术股份有限公司 Item recommendation method and device and readable storage medium
CN113221014A (en) * 2021-06-09 2021-08-06 中国银行股份有限公司 Personalized recommendation method and system for application function
CN113688314A (en) * 2021-08-13 2021-11-23 今彩慧健康科技(苏州)有限公司 Physiotherapy store recommendation method and device
CN113722443A (en) * 2021-09-10 2021-11-30 焦点科技股份有限公司 Label recommendation method and system integrating text similarity and collaborative filtering
CN113822738A (en) * 2021-06-22 2021-12-21 昆明理工大学 Multi-dimensional agricultural product supply and demand bidirectional personalized recommendation method

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825971A (en) * 2019-11-11 2020-02-21 辽宁师范大学 Article cold start recommendation algorithm integrating relationship mining and collaborative filtering
CN110825971B (en) * 2019-11-11 2023-04-14 辽宁师范大学 Article cold start recommendation algorithm integrating relationship mining and collaborative filtering
CN111369315A (en) * 2020-02-27 2020-07-03 拉扎斯网络科技(上海)有限公司 Resource object recommendation method and device, and data prediction model training method and device
CN112200601A (en) * 2020-09-11 2021-01-08 深圳市法本信息技术股份有限公司 Item recommendation method and device and readable storage medium
CN112200601B (en) * 2020-09-11 2024-05-14 深圳市法本信息技术股份有限公司 Item recommendation method, device and readable storage medium
CN113221014A (en) * 2021-06-09 2021-08-06 中国银行股份有限公司 Personalized recommendation method and system for application function
CN113822738A (en) * 2021-06-22 2021-12-21 昆明理工大学 Multi-dimensional agricultural product supply and demand bidirectional personalized recommendation method
CN113822738B (en) * 2021-06-22 2024-05-14 昆明理工大学 Multi-dimensional agricultural product supply and demand bidirectional personalized recommendation method
CN113688314A (en) * 2021-08-13 2021-11-23 今彩慧健康科技(苏州)有限公司 Physiotherapy store recommendation method and device
CN113688314B (en) * 2021-08-13 2024-03-19 今彩慧健康科技(苏州)有限公司 Physical therapy store recommending method and device
CN113722443A (en) * 2021-09-10 2021-11-30 焦点科技股份有限公司 Label recommendation method and system integrating text similarity and collaborative filtering
CN113722443B (en) * 2021-09-10 2024-04-19 焦点科技股份有限公司 Label recommendation method and system integrating text similarity and collaborative filtering

Similar Documents

Publication Publication Date Title
CN110362755A (en) A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule
CN106802956B (en) Movie recommendation method based on weighted heterogeneous information network
CN105138653B (en) It is a kind of that method and its recommendation apparatus are recommended based on typical degree and the topic of difficulty
CN107679239B (en) Personalized community recommendation method based on user behaviors
CN107562947A (en) A kind of Mobile Space-time perceives the lower dynamic method for establishing model of recommendation service immediately
CN105183748B (en) A kind of combination forecasting method based on content and scoring
Anand et al. Folksonomy-based fuzzy user profiling for improved recommendations
CN112231583B (en) E-commerce recommendation method based on dynamic interest group identification and generation of confrontation network
CN106649658A (en) Recommendation system and method for improving user role undifferentiated treatment and data sparseness
CN110619559B (en) Method for accurately recommending commodities in electronic commerce based on big data information
CN115712780A (en) Information pushing method and device based on cloud computing and big data
Chen et al. A fuzzy matrix factor recommendation method with forgetting function and user features
Jiao et al. Research on personalized recommendation optimization of E-commerce system based on customer trade behaviour data
CN114840745A (en) Personalized recommendation method and system based on graph feature learning and deep semantic matching model
CN108875071B (en) Learning resource recommendation method based on multi-view interest
CN108491477B (en) Neural network recommendation method based on multi-dimensional cloud and user dynamic interest
Sharma et al. CCFRS–community based collaborative filtering recommender system
CN110825965A (en) Improved collaborative filtering recommendation method based on trust mechanism and time weighting
Wang Application of E-Commerce Recommendation Algorithm in Consumer Preference Prediction
CN110569374B (en) Movie recommendation method based on improved collaborative filtering algorithm
Shang et al. An Improved Tensor Decomposition Model for Recommendation System
CN110825971A (en) Article cold start recommendation algorithm integrating relationship mining and collaborative filtering
Sun et al. CROA: A Content-Based Recommendation Optimization Algorithm for Personalized Knowledge Services
Hwang et al. Integrating multiple linear regression and multicriteria collaborative filtering for better recommendation
Yang et al. Consumers’ Purchase Behavior Preference in E-Commerce Platform Based on Data Mining Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20191022