CN110362755A - A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule - Google Patents
A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule Download PDFInfo
- Publication number
- CN110362755A CN110362755A CN201910667120.XA CN201910667120A CN110362755A CN 110362755 A CN110362755 A CN 110362755A CN 201910667120 A CN201910667120 A CN 201910667120A CN 110362755 A CN110362755 A CN 110362755A
- Authority
- CN
- China
- Prior art keywords
- article
- user
- similarity
- scoring
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to proposed algorithm technical fields, disclose a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule.By the historical record data collection for constructing user behavior, obtain the correlation rule of article, user-article rating matrix is constructed again, to calculate article similarity, then article-contents attribute matrix is constructed, the similarity based on goods attribute is calculated, article scoring is finally predicted according to similarity, and generate corresponding recommendation list.Using, based on the recommended method of hybrid algorithm, can effectively solve the data sparsity problem encountered in Collaborative Filtering Recommendation Algorithm, and improve the calculation method of similarity in the present invention, to improve recommendation quality.
Description
Technical field
The invention belongs to proposed algorithm technical fields, and in particular to a kind of mixed based on article collaborative filtering and correlation rule
The recommended method of hop algorithm.
Background technique
With the high speed development of internet industry, the transmission speed of information is getting faster, and the type and quantity of information are more next
It is more.When facing increasingly numerous and complicated information, user wants therefrom to search out the content oneself liked and become difficult, and
The time of searching also becomes more and more longer.In order to solve this problem, personalized recommendation technology, that is, proposed algorithm is quickly sent out
Exhibition.Proposed algorithm, by analyzing the historical behavior data acquisition user preference of user, to actively be that user recommends it that may feel
The content of interest.
Traditional proposed algorithm has: content-based recommendation algorithm, is based on social network at the proposed algorithm based on collaborative filtering
The proposed algorithm of network.What is be most widely used at present is the proposed algorithm based on collaborative filtering, it can be divided into based on user again
Collaborative filtering and collaborative filtering based on article.Collaborative filtering based on article is current industry using most
More proposed algorithms, it can recommend some articles similar with article that is liking before them to user.It is used by analysis
The behavior record at family calculates the similarity between article, and it is because liking that meaning, which has very big similarity for article A and article B,
The user of article A mostly also likes article B.But it is not meant to that the two articles are belonging to by the similarity that the method calculates
It is similar in property.The algorithm for being additionally based on collaborative filtering is highly dependent on historical data, when Sparse, recommends quality
Decline is serious.By the present invention in that goods attribute similarity is scored and calculates with the article of association rule prediction, in conjunction with based on object
The collaborative filtering of product improves recommendation quality.
Summary of the invention
The object of the present invention is to provide a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule,
The data sparsity problem encountered in Collaborative Filtering Recommendation Algorithm is solved, and improves the calculation method of similarity, is pushed away to improve
Recommend quality.To achieve the above object, the invention provides the following technical scheme: it is a kind of based on article collaborative filtering and correlation rule
The recommended method of hybrid algorithm, includes the following steps:
S1, building historical record data collection, obtain the correlation rule of article;
S2, building user-article rating matrix, calculate article similarity;
S3, building article-contents attribute matrix, calculate the similarity based on goods attribute;
S4, article scoring is predicted according to similarity.
Further, the step S1 is comprised the following specific steps that:
S1-1, historical record data collection is constructed according to each behavior record of user, every a line of data set is recorded as using
The article collection of positive feedback (such as: buying) occurs in each behavior record in family;
S1-2, the correlation rule that second order frequent item set is found out according to the minimum support and min confidence of setting.
Further, the step S2 is comprised the following specific steps that:
S2-1, historical record and scoring record building user-article rating matrix according to user, each behavior of matrix
Different user is classified as different articles to the evaluation scores of different commodity, each of matrix;
S2-2, the scoring that the article of positive feedback occurred with some user is predicted using the correlation rule in step 1,
And this scoring is filled into above-mentioned user-article rating matrix, obtain new user-article rating matrix;
S2-3, different articles are calculated using cosine similarity formula according to user-article rating matrix after filling scoring
Between similarity.
Further, the step S3 is comprised the following specific steps that:
S3-1, building all items-contents attribute matrix, the contents attribute of each behavior difference article of matrix have mercilessness
Condition;
S3-2, according to article-contents attribute matrix, calculate the similarity between different articles using cosine similarity.
Further, the step S4 is comprised the following specific steps that:
The similarity of S4-1, article i and article j are as follows:
Sim (i, j) indicates that the similarity of article i and article j, codomain are [0,1].α indicate weight, codomain be [0.1,
0.9]。
N (i) indicates to like the number of users of article i, N (i) ∩ N (j) expression while the number of users for liking article i and article j.
The attribute vector of V (i) expression article i;
The calculation formula of S4-2, article i prediction scoring are as follows:
PuiIndicate that user u scores to the prediction of article i.S (i, K) is the K article set most like with article i.rujTable
Show true scoring of the user u to article j;
S4-3, the scoring according to prediction scoring some user of formula predictions to some article.
Compared with prior art, the beneficial effects of the present invention are:
1. the sparsity for solving the appearance of single use user-article rating matrix data causes to recommend quality decline
Problem.
2. improving the accuracy of similarity prediction article scoring, recommendation quality is improved.
Detailed description of the invention
Fig. 1 is the process signal of the recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule in the present invention
Figure;
Fig. 2 is the comparison of mixing proposed algorithm and the MAE of traditional collaborative filtering prediction scoring in the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Also, the technical solution between each embodiment of the present invention can be combined with each other, but must be general with this field
Based on logical technical staff can be realized, it will be understood that when the combination of technical solution appearance is conflicting or cannot achieve this
The combination of technical solution is not present, also not the present invention claims protection scope within.
As shown in Figure 1, S1 constructs historical record data collection, the correlation rule of article is obtained, is specifically comprised the following steps:
1. constructing historical record data collection, which is the item lists that positive feedback occurs for user:
[ItemID1, ItemID2 ... ..., ItemIDn].
2. found out by Apriori association rule algorithm meet minimum support be 0.2, min confidence be 0.6 two
Rank frequent item set is associated with rule.Apriori association rule algorithm mainly includes two steps: being found out first all in data set
Frequent item set, the frequency that these item collections occur are greater than equal to minimum support, then find satisfaction most according to frequent item set
The correlation rule of small confidence level.
The calculation formula of support:
Support (X → Y) indicates support, the i.e. probability that article collection X and article collection Y occur simultaneously.N (X, Y) expression thing
The number that product collection X and article Y occurs simultaneously, NallIndicate historical record sum.
The calculation formula of confidence level:
Conf (X → Y) indicates confidence level, i.e., when article collection X occurs while the probability of article collection Y occurs.
3. traversal history log data set first looks for single order frequent item set, second order frequent item set is then looked for, and according to
Second order frequent item set finds the correlation rule that confidence level is more than or equal to min confidence.The association rule being eventually found be by
The data set that (ItemIDx, ItemIDy, Confidence) is constituted, ItemIDx indicate that the ID of article x, Confidence are indicated
The confidence level of this group of correlation rule.
S2, building user-article rating matrix, calculate article similarity, comprise the following specific steps that:
1. constructing user-article rating matrix R according to user's history behavior record:
M indicates the quantity of user, and n indicates the quantity of article.rui(1≤u≤m, 1≤i≤n) indicates user u to article i's
Scoring, if user u does not score to article i, ruiFor sky.
2. predicting scoring using the association rule data set in step 1.It is each in traverse user-article rating matrix
Row data, for each user, if thering is it to occur corresponding to the article of positive feedback and the article in association rule data set
Associated article do not have and user if positive feedback, then predict scoring of this user to this associated article.Pre- assessment
Divide formula are as follows:
PuiIt scores for user u the prediction of article i, S (u) indicates the scored article set of user u, confjiTable
Show article j to the confidence level of article i, rujIndicate scoring of the user u to article j.
3. user-article rating matrix is filled in the scoring using prediction.The article that each user does not score also is used upper
Scoring of the prediction scoring in face as the user to this article.
4. calculating the similarity of article using filled user-article rating matrix.Calculating formula of similarity are as follows:
Sim (i, j) is the similarity of article i and article j, and codomain is [0,1].The number of users of article i is liked in N (i) expression,
N (i) ∩ N (j) is indicated while being liked the number of users of article i and article j.
5. its similarity with other articles is calculated for each article, final to obtain article similarity matrix S:
sijThe similarity of (1≤i, j≤n) expression article i and article j.
S3, building article-contents attribute matrix, calculate the similarity based on goods attribute, the specific steps are as follows:
1. pair each article is analyzed, article-contents attribute matrix W is constructed:
N indicates the quantity of article, and l indicates the quantity of goods attribute.wijThe jth of (1≤i≤n, 1≤j≤l) expression article i
A attribute, if article i has this attribute, wij=1, otherwise wij=0.
2. obtaining the similarity between article by comparing the attribute of each article.Calculating formula of similarity are as follows:
simc(i, j) indicates that the similarity that article i and article j is calculated by contents attribute, codomain are [0,1].V
(i) attribute vector of article i is indicated.
3. its similarity with other articles is calculated for each article, it is final to obtain article similarity matrix Sc:
Scij(1≤i, j≤n) indicates the similarity that article i and article j are calculated based on contents attribute.
S4, article scoring is predicted according to similarity, comprise the following specific steps that:
1. the similarity sim for combining similarity sim (i, j) and being calculated based on contents attributecAfter (i, j) obtains improvement
Article similarity, formula are as follows: simnew(i, j)=α sim (i, j)+(1- α) simc(i,j)。
2. carrying out prediction scoring, prediction scoring formula to the target item i of user u using improved article similarity
Are as follows:
PuiIndicate that user u scores to the prediction of article i.S (i, K) is the K article set most like with article i.rujTable
Show true scoring of the user u to article j.
Prediction article scoring is obtained by S1-4, at the corresponding recommendation list of production.For target user, all items are traversed
Collection is picked out the article set for having not occurred positive feedback, and is sorted from large to small according to prediction scoring, and column to be recommended are generated
Table.It finally chooses preceding n article in list to be recommended and generates recommendation list.
Pass through the mean absolute deviation (Mean Absolute Deviation, MAE) for calculating prediction scoring with actually scoring
Measure recommendation quality, mean absolute error value is smaller, prediction accuracy is higher.Calculation formula are as follows:
MAEuIndicate user u to NuThe MAE, N of a article prediction scoringuFor the number of articles that predict scoring, PuiFor user
U scores to the prediction of article i, ruiPractical scoring for user u to article i.
MAE indicates that the MAE of total user, M are the quantity of all users.
Embodiment 1
In an embodiment of the present invention, movielens-latest-small is used as data set to verify present invention side
The effect of method.A cinematic data collection and a score data collection are contained in this data set, cinematic data concentration contains
9742 cinematic datas, score data concentration are 100836 score datas that 610 users generate this 9742 films.
In this embodiment, score data collection is divided into training dataset and test data set according to the ratio of 7:1 first,
Then several different algorithms are respectively trained on training dataset, take α=0.9 when calculating similarity, finally by test number
These types of algorithm is tested according to collection and compares the MAE of generation.As a result as shown in Fig. 2, abscissa indicates to choose when calculating similarity
Most like number of articles K, the MAE that ordinate indicates prediction scoring and actually scores, itemCF indicate original based on article
The MAE that generates of collaborative filtering as a result, itemCF-content indicates that the MAE generated using improved similarity based method is tied
Fruit, rule-itemCF indicate to pass through the association based on article again using after association rule prediction scoring filling user-rating matrix
The MAE generated with filter algorithm is as a result, rule-itemCF-content is the MAE that proposed algorithm proposed by the present invention generates.
From experimental result, it can be seen that the MAE value that proposed algorithm proposed in this paper obtains is minimum, and quality is recommended to obtain one
Determine the raising of degree.And when predicting scoring using 5 most like articles, obtained MAE value is minimum.
The foregoing is merely better embodiment of the invention, protection scope of the present invention is not with above embodiment
Limit, as long as those of ordinary skill in the art's equivalent modification or variation made by disclosure according to the present invention, should all be included in power
In the protection scope recorded in sharp claim.
Claims (5)
1. a kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule, it is characterised in that: including as follows
Step:
S1, building historical record data collection, obtain the correlation rule of article;
S2, building user-article rating matrix, calculate article similarity;
S3, building article-contents attribute matrix, calculate the similarity based on goods attribute;
S4, article scoring is predicted according to similarity.
2. a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule according to claim 1,
It is characterized by: the step S1 is comprised the following specific steps that:
S1-1, historical record data collection is constructed according to each behavior record of user, it is every that every a line of data set is recorded as user
The article collection of positive feedback occurs in secondary behavior record;
S1-2, the correlation rule that second order frequent item set is found out according to the minimum support and min confidence of setting.
3. a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule according to claim 1,
It is characterized by: the step S2 is comprised the following specific steps that:
S2-1, historical record and scoring record building user-article rating matrix according to user, each behavior difference of matrix
User is classified as different articles to the evaluation scores of different commodity, each of matrix;
S2-2, the scoring that the article of positive feedback occurred with some user is predicted using the correlation rule in step 1, and will
This scoring is filled into above-mentioned user-article rating matrix, obtains new user-article rating matrix;
S2-3, it is calculated between different articles according to user-article rating matrix after filling scoring using cosine similarity formula
Similarity.
4. a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule according to claim 1,
It is characterized by: the step S3 is comprised the following specific steps that:
S3-1, building all items-contents attribute matrix, the contents attribute of each behavior difference article of matrix have nil case;
S3-2, according to article-contents attribute matrix, calculate the similarity between different articles using cosine similarity.
5. a kind of recommended method of hybrid algorithm based on article collaborative filtering and correlation rule according to claim 1,
It is characterized by: the step S4 is comprised the following specific steps that:
The similarity of S4-1, article i and article j are as follows:
Sim (i, j) indicates that the similarity of article i and article j, codomain are [0,1], and α indicates weight, and codomain is [0.1,0.9], N
(i) number of users of article i is liked in expression, and N (i) ∩ N (j) indicates while liking the number of users of article i and article j, and V (i) is indicated
The attribute vector of article i;
The calculation formula of S4-2, article i prediction scoring are as follows:
PuiIndicate that user u scores to the prediction of article i, S (i, K) is the K article set most like with article i, rujIt indicates to use
True scoring of the family u to article j;
S4-3, the scoring according to prediction scoring some user of formula predictions to some article.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910667120.XA CN110362755A (en) | 2019-07-23 | 2019-07-23 | A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910667120.XA CN110362755A (en) | 2019-07-23 | 2019-07-23 | A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110362755A true CN110362755A (en) | 2019-10-22 |
Family
ID=68219738
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910667120.XA Withdrawn CN110362755A (en) | 2019-07-23 | 2019-07-23 | A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110362755A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110825971A (en) * | 2019-11-11 | 2020-02-21 | 辽宁师范大学 | Article cold start recommendation algorithm integrating relationship mining and collaborative filtering |
CN111369315A (en) * | 2020-02-27 | 2020-07-03 | 拉扎斯网络科技(上海)有限公司 | Resource object recommendation method and device, and data prediction model training method and device |
CN112200601A (en) * | 2020-09-11 | 2021-01-08 | 深圳市法本信息技术股份有限公司 | Item recommendation method and device and readable storage medium |
CN113221014A (en) * | 2021-06-09 | 2021-08-06 | 中国银行股份有限公司 | Personalized recommendation method and system for application function |
CN113688314A (en) * | 2021-08-13 | 2021-11-23 | 今彩慧健康科技(苏州)有限公司 | Physiotherapy store recommendation method and device |
CN113722443A (en) * | 2021-09-10 | 2021-11-30 | 焦点科技股份有限公司 | Label recommendation method and system integrating text similarity and collaborative filtering |
CN113822738A (en) * | 2021-06-22 | 2021-12-21 | 昆明理工大学 | Multi-dimensional agricultural product supply and demand bidirectional personalized recommendation method |
-
2019
- 2019-07-23 CN CN201910667120.XA patent/CN110362755A/en not_active Withdrawn
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110825971A (en) * | 2019-11-11 | 2020-02-21 | 辽宁师范大学 | Article cold start recommendation algorithm integrating relationship mining and collaborative filtering |
CN110825971B (en) * | 2019-11-11 | 2023-04-14 | 辽宁师范大学 | Article cold start recommendation algorithm integrating relationship mining and collaborative filtering |
CN111369315A (en) * | 2020-02-27 | 2020-07-03 | 拉扎斯网络科技(上海)有限公司 | Resource object recommendation method and device, and data prediction model training method and device |
CN112200601A (en) * | 2020-09-11 | 2021-01-08 | 深圳市法本信息技术股份有限公司 | Item recommendation method and device and readable storage medium |
CN112200601B (en) * | 2020-09-11 | 2024-05-14 | 深圳市法本信息技术股份有限公司 | Item recommendation method, device and readable storage medium |
CN113221014A (en) * | 2021-06-09 | 2021-08-06 | 中国银行股份有限公司 | Personalized recommendation method and system for application function |
CN113822738A (en) * | 2021-06-22 | 2021-12-21 | 昆明理工大学 | Multi-dimensional agricultural product supply and demand bidirectional personalized recommendation method |
CN113822738B (en) * | 2021-06-22 | 2024-05-14 | 昆明理工大学 | Multi-dimensional agricultural product supply and demand bidirectional personalized recommendation method |
CN113688314A (en) * | 2021-08-13 | 2021-11-23 | 今彩慧健康科技(苏州)有限公司 | Physiotherapy store recommendation method and device |
CN113688314B (en) * | 2021-08-13 | 2024-03-19 | 今彩慧健康科技(苏州)有限公司 | Physical therapy store recommending method and device |
CN113722443A (en) * | 2021-09-10 | 2021-11-30 | 焦点科技股份有限公司 | Label recommendation method and system integrating text similarity and collaborative filtering |
CN113722443B (en) * | 2021-09-10 | 2024-04-19 | 焦点科技股份有限公司 | Label recommendation method and system integrating text similarity and collaborative filtering |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110362755A (en) | A kind of recommended method of the hybrid algorithm based on article collaborative filtering and correlation rule | |
CN106802956B (en) | Movie recommendation method based on weighted heterogeneous information network | |
CN105138653B (en) | It is a kind of that method and its recommendation apparatus are recommended based on typical degree and the topic of difficulty | |
CN107679239B (en) | Personalized community recommendation method based on user behaviors | |
CN107562947A (en) | A kind of Mobile Space-time perceives the lower dynamic method for establishing model of recommendation service immediately | |
CN105183748B (en) | A kind of combination forecasting method based on content and scoring | |
Anand et al. | Folksonomy-based fuzzy user profiling for improved recommendations | |
CN112231583B (en) | E-commerce recommendation method based on dynamic interest group identification and generation of confrontation network | |
CN106649658A (en) | Recommendation system and method for improving user role undifferentiated treatment and data sparseness | |
CN110619559B (en) | Method for accurately recommending commodities in electronic commerce based on big data information | |
CN115712780A (en) | Information pushing method and device based on cloud computing and big data | |
Chen et al. | A fuzzy matrix factor recommendation method with forgetting function and user features | |
Jiao et al. | Research on personalized recommendation optimization of E-commerce system based on customer trade behaviour data | |
CN114840745A (en) | Personalized recommendation method and system based on graph feature learning and deep semantic matching model | |
CN108875071B (en) | Learning resource recommendation method based on multi-view interest | |
CN108491477B (en) | Neural network recommendation method based on multi-dimensional cloud and user dynamic interest | |
Sharma et al. | CCFRS–community based collaborative filtering recommender system | |
CN110825965A (en) | Improved collaborative filtering recommendation method based on trust mechanism and time weighting | |
Wang | Application of E-Commerce Recommendation Algorithm in Consumer Preference Prediction | |
CN110569374B (en) | Movie recommendation method based on improved collaborative filtering algorithm | |
Shang et al. | An Improved Tensor Decomposition Model for Recommendation System | |
CN110825971A (en) | Article cold start recommendation algorithm integrating relationship mining and collaborative filtering | |
Sun et al. | CROA: A Content-Based Recommendation Optimization Algorithm for Personalized Knowledge Services | |
Hwang et al. | Integrating multiple linear regression and multicriteria collaborative filtering for better recommendation | |
Yang et al. | Consumers’ Purchase Behavior Preference in E-Commerce Platform Based on Data Mining Algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20191022 |