CN103632290A - Recommendation probability fusion based hybrid recommendation method - Google Patents

Recommendation probability fusion based hybrid recommendation method Download PDF

Info

Publication number
CN103632290A
CN103632290A CN201310637512.4A CN201310637512A CN103632290A CN 103632290 A CN103632290 A CN 103632290A CN 201310637512 A CN201310637512 A CN 201310637512A CN 103632290 A CN103632290 A CN 103632290A
Authority
CN
China
Prior art keywords
user
scoring
commodity
outcome
represent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310637512.4A
Other languages
Chinese (zh)
Other versions
CN103632290B (en
Inventor
刘业政
姜元春
王锦坤
孙春华
魏婧
杜非
王佳佳
姬建睿
何建民
凌海峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN201310637512.4A priority Critical patent/CN103632290B/en
Publication of CN103632290A publication Critical patent/CN103632290A/en
Application granted granted Critical
Publication of CN103632290B publication Critical patent/CN103632290B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a recommendation probability fusion based hybrid recommendation method. The method includes the following steps: (1) using a two-dimensional table for representing scoring data of commodities; (2) taking any items in a scored set as unknowns, utilizing a basic recommendation method to obtain a prediction result of each corresponding item, training sets of scores of the scored items and the forecast results of the corresponding items by the aid of a neural network so as to obtain an SFM (score forecast model); (3) utilizing the basic recommendation method to obtain non-scored item forecast results, and utilizing the SFM to obtain a final forecast value of each non-scored item of a set of the non-scored item forecast results; (4) sorting all the non-scored items of a user according to size of each forecast value in a final forecast value set in a descending manner to obtain a non-scored item sorted set, and selecting top N items of the non-scored item sorted set as recommendation results to be recommended to the user. By the method, true situation of user evaluation can be effectively reflected, and precision in personalized recommendation is improved.

Description

A kind of mixing recommend method based on recommending probability fusion
Technical field
The invention belongs to e-commerce field, specifically a kind of mixing recommend method based on recommending probability fusion.
Background technology
Along with the fast development of ecommerce, information overload phenomenon is further serious.The individual demand how the commodity set based on magnanimity meets user becomes and promotes the major issue that user experiences, improves user satisfaction.Personalized recommendation system is the important means that meets users ' individualized requirement.Personalized recommendation system builds user interest preference model according to the online browse data of user's individuality or purchase data, thereby to user, recommends to meet the commodity of its unique need.Personalized commercial is recommended in the e-commerce websites such as Amazon, store, Jingdone district, Taobao and is widely applied, and has effectively improved the possibility that user buys, and has promoted the experience of user to website service.
Collaborative filtering (Collaborative Filtering) technology is the earliest and the most successfully one of technology of personalized recommendation application, its main thought be its interest preference of user based on thering is similar features also identical this hypothesis build user interest preference model.Although existing research method can be provided fundamental basis and practical advice for the structure of personalized recommendation system, still has many defects:
(1) what recommendation information represented is imperfect.Existing recommend method conventionally represents user or is predicted as a concrete numerical value to the evaluation of commodity, as predictive user thinks that to being evaluated as of commodity scoring is that the possibility of 3 minutes is 100% to user to commodity for 3 minutes.In fact, user to the evaluation of particular commodity conventionally in a kind of uncertain state, as the evaluation to a certain commodity is generally " well ", " all right ", " pretty good ".User is expressed as to user to the nondeterministic statement of commodity evaluation and for commodity, provides the possibility of different scorings, for example commodity are evaluated as to the possibility of 3 minutes, 4 minutes, 5 minutes and are respectively 30%, 40% and 20%, can effectively reflect the truth that user evaluates, to improving the precision of personalized recommendation, there is active influence.Existing method is shown as a concrete numerical value by user's grade form and has ignored the uncertainty that user evaluates, and cannot reflect that user evaluates the truth of commodity, has reduced the precision of commending system.
(2) the fusion problem of recommendation information.The article that collaborative filtering method based on user recommends those and him to have the user of common interest hobby to like by calculating neighbor user, recommendation results focuses on the focus of the microcommunity that reflection is similar with user interest; Project-based collaborative filtering method is recommended those and the similar commodity of commodity that he selected in the past by calculating commodity neighbours to user, and recommendation results focuses on the historical interest of maintaining user.Collaborative filtering based on user, project-based collaborative filtering method produce personalized recommendation result from user and project angle respectively, and the above results is merged and can the recommendation information of various angles be fully utilized, and improve the precision of personalized recommendation.In existing research, lack the Unified frame that different angles recommendation information is merged.For example, a kind of Collaborative Filtering Recommendation Algorithm based on Collaborative Filtering utilizes predicting the outcome as the input of the collaborative filtering method based on user that project-based collaborative filtering method obtains, although fully utilized collaborative filtering method and project-based collaborative filtering method based on user, it does not merge the collaborative filtering result based on user and project-based collaborative filtering result.
Summary of the invention
The present invention overcomes the weak point that prior art exists, a kind of mixing recommend method based on recommending probability fusion is proposed, not only for merging the recommendation results of different recommend methods generations, provide unified framework, and can effectively reflect the truth that user evaluates, improve the precision of personalized recommendation.
In order to achieve the above object, the technical solution adopted in the present invention is:
The present invention is a kind of is to carry out as follows based on recommending the feature of the mixing recommend method of probability fusion:
Step 1, use bivariate table T={U, I, f} represents the score data of commodity;
In described bivariate table T, U={U 1..., U u..., U | U|represent that user gathers, I={I 1..., I i..., I | I|representing commodity set, f represents the scoring of user to commodity;
Described user gathers in U, | total number that U| is user, U urepresent u user; In described commodity set I, | total number that I| is commodity, I irepresent i commodity; Suppose user U uto commodity I igrading system S be { S 1..., S s..., S | S|, in described grading system S, scoring S sfor integer and S 1< ... < S s< ... < S | S|, S 1represent the minimum scoring of commodity, S | S|represent the highest scoring of commodity;
In described bivariate table T, by existing scoring set of marking for item of all users
Figure BDA0000427982550000021
represent,
Figure BDA0000427982550000022
Figure BDA0000427982550000023
for described user U uthe set of scoring,
Figure BDA0000427982550000024
Figure BDA0000427982550000025
represent user U uthe existing scoring of i,
Figure BDA0000427982550000026
represent user U utotal number of existing scoring; Order
Figure BDA0000427982550000027
for described user U unot scoring set,
Figure BDA0000427982550000029
represent user U uscoring of i,
Figure BDA00004279825500000210
represent user U utotal number of not scoring, make the described user set of not marking in Arbitrary Term
Figure BDA00004279825500000212
Step 2, suppose user's set of having marked in Arbitrary Term
Figure BDA00004279825500000214
for unknown number, by user's set of having marked
Figure BDA00004279825500000215
in collaborative filtering method and the project-based collaborative filtering method of other utilizations based on user obtain respectively described Arbitrary Term
Figure BDA00004279825500000216
neighbor user predict the outcome
Figure BDA00004279825500000217
with neighbours' project forecast result
Figure BDA00004279825500000218
; Described neighbor user is predicted the outcome
Figure BDA00004279825500000219
with neighbours' project forecast result
Figure BDA00004279825500000220
as described Arbitrary Term
Figure BDA00004279825500000221
the item of scoring predict the outcome
Figure BDA00004279825500000227
by user's set of having marked
Figure BDA00004279825500000223
in the set that predicts the outcome with marking that predicts the outcome of all items of scoring
Figure BDA00004279825500000224
represent;
By the described item set that predicts the outcome of having marked
Figure BDA00004279825500000225
as the input value of neural network, by the described user set of having marked
Figure BDA00004279825500000226
as the output valve of described neural network, described neural network is trained, obtain score in predicting model SFM;
Step 3, by user's set of having marked in all described collaborative filtering method based on user and project-based collaborative filtering methods of utilizing obtain the described user set of not marking
Figure BDA0000427982550000032
in Arbitrary Term
Figure BDA0000427982550000033
tentative prediction result
Figure BDA00004279825500000316
by the described user set of not marking
Figure BDA0000427982550000035
in all predicting the outcome with the set that predicts the outcome of scoring not
Figure BDA0000427982550000036
represent; By the set that predicts the outcome of described scoring item
Figure BDA0000427982550000037
as the input value of described score in predicting model SFM, utilize described score in predicting model SFM to obtain a not scoring final predicted value set;
Step 4, by described user U uall scorings according to the size of each predicted value in a described scoring final predicted value set, carry out a descending sort acquisition ordered set of not marking, described in choosing, the front N item of a scoring ordered set is not recommended described user U as recommendation results u.
The feature that the present invention is based on the mixing recommend method of recommending probability fusion is also:
Collaborative filtering method based on user in described step 2 is to carry out as follows:
1) by described user U uto the user of all commodity set of having marked
Figure BDA0000427982550000038
to the set of scoring of all commodity, utilize respectively constrained Pearson came relatedness metric method to obtain user U with other users uwith user's similarity set of other users, other users are carried out to descending sort according to the similarity size in the set of described user's similarity and obtain preliminary neighbor user set N u;
2) by described commodity I iin the set of marking
Figure BDA0000427982550000039
in corresponding subscriber's meter be shown scoring user gather R i;
3) by described preliminary neighborhood N ugather R with scoring user ifront k the subscriber's meter occuring simultaneously is shown neighbor user set N ui;
4) by described neighbor user set N uiin each user to commodity I igrade form be shown neighbor user scoring set F ui;
5) utilize formula (1) to obtain the scoring probability based on user
Figure BDA00004279825500000310
Pr ui U ( S s ) = Num U ( S s ) / k - - - ( 1 )
In formula (1), Num u(S s) be described neighbor user scoring set F uimiddle scoring S sthe number of times occurring;
6) by described scoring S swith the described scoring probability based on user
Figure BDA00004279825500000312
forming neighbor user predicts the outcome
Figure BDA00004279825500000313
Pr ui U = { ( S 1 , Pr ui U ( S 1 ) ) , . . . , ( S s , Pr ui U ( S s ) ) , . . . , ( S | S | , Pr ui U ( S | S | ) ) } - - - ( 2 )
In formula (2), Pr ui U ( S s ) &Element; [ 0,1 ] .
Described project-based collaborative filtering method is to carry out as follows:
1) by all users to described commodity I imark set and to the set of scoring of other commodity, utilize respectively constrained Pearson came relatedness metric method to obtain described commodity I iwith the commodity similarity set of other commodity, other commodity are carried out to descending sort according to the similarity size in the set of described commodity similarity and obtain preliminary neighbours' commodity set N i;
2) by user U uin the set of marking in corresponding commodity list be shown scoring commodity set R u;
3) by described preliminary neighbours' commodity set N iwith scoring commodity set R ufront k the commodity list occuring simultaneously is shown neighbours' commodity set N iu;
4) by user U uto described neighbours' commodity set N iuin the grade form of each commodity be shown neighbours' commodity scoring set F iu;
5) utilize formula (3) to obtain project-based scoring probability
Figure BDA0000427982550000042
Pr ui I ( S s ) = Num I ( S s ) / k - - - ( 3 )
In formula (3), Num i(S s) be described neighbours' commodity scoring set F iumiddle scoring S sthe number of times occurring;
6) by described scoring S swith described project-based scoring probability
Figure BDA0000427982550000044
formation project neighbours predict the outcome
Figure BDA0000427982550000045
Pr ui I = { ( S 1 , Pr ui I ( S 1 ) ) , . . . , ( S s , Pr ui I ( S s ) ) , . . . , ( S | S | , Pr ui I ( S | S | ) ) } - - - ( 4 )
In formula (4), Pr ui I ( S s ) &Element; [ 0,1 ] .
Compared with the prior art, beneficial effect of the present invention is embodied in:
1, the present invention is expressed as recommendation probability by the recommendation results that obtained separately based on user and project-based recommend method, utilize neural network that the recommendation results that obtained is separately merged, overcome conventional hybrid recommend method recommendation information and represented incomplete problem, for the fusion of different angles recommendation information provides unified framework, it recommends precision to be obviously better than recommend method and the project-based recommend method based on user.
2, the present invention utilizes scoring S stwo tuple-sets that form with scoring probability based on user
Figure BDA0000427982550000048
represent the recommendation information that the recommend method based on user obtains, utilize scoring S stwo tuple-sets that form with project-based scoring probability
Figure BDA0000427982550000049
represent the recommendation information that project-based recommend method obtains, and user is shown to a concrete numerical value to the evaluation table of commodity compares, two tuples that scoring and scoring probability form can reflect the truth that user evaluates more really.
3, the collaborative filtering method of utilization of the present invention based on user and project-based collaborative filtering method obtain marking and predict the outcome, the item of marking is predicted the outcome to set as input, using the set of marking as output, utilize the distinctive capability of fitting of neural network to obtain score in predicting model SFM, guaranteed the robustness of score in predicting model.
4, the present invention utilizes collaborative filtering method and project-based collaborative filtering method based on user to obtain tentative prediction result the set of not marking, input using tentative prediction result as a score in predicting model SFM final predicted value set that obtains not marking, and then obtain final recommendation results, compare with traditional recommend method, the present invention can carry out effective integration to the recommendation results of different recommend methods, has improved the precision of personalized recommendation.
5, the present invention carries mixing recommend method the result of the collaborative filtering method based on user and project-based collaborative filtering method has been carried out merging the diversity that is conducive to improve recommendation results, the actual preferences that more meets user, has overcome the shortcoming of using merely the collaborative filtering method based on user in prior art or using merely project-based collaborative filtering method.
6, the present invention can be used for the personalized recommendation system of digital product, the travelling routes such as the entity products such as clothes and mobile phone, film and music and the service products such as arrangement of spending a holiday, can use at platforms such as the webpage of computer and mobile phone and App, have wide range of applications.
Accompanying drawing explanation
Fig. 1 is schematic flow sheet of the present invention;
Fig. 2 is the sensitivity experiments result of method for measuring similarity;
Fig. 3 is neighbor user (commodity) number sensitivity experiments result;
Fig. 4 is forecasting accuracy experimental result;
Fig. 5 is prediction training pattern sensitivity experiments result.
Embodiment
The present invention utilizes bivariate table to represent the score data of commodity, to mark set in Arbitrary Term as unknown number, utilize basic recommend method to obtain predicting the outcome of respective items, utilize neural network to train the set predicting the outcome of mark item rating and respective items and obtain score in predicting model SFM.The set that the not scoring item obtaining by basic recommend method is predicted the outcome utilizes score in predicting model SFM to obtain the not final predicted value of scoring item.Finally, on standard data set, compare with basic algorithm.As shown in Figure 1, the method for the embodiment of the present invention comprises the following steps:
Step 1, use bivariate table T={U, I, f} represents the score data of commodity, specifically comprises:
As table 1, U={U 1..., U u..., U | U|represent that user gathers, I={I 1..., I i..., I | I|representing commodity set, f represents the scoring of user to commodity;
User gathers in U, | total number that U| is user, U urepresent u user; In commodity set I, | total number that I| is commodity, I irepresent i commodity; Suppose user U uto commodity I igrading system S be { S 1..., S s..., S | S|, in grading system S, scoring S sfor integer and S 1< ... < S s< ... < S | S|, S 1represent the minimum scoring of commodity, S | S|represent the highest scoring of commodity;
Figure BDA0000427982550000061
Table 1
In bivariate table T, by existing scoring set of marking for item of all users represent,
Figure BDA0000427982550000063
Figure BDA0000427982550000064
for the set of scoring of user Uu,
Figure BDA0000427982550000065
Figure BDA0000427982550000066
represent user U uthe existing scoring of i,
Figure BDA0000427982550000067
represent user U utotal number of existing scoring; Order for user U unot scoring set,
Figure BDA0000427982550000068
Figure BDA0000427982550000069
represent user U uscoring of i,
Figure BDA00004279825500000610
represent user U utotal number of not scoring, make user's set of not marking
Figure BDA00004279825500000611
in Arbitrary Term
Figure BDA00004279825500000612
Step 2, utilize basic recommend method to obtain predicting the outcome of respective items, utilize neural network to train the predicting the outcome of item rating and respective items of marking and obtain score in predicting model SFM.Concrete steps comprise:
1) suppose user's set of having marked in Arbitrary Term
Figure BDA00004279825500000614
for unknown number, by user's set of having marked
Figure BDA00004279825500000615
in the collaborative filtering method of other utilizations based on the user neighbor user that obtains Arbitrary Term predict the outcome
Figure BDA00004279825500000623
other fingers are in the set of marking F ^ u = { f ^ u 1 , . . . , f ^ ui , . . . , f ^ u | F ^ u | } In, except
Figure BDA00004279825500000618
all items in addition;
1.1) by user U uto all commodity I={I 1..., I i..., I | I|user's set of having marked
Figure BDA00004279825500000619
to the set of scoring of all commodity, utilize respectively constrained Pearson came relatedness metric method to obtain user U with other users uwith user's similarity set of other users, other users are carried out to descending sort according to the similarity size in the set of user's similarity and obtain preliminary neighbor user set N u, other users refer to user and gather U={U 1..., U u..., U | U|in except user U uall users in addition, other users refer in the set of marking the set of scoring of all commodity
Figure BDA00004279825500000620
in except user U uthe set of scoring the set of marking of all users in addition; User's method for measuring similarity comprises three kinds of cosine similarity, Pearson came correlativity and constrained Pearson came correlativitys, for more different method for measuring similarity the present invention is directed to and designed 5 groups of experiments for the impact of the inventive method precision of prediction, experimental result as shown in Figure 2.In figure, horizontal ordinate represents data set, ordinate represents to test predicated error, in figure curve we can find out, for the standard data set in the present invention, the error of the method for measuring similarity based on constrained Pearson came correlativity is all lower than cosine method for measuring similarity and Pearson's relativity measurement method, thereby contributes to improve the precision of prediction of the inventive method.
1.2) by commodity I iin the set of marking
Figure BDA0000427982550000071
in corresponding subscriber's meter be shown scoring user gather R i;
1.3) by preliminary neighbor user set N ugather R with scoring user ifront k the subscriber's meter occuring simultaneously is shown neighbor user set N ui; The selection of k is the key factor of the collaborative filtering recommending method effect of impact based on user, and in order to verify the impact of neighbor user number on the inventive method precision of prediction, the present invention has designed 7 groups of experiments, k neighbor user, k=10,20 are chosen respectively in every group of experiment,, 80; And calculate precision of prediction.As shown in Figure 3, in figure, horizontal ordinate represents neighbor user number to experimental result, and ordinate represents to test predicated error.When neighbor user, count k hour, prediction probability is accurate not, has reduced the accuracy of prediction; When neighbor user, count k when larger, the similarity between user is not high, also can reduce the accuracy of algorithm predicts.Therefore,, for standard data set that this experiment adopts, when being chosen between [30,70], neighbor user number can obtain good prediction effect; For other data sets, best neighbor number of users definite depends on concrete data cases.
1.4) by neighbor user set N uiin each user to commodity I igrade form be shown neighbor user scoring set F ui;
1.5) utilize formula (1) to obtain the scoring probability based on user
Figure BDA0000427982550000072
, the user U that uses the collaborative filtering method based on user to obtain uto commodity I igrading system be S sprobability:
Pr ui U ( S s ) = Num U ( S s ) / k - - - ( 1 )
In formula (1), Num u(S s) be described neighbor user scoring set F uimiddle scoring S sthe number of times occurring;
1.6) S that will mark swith the scoring probability based on user
Figure BDA0000427982550000074
forming neighbor user predicts the outcome
Figure BDA0000427982550000075
Pr ui U = { ( S 1 , Pr ui U ( S 1 ) ) , . . . , ( S s , Pr ui U ( S s ) ) , . . . , ( S | S | , Pr ui U ( S | S | ) ) } - - - ( 2 )
In formula (2), Pr ui U ( S s ) &Element; [ 0,1 ] .
2) suppose user's set of having marked
Figure BDA0000427982550000078
in Arbitrary Term
Figure BDA0000427982550000079
for unknown number, by user's set of having marked
Figure BDA00004279825500000710
in other project neighbours that utilize project-based collaborative filtering method to obtain Arbitrary Term predict the outcome
Figure BDA00004279825500000711
, other fingers are in the set of marking
Figure BDA00004279825500000712
in, except i existing scoring
Figure BDA00004279825500000713
all items in addition;
2.1) user is gathered to U to commodity I ithe set and utilize respectively constrained Pearson came relatedness metric method to obtain commodity I to the set of scoring of other commodity of marking iwith the commodity similarity set of other commodity, other commodity are carried out to descending sort according to the similarity size in the set of commodity similarity and obtain preliminary commodity neighborhood N i; Other commodity refer at commodity set I={I 1..., I i..., I | I|in except commodity I iall commodity in addition.
2.2) by user U uin the set of marking in corresponding commodity list be shown scoring commodity set R u;
2.3) by preliminary commodity neighborhood N iwith scoring commodity set R ufront k the commodity list occuring simultaneously is shown commodity neighborhood N iu; The selection of k is the key factor of the project-based collaborative filtering recommending method effect of impact, and in order to verify the impact of neighbours' item number on the inventive method precision of prediction, the present invention has designed 7 groups of experiments, k neighbours' project, k=10,20 are chosen respectively in every group of experiment,, 80; And calculate precision of prediction.As shown in Figure 3, in figure, horizontal ordinate represents neighbours' item number to experimental result, and ordinate represents to test predicated error.When neighbours' item number k hour, prediction probability is accurate not, has reduced the accuracy of prediction; When neighbours' item number k is larger, the similarity between project is not high, also can reduce the accuracy of algorithm predicts.Therefore,, for standard data set that this experiment adopts, when being chosen between [30,70], neighbor user number can obtain good prediction effect; For other data sets, best neighbor number of users definite depends on concrete data cases.
2.4) by user U uto commodity neighborhood N iuin the grade form of each commodity be shown neighbours' commodity scoring set F iu;
2.5) utilize formula (3) to obtain project-based scoring probability , the user U obtaining with project-based collaborative filtering method uto commodity I igrading system be S sprobability:
Pr ui I ( S s ) = Num I ( S s ) / k - - - ( 3 )
In formula (3), Num i(S s) be neighbours' commodity scoring set F iumiddle scoring S sthe number of times occurring;
2.6) S that will mark swith project-based scoring probability
Figure BDA0000427982550000084
formation project neighbours predict the outcome
Figure BDA0000427982550000085
Pr ui I = { ( S 1 , Pr ui I ( S 1 ) ) , . . . , ( S s , Pr ui I ( S s ) ) , . . . , ( S | S | , Pr ui I ( S | S | ) ) } - - - ( 4 )
In formula (4),
Figure BDA0000427982550000087
3) neighbor user is predicted the outcome
Figure BDA0000427982550000088
with neighbours' project forecast result
Figure BDA0000427982550000089
as Arbitrary Term
Figure BDA00004279825500000810
the item of scoring predict the outcome
Figure BDA00004279825500000811
particularly, the item of scoring of Arbitrary Term predict the outcome into:
f ^ ui 0 = { ( S 1 , Pr ui U ( S 1 ) ) , . . . , ( S s , Pr ui U ( S s ) ) , . . . , ( S | S | , Pr ui U ( S | S | ) ) , ( S 1 , Pr ui I ( S 1 ) ) , . . . , ( S s , Pr ui I ( S s ) ) , . . . , ( S | S | , Pr ui I ( S | S | ) ) }
To all execution steps 1 in the set of marking) and step 2), by the set of marking
Figure BDA00004279825500000813
in all set that predict the outcome with marking that predict the outcome
Figure BDA00004279825500000814
represent;
4) by the item set that predicts the outcome of marking
Figure BDA00004279825500000815
as the input value of neural network, by the set of marking
Figure BDA00004279825500000816
as the output valve of neural network, neural network is trained, obtain score in predicting model SFM.In the present embodiment, the neural network adopting refers to radial base neural net.Radial base neural net has stronger input and output mapping function, has the characteristic that unique the best is approached, and learning process fast convergence rate.
Step 3, by user's set of having marked
Figure BDA0000427982550000091
in all
Figure BDA0000427982550000092
collaborative filtering method and the project-based collaborative filtering method of utilization based on user obtains not scoring set in Arbitrary Term
Figure BDA0000427982550000094
tentative prediction result
Figure BDA0000427982550000095
the set of not marking
Figure BDA0000427982550000096
in all
Figure BDA0000427982550000097
predict the outcome with the set that predicts the outcome of scoring item not
Figure BDA0000427982550000098
represent; The item set that predicts the outcome of not marking
Figure BDA0000427982550000099
as the input value of score in predicting model SFM, utilize a not scoring final predicted value set of score in predicting model SFM output;
Step 4, by user U uall scoring
Figure BDA00004279825500000910
according to the size of each predicted value in a scoring final predicted value set not, carry out a descending sort acquisition ordered set of not marking, choose the not front N item of a scoring ordered set and recommend user U as recommendation results u, N represents to recommend number, can set according to concrete recommendation scene.
For the inventive method, carry out experimental demonstration, specifically comprise:
1) prepare standard data set
The present invention uses MovieLens data set as the validity of standard data set checking probability fusion recommend method, and MovieLens data set is widely used personalized recommendation data set.In MovieLens data centralization, the film that user has seen oneself is marked, and score value is 1 to 5 minute, and data set comprises 943 isolated users, 1682 films, 100000 scorings.The rule of training set and test set employing 80%/20% is cut apart, and selects at random 80000 scorings as training set, and 20000 scorings are as test set.
2) evaluation index
Adopt mean absolute error (MAE) as the evaluation index of the present embodiment.Mean absolute deviation MAE is by calculating the accuracy of user's scoring and the deviation measurement prediction between the final predicted value of respective items actual in test set, and MAE is less, recommends quality higher.If actual user marks, set is { p 1..., p l..., p n, corresponding predicted value set expression is { q 1..., q l..., q n, mean absolute error is defined as formula (5):
MAE = &Sigma; l = 1 n | q l - p l | n - - - ( 5 )
3) on standard data set, test
In order to verify the validity of institute of the present invention extracting method, herein on 5 group data sets of MovieLens data set, carry out model and forecast, and will predict the outcome and compare with true scoring.As shown in Figure 4, in figure, horizontal ordinate represents data set sequence number to experimental result, and ordinate represents to test predicated error.Compare with the collaborative filtering method based on product with the collaborative filtering method based on user, the predicated error of method of the present invention is all lower than the method based on user and project-based method, thereby all can obtain more excellent predictablity rate on each data set.
Robustness for checking institute of the present invention extracting method, by changing hidden layer neuron quantity and the hidden layer learning function of radial base neural net, has designed respectively 7 groups and has verified herein.As shown in Figure 5, in figure, horizontal ordinate represents to test sequence number to experimental result, and ordinate represents to test predicated error.As seen from Figure 5, change hidden layer neuron quantity and the hidden layer learning function of radial base neural net, the predicated error of the inventive method there will be certain variation; But, under reasonably hidden layer neuron quantity and hidden layer learning function arrange, the predicated error of this paper method is all the time lower than the method based on user and project-based method, thereby the present invention is better than collaborative filtering method and the collaborative filtering recommending method based on product based on user.

Claims (3)

1. the mixing recommend method based on recommending probability fusion, is characterized in that carrying out as follows:
Step 1, use bivariate table T={U, I, f} represents the score data of commodity;
In described bivariate table T, U={U 1..., U u..., U | U|represent that user gathers, I={I 1..., I i..., I | I|representing commodity set, f represents the scoring of user to commodity;
Described user gathers in U, | total number that U| is user, U urepresent u user; In described commodity set I, | total number that I| is commodity, I irepresent i commodity; Suppose user U uto commodity I igrading system S be { S 1..., S s..., S | S|, in described grading system S, scoring S sfor integer and S 1< ... < S s< ... < S | S|, S 1represent the minimum scoring of commodity, S | S|represent the highest scoring of commodity;
In described bivariate table T, by existing scoring set of marking for item of all users
Figure FDA0000427982540000011
represent,
Figure FDA0000427982540000012
Figure FDA0000427982540000013
for described user U uthe set of scoring,
Figure FDA00004279825400000133
represent user U uthe existing scoring of i,
Figure FDA0000427982540000015
represent user U utotal number of existing scoring; Order
Figure FDA00004279825400000134
for described user U unot scoring set,
Figure FDA0000427982540000016
Figure FDA0000427982540000017
represent user U uscoring of i,
Figure FDA00004279825400000135
represent user U utotal number of not scoring, make the described user set of not marking in Arbitrary Term
Figure FDA00004279825400000110
Step 2, suppose user's set of having marked
Figure FDA00004279825400000111
in Arbitrary Term
Figure FDA00004279825400000112
for unknown number, by user's set of having marked
Figure FDA00004279825400000113
in collaborative filtering method and the project-based collaborative filtering method of other utilizations based on user obtain respectively described Arbitrary Term
Figure FDA00004279825400000114
neighbor user predict the outcome
Figure FDA00004279825400000115
with neighbours' project forecast result
Figure FDA00004279825400000136
described neighbor user is predicted the outcome
Figure FDA00004279825400000117
with neighbours' project forecast result
Figure FDA00004279825400000118
as described Arbitrary Term the item of scoring predict the outcome
Figure FDA00004279825400000138
by user's set of having marked
Figure FDA00004279825400000121
in the set that predicts the outcome with marking that predicts the outcome of all items of scoring
Figure FDA00004279825400000122
represent;
By the described item set that predicts the outcome of having marked
Figure FDA00004279825400000123
as the input value of neural network, by the described user set of having marked
Figure FDA00004279825400000124
as the output valve of described neural network, described neural network is trained, obtain score in predicting model SFM;
Step 3, by user's set of having marked
Figure FDA00004279825400000125
in all described collaborative filtering method based on user and project-based collaborative filtering methods of utilizing obtain the described user set of not marking in Arbitrary Term
Figure FDA00004279825400000127
tentative prediction result
Figure FDA00004279825400000137
by the described user set of not marking
Figure FDA00004279825400000129
in all predicting the outcome with the set that predicts the outcome of scoring not represent; By the set that predicts the outcome of described scoring item
Figure FDA00004279825400000131
as the input value of described score in predicting model SFM, utilize described score in predicting model SFM to obtain a not scoring final predicted value set;
Step 4, by described user U uall scorings according to the size of each predicted value in a described scoring final predicted value set, carry out a descending sort acquisition ordered set of not marking, described in choosing, the front N item of a scoring ordered set is not recommended described user U as recommendation results u.
2. the mixing recommend method based on recommending probability fusion according to claim 1, is characterized in that: the collaborative filtering method based on user in described step 2 is to carry out as follows:
1) by described user U uto the user of all commodity set of having marked
Figure FDA0000427982540000021
to the set of scoring of all commodity, utilize respectively constrained Pearson came relatedness metric method to obtain user U with other users uwith user's similarity set of other users, other users are carried out to descending sort according to the similarity size in the set of described user's similarity and obtain preliminary neighbor user set N u;
2) by described commodity I iin the set of marking in corresponding subscriber's meter be shown scoring user gather R i;
3) by described preliminary neighborhood N ugather R with scoring user ifront k the subscriber's meter occuring simultaneously is shown neighbor user set N ui;
4) by described neighbor user set N uiin each user to commodity I igrade form be shown neighbor user scoring set F ui;
5) utilize formula (1) to obtain the scoring probability based on user
Figure FDA0000427982540000023
Pr ui U ( S s ) = Num U ( S s ) / k - - - ( 1 )
In formula (1), Num u(S s) be described neighbor user scoring set F uimiddle scoring S sthe number of times occurring;
6) by described scoring S swith the described scoring probability based on user
Figure FDA0000427982540000025
forming neighbor user predicts the outcome
Pr ui U = { ( S 1 , Pr ui U ( S 1 ) ) , . . . , ( S s , Pr ui U ( S s ) ) , . . . , ( S | S | , Pr ui U ( S | S | ) ) } - - - ( 2 )
In formula (2), Pr ui U ( S s ) &Element; [ 0,1 ] .
3. the mixing recommend method based on recommending probability fusion according to claim 1, is characterized in that: described project-based collaborative filtering method is to carry out as follows:
1) by all users to described commodity I imark set and to the set of scoring of other commodity, utilize respectively constrained Pearson came relatedness metric method to obtain described commodity I iwith the commodity similarity set of other commodity, other commodity are carried out to descending sort according to the similarity size in the set of described commodity similarity and obtain preliminary neighbours' commodity set N i;
2) by user U uin the set of marking
Figure FDA0000427982540000029
in corresponding commodity list be shown scoring commodity set R u;
3) by described preliminary neighbours' commodity set N iwith scoring commodity set R ufront k the commodity list occuring simultaneously is shown neighbours' commodity set N iu;
4) by user U uto described neighbours' commodity set N iuin the grade form of each commodity be shown neighbours' commodity scoring set F iu;
5) utilize formula (3) to obtain project-based scoring probability
Pr ui I ( S s ) = Num I ( S s ) / k - - - ( 3 )
In formula (3), Num i(S s) be described neighbours' commodity scoring set F iumiddle scoring S sthe number of times occurring;
6) by described scoring S swith described project-based scoring probability formation project neighbours predict the outcome
Figure FDA0000427982540000034
Pr ui I = { ( S 1 , Pr ui I ( S 1 ) ) , . . . , ( S s , Pr ui I ( S s ) ) , . . . , ( S | S | , Pr ui I ( S | S | ) ) } - - - ( 4 )
In formula (4), Pr ui I ( S s ) &Element; [ 0,1 ] .
CN201310637512.4A 2013-12-02 2013-12-02 A kind of based on the mixing recommendation method recommending probability fusion Active CN103632290B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310637512.4A CN103632290B (en) 2013-12-02 2013-12-02 A kind of based on the mixing recommendation method recommending probability fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310637512.4A CN103632290B (en) 2013-12-02 2013-12-02 A kind of based on the mixing recommendation method recommending probability fusion

Publications (2)

Publication Number Publication Date
CN103632290A true CN103632290A (en) 2014-03-12
CN103632290B CN103632290B (en) 2016-06-29

Family

ID=50213310

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310637512.4A Active CN103632290B (en) 2013-12-02 2013-12-02 A kind of based on the mixing recommendation method recommending probability fusion

Country Status (1)

Country Link
CN (1) CN103632290B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105761102A (en) * 2016-02-04 2016-07-13 杭州朗和科技有限公司 Method for predicting user commodity purchasing behavior and device thereof
CN105761107A (en) * 2016-02-15 2016-07-13 深圳市非零无限科技有限公司 Method for acquiring target new users in internet products and device thereof
CN105989510A (en) * 2015-02-06 2016-10-05 展讯通信(上海)有限公司 Online goods recommending system and device based on neural network, and mobile terminal
CN106446195A (en) * 2016-09-29 2017-02-22 北京百度网讯科技有限公司 News recommending method and device based on artificial intelligence
CN107133836A (en) * 2017-03-22 2017-09-05 无锡中科富农物联科技有限公司 A kind of adaptive weighting combined recommendation algorithm
WO2017166990A1 (en) * 2016-03-31 2017-10-05 深圳光启合众科技有限公司 Artificial intelligence system having evaluation capability and evaluation method thereof
CN107798045A (en) * 2017-07-24 2018-03-13 中南大学 User towards middle-size and small-size website, which accesses, is intended to acquisition methods and system
CN108648072A (en) * 2018-05-18 2018-10-12 深圳灰猫科技有限公司 Internet finance lending risk evaluating system based on user credit dynamic grading
CN109032591A (en) * 2018-06-21 2018-12-18 北京航空航天大学 A kind of crowdsourcing software developer recommended method neural network based
CN110598092A (en) * 2019-08-12 2019-12-20 深圳市天天学农网络科技有限公司 Content recommendation method
CN110689383A (en) * 2019-10-12 2020-01-14 腾讯科技(深圳)有限公司 Information pushing method and device, server and storage medium
CN111291273A (en) * 2020-02-20 2020-06-16 深圳前海微众银行股份有限公司 Recommendation system optimization method, device, equipment and readable storage medium
CN113221003A (en) * 2021-05-20 2021-08-06 北京建筑大学 Mixed filtering recommendation method and system based on dual theory

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102135999A (en) * 2011-03-25 2011-07-27 南京财经大学 User credibility and item nearest neighbor combination Internet recommendation method
CN102982107A (en) * 2012-11-08 2013-03-20 北京航空航天大学 Recommendation system optimization method with information of user and item and context attribute integrated

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102135999A (en) * 2011-03-25 2011-07-27 南京财经大学 User credibility and item nearest neighbor combination Internet recommendation method
CN102982107A (en) * 2012-11-08 2013-03-20 北京航空航天大学 Recommendation system optimization method with information of user and item and context attribute integrated

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YUANCHUN JIANG等: "Maximizing customer satisfaction through an online recommendation system:A novel associative classification model", 《DECISION SUPPORT SYSTEMS》 *
谌彦妮: "基于用户—项目的混合协同过滤技术的应用研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989510A (en) * 2015-02-06 2016-10-05 展讯通信(上海)有限公司 Online goods recommending system and device based on neural network, and mobile terminal
CN105761102B (en) * 2016-02-04 2021-05-11 杭州朗和科技有限公司 Method and device for predicting commodity purchasing behavior of user
CN105761102A (en) * 2016-02-04 2016-07-13 杭州朗和科技有限公司 Method for predicting user commodity purchasing behavior and device thereof
CN105761107A (en) * 2016-02-15 2016-07-13 深圳市非零无限科技有限公司 Method for acquiring target new users in internet products and device thereof
WO2017166990A1 (en) * 2016-03-31 2017-10-05 深圳光启合众科技有限公司 Artificial intelligence system having evaluation capability and evaluation method thereof
CN106446195A (en) * 2016-09-29 2017-02-22 北京百度网讯科技有限公司 News recommending method and device based on artificial intelligence
CN107133836A (en) * 2017-03-22 2017-09-05 无锡中科富农物联科技有限公司 A kind of adaptive weighting combined recommendation algorithm
CN107798045A (en) * 2017-07-24 2018-03-13 中南大学 User towards middle-size and small-size website, which accesses, is intended to acquisition methods and system
CN108648072A (en) * 2018-05-18 2018-10-12 深圳灰猫科技有限公司 Internet finance lending risk evaluating system based on user credit dynamic grading
CN109032591A (en) * 2018-06-21 2018-12-18 北京航空航天大学 A kind of crowdsourcing software developer recommended method neural network based
CN110598092A (en) * 2019-08-12 2019-12-20 深圳市天天学农网络科技有限公司 Content recommendation method
CN110689383A (en) * 2019-10-12 2020-01-14 腾讯科技(深圳)有限公司 Information pushing method and device, server and storage medium
CN110689383B (en) * 2019-10-12 2023-08-22 腾讯科技(深圳)有限公司 Information pushing method, device, server and storage medium
CN111291273A (en) * 2020-02-20 2020-06-16 深圳前海微众银行股份有限公司 Recommendation system optimization method, device, equipment and readable storage medium
CN113221003A (en) * 2021-05-20 2021-08-06 北京建筑大学 Mixed filtering recommendation method and system based on dual theory

Also Published As

Publication number Publication date
CN103632290B (en) 2016-06-29

Similar Documents

Publication Publication Date Title
CN103632290B (en) A kind of based on the mixing recommendation method recommending probability fusion
CN106651519B (en) Personalized recommendation method and system based on label information
CN101826114B (en) Multi Markov chain-based content recommendation method
CN102982107B (en) A kind of commending system optimization method merging user, project and context property information
CN103617289B (en) Micro-blog recommendation method based on user characteristics and cyberrelationship
CN103309967B (en) Collaborative filtering method based on similarity transmission and system
CN103823888B (en) Node-closeness-based social network site friend recommendation method
CN106327227A (en) Information recommendation system and information recommendation method
CN107330727A (en) A kind of personalized recommendation method based on hidden semantic model
CN107833117A (en) A kind of Bayes&#39;s personalized ordering for considering label information recommends method
CN103995839A (en) Commodity recommendation optimizing method and system based on collaborative filtering
CN103761237A (en) Collaborative filtering recommending method based on characteristics and credibility of users
CN104239338A (en) Information recommendation method and information recommendation device
CN104239399A (en) Method for recommending potential friends in social network
CN102841929A (en) Recommending method integrating user and project rating and characteristic factors
CN103136683A (en) Method and device for calculating product reference price and method and system for searching products
CN103593417A (en) Collaborative filtering recommendation method based on association rule prediction
CN105138508A (en) Preference diffusion based context recommendation system
CN104166732A (en) Project collaboration filtering recommendation method based on global scoring information
CN105069072A (en) Emotional analysis based mixed user scoring information recommendation method and apparatus
CN103678518A (en) Method and device for adjusting recommendation lists
CN106610970A (en) Collaborative filtering-based content recommendation system and method
CN106296242A (en) A kind of generation method of commercial product recommending list in ecommerce and the system of generation
CN105913290A (en) Commodity matching recommending method and recommending system
CN106919699A (en) A kind of recommendation method for personalized information towards large-scale consumer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant