CN106202151A - One is used for improving the multifarious method of personalized recommendation system - Google Patents

One is used for improving the multifarious method of personalized recommendation system Download PDF

Info

Publication number
CN106202151A
CN106202151A CN201610463223.0A CN201610463223A CN106202151A CN 106202151 A CN106202151 A CN 106202151A CN 201610463223 A CN201610463223 A CN 201610463223A CN 106202151 A CN106202151 A CN 106202151A
Authority
CN
China
Prior art keywords
user
article
score
rank
score value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610463223.0A
Other languages
Chinese (zh)
Inventor
李方敏
栾悉道
龙妍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University
Original Assignee
Changsha University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University filed Critical Changsha University
Priority to CN201610463223.0A priority Critical patent/CN106202151A/en
Publication of CN106202151A publication Critical patent/CN106202151A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The invention discloses a kind of raising multifarious method of personalized recommendation system, first it particularly as follows: obtain user's score data collection, then above-mentioned three kinds of conventional proposed algorithms are used to recommend on score data collection, next recommendation results is carried out threshold value control and sequence, then final result is presented to user.By recommendation results is resequenced, the list content of consequently recommended result can be changed such that it is able to improve system Biodiversity, the method simultaneously using threshold value control when sequence, make to recommend article to be liked article by user as far as possible, it is ensured that the prediction accuracy of commending system.Test result indicate that, relative to existing personalized recommendation system method, method proposed by the invention can increase substantially system Biodiversity in the case of less reduction prediction accuracy, reaches control system accuracy and the purpose of multiformity balance.

Description

One is used for improving the multifarious method of personalized recommendation system
Technical field
The invention belongs to the Internet, mobile Internet and computer network field, be used for improving more particularly, to one The property multifarious method of commending system.
Background technology
Owing to internet retailer field also exists long tail effect, i.e. 80 the percent of Merchant sales volume comes from its percentage 20 commodity, and if able to propose the sale of retailer's long-tail article, then one-tenth is doubled by the turnover of internet retailer Long, and user's uniqueness preference can be met.The most in recent years, in conjunction with computer networking technology and the personalization of big data processing technique Commending system progressively causes the great attention of people in e-commerce field, and obtains a wide range of applications.Personalization pushes away The system of recommending is the information filtering in order to realize personal interest based on user, and personalized recommendation system is an information filtering system System, relevant information and data to user are analyzed and excavate, thus finding user interest place, finding the need that user is implied Ask, then recommend for it.By The long tail, if the service that provided of businessman or commodity perfect can meet user's Individual demand, user is the most high to businessman's satisfaction and degree of belief, then will necessarily bring huge profit to businessman, and individual Property commending system be businessman for meeting the important means of the individual demand of user, simultaneously personalized recommendation system also be solve The certainly important means of the long-tail phenomenon of internet retailer.
The at present the most widely used method for personalized recommendation system have collaborative filtering method based on user and Collaborative filtering methods based on article.
For collaborative filtering method based on user, those have mutual user to be considered place with some identical items In same neighborhood.According to some statistical datas, if there is similar preference past user, then in future, they will Continue to have similar preference.If having a user buy or have rated a new article, then these article will be pushed away Recommending to the neighbor user of this user, the method is mainly for how to carry out recommending and calculating between user for large-scale user Similarity.But, the defect of the method is, the method does not consider the diverse problems of commending system, thus causes interconnection The long-tail article of net retailer are unsalable.
Collaborative filtering methods based on article and above-mentioned collaborative filtering method based on user are essentially identical, and its difference exists In, its similarity needing to calculate article, and the similarity of non-computational user.But this algorithm also fails to consider commending system Diverse problems, the long-tail article also resulting in internet retailer are unsalable.
Summary of the invention
For disadvantages described above or the Improvement requirement of prior art, the invention provides a kind of for improving personalized recommendation system Unite multifarious method, it is intended that solve the multiformity not considering in personalized recommendation system present in existing method, Thus cause the technical problem that the long-tail article of internet retailer are unsalable.
For achieving the above object, according to one aspect of the present invention, it is provided that a kind of raising personalized recommendation system is various The method of property, comprises the following steps:
(1) obtain user's score data collection from website, and this user's score data collection is deposited in the way of text Storage, this is concentrated by score data and includes article ID corresponding to ID, this ID and this user scoring to these article Value;
(2) the user's score data collection using the proposed algorithm being used for personalized recommendation system to obtain step (1) is carried out Prediction and recommendation process, thus the multiple users concentrated for user's score data generate the recommendation list of correspondence, this recommendation respectively List includes article ID corresponding to ID, this ID and this user prediction score value to these article;
(3) to user's score data collection, asking for the popularity degree of its article, this popularity degree is to be commented by these article The score value of these article is determined by the number of the user of valency, user, user is entered the prediction score value of article with controlling threshold value Row compares, and the popularity degree above or equal to prediction article corresponding to score value controlling threshold value is ranked up, to obtain Whole ranking results;
(4) take multiple results of front end in ranking results and feed back to user as recommendation list.
Preferably, step (2) specifically includes following sub-step:
(2-1) according to the user's score data collection obtained, the similarity between all users is calculated:
s i m ( a , b ) = Σ p ∈ P R ( a , p ) R ( b , p ) Σ p ∈ P R ( a , p ) 2 Σ p ∈ P R ( b , p ) 2
Wherein (a, b) represents the similarity between user a and b to sim, and P represents the set of all items, and p represents in set P Article, R (a, p) and R (b p) represents that user a and user b is for the score value of article p respectively;
(2-2) for each user, front K the user the highest with this user's similarity neighbour as this user is chosen Occupying user, wherein K is the integer between 50 to 300;
(2-3) for each user, the score value of the article that its K neighbor user was marked is analyzed, with Dope this user and most possibly beat multiple article of high score, and these article that this user may beat high score recommend use Family.
Preferably, step (2-3) specifically used below equation:
R * ( u , i ) = R ( u ) ‾ + kΣ v ∈ N ( u ) ( R ( v , i ) - R ( v ) ‾ ) × s i m ( u , v )
Wherein
k = 1 Σ v ∈ N ( u ) s i m ( u , v ) ;
Obtain after being substituted into above-mentioned formula:
R * ( u , i ) = R ( u ) ‾ + Σ v ∈ N ( u ) ( R ( v , i ) - R ( v ) ‾ ) × s i m ( u , v ) Σ v ∈ N ( u ) | s i m ( u , v ) |
Wherein R*(u, i) represents the user u prediction score value for article i,It is that user u is for its all items Average score value, k is normalization factor, and N (u) represents the set of all neighbor users of user u.
Preferably, step (3) specifically includes following sub-step:
(3-1) to user's score data collection, according to the people of the user that this user's score data is concentrated article be evaluated Popularity degree several, that the score value of these article is asked for its article by user;
(3-2) the prediction score value of article is compared by user with controlling threshold value, above or equal to controlling threshold value The popularity degree of prediction article corresponding to score value be ranked up, to obtain final ranking results.
Preferably, the process of the popularity degree obtaining article in step (3-1) is represented by
Wherein rankPopularityI () represents the method using the sequence of article popularity degree, Represent for all users gathers each user u in U, there is user u to the score value R of article i (u, i) individual Number.
Preferably, the process of the popularity degree obtaining article in step (3-1) is represented by
rankReversePrediction(i)=R*(u,i)
Wherein rankReversePredictionI () represents the method using prediction score value inverted order, this prediction score value can represent The popularity degree of article.
Preferably, the process of the popularity degree obtaining article in step (3-1) is represented by
rank A v e r a g e R a t i n g ( i ) = R ( i ) ‾
Wherein have
R ( i ) ‾ = 1 | U ( i ) | Σ u ∈ U ( i ) R ( u , i )
rankAverageRatingI () represents the method using the sequence of article average score value;
Preferably, the process of the popularity degree obtaining article in step (3-1) is represented by
rankAbsoluteLikeability(i)=| UH(i)|
Wherein UH(i)=and u ∈ U (i) | R (u, i) >=TH}
rankAbsoluteLikeabilityI () represents the absolute pouplarity using article, and u ∈ U (i) | R (u, i) >=TH} Represent that the score value that article i is beaten by user u is more than threshold value THQuantity.
Preferably, the process of the popularity degree obtaining article in step (3-1) is represented by
rankRelativeLikeability(i)=| UH(i)/U(i)|
Wherein rankRelativeLikeabilityI () represents that article are relative to pouplarity.
Preferably, the process of the popularity degree obtaining article in step (3-1) is represented by
rank R a t i n g V a r i a n c e ( i ) = 1 | U ( i ) | Σ u ∈ U ( i ) ( R ( u , i ) - R ( i ) ‾ ) 2
Wherein rankRatingVarianceI () represents that the score value of article is deviateed the degree of this article average score value by user.
Preferably, step (3-2) specifically uses below equation:
rank x ( i , T R ) = rank x ( i ) , R * ( u , i ) ∈ [ T R , T max ] rank S tan d a r d ( i ) , R * ( u , i ) ∈ [ T H , T R )
Wherein, rankx(i,TR) represent that use controls threshold value TRThe function that article i is ranked up, rankxI () represents on The popularity degree of article, TmaxRepresent the upper limit (such as in the scoring system of 5 points of systems, this value is equal to 5) of score value, control Threshold value TR∈[TH,Tmax], rankStandardI () is the sort method of existing standard, and have:
rankStandard(i)=R*(u,i)-1
It is another aspect of this invention to provide that provide a kind of raising multifarious system of personalized recommendation system, including:
First module, for obtaining user's score data collection, and by this user's score data collection with text from website Mode stores, and this is concentrated by score data and includes article ID corresponding to ID, this ID and this user to this The score value of article;
Second module, for the user's scoring using the proposed algorithm being used for personalized recommendation system to obtain the first module Data set is predicted and recommendation process, thus the multiple users concentrated for user's score data generate the recommendation row of correspondence respectively Table, includes article ID corresponding to ID, this ID and this user and marks the prediction of these article in this recommendation list Value;
Three module, for user's score data collection, asks for the popularity degree of its article, and this popularity degree is by this The score value of these article is determined by the number of the user that article are evaluated, user, by user to the prediction score value of article with Controlling threshold value to compare, the popularity degree above or equal to prediction article corresponding to score value controlling threshold value is arranged Sequence, to obtain final ranking results;
4th module, feeds back to user for taking multiple results of front end in ranking results as recommendation list.
In general, by the contemplated above technical scheme of the present invention compared with prior art, it is possible to show under acquirement Benefit effect:
1 compares with the method in conventional personalized recommendation system, the present invention be directed to improve in personalized recommendation system Multiformity and design, be significantly increased improving on diversity index, thus reduce the long tail effect of internet retailer The unsalable problem of article brought.
2, the present invention uses flexibly, can conveniently can be added in existing personalized recommendation system, without changing The core architecture of this personalized recommendation system, brings huge change and burden will not to original system.
3, by using the step (3) in the inventive method, it is ensured that the accuracy of personalized recommendation system and multiformity.
4, by the step (3) of the present invention, recommendation results can be increased substantially while accuracy recommended by less sacrifice Overall multiformity.
Accompanying drawing explanation
Fig. 1 is the process structure chart of the personalized recommendation system that the present invention is suitable for.
Fig. 2 is the recommended flowsheet schematic diagram of present invention commending system based on user.
Fig. 3 is the schematic diagram that the present invention selects neighbor user.
Fig. 4 is the general thoughts schematic diagram of present invention sort algorithm based on threshold value.
Fig. 5 is the design of experiment of the present invention.
Fig. 6 is the sort algorithm result figure on accuracy and multifarious impact, wherein: (a) is to use based on popularity Method, (b) be use average score method, (c) be use absolute pouplarity method, (d) be use the most welcome Degree method, (e) is to use scoring variance method, and (f) is to use scoring inverted order method.
Fig. 7 is relational result figure between loss of accuracy and multiformity.
Fig. 8 is to recommend long-tail article proportion figure in article.
Fig. 9 is that the present invention is for improving the flow chart of the multifarious method of personalized recommendation system.
Detailed description of the invention
In order to make the purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, right The present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, and It is not used in the restriction present invention.If additionally, technical characteristic involved in each embodiment of invention described below The conflict of not constituting each other just can be mutually combined.
The Integral Thought of the present invention is, first obtains user's score data collection, then often uses on score data collection Recommend by proposed algorithm, next recommendation results is carried out threshold value control and sequence, then final result is presented to user. By recommendation results is resequenced, it is possible to change the list content of consequently recommended result such that it is able to improve system many Sample, the method simultaneously using threshold value control when sequence so that recommend article to be liked article by user as far as possible, it is ensured that to push away Recommend the prediction accuracy of system.
Typical personalized recommendation system processes structure as shown in Figure 1.In the entire system, it is assumed that U (user, i.e. user) For participating in the set of all users in personalized recommendation system, I (article, i.e. item) be in system all items (such as book Nationality, film, music etc.) set, the relation of the most each user u ∈ U and article i ∈ I can regard as R (u, i), R (u, i) With the relation between U, I is shown below:
R:U×I→Rating
R is a scoring from practical significance, indicates the user u preference degree to article i.Commending system Comprise two steps: predict and recommend.The first step at commending system i.e. predicts that task is to use known score data to come in advance Survey the user's possible mark to article of not marking.
Comprise the following steps as it is shown in figure 9, the present invention improves the multifarious method of personalized recommendation system:
(1) obtain user's score data collection from website, and this user's score data collection is deposited in the way of text Storage, this is concentrated by score data and includes article ID corresponding to ID, this ID and this user scoring to these article Value;Specifically, it is that the method using web crawlers obtains user's score data collection in the present invention;
(2) the user's score data collection using the proposed algorithm being used for personalized recommendation system to obtain step (1) is carried out Prediction and recommendation process, thus the multiple users concentrated for user's score data generate the recommendation list of correspondence, this recommendation respectively List includes article ID corresponding to ID, this ID and this user prediction score value to these article;
(3) to user's score data collection, asking for the popularity degree of its article, this popularity degree is to be commented by these article The score value of these article is determined by the number of the user of valency, user, user is entered the prediction score value of article with controlling threshold value Row compares, and the popularity degree above or equal to prediction article corresponding to score value controlling threshold value is ranked up, to obtain Whole ranking results;
(4) take multiple results of front end in ranking results and feed back to user as recommendation list;In the present invention, fetch bit in 5 results of ranking results front end feed back to user as recommendation list.
As in figure 2 it is shown, the step of the inventive method (2) specifically includes following sub-step:
(2-1) according to the user's score data collection obtained, the similarity between all users is calculated;
Specifically, this step is to use below equation:
s i m ( a , b ) = Σ p ∈ P R ( a , p ) R ( b , p ) Σ p ∈ P R ( a , p ) 2 Σ p ∈ P R ( b , p ) 2
Wherein (a, b) represents the similarity between user a and b to sim, and P represents the set of all items, and p represents in set P Article, R (a, p) and R (b p) represents that user a and user b is for the score value of article p respectively;
(2-2) for each user, front K the user the highest with this user's similarity neighbour as this user is chosen Occupy user;In the present embodiment, the span of K is the integer between 50 to 300;As shown in Figure 3.
(2-3) for each user, the score value of the article that its K neighbor user was marked is analyzed, with Dope this user and most possibly beat multiple article of high score, and these article that this user may beat high score recommend use Family;
Specifically, this step is specifically used below equation:
R * ( u , i ) = R ( u ) ‾ + kΣ v ∈ N ( u ) ( R ( v , i ) - R ( v ) ‾ ) × s i m ( u , v )
Wherein,Obtain after being substituted into above-mentioned formula:
R * ( u , i ) = R ( u ) ‾ + Σ v ∈ N ( u ) ( R ( v , i ) - R ( v ) ‾ ) × s i m ( u , v ) Σ v ∈ N ( u ) | s i m ( u , v ) |
Wherein R*(u, i) represents the user u prediction score value for article i,It is that user u is for its all items Average score value, k is normalization factor, and N (u) represents the set of all neighbor users of user u;
The step (3) of the inventive method specifically includes following sub-step:
(3-1) to user's score data collection, according to the people of the user that this user's score data is concentrated article be evaluated Popularity degree several, that the score value of these article is asked for its article by user;
Specifically, the process of the popularity degree that the present invention obtains article can use following six kinds of methods to calculate:
The first,
Wherein rankPopularityI () represents the method using the sequence of article popularity degree, Represent for all users gathers each user u in U, there is user u to the score value R of article i (u, i) individual Number.
The second, rankReversePrediction(i)=R*(u,i)
Wherein rankReversePredictionI () represents the method using prediction score value inverted order, this prediction score value can represent The popularity degree of article.
The third,
Wherein have
R ( i ) ‾ = 1 | U ( i ) | Σ u ∈ U ( i ) R ( u , i )
rankAverageRatingI () represents the method using the sequence of article average score value,
4th kind, rankAbsoluteLikeability(i)=| UH(i)|
Wherein UH(i)=and u ∈ U (i) | R (u, i) >=TH}
rankAbsoluteLikeabilityI () represents the absolute pouplarity using article, and u ∈ U (i) | R (u, i) >=TH} Represent that the score value that article i is beaten by user u is more than threshold value THQuantity, in the present embodiment, threshold value THValue can be free Set, this value is the least, represent user the strictest for the standards of grading of article, otherwise then represent the loosest, in the present invention its Value is 3.5 (on the premise of 5 points of full marks processed of standard);
5th kind, rankRelativeLikeability(i)=| UH (i)/U (i) |
Wherein rankRelativeLikeabilityI () represents that article are relative to pouplarity.
6th kind,
Wherein rankRatingVarianceI () represents that the score value of article is deviateed the degree of this article average score value by user.
(3-2) the prediction score value of article is compared by user with controlling threshold value, above or equal to controlling threshold value The popularity degree of prediction article corresponding to score value be ranked up, to obtain final ranking results.
This step specifically uses below equation:
rank x ( i , T R ) = rank x ( i ) , R * ( u , i ) ∈ [ T R , T max ] rank S tan d a r d ( i ) , R * ( u , i ) ∈ [ T H , T R )
Wherein, rankx(i,TR) represent that use controls threshold value TRThe function that article i is ranked up, rankxI () represents on State the popularity degree of article acquired in step (3-1), TmaxRepresent that the upper limit of score value is (such as at the scoring system of 5 points of systems In, this value is equal to 5), control threshold value TR∈[TH,Tmax], rankStandardI () is the sort method of existing standard, and have:
rankStandard(i)=R*(u,i)-1
Here, when introducing control threshold value TRAfter so that those prediction scorings are higher than TRAccording to original rankx(i) Result be ranked up, and those prediction scorings are less than TR, then use rankStandardI () is ranked up.Meanwhile, also can Control all prediction scorings higher than TRArticle all come all prediction scorings less than TRArticle before.So, when increasing TR Value time, more high accuracy and the lowest multifarious article will be filtered out (because standard sorted so can be become closer to;If subtracting Little TRValue, then function rank can be madeX(i,TR) closer to rankxI (), can improve the accurate of system in this case Spend and reduce its multiformity.Therefore, it can by selecting different control threshold values TR, realize between accuracy and multiformity Relation balance.The general thoughts of sort algorithm based on threshold value is as shown in Figure 4.
As shown in Fig. 4 (a), use the sort method of standard, directly candidate item is ranked up, in advance according to prediction scoring Test and appraisal score value is the highest, and article sequence is the most forward.Then select prediction the highest front 5 article of scoring to recommend user, and be Ensure that recommended article all meet user preference, select the scoring of article all at THOn, it is recommended that the recommendation quality that system is overall As shown in the rectangular histogram of side.As shown in Fig. 4 (b), employ ranking functions rankx(i), used herein based on popularity Sort method, has then obtained one group of new recommendation list, and popularity is relatively low, but prediction scoring is at THOn article recommend To user.In this Groups List, user can be appreciated that some minority's article, and the article of this part are in length in whole commending system Portion, although their popularity is the highest, but the evaluation of these article is probably and likes (predicting that score value is higher than by user TH), the degree of correlation of these article and user is described well, after employing the method for this sequence, it is possible to increase system Multiformity, also reduce the accuracy of system simultaneously, the overall of system recommends quality still as shown in the rectangular histogram of side.Such as Fig. 4 Shown in (c), control threshold value T by adjustingR, can select different article are recommended user, reach to reduce reduction standard as far as possible While exactness, improve the multifarious purpose of system.
In order to carry out fair and reasonable Performance Evaluation, the present invention is given in personalized recommendation system evaluation procedure several The definition of quantitative assessing index.
(1) accuracy
Score data in, scoring interval value be 1~5, higher numerical value represents user and more has a preference for these article.Root According to the definition of general commending system, by scoring, higher than 3.5, (threshold value of high scoring article, is designated as TH) conduct " high ranking " thing Product, by the scoring article being designated as " not high ranking " less than 3.5.Additionally, in actual commending system, because user is the most only Paying close attention to several maximally related recommendation article, therefore commending system would generally provide N number of article of top ranked, will recommend use N number of article of family u are designated as:
LN(u)={ i1,...,iN}
Wherein,
R*(u,ik)≥TH,k∈{1,2,...,N}
Therefore article in, the accuracy of assessment commending system based on real high ranking article proportion, high row Name article proportion is designated as correct (LN(u)), in the middle of this, find out N number of maximally related " high ranking " article recommend use Family, is precision-in-top-N (i.e. top N recommends the accuracy of article), and formula is as follows:
p r e c i s i o n - i n - t o p - N = Σ u ∈ U | c o r r e c t ( L N ( u ) ) | Σ u ∈ U | L N ( u ) |
Wherein,
correct(LN(u))={ i ∈ LN(u)|R(u,i)≥TH}
But it is dependent on accuracy and can not well find the recommendation article needed for user.Commending system is necessary not only for standard Really, in addition it is also necessary to practical value.It follows that introduce another evaluation index of personalized recommendation system, i.e. commending system is many Sample.
(2) multiformity
Multiformity is for assessing the commending system excavation ability to long-tail article.Multiformity can be entered by different methods Row assessment, in this article, the quantity of all different article that use commending system can be recommended is estimated, and formula is as follows:
d i v e r s i t y - i n - t o p - N = | ∪ u ∈ U L N ( u ) |
Wherein diversity-in-top-N represents that top N recommends the multiformity of article.
Can ensure that more article display when the multiformity of system is higher to user, for RECOMENDATION, Ratio shared by popular article is very big, so causing multiformity the lowest.For a good personalized recommendation system, it should Need the much higher sample of comparison, more article so could be allowed to obtain recommended chance.Multiformity is also that product carries simultaneously The index being concerned about very much for business, each article can be recommended at least one user by the system that multiformity is the highest.
Experiment embodiment
In order to verify that being based on threshold value control personalized recommendation system sort method can actually lose minimum accuracy In the case of promote the multiformity of commending system, carried out the feasibility of proof scheme by following experimental procedure:
(1) MovieLens and Netflix raw data set is obtained.
(2) concentrate because of initial data and comprise the part information data the most unrelated with this algorithm, and userId and movieId It is worth excessive, is unfavorable for that computer carries out computing, it is therefore desirable to raw data set is carried out decentration process, i.e. removes data set In type, the redundant information such as timestamp, and score data is re-started mapping so that the value of userId and movieId is from 1 Start counting up.
(3) set of source data is divided into training set (comprising 60% source data) and test set (comprising 40% source data), and protects Including at least 5 film score data of a user in card test set.
(4) in training set, proposed algorithm based on user is used respectively, based on article proposed algorithm with based on singular value Decompose proposed algorithm, obtain six groups of score in predicting data lists of total of MovieLens and Netflix.
(5) in above recommendation list, use method based on sequence, obtain the rearrangement list of recommendation list, i.e. present to The recommendation list of user.
(6) assessment proposes the performance of solution, precision-in-top-N and diversity-in-top-N.
Experimental design is as shown in Figure 5.
Experiment contrasts on MovieLens data set with Netflix data set respectively and uses 6 kinds of different sort methods to pushing away Recommend the impact of result and the prediction accuracy of commending system and multifarious change.In experiment, the evaluation metrics of commending system is Precision-in-top-N and diversity-in-top-N.In experimentation, the method for off-line verification is used to enter Row experiment, uses the method having arrived cross validation to test simultaneously.In the general experimental technique of off-line verification, first have to by Data set is divided into training set (Training Set) and two parts of test set (Test Set), comprises 60% in training set Original score data, comprises the original score data of 40% in test set.The score data being then based in training set, use pushes away Recommend algorithm recommendation results is predicted, and the actual result with test set that predicts the outcome is compared, calculate prediction The evaluation metrics such as accuracy and multiformity.In experimentation, 5 groups of training sets of stochastic generation and 5 groups of test sets, finally by 5 times The meansigma methods of experimental result, as the whole result of experiment, covers every score data, it is ensured that experimental data to the full extent Accuracy.
Prediction accuracy that Fig. 6 is obtained by six kinds of methods in step (3-1) and multifarious Performance comparision, demonstrate this The design of multiformity solution and personalized recommendation system row based on control threshold value in the personalized recommendation system that invention proposes The feasibility of sequence method.In figure, data are and use different proposed algorithms to be controlled threshold value again on Netflix data set Relation between the multiformity and the accuracy that obtain after sequence.The control threshold value sort method of the present invention can be to varying degrees Improving multiformity, the optimum simultaneously predicted depends on selected data set and selected recommendation method.Meanwhile, individual character The designer changing commending system can select various sequence neatly according to different application scenarios and the data collected Mode thus reach optimal recommendation effect.
Fig. 7 provides in six kinds of methods employed in step of the present invention (3-1) and loses between accuracy and multiformity Relation.Figure compares between loss of accuracy and the multiformity gain between all of sort algorithm based on control threshold value Relation.It can be seen that the algorithm proposed can improve multiformity by the prediction accuracy of sacrificial system, use Different proposed algorithms and sort method show situation on different data sets and differ.In this is tested, select respectively Select sort method based on popularity and preferable multiformity can be obtained relative to pouplarity sort method based on article and increase Benefit.
Fig. 8 is to use after the inventive method the Performance Evaluation for personalized recommendation system long tail effect.Figure calculates Long-tail article recommend the percentage ratio in article all users.Because the multiformity of assessment personalized recommendation system is by calculating All different numbers recommending article, therefore can improve multiformity by recommending some new articles to small part user, Thus cannot determine that process proposed herein the most really can change the long-tail distribution of article.Here, have evaluated sequence The impact that long-tail is distributed by algorithm.According to " sixteen rules " of long-tail distribution, the article of definition 20% are bestseller items, remaining 80% is long-tail article.It can be seen that the rearrangement algorithm proposed can significantly improve the recommendation hundred of long-tail article Proportion by subtraction.Therefore, here turned out that the sort algorithm proposed not only improves is multifarious index, and can have Effect improves the accounting of long-tail article.The most just the distribution of long-tail article can be improved on the whole.
As it will be easily appreciated by one skilled in the art that and the foregoing is only presently preferred embodiments of the present invention, not in order to Limit the present invention, all any amendment, equivalent and improvement etc. made within the spirit and principles in the present invention, all should comprise Within protection scope of the present invention.

Claims (10)

1. one kind is improved the multifarious method of personalized recommendation system, it is characterised in that comprise the following steps:
(1) obtain user's score data collection from website, and this user's score data collection is stored in the way of text, should Concentrate by score data and include article ID corresponding to ID, this ID and this user score value to these article;
(2) the user's score data collection using the proposed algorithm being used for personalized recommendation system to obtain step (1) is predicted And recommendation process, thus the multiple users concentrated for user's score data generate the recommendation list of correspondence, this recommendation list respectively In include article ID corresponding to ID, this ID and this user prediction score value to these article;
(3) to user's score data collection, asking for the popularity degree of its article, this popularity degree is evaluated by these article The score value of these article is determined by the number of user, user, user is compared the prediction score value of article with controlling threshold value Relatively, the popularity degree above or equal to prediction article corresponding to score value controlling threshold value is ranked up, final to obtain Ranking results;
(4) take multiple results of front end in ranking results and feed back to user as recommendation list.
Method the most according to claim 1, it is characterised in that step (2) specifically includes following sub-step:
(2-1) according to the user's score data collection obtained, the similarity between all users is calculated:
Wherein (a, b) represents the similarity between user a and b to sim, and P represents the set of all items, and p represents the thing in set P Product, (a, p) (b p) represents the user a and the user b score value for article p to R respectively with R;
(2-2) for each user, front K the user the highest with this user's similarity neighbours as this user are chosen User, wherein K is the integer between 50 to 300;
(2-3) for each user, the score value of the article that its K neighbor user was marked is analyzed, with prediction Go out this user and most possibly beat multiple article of high score, and these article that this user may beat high score recommend user.
Method the most according to claim 2, it is characterised in that step (2-3) specifically used below equation:
Wherein
Obtain after being substituted into above-mentioned formula:
Wherein R*(u, i) represents the user u prediction score value for article i,It is average for its all items of user u Score value, k is normalization factor, and N (u) represents the set of all neighbor users of user u.
Method the most according to claim 3, it is characterised in that step (3) specifically includes following sub-step:
(3-1) to user's score data collection, according to the number of user that this user's score data is concentrated article be evaluated, use The score value of these article is asked for the popularity degree of its article by family;
(3-2) the prediction score value of article is compared by user with controlling threshold value, above or equal to controlling the pre-of threshold value The popularity degree of the article that test and appraisal score value is corresponding is ranked up, to obtain final ranking results.
Method the most according to claim 4, it is characterised in that obtain the process of the popularity degree of article in step (3-1) It is represented by
Wherein rankPopularityI () represents the method using the sequence of article popularity degree,Represent For all users gather each user u in U, there is user u score value R (u, number i) to article i.
Method the most according to claim 4, it is characterised in that obtain the process of the popularity degree of article in step (3-1) It is represented by
rankReversePrediction(i)=R*(u,i)
Wherein rankReversePredictionI () represents the method using prediction score value inverted order, this prediction score value can represent article Popularity degree.
Method the most according to claim 4, it is characterised in that obtain the process of the popularity degree of article in step (3-1) It is represented by
Wherein have
rankAverageRating(i): represent the method using the sequence of article average score value.
Method the most according to claim 4, it is characterised in that obtain the process of the popularity degree of article in step (3-1) It is represented by
rankAbsoluteLikeability(i)=| UH(i)|
Wherein UH(i)=and u ∈ U (i) | R (u, i) >=TH}
rankAbsoluteLikeabilityI (): table UH (showing i) uses the absolute pouplarity of article, u ∈ U (i) | R (u, i) >=TH} Represent that the score value that article i is beaten by user u is more than threshold value THQuantity.
Method the most according to claim 4, it is characterised in that obtain the process of the popularity degree of article in step (3-1) It is represented by
rankRelativeLikeability(i)=| UH(i)/U(i)|
Wherein rankRelativeLikeabilityI () table U shows H thing (i) product/U phase (i to) pouplarity,
Or be expressed as
Wherein rankRatingVarianceI () represents that the score value of article is deviateed the degree of this article average score value by user.
10. according to the method described in any one in claim 5 to 9, it is characterised in that step (3-2) specifically use with Lower formula:
Wherein, rankx(i,TR) represent the function using control threshold value TR that article i is ranked up, rankxI () represents upper article Popularity degree, TmaxRepresent the upper limit (such as in the scoring system of 5 points of systems, this value is equal to 5) of score value, control threshold value TR∈[TH,Tmax], rankStandardI () is the sort method of existing standard, and have:
rankStandard(I)=R*(u,i)-1
CN201610463223.0A 2016-06-23 2016-06-23 One is used for improving the multifarious method of personalized recommendation system Pending CN106202151A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610463223.0A CN106202151A (en) 2016-06-23 2016-06-23 One is used for improving the multifarious method of personalized recommendation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610463223.0A CN106202151A (en) 2016-06-23 2016-06-23 One is used for improving the multifarious method of personalized recommendation system

Publications (1)

Publication Number Publication Date
CN106202151A true CN106202151A (en) 2016-12-07

Family

ID=57461470

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610463223.0A Pending CN106202151A (en) 2016-06-23 2016-06-23 One is used for improving the multifarious method of personalized recommendation system

Country Status (1)

Country Link
CN (1) CN106202151A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107230104A (en) * 2017-05-26 2017-10-03 喻继军 Commodity diversity based on consuming character adaptively recommends method
CN107491813A (en) * 2017-08-29 2017-12-19 天津工业大学 A kind of long-tail group recommending method based on multiple-objection optimization
CN109408730A (en) * 2018-12-06 2019-03-01 重庆理工大学 A method of reducing Matthew effect in film recommender system
CN110825967A (en) * 2019-10-31 2020-02-21 中山大学 Recommendation list re-ranking method for improving diversity of recommendation system
CN111192657A (en) * 2018-11-15 2020-05-22 宁波方太厨具有限公司 Menu recommendation method based on user behavior heat
CN111191707A (en) * 2019-12-25 2020-05-22 浙江工商大学 LFM training sample construction method fusing time attenuation factors

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103995839A (en) * 2014-04-30 2014-08-20 兴天通讯技术(天津)有限公司 Commodity recommendation optimizing method and system based on collaborative filtering
CN104166732A (en) * 2014-08-29 2014-11-26 合肥工业大学 Project collaboration filtering recommendation method based on global scoring information
US20150178775A1 (en) * 2013-12-23 2015-06-25 Yahoo! Inc. Recommending search bid phrases for monetization of short text documents
CN105512183A (en) * 2015-11-24 2016-04-20 中国科学院重庆绿色智能技术研究院 Personalized recommendation method and system based on users' independent choice

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150178775A1 (en) * 2013-12-23 2015-06-25 Yahoo! Inc. Recommending search bid phrases for monetization of short text documents
CN103995839A (en) * 2014-04-30 2014-08-20 兴天通讯技术(天津)有限公司 Commodity recommendation optimizing method and system based on collaborative filtering
CN104166732A (en) * 2014-08-29 2014-11-26 合肥工业大学 Project collaboration filtering recommendation method based on global scoring information
CN105512183A (en) * 2015-11-24 2016-04-20 中国科学院重庆绿色智能技术研究院 Personalized recommendation method and system based on users' independent choice

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107230104A (en) * 2017-05-26 2017-10-03 喻继军 Commodity diversity based on consuming character adaptively recommends method
CN107491813A (en) * 2017-08-29 2017-12-19 天津工业大学 A kind of long-tail group recommending method based on multiple-objection optimization
CN111192657A (en) * 2018-11-15 2020-05-22 宁波方太厨具有限公司 Menu recommendation method based on user behavior heat
CN109408730A (en) * 2018-12-06 2019-03-01 重庆理工大学 A method of reducing Matthew effect in film recommender system
CN109408730B (en) * 2018-12-06 2021-08-24 重庆理工大学 Method for reducing Martha effect in film recommendation system
CN110825967A (en) * 2019-10-31 2020-02-21 中山大学 Recommendation list re-ranking method for improving diversity of recommendation system
CN110825967B (en) * 2019-10-31 2023-04-07 中山大学 Recommendation list re-ranking method for improving diversity of recommendation system
CN111191707A (en) * 2019-12-25 2020-05-22 浙江工商大学 LFM training sample construction method fusing time attenuation factors
CN111191707B (en) * 2019-12-25 2023-06-06 浙江工商大学 LFM training sample construction method integrating time attenuation factors

Similar Documents

Publication Publication Date Title
CN106202151A (en) One is used for improving the multifarious method of personalized recommendation system
Li et al. Using multidimensional clustering based collaborative filtering approach improving recommendation diversity
CN109785062B (en) Hybrid neural network recommendation system based on collaborative filtering model
CN110020128B (en) Search result ordering method and device
US20080300958A1 (en) Taste network content targeting
US20110184977A1 (en) Recommendation method and system based on collaborative filtering
CN106777051A (en) A kind of many feedback collaborative filtering recommending methods based on user's group
CN106471491A (en) A kind of collaborative filtering recommending method of time-varying
CN102567900A (en) Method for recommending commodities to customers
CN103559622A (en) Characteristic-based collaborative filtering recommendation method
Kommineni et al. Machine learning based efficient recommendation system for book selection using user based collaborative filtering algorithm
CN113191838B (en) Shopping recommendation method and system based on heterogeneous graph neural network
CN103309972A (en) Recommend method and system based on link prediction
CN109919737B (en) Recommendation method and system for producing and selling commodities
Boratto et al. ART: group recommendation approaches for automatically detected groups
CN108038746A (en) Method is recommended based on the bigraph (bipartite graph) of key user and time context
CN107025311A (en) A kind of Bayes's personalized recommendation method and device based on k nearest neighbor
Diwan et al. Development of empirically based customer-derived positioning typology in the automobile industry
CN105809275A (en) Item scoring prediction method and apparatus
Razghandi et al. A context-aware and user behavior-based recommender system with regarding social network analysis
EP2983123A1 (en) Self transfer learning recommendation method and system
Hassan et al. Performance analysis of neural networks-based multi-criteria recommender systems
Luke et al. Recommending long-tail items using extended tripartite graphs
CN101986301A (en) Inverse neighbor analysis-based collaborative filtering recommendation system and method
CN108415928A (en) A kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20161207

RJ01 Rejection of invention patent application after publication