CN108269172A - Collaborative filtering based on comprehensive similarity migration - Google Patents

Collaborative filtering based on comprehensive similarity migration Download PDF

Info

Publication number
CN108269172A
CN108269172A CN201810050004.9A CN201810050004A CN108269172A CN 108269172 A CN108269172 A CN 108269172A CN 201810050004 A CN201810050004 A CN 201810050004A CN 108269172 A CN108269172 A CN 108269172A
Authority
CN
China
Prior art keywords
user
similarity
scoring
platform
represent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810050004.9A
Other languages
Chinese (zh)
Other versions
CN108269172B (en
Inventor
琚生根
孙界平
陈黎
夏欣
金玉
王婧研
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201810050004.9A priority Critical patent/CN108269172B/en
Publication of CN108269172A publication Critical patent/CN108269172A/en
Application granted granted Critical
Publication of CN108269172B publication Critical patent/CN108269172B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Evolutionary Computation (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of collaborative filterings based on comprehensive similarity migration, compared with prior art, the present invention is on similarity calculation, user's score information is utilized and also utilizes customer attribute information simultaneously, and it considers between user to the otherness of the scoring criterion of satisfaction, it employs user and scores distribution consistency to weigh the method for user's scoring similarity, the accuracy of similarity calculation is improved, so as to improve the quality of Data Migration.The experimental results showed that the model can relatively efficiently alleviate Sparse sex chromosome mosaicism compared with other algorithms.Joint project similarity or other knowledge can be considered in future, and such as text message migrates the data of field of auxiliary, the quality of migrating data can be improved in this way, so as to improve recommendation accuracy.

Description

Collaborative filtering based on comprehensive similarity migration
Technical field
The present invention relates to network information calculating field more particularly to a kind of collaborative filtering calculations based on comprehensive similarity migration Method.
Background technology
Currently, network information exponentially increases, on the one hand the network user can obtain abundant information, on the other hand but face Face problem of information overload, it is difficult to the information useful to oneself is excavated from magnanimity information.Commending system can according to user interest, from Mass data filters out the interested part of user.At present, commending system is widely used, such as Amazon, eBay, The e-commerce platforms such as MovieLens, GroupLens.
Collaborative filtering is that one of most widely used technology, basic thought are in commending system:Utilize user's History score data to predict interest-degree of the user to the article that do not score, selects the highest several articles of interest-degree as recommendation As a result.For traditional collaborative filtering, most essential steps are the similarities calculated between user or between article, but with number According to growth, user's score data can be extremely sparse, and recommends quality that can also decline therewith.
At present, for Sparse Problem【1】, there is following several solutions:When by filling do not score article come Reduce the openness of data set【2-4】, the algorithm be suitable for article update infrequently and article number be much smaller than number of users scene, Cold start-up problem is existed simultaneously dependent on user behavior;Second is that the openness of data set is reduced by matrix decomposition【5】, the calculation Method carries out rating matrix singular value decomposition, this method training cost is big, no using the potential relationship between user and project Adapt to the change of user interest;Third, using transfer learning thought, target domain is promoted by the cross section between field Study【6-7】, which reaches the mesh of auxiliary mark field training by finding the potential relationship of target domain and field of auxiliary , its degree of reliability for depending on potential relationship, can lead to negative transfer Ru unreliable as a result,.At present, some scholars propose profit Alleviate target domain Sparse Problem with multi-field data.Such as Jamali【8-9】Et al. propose it is a kind of based on context Matrix decomposition mould HeteroMF,
Its main thought is to utilize multi-field common physical, and the characterization factor of shared entity comes simultaneously to multiple matrixes Joint decomposition is carried out, algorithm needs training, and it need to consume plenty of time calculating gradient compared with multi-parameter;LiBin【10】Et al. proposition A kind of rating matrix generation model RMGM (RatingMatrixGenerativeModel), main thought is by finding The grading matrix of shared implicit cluster rank, should then using the null value of original matrix in this matrix fill-in target domain Method use is with strong correlation field and there is no theories integrations;Li Chao【10】Et al. propose one kind be based on user's similarity migration Mould TSUCF (TransferSimilarity
User-basedCollaborativeFiltering), main thought is that crossing domain data set up auxiliary Contacting for field and target domain, achievees the purpose that auxiliary mark field, this method is weighing merely with user's score information During amount scoring similarity, weighed only with common number of articles, do not account for the preference of user.
Although algorithm above improves recommendation precision using field of auxiliary knowledge, but still have insufficient:First, it is based on The model of matrixing, model training parameter is more, second is that field of auxiliary is required to meet strong correlation with target domain, model is applicable in Scene is few;Third, when calculating user's scoring similarity, otherness of the user to the scoring criterion of satisfaction is had ignored.
Sparse sex chromosome mosaicism is one of main bottleneck of traditional collaborative filtering.Transfer learning is typically to utilize target Field and the potential relationship of field of auxiliary are carried out knowledge migration to field of auxiliary, the recommendation quality of target domain are improved with this. It is existing to be based on similarity migration models, user's score information is generally only utilized, and ignore on scoring similarity calculation User standards of grading difference.In view of the above problems, the present invention proposes a proposed algorithm migrated based on comprehensive similarity.
Bibliography:
[1]PanW。Asurveyoftransferlearningforcollaborativerecommendationwithau xiliary data[J]。Neurocomputing,2016,177(C):447-453。
[2]LemireD,MaclachlanA。Slopeonepredictorsforonlinerating- basedcollaborativefiltering[C]//Pr oceedingsofthe2005SIAMInternationalConfere nceonDataMining。SocietyforIndustrialandAppliedMathematics,2005:471-475。
[3]WangP,YeHW。Apersonalizedrecommendationalgorithmcombiningslopeonesc hemeanduserbasedcollaborativefiltering[C]//IndustrialandInformationSystems, 2009。IIS'09。InternationalConferenceon。IEEE,2009:152-154。
[4]SunZ,LuoN,KuangW。Onereal-timepersonalizedrecommendationsystemsbase donSlopeOnealgorithm[C]//FuzzySystemsandKnowledgeDiscovery(FSKD), 2011EighthInternationalConferenceon。IEEE,2011,3:1826-1830。
[5]SarwarB,KarypisG,KonstanJ,etal。Applicationofdimensionalityreductio ninrecommendersystem-acasestudy[R]。MinnesotaUnivMinneapolisDeptofComputerScie nce,2000。
[6]LiB,YangQ,XueX。Transferlearningforcollaborativefilteringviaarating -matrixgenerative model[C]//InternationalConferenceonMachineLearning, ICML2009,Montreal,Quebec,Canada,June。DBLP,2009:617-624。
[7]PanW,XiangEW,LiuNN,etal。TransferLearninginCollaborativeFilteringfo rSparsityReduction[C]//AAA I。2010,10:230-235。
[8]JamaliM,LakshmananL。Heteromf:recommendationinheterogeneousinformat ionnetworksusingcontextdependentfactormodels[C]//Proceedingsofthe22ndinternat ionalconferen ceonWorldWideWeb。ACM,2013:643-654。
[9]WuS,LiuQ,WangL,etal。Contextualoperationforrecommendersystems[J]。 IEEETransactionsonKnowledgeandDataEngineering,2016,28(8):2000-2012。
[10]LiB,YangQ,XueX。Transferlearningforcollaborativefilteringviaaratin g-matrixgenerativemodel[C]//Proceedingsofthe26thannualinternationalconference onmachine learning。ACM,2009:617-624。
[11] Li Chao, Zhou Tao, Huang Junming, etc..The cross-platform intersection proposed algorithm [J] transmitted based on user's similitude.In Literary information journal, 2016,30 (2):90-98.
Invention content
The purpose of the present invention is that solve the above-mentioned problems and provides a kind of collaboration migrated based on comprehensive similarity Filter algorithm.
The present invention is achieved through the following technical solutions above-mentioned purpose:
The present invention includes the following steps:
(1) proposed algorithm based on comprehensive similarity migration:If there are two platform e1And e2, U1It represents only in platform e1In There are the user of historical behavior information, U2It represents only in platform e2The middle user there are historical behavior information, UcIt represents in platform e1 And e2In had the user of historical behavior information, be defined as intersecting user;In a practical situation, intersect the quantity of user much Less than the quantity of non-crossing user;It is non-crossing user U by intersecting user1And U2Similarity contact is set up, is helped with this Target domain is recommended;
(2) similarity migrates:Non-crossing user U1With user U2Similitude can not be directly calculated, still, user U1And user U2Respectively with intersecting user UcSimilarity can calculate, so, can will intersect user UcUser U is established as tie1 With user U2Similarity;
Similarity migration step:It finds out first and collects U with the common user of platform 1 and platform 2c;Then U is calculated respectively1With Uc Similitude, be denoted as vectorU2With UcSimilitude, be denoted asFinally calculateWithInner product, as U1And U2's Transmit similarity
Wherein, U11Represent the non-crossing user 1, U in platform 121、U22Represent the non-crossing user 1, U in platform 2c1、Uc2 Expressions is waited to intersect user, S1、S2Deng expression similarity;If calculate U11With U21Between similarity, then can pass through Uc1、Uc2、 Uc3Transition calculates indirectlyTo sum up, then U1And U2Between similarity calculation can formalize For:
(3) similarity calculation:Calculate non-crossing user U1With U2Similarity before, need to first calculate non-crossing user U1、U2 Respectively with intersecting the similarity of user, similarity calculation is as follows:
1) user's scoring similarity
User's scoring similarity is weighed herein by scoring distribution consistency, two aspect of confidence level;
Scoring distribution consistency is that the scoring distribution for the identical items evaluated by two users determines;Scoring distribution more one It causes, illustrates that the interest of two users is more similar;If { ur1,ur2,...,urn, { ur1,ur2,...,urnIt is respectively user u and use Two groups of data are carried out sort ascending, i.e. { ur by family v respectively to the scoring collection of common article1,ur2,...,urn,If 1,2 ..., n and x1,x2,...,xnMatching degree it is bigger, then the consistency both shown is higher; Calculation formula is as follows;
Confidence level is that the quantity for the identical items evaluated according to two users determines, if quantity very little, even if scoring point Cloth is consistent, and it is certain similar not represent the two yet;Calculation formula is as follows;
Wherein, IuRepresent the article collection of user u evaluations;
User's scoring calculating formula of similarity is as follows;
sim1(u, v)=dist (u, v) conf (u, v) (1-4)
2) user property similarity
User property similarity is weighed according to user property;It is generally believed that possess the user of same alike result certain There is similar interest in degree;Calculation formula is as follows;
Wherein, n represents attribute number, and sim (u, v, i) represents whether two users are identical in ith attribute, such as identical, It is then 1, otherwise is 0, diThe discrimination of ith attribute is represented, if the user with certain attribute carries out all items Scoring then shows that the attribute does not have discrimination, and value is determined by different data collection;
3) final similarity
Under normal circumstances, after user judges something point, it should as possible using user to article score information, when with Family does not score to certain article, then should utilize customer attribute information as possible;When the number of articles that user is scored increases, algorithm It should be smoothly transitted into and be recommended using score information, is smoothed herein using sigmoid functions, end user is similar Degree is defined as follows:
Sim (u, v)=α sim1(u,v)+(1-α)sim2(u,v) (1-6)
Wherein, CuvRepresent the article set that user u and user v is evaluated jointly;It is represented by above-mentioned formula, user's similarity meter Calculation can evaluate increasing for number of articles with user, be smoothly transitted into using score information, this seamlessly transit can improve The predictablity rate under cold start;
(4) algorithm description:
A user's similarity algorithm) is calculated:The first step calculates user property similarity according to customer attribute information;Second step According to user's score information, user's scoring similarity is calculated;Third walks:It is similar to user's scoring according to user property similarity Degree calculates end user's similarity;
B) the proposed algorithm based on transfer learning:The first step calculates U1With UcBetween similaritySecond step calculates U2 With UcBetween similarityThird walks computation migration similarity4th step utilizes and migrates similarityWith reference to UCF algorithms are recommended.
The beneficial effects of the present invention are:
The present invention is a kind of collaborative filtering migrated based on comprehensive similarity, and compared with prior art, the present invention exists On similarity calculation, that is, user's score information is utilized and also utilizes customer attribute information simultaneously, and consider it is right between user The otherness of the scoring criterion of satisfaction employs user and scores distribution consistency to weigh the method for user's scoring similarity, The accuracy of similarity calculation is improved, so as to improve the quality of Data Migration.The experimental results showed that the model is calculated compared with other Method can relatively efficiently alleviate Sparse sex chromosome mosaicism.Future is it is contemplated that joint project similarity or other knowledge, such as text Information migrates the data of field of auxiliary, can improve the quality of migrating data in this way, recommends so as to improve Accuracy.
Description of the drawings
Fig. 1 is user's rating matrix figure of the present invention;
Fig. 2 is the similitude migration schematic diagram of the present invention;
Fig. 3 is the lower algorithms of different RMSE value comparison diagram of A groups of the present invention;
Fig. 4 is the B group algorithms of different RMSE value comparison diagrams of the present invention;
Fig. 5 is the C group algorithms of different RMSE value comparison diagrams of the present invention;
Fig. 6 is the D group algorithms of different RMSE value comparison diagrams of the present invention;
Fig. 7 is the E group algorithms of different RMSE value comparison diagrams of the present invention;
Fig. 8 is algorithms of different RMSE value comparison diagram under N=5 of the invention;
Fig. 9 is the N=10 algorithms of different RMSE value comparison diagrams of the present invention;
Figure 10 is the N=20 algorithms of different RMSE value comparison diagrams of the present invention;
Figure 11 is the N=30 algorithms of different RMSE value comparison diagrams of the present invention;
Figure 12 is the N=40 algorithms of different RMSE value comparison diagrams of the present invention.
Specific embodiment
The invention will be further described below in conjunction with the accompanying drawings:
Proposed algorithm based on comprehensive similarity migration:
The present invention proposes a kind of proposed algorithm based on comprehensive similarity migration, alleviates target using field of auxiliary information The sparse sex chromosome mosaicism of FIELD Data.
Inventive algorithm will be illustrated by taking two film platforms as an example below.Assuming that there are two platform e1And e2, U1Table Show only in platform e1The middle user there are historical behavior information, U2It represents only in platform e2The middle user there are historical behavior information, UcIt represents in platform e1And e2In had the user of historical behavior information, be defined as intersecting user.User behavior matrix such as Fig. 1 It is shown.
In a practical situation, the quantity for intersecting user is far smaller than the quantity of non-crossing user.
Conventional recommendation algorithm be using proportion it is less intersect the user non-crossing user larger to proportion into Row is recommended, and thus will appear cold start-up problem and Sparse Problem so that recommends quality relatively low.
Inventive algorithm is by intersecting user, is non-crossing user U1And U2Similarity contact is set up, mesh is helped with this Recommended in mark field.
Similarity migrates:
As shown in Figure 1, non-crossing user U1With user U2Similitude can not be directly calculated, still, user U1With user U2Point Not with intersecting user UcSimilarity can calculate, so, can will intersect user UcUser U is established as tie1With with Family U2Similarity.
Similarity migration step:It finds out first and collects U with the common user of platform 1 and platform 2c;Then U is calculated respectively1With Uc Similitude, be denoted as vectorU2With UcSimilitude, be denoted asFinally calculateWithInner product, as U1And U2's Transmit similarity
Similarity migration is as shown in Figure 2;
Wherein, U11Represent the non-crossing user 1, U in platform 121、U22Represent the non-crossing user 1, U in platform 2c1、Uc2 Expressions is waited to intersect user, S1、S2Deng expression similarity.If calculate U11With U21Between similarity, then can pass through Uc1、Uc2、 Uc3Transition calculates indirectlyTo sum up, then U1And U2Between similarity calculation can formalize For:
Similarity calculation:
Based on above analysis, non-crossing user U is calculated1With U2Similarity before, need to first calculate non-crossing user U1、U2 Respectively with intersecting the similarity of user, similarity calculation is as follows:
1) user's scoring similarity
The present invention is distributed consistency by scoring, two aspect of confidence level weighs user's scoring similarity.
Scoring distribution consistency is that the scoring distribution for the identical items evaluated by two users determines.Scoring distribution more one It causes, illustrates that the interest of two users is more similar.If { ur1,ur2,...,urn, { ur1,ur2,...,urnIt is respectively user u and use Two groups of data are carried out sort ascending, i.e. { ur by family v respectively to the scoring collection of common article1,ur2,...,urn,If 1,2 ..., n and x1,x2,...,xnMatching degree it is bigger, then the consistency both shown is higher. Calculation formula is as follows.
Confidence level is that the quantity for the identical items evaluated according to two users determines, if quantity very little, even if scoring point Cloth is consistent, and it is certain similar not represent the two yet.Calculation formula is as follows.
Wherein, IuRepresent the article collection of user u evaluations.
User's scoring calculating formula of similarity is as follows.
sim1(u, v)=dist (u, v) conf (u, v) (1-4)
2) user property similarity
User property similarity is weighed according to user property.It is generally believed that possess the user of same alike result certain There is similar interest in degree.Calculation formula is as follows.
Wherein, n represents attribute number, and sim (u, v, i) represents whether two users are identical in ith attribute, such as identical, It is then 1, otherwise is 0, diThe discrimination of ith attribute is represented, if the user with certain attribute carries out all items Scoring then shows that the attribute does not have discrimination, and value is determined by different data collection.
3) final similarity
Under normal circumstances, after user judges something point, it should as possible using user to article score information, when with Family does not score to certain article, then should utilize customer attribute information as possible.When the number of articles that user is scored increases, algorithm It should be smoothly transitted into and be recommended using score information, the present invention is smoothed using sigmoid functions, end user's phase It is defined as follows like degree:
Wherein, CuvRepresent the article set that user u and user v is evaluated jointly.It is represented by above-mentioned formula, user's similarity meter Calculation can evaluate increasing for number of articles with user, be smoothly transitted into using score information, this seamlessly transit can improve The predictablity rate under cold start.
Algorithm description:
1) user's similarity algorithm is calculated:The first step calculates user property similarity according to customer attribute information;Second step According to user's score information, user's scoring similarity is calculated;Third walks:It is similar to user's scoring according to user property similarity Degree calculates end user's similarity.
2) proposed algorithm based on transfer learning:The first step calculates U1With UcBetween similaritySecond step calculates U2 With UcBetween similarityThird walks computation migration similarity4th step utilizes and migrates similarityWith reference to UCF algorithms are recommended.
Experiment:
Experimental data:
Experiment uses the data set of MovieLen web films.Shown in data set is described as follows.
1 Movielens data of table describe
Experimental data set divides as follows.
5 data set of table divides
Evaluation index:
For the prediction accuracy of measure algorithm, this experiment uses root-mean-square error RMSE (Root Mean Squared Error, RMSE) verify gap that prediction result obtained by inventive algorithm really scores with user.RMSE computational methods are as follows:
Wherein, ruiRepresent true scorings of the user u to article i, preuiRepresent that user u scores to the prediction of article i, T is Test set, | T | represent test set size.RMSE is smaller, illustrates that predicted value is nearer with actual value, and the accuracy rate of prediction result is got over It is high.
Compare algorithm:
1) UCF algorithms:It can only be recommended using user is intersected.
2) proposed algorithm (TSUCF) transmitted based on user's similitude:Utilize commenting for the less intersection user of proportion Divide information that the user of two different electric business is established contact, achievees the effect that recommendation as tie.
3) inventive algorithm:Inventive algorithm makes improvement on TSUCF algorithms, first, user property is made full use of to believe Breath second is that considering the otherness of the standards of grading of user, is commented using the scoring distribution consistency of common article to weigh user Divide similitude.
Experimental result:
Can have an impact in view of the size of nearest-neighbors number N to result, experiment respectively nearest-neighbors number for 5,10,20, 30th, algorithm comparison is carried out under the premise of 40.
2- Fig. 7 is, it is apparent that inventive algorithm equal can obtain under different nearest-neighbors numbers preferably pushes away from the graph Recommend effect.
In view of intersecting influence of the number of users to experimental result, experiment respectively intersect number of users for 95,189, 283rd, 377,471 times progress algorithm comparisons.
8- Figure 12 is, it is apparent that inventive algorithm can obtain best push away under different intersection numbers of users from the graph Recommend effect.
Inventive algorithm migrates the data of field of auxiliary using user property similarity and user's scoring similarity To solve the Sparse sex chromosome mosaicism of target domain, future is it is contemplated that joint project similarity or other knowledge, such as text envelope Breath, migrates the data of field of auxiliary, can improve the quality of migrating data in this way, recommends essence so as to improve Exactness.
Basic principle of the invention and main feature and advantages of the present invention has been shown and described above.The technology of the industry Personnel are it should be appreciated that the present invention is not limited to the above embodiments, and the above embodiments and description only describe this The principle of invention, without departing from the spirit and scope of the present invention, various changes and improvements may be made to the invention, these changes Change and improvement all fall within the protetion scope of the claimed invention.The claimed scope of the invention by appended claims and its Equivalent thereof.

Claims (1)

1. a kind of collaborative filtering based on comprehensive similarity migration, which is characterized in that include the following steps:
(1) proposed algorithm based on comprehensive similarity migration:If there are two platform e1And e2, U1It represents only in platform e1Middle presence is gone through The user of history behavioural information, U2It represents only in platform e2The middle user there are historical behavior information, UcIt represents in platform e1And e2In There is the user of historical behavior information, and be defined as intersecting user;In a practical situation, the quantity for intersecting user is far smaller than non- Intersect the quantity of user;It is non-crossing user U by intersecting user1And U2Similarity contact is set up, target is helped to lead with this Recommended in domain;
(2) similarity migrates:Non-crossing user U1With user U2Similitude can not be directly calculated, still, user U1With user U2Point Not with intersecting user UcSimilarity can calculate, so, can will intersect user UcUser U is established as tie1With with Family U2Similarity;
Similarity migration step:It finds out first and collects U with the common user of platform 1 and platform 2c;Then U is calculated respectively1With UcPhase Like property, it is denoted as vectorU2With UcSimilitude, be denoted asFinally calculateWithInner product, as U1And U2Transmission Similarity
Wherein, U11Represent the non-crossing user 1, U in platform 121、U22Represent the non-crossing user 1, U in platform 2c1、Uc2Wait tables Show and intersect user, S1、S2Deng expression similarity;If calculate U11With U21Between similarity, then can pass through Uc1、Uc2、Uc3It crosses It crosses, calculates indirectlyTo sum up, then U1And U2Between similarity calculation can form turn to:
(3) similarity calculation:Calculate non-crossing user U1With U2Similarity before, need to first calculate non-crossing user U1、U2Respectively Similarity with intersecting user, similarity calculation are as follows:
1) user's scoring similarity
User's scoring similarity is weighed herein by scoring distribution consistency, two aspect of confidence level;
Scoring distribution consistency is that the scoring distribution for the identical items evaluated by two users determines;Scoring distribution is more consistent, says The interest of bright two users is more similar;If { ur1,ur2,...,urn, { ur1,ur2,...,urnIt is respectively v couples of user u and user Two groups of data are carried out sort ascending, i.e. { ur by the scoring collection of common article respectively1,ur2,...,urn,If 1,2 ..., n and x1,x2,...,xnMatching degree it is bigger, then the consistency both shown is higher; Calculation formula is as follows;
Confidence level is that the quantity for the identical items evaluated according to two users determines, if quantity very little, even if scoring distribution one It causes, it is certain similar not to represent the two yet;Calculation formula is as follows;
Wherein, IuRepresent the article collection of user u evaluations;
User's scoring calculating formula of similarity is as follows;
sim1(u, v)=dist (u, v) conf (u, v) (1-4)
2) user property similarity
User property similarity is weighed according to user property;It is generally believed that possess the user of same alike result to a certain degree It is upper that there is similar interest;Calculation formula is as follows;
Wherein, n represent attribute number, sim (u, v, i) represent in ith attribute two users it is whether identical, such as it is identical, then for 1, on the contrary it is 0, diThe discrimination of ith attribute is represented, if the user with certain attribute scores to all items Then show that the attribute does not have discrimination, value is determined by different data collection;
3) final similarity
Under normal circumstances, after user judges something point, it should as possible using user to article score information, as user couple Certain article does not score, then should utilize customer attribute information as possible;When the number of articles that user is scored increases, algorithm should be put down Sliding be transitioned into is recommended using score information, is smoothed herein using sigmoid functions, end user's similarity is determined Justice is as follows:
Sim (u, v)=α sim1(u,v)+(1-α)sim2(u,v) (1-6)
Wherein, CuvRepresent the article set that user u and user v is evaluated jointly;It is represented by above-mentioned formula, user's similarity calculation meeting It as user evaluates increasing for number of articles, is smoothly transitted into using score information, this seamlessly transit can be improved cold Predictablity rate under starting state;
(4) algorithm description:
A user's similarity algorithm) is calculated:The first step calculates user property similarity according to customer attribute information;Second step according to User's score information calculates user's scoring similarity;Third walks:According to user property similarity and user's scoring similarity, meter Calculate end user's similarity;
B) the proposed algorithm based on transfer learning:The first step calculates U1With UcBetween similaritySecond step calculates U2With Uc Between similarityThird walks computation migration similarity4th step utilizes and migrates similarityIt is calculated with reference to UCF Method is recommended.
CN201810050004.9A 2018-01-18 2018-01-18 Collaborative filtering method based on comprehensive similarity migration Active CN108269172B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810050004.9A CN108269172B (en) 2018-01-18 2018-01-18 Collaborative filtering method based on comprehensive similarity migration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810050004.9A CN108269172B (en) 2018-01-18 2018-01-18 Collaborative filtering method based on comprehensive similarity migration

Publications (2)

Publication Number Publication Date
CN108269172A true CN108269172A (en) 2018-07-10
CN108269172B CN108269172B (en) 2020-02-18

Family

ID=62776114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810050004.9A Active CN108269172B (en) 2018-01-18 2018-01-18 Collaborative filtering method based on comprehensive similarity migration

Country Status (1)

Country Link
CN (1) CN108269172B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109977304A (en) * 2019-03-14 2019-07-05 四川长虹电器股份有限公司 A kind of TV programme suggesting method based on the migration of point of interest similarity
CN110968675A (en) * 2019-12-05 2020-04-07 北京工业大学 Recommendation method and system based on multi-field semantic fusion
CN112532627A (en) * 2020-11-27 2021-03-19 平安科技(深圳)有限公司 Cold start recommendation method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130134046A (en) * 2012-05-30 2013-12-10 전북대학교산학협력단 Cosine similarity based expert recommendation technique using hybrid collaborative filtering
CN106021298A (en) * 2016-05-03 2016-10-12 广东工业大学 Asymmetrical weighing similarity based collaborative filtering recommendation method and system
CN106708953A (en) * 2016-11-28 2017-05-24 西安电子科技大学 Discrete particle swarm optimization based local community detection collaborative filtering recommendation method
CN107329994A (en) * 2017-06-08 2017-11-07 天津大学 A kind of improvement collaborative filtering recommending method based on user characteristics

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130134046A (en) * 2012-05-30 2013-12-10 전북대학교산학협력단 Cosine similarity based expert recommendation technique using hybrid collaborative filtering
CN106021298A (en) * 2016-05-03 2016-10-12 广东工业大学 Asymmetrical weighing similarity based collaborative filtering recommendation method and system
CN106708953A (en) * 2016-11-28 2017-05-24 西安电子科技大学 Discrete particle swarm optimization based local community detection collaborative filtering recommendation method
CN107329994A (en) * 2017-06-08 2017-11-07 天津大学 A kind of improvement collaborative filtering recommending method based on user characteristics

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109977304A (en) * 2019-03-14 2019-07-05 四川长虹电器股份有限公司 A kind of TV programme suggesting method based on the migration of point of interest similarity
CN110968675A (en) * 2019-12-05 2020-04-07 北京工业大学 Recommendation method and system based on multi-field semantic fusion
CN110968675B (en) * 2019-12-05 2023-03-31 北京工业大学 Recommendation method and system based on multi-field semantic fusion
CN112532627A (en) * 2020-11-27 2021-03-19 平安科技(深圳)有限公司 Cold start recommendation method and device, computer equipment and storage medium
CN112532627B (en) * 2020-11-27 2022-03-29 平安科技(深圳)有限公司 Cold start recommendation method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN108269172B (en) 2020-02-18

Similar Documents

Publication Publication Date Title
CN111949887B (en) Article recommendation method, apparatus and computer readable storage medium
CN103218356B (en) A kind of enquirement quality judging method and system towards open platform
CN108269172A (en) Collaborative filtering based on comprehensive similarity migration
CN103632290B (en) A kind of based on the mixing recommendation method recommending probability fusion
CN103377296B (en) A kind of data digging method of many indexs evaluation information
CN104935963A (en) Video recommendation method based on timing sequence data mining
CN113378048B (en) Individualized recommendation method based on multi-view knowledge graph attention network
CN109871858A (en) Prediction model foundation, object recommendation method and system, equipment and storage medium
CN110263257A (en) Multi-source heterogeneous data mixing recommended models based on deep learning
CN107016122A (en) Knowledge recommendation method based on time-shift
CN106610970A (en) Collaborative filtering-based content recommendation system and method
CN110008377A (en) A method of film recommendation is carried out using user property
CN110503508A (en) A kind of item recommendation method of the more granularity matrix decompositions of level
CN109635206A (en) Merge the personalized recommendation method and system of implicit feedback and user's social status
CN111324807A (en) Collaborative filtering recommendation method based on trust degree
CN109840702A (en) A kind of new projects' collaborative recommendation method based on multi-core integration
CN108876536A (en) Collaborative filtering recommending method based on arest neighbors information
EP4116884A2 (en) Method and apparatus for training tag recommendation model, and method and apparatus for obtaining tag
CN113420221A (en) Interpretable recommendation method integrating implicit article preference and explicit feature preference of user
CN115525819A (en) Cross-domain recommendation method for information cocoon room
CN105574139B (en) It is a kind of that method and system are recommended based on strange make friends of social networks that double Attraction Degrees are calculated
CN110717318B (en) Intention-driven competition and cooperation intention adaptive content filling method
CN110942180A (en) Industrial design matching service party prediction method based on xgboost algorithm
CN106650972B (en) Recommendation system score prediction method based on cloud model and oriented to social network
CN114862514A (en) User preference commodity recommendation method based on meta-learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant