CN108269172B - Collaborative filtering method based on comprehensive similarity migration - Google Patents

Collaborative filtering method based on comprehensive similarity migration Download PDF

Info

Publication number
CN108269172B
CN108269172B CN201810050004.9A CN201810050004A CN108269172B CN 108269172 B CN108269172 B CN 108269172B CN 201810050004 A CN201810050004 A CN 201810050004A CN 108269172 B CN108269172 B CN 108269172B
Authority
CN
China
Prior art keywords
user
similarity
users
cross
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810050004.9A
Other languages
Chinese (zh)
Other versions
CN108269172A (en
Inventor
琚生根
孙界平
陈黎
夏欣
金玉
王婧研
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201810050004.9A priority Critical patent/CN108269172B/en
Publication of CN108269172A publication Critical patent/CN108269172A/en
Application granted granted Critical
Publication of CN108269172B publication Critical patent/CN108269172B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Evolutionary Computation (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Compared with the prior art, the collaborative filtering method based on the comprehensive similarity migration has the advantages that on similarity calculation, user scoring information is utilized, user attribute information is utilized, the difference of scoring standards of satisfaction among users is considered, the user scoring distribution consistency is adopted to measure the user scoring similarity, the accuracy of similarity calculation is improved, and the quality of data migration is improved. Experimental results show that compared with other methods, the model can effectively relieve the problem of data sparsity. In the future, the data in the auxiliary field can be migrated by considering the similarity of the joint project or other knowledge, such as text information, and the quality of the migrated data can be improved in such a way, so that the recommendation accuracy is improved.

Description

Collaborative filtering method based on comprehensive similarity migration
Technical Field
The invention relates to the field of network information calculation, in particular to a collaborative filtering method based on comprehensive similarity migration.
Background
At present, the amount of network information is exponentially increased, and network users can acquire rich information on one hand and face the problem of information overload on the other hand, so that information useful for the users is difficult to be mined from massive information. The recommendation system can screen out the interested part of the user from the mass data according to the user interest. Currently, recommendation systems are widely used, such as Amazon, eBay, MovieLens, GroupLens and other electronic commerce platforms.
The collaborative filtering technology is one of the most widely applied technologies in the recommendation system, and the basic idea is as follows: and predicting the interestingness of the user on the unscored items by utilizing the historical scoring data of the user, and selecting the items with the highest interestingness as recommendation results. For a traditional collaborative filtering algorithm, the most key step is to calculate the similarity between users or between articles, but as the data grows, the user scoring data is extremely sparse, and the recommendation quality is reduced.
At present, aiming at the data sparsity problem [ 1], there are the following solutions: the method has the advantages that firstly, the sparsity [ 2-4 ] of a data set is reduced by filling unscored articles, the algorithm is suitable for scenes that articles are not frequently updated and the number of the articles is far smaller than the number of users, and the cold start problem exists depending on user behaviors; secondly, sparsity [ 5 ] of a data set is reduced through matrix decomposition, the algorithm utilizes the potential relation between users and projects to carry out singular value decomposition on a scoring matrix, and the method is high in training cost and not suitable for changes of user interests; thirdly, learning of the target field is promoted by utilizing a transfer learning idea through a cross part between the field domains [ 6-7 ], and the algorithm achieves the aim of training the auxiliary target field by finding out the potential relation between the target field and the auxiliary field, so that the reliability degree of the potential relation is depended on, and if the reliability degree is unreliable, negative transfer is caused. Currently, some scholars propose to utilize multi-domain data to alleviate the target domain data sparseness problem. As Jamali [ 8-9 ] et al propose a context-based matrix decomposition model HeteroMF,
the method has the main idea that a plurality of matrixes are subjected to joint decomposition simultaneously by utilizing common entities among multiple fields and sharing characteristic factors of the entities, and an algorithm needs to train more parameters and consumes a large amount of time to calculate gradients; LiBin [ 10 ] et al propose a scoring matrix generation model rmgm (ratingmatrix generativemodel), whose main idea is to find a shared ranking matrix at an implicit cluster level and then fill the null value of the original matrix in the target field with this matrix, which uses a strongly related field and has no theoretical support; li Chao (10) et al propose a transfer similarity based transfer modulo (TSUCF)
User-based collaborative filtering) which is mainly thought of establishing a relationship between an auxiliary field and a target field by crossing field data to achieve the purpose of assisting the target field.
Although the above algorithms all adopt the knowledge in the auxiliary domain to improve the recommendation precision, the following disadvantages still exist: firstly, the model based on matrix transformation has more training parameters, secondly, the auxiliary field and the target field are required to meet strong correlation, and the applicable scenes of the model are few; and thirdly, when the scoring similarity of the user is calculated, the difference of the scoring standard of the user on the satisfaction is ignored.
The data sparsity problem is one of the major bottlenecks of the conventional collaborative filtering algorithm. The transfer learning generally refers to performing knowledge transfer on the auxiliary domain by using a potential relationship between the target domain and the auxiliary domain, so as to improve the recommendation quality of the target domain. The existing migration model based on similarity generally only utilizes user scoring information, and ignores user scoring standard difference in scoring similarity calculation. Aiming at the problems, the invention provides a recommendation method based on comprehensive similarity migration.
Reference documents:
[1]PanW。Asurveyoftransferlearningforcollaborativerecommendationwithauxiliarydata[J]。Neurocomputing,2016,177(C):447-453。
[2]LemireD,MaclachlanA。Slopeonepredictorsforonlinerating-basedcollaborativefiltering[C]//Proceedingsofthe2005SIAMInternationalConferenceonDataMining。SocietyforIndustrialandAppliedMathematics,2005:471-475。
[3]WangP,YeHW。Apersonalizedrecommendationalgorithmcombiningslopeoneschemeanduserbasedcollaborativefiltering[C]//IndustrialandInformationSystems,2009。IIS'09。InternationalConferenceon。IEEE,2009:152-154。
[4]SunZ,LuoN,KuangW。Onereal-timepersonalizedrecommendationsystemsbasedonSlopeOnealgorithm[C]//FuzzySystemsandKnowledgeDiscovery(FSKD),2011EighthInternationalConferenceon。IEEE,2011,3:1826-1830。
[5]SarwarB,KarypisG,KonstanJ,etal。Applicationofdimensionalityreductioninrecommendersystem-acasestudy[R]。MinnesotaUnivMinneapolisDeptofComputerScience,2000。
[6]LiB,YangQ,XueX。Transferlearningforcollaborativefilteringviaarating-matrixgenerativemodel[C]//InternationalConferenceonMachineLearning,ICML2009,Montreal,Quebec,Canada,June。DBLP,2009:617-624。
[7]PanW,XiangEW,LiuNN,etal。TransferLearninginCollaborativeFilteringforSparsityReduction[C]//AAAI。2010,10:230-235。
[8]JamaliM,LakshmananL。Heteromf:recommendationinheterogeneousinformationnetworksusingcontextdependentfactormodels[C]//Proceedingsofthe22ndinternationalconferenceonWorldWideWeb。ACM,2013:643-654。
[9]WuS,LiuQ,WangL,etal。Contextualoperationforrecommendersystems[J]。IEEETransactionsonKnowledgeandDataEngineering,2016,28(8):2000-2012。
[10]LiB,YangQ,XueX。Transferlearningforcollaborativefilteringviaarating-matrixgenerativemodel[C]//Proceedingsofthe26thannualinternationalconferenceonmachinelearning。ACM,2009:617-624。
[11] li Chao, Zhou Tao, Huangjunming, etc. And (4) a cross-platform cross recommendation algorithm [ J ] based on user similarity delivery. Chinese information report 2016,30(2): 90-98.
Disclosure of Invention
The present invention aims to solve the above problems and provide a collaborative filtering method based on comprehensive similarity migration.
The invention realizes the purpose through the following technical scheme:
the invention comprises the following steps:
(1) the recommendation method based on the comprehensive similarity migration comprises the following steps: is provided with two platforms e1And e2,U1Shown only on platform e1Users in whom historical behavior information exists, U2Shown only on platform e2Users in whom historical behavior information exists, UcIs shown on a platform e1And e2Users who have historical behavior information in the user group are defined as cross users; in practical cases, the number of cross users is much smaller than the number of non-cross users; by cross-user, as non-cross-user U1And U2Establishing similarity relation to help target field to recommend;
(2) And (3) similarity migration: non-cross user U1And user U2Similarity cannot be directly calculated, however, user U1And user U2Respectively with cross users UcCan be calculated, so that the cross users U can be crossedcCreating user U as a link1And user U2The similarity of (2);
and (3) similarity migration step: firstly, a common user set U with a platform 1 and a platform 2 is foundc(ii) a Then calculate U separately1And UcSimilarity of (2) is expressed as a vector
Figure GDA0002324416630000051
U2And UcIs recorded asFinal calculation
Figure GDA0002324416630000053
And
Figure GDA0002324416630000054
is the inner product of (1), namely U1And U2(ii) degree of similarity of transmission
Wherein, U11Representing non-intersecting users 1, U in platform 121、U22Representing non-intersecting users 1, U in a platform 2c1、Uc2Equal for cross-users, S1、S2Etc. represent similarity; if U is to be calculated11And U21The similarity between the two can pass through Uc1、Uc2、Uc3Transitional, indirect calculation
Figure GDA0002324416630000055
To sum up, then U1And U2The similarity calculation between them can be formalized as:
Figure GDA0002324416630000056
(3) and (3) similarity calculation: calculating non-cross user U1And U2Before the similarity, non-cross user U needs to be calculated1、U2The similarity with the cross users respectively is calculated as follows:
1) similarity of user scores
The user score similarity is measured through two aspects of score distribution consistency and credibility;
the consistency of the grading distribution is determined by the grading distribution of the same goods evaluated by two users; the more consistent the score distribution, the more similar the interests of the two users are; let { ur1,ur2,...,urn},{ur1,ur2,...,urnRespectively carrying out incremental ordering on two groups of data for scoring sets of common items by the user u and the user v, namely { ur }1,ur2,...,urn},
Figure GDA0002324416630000057
If 1,2, n and x1,x2,...,xnThe greater the matching degree of (A) is, the higher the consistency of the two is; the calculation formula is shown below;
Figure GDA0002324416630000058
the credibility is determined according to the quantity of the same articles evaluated by two users, and if the quantity is small, the two items are not similar to each other even if the grading distribution is consistent; the calculation formula is shown below;
Figure GDA0002324416630000061
wherein, IuAn item set representing a user u's rating;
the user score similarity calculation formula is shown below;
sim1(u,v)=dist(u,v)conf(u,v) (1-4)
2) user attribute similarity
The user attribute similarity is measured according to the user attribute; it is generally accepted that users with the same attributes have similar interests to some extent; the calculation formula is shown below;
Figure GDA0002324416630000062
wherein n represents the number of attributes, sim (u, v, i) represents whether two users are the same on the ith attribute, if so, the number is 1, otherwise, the number is 0, and diThe discrimination degree of the ith attribute is represented, if all the articles are scored by a user with a certain attribute, the attribute is not distinguished, and the value of the discrimination degree is determined by different data sets;
3) final degree of similarity
Generally, after a user scores an article, the user scoring information of the article should be utilized as much as possible, and when the user does not score the article, the user attribute information should be utilized as much as possible; when the number of the items scored by the user is increased, the method is smoothly transited to the step of recommending by using scoring information, a sigmoid function is used for smoothing, and the similarity of the final user is defined as follows:
sim(u,v)=αsim1(u,v)+(1-α)sim2(u,v) (1-6)
wherein, CuvA set of items representing a common evaluation of user u and user v; the formula shows that the user similarity calculation can be smoothly transited to the use scoring information along with the increase of the number of the articles evaluated by the user, and the smooth transition can improve the prediction accuracy rate in a cold start state;
(4) description of the method:
A) the method for calculating the user similarity comprises the following steps: the first step is that according to the user attribute information, the similarity of the user attributes is calculated; secondly, calculating the similarity of user scores according to the user score information; the third step: calculating the final user similarity according to the user attribute similarity and the user score similarity;
B) the recommendation method based on the transfer learning comprises the following steps: first step of calculating U1And UcSimilarity between them
Figure GDA0002324416630000071
Second step of calculating U2And UcSimilarity between them
Figure GDA0002324416630000072
The third step is to calculate the migration similarityThe fourth step utilizes migration similarity
Figure GDA0002324416630000074
And recommending by combining a UCF method.
The invention has the beneficial effects that:
compared with the prior art, the collaborative filtering method based on the comprehensive similarity migration has the advantages that on the aspect of similarity calculation, user scoring information is utilized, user attribute information is utilized, the difference of scoring standards of satisfaction among users is considered, the user scoring distribution consistency is adopted to measure the user scoring similarity, the accuracy of similarity calculation is improved, and the quality of data migration is improved. Experimental results show that compared with other methods, the model can effectively relieve the problem of data sparsity. In the future, the data in the auxiliary field can be migrated by considering the similarity of the joint project or other knowledge, such as text information, and the quality of the migrated data can be improved in such a way, so that the recommendation accuracy is improved.
Drawings
FIG. 1 is a diagram of a user scoring matrix of the present invention;
FIG. 2 is a schematic of the similarity migration of the present invention;
FIG. 3 is a graph comparing RMSE values for different methods of the invention under group A;
FIG. 4 is a graph comparing the RMSE values for the B set of different methods of the present invention;
FIG. 5 is a graph comparing the RMSE values for the C group of different methods of the present invention;
FIG. 6 is a comparison of the RMSE values for the D set of different methods of the present invention;
FIG. 7 is a graph comparing the RMSE values for the E set of different methods of the present invention;
FIG. 8 is a graph comparing the RMSE values of different methods under N-5 in accordance with the present invention;
FIG. 9 is a graph comparing the RMSE values of the different methods of the present invention, N10;
FIG. 10 is a graph comparing the RMSE values of the different methods of the present invention, N-20;
FIG. 11 is a graph comparing the RMSE values of the different methods of the present invention, N-30;
fig. 12 is a comparison graph of RMSE values for different methods of the present invention, N-40.
Detailed Description
The invention will be further described with reference to the accompanying drawings in which:
the recommendation method based on the comprehensive similarity migration comprises the following steps:
the invention provides a recommendation method based on comprehensive similarity migration, which is used for relieving the problem of data sparsity in a target field by using auxiliary field information.
The method of the present invention will be described below by taking two movie platforms as examples. Suppose there are two platforms e1And e2,U1Shown only on platform e1Users in whom historical behavior information exists, U2Shown only on platform e2Users in whom historical behavior information exists, UcIs shown on a platform e1And e2And defining the users with the historical behavior information as cross users. The user behavior matrix is shown in fig. 1.
In practical cases, the number of cross users is much smaller than the number of non-cross users.
In the traditional recommendation method, a cross user with a small proportion is used for recommending a non-cross user with a large proportion, so that the problems of cold start and data sparsity can occur, and the recommendation quality is low.
The method of the invention is to use the cross user as the non-cross user U1And U2Establishing similarity relationThereby helping the target domain to recommend.
And (3) similarity migration:
as shown in fig. 1, non-intersecting users U1And user U2Similarity cannot be directly calculated, however, user U1And user U2Respectively with cross users UcCan be calculated, so that the cross users U can be crossedcCreating user U as a link1And user U2The similarity of (c).
And (3) similarity migration step: firstly, a common user set U with a platform 1 and a platform 2 is foundc(ii) a Then calculate U separately1And UcSimilarity of (2) is expressed as a vector
Figure GDA0002324416630000095
U2And UcIs recorded as
Figure GDA0002324416630000096
Final calculation
Figure GDA0002324416630000097
And
Figure GDA0002324416630000098
is the inner product of (1), namely U1And U2(ii) degree of similarity of transmission
The similarity migration is shown in FIG. 2;
wherein, U11Representing non-intersecting users 1, U in platform 121、U22Representing non-intersecting users 1, U in a platform 2c1、Uc2Equal for cross-users, S1、S2And the like indicate the degree of similarity. If U is to be calculated11And U21The similarity between the two can pass through Uc1、Uc2、Uc3Transitional, indirect calculation
Figure GDA0002324416630000091
To sum up, then U1And U2The similarity calculation between them can be formalized as:
and (3) similarity calculation:
based on the above analysis, non-intersecting user U is calculated1And U2Before the similarity, non-cross user U needs to be calculated1、U2The similarity with the cross users respectively is calculated as follows:
1) similarity of user scores
The method measures the similarity of the user scores through two aspects of the consistency and the credibility of the score distribution.
The consistency of the score distribution is determined by the score distribution of the same item as evaluated by both users. The more uniform the score distribution, the more similar the interests of the two users. Let { ur1,ur2,...,urn},{ur1,ur2,...,urnRespectively carrying out incremental ordering on two groups of data for scoring sets of common items by the user u and the user v, namely { ur }1,ur2,...,urn},
Figure GDA0002324416630000093
If 1,2, n and x1,x2,...,xnThe greater the matching degree of (A) is, the higher the consistency of the two is. The calculation formula is as follows.
Figure GDA0002324416630000094
The credibility is determined according to the number of the same articles evaluated by two users, and if the number is small, even if the grading distribution is consistent, the two items are not necessarily similar. The calculation formula is as follows.
Figure GDA0002324416630000101
Wherein, IuAn item set representing the user u's rating.
The user score similarity calculation formula is shown below.
sim1(u,v)=dist(u,v)conf(u,v) (1-4)
2) User attribute similarity
User attribute similarity is measured in terms of user attributes. It is generally accepted that users with the same attributes have somewhat similar interests. The calculation formula is as follows.
Figure GDA0002324416630000102
Wherein n represents the number of attributes, sim (u, v, i) represents whether two users are the same on the ith attribute, if so, the number is 1, otherwise, the number is 0, and diAnd (3) the discrimination of the ith attribute is represented, if all the items are scored by a user with a certain attribute, the attribute is not distinguished, and the value of the discrimination is determined by different data sets.
3) Final degree of similarity
Generally, after a user scores an item, the user should use the item scoring information as much as possible, and when the user does not score the item, the user should use the user attribute information as much as possible. When the number of the items scored by the user is increased, the method should smoothly transit to the step of recommending by using scoring information, the sigmoid function is used for smoothing, and the similarity of the final user is defined as follows:
sim(u,v)=αsim1(u,v)+(1-α)sim2(u,v) (1-6)
Figure GDA0002324416630000103
wherein, CuvA collection of items representing a common evaluation of user u and user v. The user similarity calculation is smoothly transited to the use scoring information along with the increase of the number of the goods evaluated by the user, and the smooth transition can improve the prediction accuracy rate in the cold start state.
Description of the method:
1) the method for calculating the user similarity comprises the following steps: the first step is that according to the user attribute information, the similarity of the user attributes is calculated; secondly, calculating the similarity of user scores according to the user score information; the third step: and calculating the final user similarity according to the user attribute similarity and the user score similarity.
2) The recommendation method based on the transfer learning comprises the following steps: first step of calculating U1And UcSimilarity between them
Figure GDA0002324416630000111
Second step of calculating U2And UcSimilarity between themThe third step is to calculate the migration similarity
Figure GDA0002324416630000113
The fourth step utilizes migration similarity
Figure GDA0002324416630000114
And recommending by combining a UCF method.
Experiment:
experimental data:
the experiment used a dataset for the MovieLen movie website. The data set is described as follows.
Table 1 Movielens data description
Figure GDA0002324416630000115
Experimental data set partitioning is shown below.
TABLE 5 data set partitioning
Figure GDA0002324416630000116
Evaluation indexes are as follows:
in order to measure the prediction accuracy of the method, Root Mean Square Error (RMSE) is adopted in the experiment to verify the difference between the prediction result obtained by the method and the user real score.
The RMSE calculation method is as follows:
wherein r isuiRepresenting the true score, pre, of user u on item iuiAnd the prediction score of the user u on the item i is represented, T is a test set, and | T | represents the size of the test set. The smaller the RMSE is, the closer the predicted value is to the actual value is, and the higher the accuracy of the predicted result is.
The comparison method comprises the following steps:
1) the UCF method comprises the following steps: recommendations can only be made with cross users.
2) Recommendation method based on user similarity delivery (TSUCF): the scoring information of the cross users with a small proportion is used as a link to establish a connection between the users of two different e-commerce businesses, so that a recommendation effect is achieved.
3) The method comprises the following steps: the method of the invention makes improvements on the TSUCF method, firstly, the user attribute information is fully utilized, secondly, the difference of the scoring standards of the users is considered, and the scoring distribution consistency of the common goods is adopted to measure the scoring similarity of the users.
The experimental results are as follows:
considering that the size of the nearest neighbor number N has an influence on the result, the experiments respectively perform method comparison under the premise that the nearest neighbor number is 5, 10, 20, 30 and 40.
As is apparent from fig. 2 to 7, the method of the present invention can achieve better recommendation effect under different nearest neighbor numbers.
Considering the influence of the number of cross users on the experimental results, the experiments were compared under the conditions that the number of cross users is 95, 189, 283, 377 and 471, respectively.
As is apparent from fig. 8 to 12, the method of the present invention can achieve the best recommendation effect under different numbers of cross users.
According to the method, the data in the auxiliary field are migrated by utilizing the user attribute similarity and the user scoring similarity so as to solve the problem of data sparsity in the target field, the data in the auxiliary field can be migrated by considering the joint project similarity or other knowledge, such as text information, in the future, and the quality of the migrated data can be improved by the method, so that the recommendation accuracy is improved.
The foregoing shows and describes the general principles and features of the present invention, together with the advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (1)

1. A collaborative filtering method based on comprehensive similarity migration is characterized by comprising the following steps:
(1) the recommendation method based on the comprehensive similarity migration comprises the following steps: is provided with two platforms e1And e2,U1Shown only on platform e1Users in whom historical behavior information exists, U2Shown only on platform e2Users in whom historical behavior information exists, UcIs shown on a platform e1And e2Users who have historical behavior information in the user group are defined as cross users; in practical cases, the number of cross users is much smaller than the number of non-cross users; by cross-user, as non-cross-user U1And U2Establishing similarity relation to help the target field to recommend;
(2) and (3) similarity migration: non-cross user U1And user U2Similarity cannot be directly calculated, however, user U1And user U2Respectively with cross users UcCan be calculated, so that the cross users U can be crossedcCreating user U as a link1And user U2The similarity of (2);
and (3) similarity migration step: first, find and platform e1And a platform e2Public user set Uc(ii) a Then calculate U separately1And UcSimilarity of (2) is expressed as a vector
Figure FDA0002324416620000011
U2And UcIs recorded as
Figure FDA0002324416620000012
Final calculation
Figure FDA0002324416620000013
And
Figure FDA0002324416620000014
is the inner product of (1), namely U1And U2The transfer similarity of (2);
wherein, U11Presentation platform e1Non-intersecting users 1, U in21、U22Presentation platform e2Non-intersecting users 1, U inc1、Uc2Equal for cross-users, S1、S2Etc. represent similarity; if U is to be calculated11And U21The similarity between the two can pass through Uc1、Uc2、Uc3Transitional, indirect calculation
Figure FDA0002324416620000015
To sum up, then U1And U2The similarity calculation between them can be formalized as:
Figure FDA0002324416620000016
(3) and (3) similarity calculation: calculating non-cross user U1And U2Before the similarity, non-cross user U needs to be calculated1、U2The similarity with the cross users respectively is calculated as follows:
1) similarity of user scores
Measuring the similarity of user scores through two aspects of score distribution consistency and credibility;
the consistency of the score distribution isThe score distribution of the same items evaluated by two users; the more consistent the score distribution, the more similar the interests of the two users are; let { ur1,ur2,...,urn},
Figure FDA0002324416620000017
Respectively carrying out increasing ordering on two groups of data for the scoring sets of the common items of the user u and the user v, namely { ur1,ur2,...,urn},
Figure FDA0002324416620000021
If 1,2, n and x1,x2,...,xnThe greater the matching degree of (A) is, the higher the consistency of the two is; the calculation formula is as follows:
Figure FDA0002324416620000022
the credibility is determined according to the quantity of the same articles evaluated by two users, and if the quantity is small, the two items are not similar to each other even if the grading distribution is consistent; the calculation formula is as follows:
Figure FDA0002324416620000023
wherein, IuAn item set representing a user u's rating;
the user score similarity calculation formula is as follows:
sim1(u,v)=dist(u,v)conf(u,v) (1-4)
2) user attribute similarity
The user attribute similarity is measured according to the user attribute; users with the same attributes have similar interests to some extent; the calculation formula is as follows:
wherein n represents the number of attributes, sim (u, v, i) represents whether two users are the same on the ith attribute, if so, the number is 1, otherwise, the number is 0, and diThe discrimination degree of the ith attribute is represented, if all the articles are scored by a user with a certain attribute, the attribute is not distinguished, and the value of the discrimination degree is determined by different data sets;
3) final degree of similarity
After the user scores a certain article, the user is used for scoring the article, and when the user does not score the certain article, the user attribute information is used; when the number of the items scored by the user is increased, the method is smoothly transited to the step of recommending by using scoring information, the step of smoothing is performed by using a sigmoid function, and finally the similarity of the user is defined as follows:
sim(u,v)=αsim1(u,v)+(1-α)sim2(u,v) (1-6)
Figure FDA0002324416620000031
wherein, CuvA set of items representing a common evaluation of user u and user v; the user similarity calculation is smoothly transited to the use scoring information along with the increase of the number of the goods evaluated by the user, and the smooth transition can improve the prediction accuracy rate in the cold start state.
CN201810050004.9A 2018-01-18 2018-01-18 Collaborative filtering method based on comprehensive similarity migration Active CN108269172B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810050004.9A CN108269172B (en) 2018-01-18 2018-01-18 Collaborative filtering method based on comprehensive similarity migration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810050004.9A CN108269172B (en) 2018-01-18 2018-01-18 Collaborative filtering method based on comprehensive similarity migration

Publications (2)

Publication Number Publication Date
CN108269172A CN108269172A (en) 2018-07-10
CN108269172B true CN108269172B (en) 2020-02-18

Family

ID=62776114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810050004.9A Active CN108269172B (en) 2018-01-18 2018-01-18 Collaborative filtering method based on comprehensive similarity migration

Country Status (1)

Country Link
CN (1) CN108269172B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109977304A (en) * 2019-03-14 2019-07-05 四川长虹电器股份有限公司 A kind of TV programme suggesting method based on the migration of point of interest similarity
CN110968675B (en) * 2019-12-05 2023-03-31 北京工业大学 Recommendation method and system based on multi-field semantic fusion
CN112532627B (en) * 2020-11-27 2022-03-29 平安科技(深圳)有限公司 Cold start recommendation method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130134046A (en) * 2012-05-30 2013-12-10 전북대학교산학협력단 Cosine similarity based expert recommendation technique using hybrid collaborative filtering
CN106021298A (en) * 2016-05-03 2016-10-12 广东工业大学 Asymmetrical weighing similarity based collaborative filtering recommendation method and system
CN106708953A (en) * 2016-11-28 2017-05-24 西安电子科技大学 Discrete particle swarm optimization based local community detection collaborative filtering recommendation method
CN107329994A (en) * 2017-06-08 2017-11-07 天津大学 A kind of improvement collaborative filtering recommending method based on user characteristics

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130134046A (en) * 2012-05-30 2013-12-10 전북대학교산학협력단 Cosine similarity based expert recommendation technique using hybrid collaborative filtering
CN106021298A (en) * 2016-05-03 2016-10-12 广东工业大学 Asymmetrical weighing similarity based collaborative filtering recommendation method and system
CN106708953A (en) * 2016-11-28 2017-05-24 西安电子科技大学 Discrete particle swarm optimization based local community detection collaborative filtering recommendation method
CN107329994A (en) * 2017-06-08 2017-11-07 天津大学 A kind of improvement collaborative filtering recommending method based on user characteristics

Also Published As

Publication number Publication date
CN108269172A (en) 2018-07-10

Similar Documents

Publication Publication Date Title
CN112765486B (en) Knowledge graph fused attention mechanism movie recommendation method
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
Wu et al. Hierarchical personalized federated learning for user modeling
CN112364976B (en) User preference prediction method based on session recommendation system
CN104935963B (en) A kind of video recommendation method based on timing driving
CN108876537B (en) Mixed recommendation method for online marketplace system
US7949643B2 (en) Method and apparatus for rating user generated content in search results
CN109543840B (en) Dynamic recommendation system design method based on multidimensional classification reinforcement learning
CN108269172B (en) Collaborative filtering method based on comprehensive similarity migration
CN109711925A (en) Cross-domain recommending data processing method, cross-domain recommender system with multiple auxiliary domains
CN110033097B (en) Method and device for determining association relation between user and article based on multiple data fields
CN106294859A (en) A kind of item recommendation method decomposed based on attribute coupling matrix
WO2023284516A1 (en) Information recommendation method and apparatus based on knowledge graph, and device, medium, and product
CN111324807A (en) Collaborative filtering recommendation method based on trust degree
CN111125540A (en) Recommendation method integrating knowledge graph representation learning and bias matrix decomposition
WO2020119017A1 (en) System and method for achieving data asset sensing and pricing functions in big data background
CN115358809A (en) Multi-intention recommendation method and device based on graph comparison learning
CN115840853A (en) Course recommendation system based on knowledge graph and attention network
CN117556148B (en) Personalized cross-domain recommendation method based on network data driving
CN113836393B (en) Cold start recommendation method based on preference self-adaptive meta-learning
CN118071400A (en) Application method and system based on graph computing technology in information consumption field
Lei et al. Personalized Item Recommendation Algorithm for Outdoor Sports
CN116521996A (en) Multi-behavior recommendation method and system based on knowledge graph and graph convolution neural network
CN110851707B (en) Intelligent recommendation method for building material bidding platform
CN111460318B (en) Collaborative filtering recommendation method based on explicit and implicit trusts

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant