CN109840833B - Bayesian collaborative filtering recommendation method - Google Patents

Bayesian collaborative filtering recommendation method Download PDF

Info

Publication number
CN109840833B
CN109840833B CN201910112719.7A CN201910112719A CN109840833B CN 109840833 B CN109840833 B CN 109840833B CN 201910112719 A CN201910112719 A CN 201910112719A CN 109840833 B CN109840833 B CN 109840833B
Authority
CN
China
Prior art keywords
matrix
user
probability
distribution
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910112719.7A
Other languages
Chinese (zh)
Other versions
CN109840833A (en
Inventor
王邦军
戴欣
李凡长
张莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weihai Bohua Medical Equipment Co ltd
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201910112719.7A priority Critical patent/CN109840833B/en
Publication of CN109840833A publication Critical patent/CN109840833A/en
Application granted granted Critical
Publication of CN109840833B publication Critical patent/CN109840833B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a Bayesian collaborative filtering recommendation method. The invention discloses a Bayesian collaborative filtering recommendation method, which comprises the following steps: the input of the model is a scoring matrix of the collaborative filtering recommendation system
Figure DDA0001968828420000011
Decomposition into two potential matrices
Figure DDA0001968828420000012
Wherein for M K matrix UikRepresenting the probability, U, that user i belongs to group kikE (0, 1); for an N K matrix VjkEvidence that user group k likes item j, i.e. prediction score matrix R ═ UVT(ii) a Since the data set R is sparse, the observed entries can be represented by the set Ω { (i, j) | Rijis observed }; a probabilistic approach is taken to this problem; representing a likelihood function for the observed data and treating the potential matrix as a random variable; when it is assumed that each value of R is from the product of U and V, some Gaussian noise is added
Figure DDA0001968828420000013
The invention has the beneficial effects that: the user's taste is various, can not reflect the taste comparatively unanimously like little data set. A large amount of data are missing in the real data set, and if the evidence is insufficient and the values are difficult to predict, the values are predicted to be median values or average values, so that the recommendation significance is lost.

Description

Bayesian collaborative filtering recommendation method
Technical Field
The invention relates to the field of Internet, in particular to a Bayesian collaborative filtering recommendation method.
Background
The networking is appeared and popularized, a large amount of data can be easily obtained, but the large amount of data makes it difficult for users to directly obtain effective information in searching information, so that the use rate of the information is reduced. Therefore, a method for effectively solving the information overload problem using a recommendation system is very important, and the recommendation system will recommend contents according to the user's request, hobbies, and the like. Currently, recommendation systems have found wide application in many fields such as movies, music, shopping, social interactions, books, etc. The application is the most extensive, and one of the most effective personalized recommendation technologies is a collaborative filtering recommendation algorithm. Collaborative filtering is mainly divided into two categories, memory-based methods and model-based methods:
1. the memory-based method comprises the following steps: the method is mainly divided into user-based collaborative filtering and project-based collaborative filtering, and generally predicts by similar users (or similar projects) of a target user (or target project).
2. Model-based methods: the model is used for predicting the user score, namely model training parameters are firstly constructed, and once training is finished, the model-based recommendation system can predict the preference of the user very quickly. Therefore, when a large number of users and a large number of items exist, the model-based recommendation is strong in expandability and high in prediction speed. The model-based method mainly comprises the following steps: decision trees, rule-based models, bayesian methods and potential factor models.
Although reliable suggestions are provided by using a memory-based method, for keyword information which is not contacted by a user at all and new user information which cannot be recommended, a large number of ratings are needed to make reliable predictions, and real-world data is often very sparse. For this reason, many scholars have made more intensive research on model-based recommendation systems, and have proposed some improved methods and achieved some achievements, for example, incremental recommendation algorithm based on Probabilistic Latent Semantic Analysis (PLSA) for automatic question recommendation is used in question and answer websites, so that the problem of insufficient real-time performance of the recommendation system is improved; a recommendation algorithm based on a document topic generation model (LDA) is used in the blog, so that the recommendation precision of a user is improved; a recommendation algorithm combining a clustering algorithm and an SVD algorithm is used in an e-commerce recommendation system, so that the problem of data sparsity and the like is effectively solved.
The traditional technology has the following technical problems:
in existing models, matrix scoresThe solution technique has higher accuracy on sparse matrices. In the classical matrix decomposition model, the scores of the user set on the item set are expressed as a score matrix Rm×nWherein r isijRepresenting user uiFor article vjThe matrix is decomposed into two matrices: a U associated with a userm×kAnd another associated with the item Vk×nSo that their product approximates the original matrix: rm×n≈Um×k×Vk×n. The core idea is to connect users and items through implicit characteristics, and fill missing items by using a dimension reduction method, which has proved superior to the traditional nearest neighbor technology in the recommendation algorithm. In the method, each item is not classified into a category rigidly, but the weight of the item in each category is determined by counting user behaviors, and if users who like a certain category all like a certain item, the weight of the item in the category is possibly higher. The classical matrix decomposition is applied to collaborative filtering, and the problem of excessive sparsity of data is well solved.
Classical matrix factorization has two major drawbacks, one of which is that the matrix components of the decomposition are not constrained to be non-negative, which results in difficulty in understanding the predictive meaning of each component, and the recommendation system has no interpretability. Another disadvantage is that the general matrix decomposition is non-probabilistic, the solution being to minimize the error between the original matrix and the approximated matrix, i.e.:
Figure BDA0001968828400000021
this approach may more easily lead to overfitting and neglecting its uncertainty, with poor recommendations.
There are some inherent disadvantages of model-based recommendation systems in that when new items or items are rarely rated, it is difficult for a single model approach to provide sufficient evidence, resulting in a great impact on recommendation quality.
Disclosure of Invention
The invention aims to solve the technical problem of providing a Bayesian collaborative filtering recommendation method, which is difficult to explain the semantics of a negative number and a prediction result in a decomposition matrix, and solves the problem well by non-negative matrix decomposition, wherein the non-negative matrix decomposition restricts elements to be non-negative and can decompose a variable data set into a meaningful non-negative matrix. A Bayesian probability method is incorporated into nonnegative matrix decomposition, and the method is different in that two smaller matrixes are regarded as random variables, prior distribution is placed on the random variables, and posterior distribution of values of the prior distribution is found by observing data, so that overfitting can be greatly reduced, and convergence time is saved. Therefore, the hidden incidence relation among scoring data, users and projects in the scoring matrix is utilized, a nonnegative matrix decomposition algorithm based on a variational Bayesian probability model is combined with improved naive Bayesian classification, and a hidden Bayesian probability model recommendation algorithm is provided. And (3) obtaining a hidden user group from variational Bayes nonnegative matrix decomposition (BNMF), and carrying out initial prediction on a missing value, and further correcting by using improved naive Bayes on the basis to generate a recommendation result. The method fully considers multiple hidden relations between users, between users and projects and between projects, solves the problem of cold start of commodities and improves the accuracy of prediction results. Experimental results show that the recommendation quality is obviously improved by the algorithm.
In order to solve the technical problem, the invention provides a Bayesian collaborative filtering recommendation method, which comprises the following steps:
the input of the model is a scoring matrix of the collaborative filtering recommendation system
Figure BDA0001968828400000031
Decomposition into two potential matrices
Figure BDA0001968828400000032
Wherein for M K matrix UikRepresenting the probability, U, that user i belongs to group kikE (0, 1); for an N K matrix VjkEvidence that user group k likes item j, i.e. prediction score matrix R ═ UVT(ii) a Since the data set R is sparse, the observed entries can be represented by the set Ω { (i, j) | Rijis observed }; a probabilistic approach is taken to this problem; representing a likelihood function for the observed data and forming a potential matrixProcessing for random variables; when it is assumed that each value of R is from the product of U and V, some Gaussian noise is added
Figure BDA0001968828400000033
Namely:
R=UVT+E,
Figure BDA0001968828400000034
wherein U isi,VjI and j rows, R, representing U and VijObeying a gaussian distribution with a precision τ; the parameter set of our model is denoted θ ═ { U, V, τ };
according to Bayesian theorem, the observed dataset D ═ Rij}i,j∈ΩAs a priori, a distribution is then found for the parameter θ:
P(θ|D)∝P(D|θ)P(θ),
the posterior P (θ | D) is usually not calculated accurately, but a good approximation can be obtained by choosing a suitable a priori; in order to make the decomposed matrix values have interpretable meanings, U, V are constrained to be non-negative; the users and the users, and the commodities are independent from each other, so the indexes are selected from U and V in advance, so that each element in U and V is assumed to be independent index distribution and the speed parameter
Figure BDA0001968828400000041
Can be constrained to be non-negative at the same time; namely:
Figure BDA0001968828400000042
using α for precision τττGamma distribution > 0, i.e.:
Figure BDA0001968828400000043
approximating the posterior P (theta | D) by an approximation q (theta) in variational Bayes; according to the mean field principleIt is assumed that the variational distribution q (θ) is fully true, and therefore all variables are independent in the a posteriori, i.e.:
Figure BDA0001968828400000044
the following distribution was obtained using bayes' theorem:
Figure BDA0001968828400000045
Figure BDA0001968828400000046
Figure BDA0001968828400000047
wherein
Figure BDA0001968828400000048
Approximation function q (theta)i) Obey the following distribution:
Figure BDA0001968828400000051
Figure BDA0001968828400000057
Figure BDA0001968828400000052
by minimizing the KL divergence, the approximation function q (θ) is approximated to the posterior P (θ | D):
Figure BDA0001968828400000053
Figure BDA0001968828400000054
to minimize KL divergence, only the lower evidence bound (ELBQ) L (q) needs to be maximized, so that an approximate solution of the posterior p (θ | D) can be obtained; i.e. it can be found (for a certain constant C) that the ith q is*i) Then sequentially updates other thetaiAnd finally, mutual iteration is stable, so that the optimal update of the variation parameters can be found, and the algorithm ensures the maximization (ELBO) of the lower bound of the evidence:
Figure BDA0001968828400000058
adding an auto-correlation determination (ARD) method, without selecting the correct k, but giving an upper bound, the model will automatically determine the number of factors to use; each parameter of the prior of the decomposition matrix is replaced by one shared by all items in the same column, namely each factor is shared, and the prior of the decomposition matrix is divided into a plurality of items at lambdakPlacing a gamma prior; the prior distribution becomes:
Figure BDA0001968828400000056
naive bayes classification:
assuming D is a sample data set, n attributes A for each sample X in D1,A2,…AnExpressed as X ═ X with n-dimensional feature vectors1,x2,…,xn](ii) a Suppose a sample has m classes (e.g. score full 5, i.e. there are 5 classes), each class is respectively represented by C1,C2,…CmRepresents;
according to Bayes' theorem
Figure BDA0001968828400000055
For a sample X to be classified, the respective class C in D can be derived under the condition that X appearsiThe probability of occurrence; comparing the posterior probability of class occurrence, selectingThe category in which the probability is greatest; since p (X) is constant for all classes, if and only if the prior probability p (X | C)i)p(Ci) At maximum, the posterior probability p (C)i| X) max; in order to reduce the overhead and realize effective estimation, the classes and attributes are assumed to be independent from each other, that is, only the following are considered:
Figure BDA0001968828400000061
Figure BDA0001968828400000062
suppose that
Figure BDA0001968828400000067
Representing C in the training set DiAnd (3) the class prior probability can be obtained through the collection of class samples:
Figure BDA0001968828400000063
for discrete attributes, assume
Figure BDA0001968828400000064
To represent
Figure BDA0001968828400000068
In AkAttribute value of xkThe conditional probability is then:
Figure BDA0001968828400000065
for the continuous attribute, a probability density function can be considered, and the continuous attribute is discretized;
corresponding influence factors are adopted for different users and attributes according to the importance, and the weighted naive Bayes model is improved:
Figure BDA0001968828400000066
where ρ isiRepresenting user uiWeight of (a), ωkRepresents attribute AkThe weight of (c); the weight value is larger, namely the influence is larger, and the weight value is calculated by using the information entropy;
"hidden" in HBPM is embodied in a hidden user group K obtained from the U matrix in BNMF; multiplying the U matrix V matrix to obtain a prediction scoring matrix, and obtaining a part of hidden but reliable prediction scoring from the prediction scoring matrix; and finally, correcting by using improved naive Bayes in combination with the attributes to obtain a final prediction result.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the methods when executing the program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of any of the methods.
A processor for running a program, wherein the program when running performs any of the methods.
The invention has the beneficial effects that:
in reality, the user's hobby is various, can not reflect the taste comparatively unanimously like little data set. A large amount of data are missing in the real data set, and if the evidence is insufficient and the values are difficult to predict, the values are predicted to be median values or average values, so that the recommendation significance is lost. Tests have been conducted to find that for items in the data set that have never been scored by the user or for which very few users have scored, the matrix factorization method does not find enough evidence to predict a preference or dislike, which can lead to cold start problems for the goods. Therefore, potential relations and apparent relations among the user commodities are fully considered, the commodity attribute is added to be combined with the scoring matrix, and the prediction scoring matrix is corrected to a certain extent, so that time is saved, and accuracy of a prediction result is improved.
Drawings
FIG. 1 is a schematic diagram of a BNMF probability model in the Bayesian collaborative filtering recommendation method of the present invention.
FIG. 2 is a schematic diagram of an HBPM probability model in the Bayesian collaborative filtering recommendation method of the invention.
Detailed Description
The present invention is further described below in conjunction with the following figures and specific examples so that those skilled in the art may better understand the present invention and practice it, but the examples are not intended to limit the present invention.
Our model is mainly composed of two parts. The first part mainly acquires hidden information through BNMF, and the second part combines the hidden information and the explicit information and uses an improved naive Bayes classifier.
Variational Bayesian nonnegative matrix factorization
The input of the model is a scoring matrix of the collaborative filtering recommendation system
Figure BDA0001968828400000071
Decomposition into two potential matrices
Figure BDA0001968828400000072
Wherein for M K matrix UikRepresenting the probability, U, that user i belongs to group kikE (0, 1); for an N K matrix VjkEvidence that user group k likes item j, i.e. prediction score matrix R ═ UVT. Since the data set R is sparse, the observed entries can be represented by the set Ω { (i, j) | Rijis updated }. We take a probabilistic approach to this problem. We represent a likelihood function for the observed data and treat the latent matrix as a random variable. When we assume that each value of R comes from the product of U and V, some Gaussian noise is added
Figure BDA0001968828400000081
Namely:
R=UVT+E,
Figure BDA0001968828400000082
wherein U isi,VjI and j rows, R, representing U and VijObeying a gaussian distribution with an accuracy of τ. The parameter set of our model is denoted as θ ═ { U, V, τ }.
According to Bayesian theorem, the observed dataset D ═ Rij}i,j∈ΩAs a priori, a distribution is then found for the parameter θ:
P(θ|D)∝P(D|θ)P(θ),
the a posteriori P (θ | D) is usually not calculated accurately, but a good approximation can be obtained by choosing a suitable a priori. In order to make the decomposed matrix values interpretable, U, V are constrained to be non-negative. The users and the users, and the commodities are independent from each other, so the indexes are selected from U and V in advance, so that each element in U and V is assumed to be independent index distribution and the speed parameter
Figure BDA0001968828400000083
And can also be constrained to be non-negative. Namely:
Figure BDA0001968828400000084
for precision τ we use αττGamma distribution > 0, i.e.:
Figure BDA0001968828400000085
the posterior P (theta | D) is approximated by an approximation q (theta) in variational Bayes. From mean field theory, we assume that the variational distribution q (θ) is fully true, so all variables are independent in the posteriori, i.e.:
Figure BDA0001968828400000086
using bayes' theorem we have the following distribution:
Figure BDA0001968828400000091
Figure BDA0001968828400000092
Figure BDA0001968828400000093
wherein
Figure BDA0001968828400000094
Approximation function q (theta)i) Obey the following distribution:
Figure BDA0001968828400000095
Figure BDA0001968828400000096
Figure BDA0001968828400000097
by minimizing the KL divergence, the approximation function q (θ) is approximated to the posterior P (θ | D):
Figure BDA0001968828400000098
Figure BDA0001968828400000099
to minimize KL divergence, only the lower evidence bound (ELBQ) L (q) needs to be maximized, so that an approximate solution for the posterior p (θ | D) can be obtained. I.e. it can be found (for a certain constant C) that the ith q is*i) Then sequentially updates other thetaiFinally, the mutual iteration is stabilized, so that variation parameters can be foundOptimal update of numbers, this algorithm guarantees maximization of the lower bound of Evidence (ELBO):
Figure BDA00019688284000000910
the selection of potential factors K in matrix decomposition also has great influence on the prediction of the result, and an automatic correlation determination (ARD) method is added, so that the model automatically determines the number of the factors to be used without selecting the correct K but giving an upper limit. Each parameter of the prior of the decomposition matrix is replaced by one shared by all items in the same column, namely each factor is shared, and the prior of the decomposition matrix is divided into a plurality of items at lambdakA gamma prior is placed on it. The prior distribution becomes:
Figure BDA0001968828400000101
as shown in fig. 1, the probability model for BNMF is:
naive Bayes classification
Assuming D is a sample data set, n attributes A for each sample X in D1,A2,…AnExpressed as X ═ X with n-dimensional feature vectors1,x2,…,xn]. Suppose a sample has m classes (e.g. score full 5, i.e. there are 5 classes), each class is respectively represented by C1,C2,…CmAnd (4) showing.
According to Bayes' theorem
Figure BDA0001968828400000102
For a sample X to be classified, the respective class C in D can be derived under the condition that X appearsiThe probability of occurrence. Comparing the posterior probabilities of the occurrence of the categories, and selecting the category with the highest probability. Since p (X) is constant for all classes, if and only if the prior probability p (X | C)i)p(Ci) At maximum, the posterior probability p (C)i| X) is largest. In order to reduce the overhead and realize effective estimation, the classes and attributes are assumed to be independent from each other, that is, only the following are considered:
Figure BDA0001968828400000103
Figure BDA0001968828400000104
suppose that
Figure BDA0001968828400000109
Representing C in the training set DiAnd (3) the class prior probability can be obtained through the collection of class samples:
Figure BDA0001968828400000105
for discrete attributes, assume
Figure BDA0001968828400000106
To represent
Figure BDA0001968828400000107
In AkAttribute value of xkThe conditional probability is then:
Figure BDA0001968828400000108
for continuous attributes, the continuous attributes may be discretized in view of a probability density function.
If the classification prediction is carried out by directly using the naive Bayes, the effect is not ideal. The first original data is too sparse, and reliable data is insufficient; the second data has large calculation amount and high memory requirement; the third data is too noisy and not robust. Ordinary naive Bayes considers that all data and condition attributes have the same influence on classification, actually the preference of other people or the preference of similar users has lower importance on classification than the preference of the user, and the influence of different user attributes and item attributes on classification is different. In order to reduce the influence of different users and attributes on classification, corresponding influence factors can be adopted for different users and attributes according to importance, and improvement is carried out on a weighted naive Bayes model:
Figure BDA0001968828400000111
where ρ isiRepresenting user uiWeight of (a), ωkRepresents attribute AkThe weight of (c). The larger the weight value of the weighted value is, the larger the influence is, and the weighted value is calculated by using the information entropy.
Our model HBPM as shown in fig. 2, the "hidden" in HBPM is embodied in a hidden user group K obtained from the U matrix in BNMF; the U matrix V matrix is multiplied to obtain a prediction score matrix from which a portion of the hidden but reliable prediction scores are obtained. And finally, correcting by using improved naive Bayes in combination with the attributes to obtain a final prediction result. The algorithm flow for HBPM is shown in algorithm 1.
A specific application scenario of the present invention is described below:
we present our model in a simple and intuitive way. As shown in table 1, is a small dataset rating matrix with 11 users and 15 items, M11 and N15. The numbers indicate the user's rating of the item, with higher ratings indicating a greater preference, and prime' indicating missing data that has not been scored. From the figure we can clearly see that the user group with the same preference is { U }1、U2、U3、U10}、{U4、U5、U6、U11}、{U7、U8、U9}. From the rating matrix, we can visually observe the user U5May like item I8But cannot directly observe the user U5To I4The attitude of (c). Table 2 shows the U matrix (with potential factor k being 3) decomposed by the BNMF method, which has certain interpretable significance, and each item U of the matrixikThe evidence that user i belongs to user group k is shown, and the user groups can be divided into 3 types through decomposed sub-matrixes: { U1、U2、U3、U10}、{U4、U5、U6、U11}、{U7、U8、U9I.e. using sublotsHidden factors in the array are clustered, and users with similar preferences are divided into a group.
TABLE 1 user rating matrix
Figure BDA0001968828400000121
Table 3 shows the V matrix decomposed by the BNMF method, each term V of the matrixjkShowing evidence that the users in the kth group like the respective items, it can be seen that the users in the first group like the item { I }1,I2,I3,I4,I5}, users in group 2 like item { I6,I7,I8,I9,I10}, users in group 3 like { I7,I11,I12,I13,I14,I15}。
TABLE 2U matrix after BNMF decomposition
F1 F2 F3 cluster
U1 0.82 0.07 0.06 1
U2 0.80 0.09 0.17 1
U3 0.72 0.18 0.35 1
U4 0.11 0.84 0.08 2
U5 0.26 0.76 0.23 2
U6 0.10 0.82 0.18 2
U7 0.44 0.11 0.67 3
U8 0.10 0.11 0.72 3
U9 0.32 0.36 0.60 3
U10 0.75 0.12 0.35 1
U11 0.09 0.81 0.21 2
TABLE 3 BNMF decomposed V-matrices
I1 I2 I3 I4 I5 I6 I7 I8 I9 I10 I11 I12 I13 I14 I15
G1 5.61 6.04 5.73 5.58 5.82 0.64 1.77 0.73 0.59 4.50 0.74 0.96 1.35 1.39 0.91
G2 2.31 0.73 0.71 2.26 0.75 5.78 4.80 5.58 5.86 4.90 2.27 0.70 4.21 2.30 2.31
G3 1.57 0.84 1.86 1.70 1.43 1.04 4.89 1.72 0.73 2.21 6.40 5.82 5.68 5.59 6.31
Compared with the prediction result of the classical matrix decomposition, the prediction method not only restricts the data to be non-negative to make the prediction more reasonable, but also well predicts most missing data (the bold place is known data), such as U in the classical matrix decomposition9For item { I1,I2,I3,I4,I5All show a comparative like, but there is not enough evidence to show that these items are like, whereas BNMF prediction of 3 points (middle value) is relatively more reasonable. But if the mean value or the median value is assigned to the items with insufficient evidence in the recommendation system in order to improve the MAE, the meaning of the recommendation is lost. On the basis of the above, the user wants to align U9For item { I1,I2,I3,I4,I5The preferences of the project are predicted more accurately, requiring the potential connections between the projects to be seen. The distinction and connection of likes and dislikes in the scored item, extended to likes or dislikes of the attribute, is observed by U9 in conjunction with the item attribute. Here, the user's preference for items is predicted more accurately using an improved naive bayesian classification.
Our experiments were also performed on real data sets. We used Movielens 100k and Movielens 1M as experimental datasets, both of which were user ratings of projects by rating the movies (score range: an integer of 1-5), with at least 20 movies rated by each registered user. The data set also provides a user profile including gender, age, occupation, zip code; a movie property file, i.e. the type of movie, a movie has at least one property and at most 6 properties. In the experiment, a user project score file and a movie attribute file are used for evaluating the technology, the experiment data set is divided into two groups, wherein 80% of the experiment data set is used as a training set (HBPM is used), 20% of the experiment data set is used as a test set, each group of experiments are repeated for 30 times, and the average value of each performance index is used for detecting the accuracy of a prediction result.
For prediction accuracy, the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) are used as performance indexes for evaluating the prediction accuracy of the user, namely the prediction accuracy is measured by calculating the deviation between the user score predicted by the model and the actual score. Test set actual user score of TijCorresponding predicted user score of PijAnd n is the size of the test set.
Figure BDA0001968828400000141
The recommended accuracy on the test set is verified. When predicted item score PijAnd when the number is more than or equal to 4, the recommendation can be considered to be recommended to the user. PR={i,j|Pij≧ 4} represents a set, T, that can be recommended to the user in the prediction resultR={i,j|Tij≧ 4} represents a set of user preferences in the test set, Pd={i,j|Pij3} represents a non-recommendable set of predictions, Td={i,j|Tij≦ 3} non-recommendable set in the test set.
Figure BDA0001968828400000142
In table 4, we compared our experiments with other methods. Different similarity measures have a large impact on the results of the K-nearest neighbor recommendation system (KNN), where pearson correlation coefficients are used. It can be seen that the BNMF algorithm is significantly improved in MAE and accuracy compared to the non-negative matrix factorization recommendation system (BNMF) and the Classical matrix factorization recommendation system (classic MF) using only KNN of the scoring information. A multi-level hybrid similarity recommendation system (usicf), an improved na iotave bayes recommendation system (INB-CF), and a hybrid multi-tag recommendation system (DRA-HMLF), all add user information or item information based on usage scoring information. Usicf improves the similarity algorithm and uses the movie attributes to predict user interests, which, while more effective than KNN, ignores potential connections between users. The INB-CF algorithm utilizes the user and movie attributes in combination with naive Bayesian classification, but the classification effect is poor due to the fact that data is too sparse and evidence is insufficient. The DRA-HMLF algorithm combines a similarity algorithm and matrix decomposition, and only carries out clustering through user attributes but ignores user scoring behaviors. According to the method, variational Bayes is integrated to improve classical matrix decomposition, so that initial prediction of a scoring matrix is more stable and accurate, and commodity information is integrated to parts which cannot be accurately predicted due to insufficient evidence in matrix decomposition, so that recommendation accuracy is improved.
Table 4 comparison of our experiments with other methods
Figure BDA0001968828400000151
The above-mentioned embodiments are merely preferred embodiments for fully illustrating the present invention, and the scope of the present invention is not limited thereto. The equivalent substitution or change made by the technical personnel in the technical field on the basis of the invention is all within the protection scope of the invention. The protection scope of the invention is subject to the claims.

Claims (4)

1. A Bayesian collaborative filtering recommendation method is characterized by comprising the following steps:
the input of the model is a scoring matrix of the collaborative filtering recommendation system
Figure FDA0002625422780000011
Decomposition into two potential matrices
Figure FDA0002625422780000012
Wherein for M K matrix UikRepresenting the probability, U, that user i belongs to group kikE (0, 1); for an N K matrix VjkIndicating that user group k likes item jEvidence, i.e. prediction scoring matrix R ═ UVT(ii) a Since the data set R is sparse, the observed entries can be represented by the set Ω { (i, j) | Rijis observed }; a probabilistic approach is taken to this problem; representing a likelihood function for the observed data and treating the potential matrix as a random variable; when it is assumed that each value of R is from the product of U and V, some Gaussian noise is added
Figure FDA0002625422780000013
Namely:
R=UVT+E,
Figure FDA0002625422780000014
wherein U isi,VjI and j rows, R, representing U and VijObeying a gaussian distribution with a precision τ; the parameter set of our model is denoted θ ═ { U, V, τ };
according to Bayesian theorem, the observed dataset D ═ Rij}i,j∈ΩAs a priori, a distribution is then found for the parameter θ:
P(θ|D)∝P(D|θ)P(θ),
the posterior P (θ | D) is usually not calculated accurately, but a good approximation can be obtained by choosing a suitable a priori; in order to make the decomposed matrix values have interpretable meanings, U, V are constrained to be non-negative; the users and the users, and the commodities are independent from each other, so the indexes are selected from U and V in advance, so that each element in U and V is assumed to be independent index distribution and the speed parameter
Figure FDA0002625422780000015
Can be constrained to be non-negative at the same time; namely:
Figure FDA0002625422780000016
for precision τBy using alphaττGamma distribution > 0, i.e.:
Figure FDA0002625422780000021
approximating the posterior P (theta | D) by an approximation q (theta) in variational Bayes; according to mean field theory, it is assumed that the variational distribution q (θ) holds exactly, so all variables are independent in the a posteriori, i.e.:
Figure FDA0002625422780000022
the following distribution was obtained using bayes' theorem:
Figure FDA0002625422780000023
Figure FDA0002625422780000024
Figure FDA0002625422780000025
wherein
Figure FDA0002625422780000026
Approximation function q (theta)i) Obey the following distribution:
Figure FDA0002625422780000027
Figure FDA0002625422780000028
Figure FDA0002625422780000029
by minimizing the KL divergence, the approximation function q (θ) is approximated to the posterior P (θ | D):
Figure FDA00026254227800000210
Figure FDA00026254227800000211
to minimize KL divergence, only the lower evidence bound L (q) needs to be maximized, so that an approximate solution for a posteriori p (θ | D) can be obtained; i.e. the ith q can be found*i) Then sequentially updates other thetaiAnd finally, mutual iteration is stable, so that the optimal update of the variation parameters can be found, and the algorithm ensures the maximization of the lower bound of the evidence:
Figure FDA00026254227800000212
adding an automatic relevance determination method, without selecting correct k, giving an upper limit, and automatically determining the number of factors to be used by a model; each parameter of the prior of the decomposition matrix is replaced by one shared by all items in the same column, namely each factor is shared, and the prior of the decomposition matrix is divided into a plurality of items at lambdakPlacing a gamma prior; the prior distribution becomes:
Uik~(Uikk)Vik~(Vjkk)
Figure FDA0002625422780000031
naive bayes classification:
assuming D is a sample data set, n attributes A for each sample X in D1,A2,…AnExpressed as X ═ X with n-dimensional feature vectors1,x2,…,xn](ii) a Suppose a sample has m classes, each class is represented by C1,C2,…CmRepresents;
according to Bayes' theorem
Figure FDA0002625422780000032
For a sample X to be classified, the respective class C in D can be derived under the condition that X appearsiThe probability of occurrence; comparing the posterior probabilities of the occurrence of the categories, and selecting the category with the highest probability; since p (X) is constant for all classes, if and only if the prior probability p (X | C)i)p(Ci) At maximum, the posterior probability p (C)i| X) max; in order to reduce the overhead and realize effective estimation, the classes and attributes are assumed to be independent from each other, that is, only the following are considered:
Figure FDA0002625422780000033
Figure FDA0002625422780000034
suppose that
Figure FDA0002625422780000039
Representing C in the training set DiAnd (3) the class prior probability can be obtained through the collection of class samples:
Figure FDA0002625422780000035
for discrete attributes, assume
Figure FDA0002625422780000036
To represent
Figure FDA0002625422780000037
In AkAttribute value of xkThe conditional probability is then:
Figure FDA0002625422780000038
for the continuous attribute, a probability density function can be considered, and the continuous attribute is discretized;
corresponding influence factors are adopted for different users and attributes according to the importance, and the weighted naive Bayes model is improved:
Figure FDA0002625422780000041
where ρ isiRepresenting user uiWeight of (a), ωkRepresents attribute AkThe weight of (c); the weight value is larger, namely the influence is larger, and the weight value is calculated by using the information entropy;
"hidden" in HBPM is embodied in a hidden user group K obtained from the U matrix in BNMF; multiplying the U matrix V matrix to obtain a prediction scoring matrix, and obtaining a part of hidden but reliable prediction scoring from the prediction scoring matrix; and finally, correcting by using improved naive Bayes in combination with the attributes to obtain a final prediction result.
2. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method of claim 1 are performed when the program is executed by the processor.
3. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method as claimed in claim 1.
4. A processor, characterized in that the processor is configured to run a program, wherein the program when running performs the method of claim 1.
CN201910112719.7A 2019-02-13 2019-02-13 Bayesian collaborative filtering recommendation method Active CN109840833B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910112719.7A CN109840833B (en) 2019-02-13 2019-02-13 Bayesian collaborative filtering recommendation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910112719.7A CN109840833B (en) 2019-02-13 2019-02-13 Bayesian collaborative filtering recommendation method

Publications (2)

Publication Number Publication Date
CN109840833A CN109840833A (en) 2019-06-04
CN109840833B true CN109840833B (en) 2020-11-10

Family

ID=66884735

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910112719.7A Active CN109840833B (en) 2019-02-13 2019-02-13 Bayesian collaborative filtering recommendation method

Country Status (1)

Country Link
CN (1) CN109840833B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11869015B1 (en) 2022-12-09 2024-01-09 Northern Trust Corporation Computing technologies for benchmarking

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738242B (en) * 2019-09-25 2021-08-10 清华大学 Bayes structure learning method and device of deep neural network
CN110993004B (en) * 2019-11-12 2021-08-20 华中科技大学 Naive Bayes classification method, engine and system based on resistive random access memory array
CN113457128B (en) * 2020-03-15 2023-12-08 腾讯科技(深圳)有限公司 Method and device for predicting game role-to-odds ratio
CN111428145B (en) * 2020-03-19 2022-12-27 重庆邮电大学 Recommendation method and system fusing tag data and naive Bayesian classification
CN113158039A (en) * 2021-04-06 2021-07-23 深圳先进技术研究院 Application recommendation method, system, terminal and storage medium
CN113433893A (en) * 2021-07-12 2021-09-24 东北大学 Robot servo system performance index calibration method based on backtracking Bayes

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103390032A (en) * 2013-07-04 2013-11-13 上海交通大学 Recommendation system and method based on relationship type cooperative topic regression
CN108154380A (en) * 2017-04-28 2018-06-12 华侨大学 The method for carrying out the online real-time recommendation of commodity to user based on extensive score data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103390032A (en) * 2013-07-04 2013-11-13 上海交通大学 Recommendation system and method based on relationship type cooperative topic regression
CN108154380A (en) * 2017-04-28 2018-06-12 华侨大学 The method for carrying out the online real-time recommendation of commodity to user based on extensive score data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"基于改进贝叶斯概率模型的推荐算法";刘付勇等;《计算机科学》;20170515;第44卷(第5期);第285-289页 *
"基于贝叶斯网络的协同过滤推荐算法";曹向前等;《软件导刊》;20150715;第14卷(第7期);第64-65页 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11869015B1 (en) 2022-12-09 2024-01-09 Northern Trust Corporation Computing technologies for benchmarking

Also Published As

Publication number Publication date
CN109840833A (en) 2019-06-04

Similar Documents

Publication Publication Date Title
CN109840833B (en) Bayesian collaborative filtering recommendation method
Bansal et al. Ask the gru: Multi-task learning for deep text recommendations
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
kumar Bokde et al. Role of matrix factorization model in collaborative filtering algorithm: A survey
US20020107853A1 (en) System and method for personalized search, information filtering, and for generating recommendations utilizing statistical latent class models
Ortega et al. Providing reliability in recommender systems through Bernoulli matrix factorization
EP2860672A2 (en) Scalable cross domain recommendation system
Deodhar et al. A framework for simultaneous co-clustering and learning from complex data
Zhang et al. Aggregated recommendation through random forests
Pessiot et al. Learning to Rank for Collaborative Filtering.
Bhavana et al. Block based singular value decomposition approach to matrix factorization for recommender systems
Zhang et al. Hybrid recommender system using semi-supervised clustering based on Gaussian mixture model
Duan et al. A hybrid intelligent service recommendation by latent semantics and explicit ratings
Liphoto et al. A survey on recommender systems
Chen et al. A fuzzy matrix factor recommendation method with forgetting function and user features
Xu et al. A hybrid approach to three-way conversational recommendation
Zheng et al. Hierarchical collaborative embedding for context-aware recommendations
Guimarães et al. Guard: A genetic unified approach for recommendation
Zhang et al. Probabilistic matrix factorization recommendation of self-attention mechanism convolutional neural networks with item auxiliary information
Rafailidis A Multi-Latent Transition model for evolving preferences in recommender systems
Wang et al. Multi‐feedback Pairwise Ranking via Adversarial Training for Recommender
Molina et al. Recommendation system for netflix
Rashidi et al. Entropy-based ranking approach for enhancing diversity in tag-based community recommendation
Feng et al. Forest-based deep recommender
Jun-Yao et al. Solutions to cold-start problems for latent factor models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221130

Address after: No.-532-1, Gushan Road, Economic and Technological District, Weihai City, Shandong Province, 264200

Patentee after: Weihai Bohua Medical Equipment Co.,Ltd.

Address before: 215000 8 Ji Xue Road, Xiangcheng District, Suzhou, Jiangsu.

Patentee before: SOOCHOW University