CN110781401A - Top-n project recommendation method based on collaborative autoregressive flow - Google Patents
Top-n project recommendation method based on collaborative autoregressive flow Download PDFInfo
- Publication number
- CN110781401A CN110781401A CN201911079406.2A CN201911079406A CN110781401A CN 110781401 A CN110781401 A CN 110781401A CN 201911079406 A CN201911079406 A CN 201911079406A CN 110781401 A CN110781401 A CN 110781401A
- Authority
- CN
- China
- Prior art keywords
- user
- project
- distribution
- item
- items
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a project recommendation method realized by utilizing Collaborative Autoregressive Flow (CAF), which expands a generation model based on the autoregressive flow to a collaborative filtering technology for modeling implicit feedback. The CAF converts the simple initial density into a more complex density which accords with the real distribution of data through a series of reversible transformations, and mines the items accessed by the user and the potential information in the attribute characteristics of the user and the items, so that the potential representation of the user and the potential representation of the items with more representation capability can be learned, and the items which are most likely to be accessed next time are stably and effectively recommended for the user. In addition, the invention adds a collaborative autoregressive flow to the model to learn the true distribution of the hidden variables, which can reduce the error of the traditional variational model (such as VAE) in the conversation-based recommendation problem.
Description
Technical Field
The invention belongs to the field of neural networks in machine learning, and relates to a deep learning-based method, which mainly utilizes Autoregressive Flow (Autoregressive Flow) to mine items accessed by users and potential information in attribute characteristics of the users and the items, learns potential representation of the users and the items, and utilizes a Collaborative Filtering (Collaborative Filtering) technology to recommend n most likely accessed items to the users on the basis.
Background
With the rapid development of internet technology, people's lives are more and more unable to leave the network, and a huge amount of user project interaction data appears along with the life. By using the attribute characteristics of the data, the interaction behavior of the user and the items is researched, the preference of the user is analyzed, and the items which are probably liked but not accessed by the user are presented to the user, so that personalized item recommendation is promoted, and the problem of internet information overload can be effectively relieved.
The conventional collaborative filtering method predicts the preference of a user by using a scoring Matrix of the user for an item through Matrix Factorization (Matrix Factorization). The method does not consider the attribute information of the user and the project, only can simply simulate the linear relationship of the interaction between the user and the project, and cannot extract the complex relationship of the interaction between the user and the project, so that the model recommendation performance is poor. There is also a class of methods that combine traditional bayesian inference techniques with uncertainty representations to learn the complex interaction of sparse implicit feedback and auxiliary information between users and projects. Such methods typically require that the posterior distribution of real data be assumed to be gaussian, however, real world data does not necessarily obey this form of distribution. Thus, such assumptions can make the model inflexible enough to match the true posterior distribution and uncertainty of the recommendation, easily leading to recommendation errors.
Based on the problems, the invention provides a deep learning-based method, which is characterized in that two variational automatic encoders are used for respectively modeling users and items, and a collaborative autoregressive stream is added in the process of encoding embedded vector representations of the users and the items to obtain potential representations of the users and the items, so that the problems of low recommendation accuracy, low recommendation efficiency and the like caused by the fact that nonlinear interaction between the users and the items cannot be simulated and the models are not flexible enough in the traditional item recommendation method are solved.
Disclosure of Invention
The invention aims to provide a representation method based on a collaborative autoregressive flow aiming at the current situation of low recommendation accuracy, low recommendation performance and other technical defects of the traditional project recommendation method, so that accurate and efficient prediction of the most probable project access of a user is realized, and the problem of poor recommendation effect of the traditional model is solved.
The idea of the invention is that according to the condition that a user accesses a project, the user and the project are expressed into embedded vectors, and then two Variational Automatic Encoders (VAEs) are utilized to learn the embedded vectors of the user and the project, so as to obtain the potential expression of the user and the project. In addition, in the process of obtaining the potential representation of the user and the project, the method adds the process of variation inference, utilizes the collaborative autoregressive flow to flexibly simulate the nonlinear interaction of the user and the project and learn the real distribution of the data of the user and the project, thereby avoiding the error of directly using a variation autoencoder to a great extent and achieving better project recommendation effect.
Based on the above invention thought, the invention designs a project recommendation method realized by utilizing a collaborative autoregressive stream, which specifically comprises the following steps:
s1, preprocessing of data: according to the condition that a user historically visits a project in original data, dividing a data set into a training set, a verification set and a test set; then, a project label matrix R is constructed according to the condition that the user visits the project in the training set; embedding each user and each project in the training set to obtain a user embedded vector matrix U and a project embedded vector matrix V;
s2, optimizing the model parameters, and acquiring an optimal project recommendation model: respectively inputting the user embedded vector matrix U and the project embedded vector matrix V into different variational automatic encoders, introducing a collaborative autoregressive stream in the encoding process, and obtaining final implicit variable representation matrices of the user and the project
And
then combining the final hidden variable representation of the user and the project with the corresponding collaborative information to obtain final potential representation matrixes U 'and V' of the user and the project, then inputting the vector product of the final potential representation of the user and the project into a classifier to obtain a first loss, inputting the final hidden variable representation of the user and the project into a decoder to obtain a second loss, adding the first loss and the second loss to generate a final total loss, and then minimizing the total loss to obtain the project recommendation model;
s3, recommending items to the user: and utilizing the trained item recommendation model to recommend Top-n items to the user.
In the method for training the project recommendation model by using the collaborative autoregressive flow, step S1 is to process an original data set, divide a training set, a verification set and a test set, and then embed all users and projects in the training set, and specifically includes the following steps:
s11, dividing the data set: according to the condition that a user accesses items in an original data set, randomly selecting 70% of items accessed by each user history as a training set, selecting 20% as a test set, and using the remaining 10% as a verification set;
s12, constructing a user embedded vector matrix U, a project embedded vector matrix V and a project label matrix R according to the condition that users visit projects in the training set, and specifically comprising the following steps:
s121, constructing a user embedded vector matrix U: will train all users U in the set
s={u
1,…,u
i,…,u
MExpressing as embedded vector, initializing a user embedded vector matrix according to user preference to items
Where M represents the number of users, N represents the number of items, u
ijRepresenting the user's embedded elements in the vector matrix, if user u
iAccessed item v
jThen u is
ijOtherwise, the symbol is 0, and the ith row of the user embedded in the vector matrix U represents the user U
iIs embedded with vector u
i;
S122, constructing a project embedding vector moment V: will train all items V in the set
s={v
1,…,v
j,…,v
NDenoted as the embedded vector, and the item embedded vector matrix is the transpose of the user embedded vector matrix, i.e., V ═ U
TItem V is represented by item embedding in jth row of vector matrix V
jIs embedded vector v
j;
S123, constructing a project label matrix R: constructing a project label matrix according to the condition that users visit projects in a training set
Wherein r is
ij=u
ij。
In the method for training the project recommendation model by using the collaborative autoregressive stream, step S2 is to input the user embedded variable matrix U and the project embedded variable matrix V obtained in step S1 into two variational automatic encoders with collaborative autoregressive stream introduced, and obtain final implicit variable representation matrices of users and projects
And
and then final potential representation matrixes U 'and V' of the users and the items are obtained. And then constructing a first loss and a second loss, and minimizing the sum of the two losses to obtain an optimal project recommendation model. The method comprises the following steps:
s21, respectively embedding a vector u into a user by using two variational automatic encoders
iAnd item embedding vector v
jCoding to obtain user u
iAnd item v
jInitial implicit variable representation of
And
the initial implicit variable representations are all d-dimensional vectors;
this step encodes the embedded vectors, represented as users and items, with an encoder (multi-layer perceptron), the computation process for each layer is as follows:
……
in the above formula, t represents the number of layers of the multilayer perceptron, and in the present invention, t is 3, and the result of passing the last layer
And
computing initial hidden variable representations of users and items, respectively
And
and
all obey a Gaussian distribution with user-initiated implicit variable representation
Mean value of distribution of
Variance of
Then
Initial implicit variable representation for an item
Mean value thereof
Variance (variance)
Obtained by means of the mean and variance of the items
In the calculation process of each layer of the encoder,
represents a nonlinear activation function, commonly known as sigmoid or tanh, in the invention, sigmoid is selected as the activation function,
and
respectively representing the hidden state representation obtained by the user embedded vector and the project embedded vector through the t-th layer neural network, and obtaining a matrix
Sum deviation vector
Are training parameters of the user encoder, where the superscript t denotes that this is a parameter through the t-th layer neural network, and
is to find the initial hidden variable representation of the user
Parameters to be learned during distribution;matrix array
And a deviation vector
For the training parameters of the project encoder, the superscript t indicates that this is a parameter passing through the t-th layer neural network,
is to find the initial hidden variable representation of the project
Parameters to be learned during distribution;
and
are all variables of a standard normal distribution for random sampling;
s22, defining K reversible autoregressive flows, and representing the user initial hidden variables obtained in the last step
And item initial implicit variable representation
Inputting the K autoregressive flows to perform reversible transformation, and learning to obtain final hidden variable representation of the user and the project
And
making its distribution closer to the true potential layer data distribution. Study of
And
the process of distribution is as follows (since the learning process represented by the final hidden variables of the user and the item is identical, we omit here the subscript u
iAnd v
jBy the uniform symbol z
0And z
KTo represent the initial hidden variable representation and the final hidden variable representation):
in the above formula
Is a one-dimensional probability density and is measured in z
KConditioned on the probability of the previous i-1 dimension, d denotes z
KDimension (d);
the process of finding the final implicit variable representation of the user and the project comprises the following sub-steps:
s221, obtaining a target distribution p (z)
k): in getting the final implied variable representation, we first get z
kD dimensions, where k is considered to pass through the kth stream, z
kIs in z
k-1Is conditional on the first i-1 dimension of (A), according to
The distribution is p (z)
k-1) Calculating z
kD dimensions of, thereby obtaining z
kDistribution p (z)
k),
In the above formula, M (-) and S (-) are the mean and the square, respectivelyPoor neural network, we can get a target distribution p (z) through S22
k) Since the process of generating the target distribution is based on p (z)
k-1) And therefore, the process is very fast for parallelization of the GPU. The process mainly utilizes the idea of Inverse Autoregressive Flow (IAF), which is a special neural network, and can output all values of mean and standard deviation at the same time, so as to facilitate the processing of mu
1:i-1And σ
1:i-1Sampling of (1);
s222, obtaining target distribution
For p (z)
k-1) Reversible process of density estimation, the invention uses the idea of Mask Autoregressive Flow (MAF) to estimate the target distribution
In the above equation, M '(. cndot.) and S' (. cndot.) are special neural networks that use mask autoregressive flow to average and variance, respectively, and act as a reversible computational process in the model;
s223, obtaining two same target distributions p (z) based on different conditions through steps S221 and S222
k) And
but they have different deviations in the process of modeling the autoregressive flow, in order to stabilize the training process with two output distributions, we calculate the KL divergence between the two output distributions,
in the above formula, two target distributions p (z)
k) And
the cross entropy calculation method of (1) is as follows:
s224, repeating the steps S221 to S223 for K times to obtain the final hidden variable representation of the user and the project
And
s23, representing the final hidden variables of the user and the project obtained in S22
And
collaborative information with users and items
And
combine to get the final potential representation u of the user and the project
i' and v
j′,
In the above formula, u
cAnd v
cThe two vectors are continuously optimized along with the optimization of the model, and finally, the collaborative information of the user and the project can be well represented;
s24, repeating the steps S21 to S23 until the final potential representation of all the users and the projects is obtained, and connecting the final potential representation of all the users and the projects to obtain final potential representation matrixes U 'and V' of the users and the projects;
s25, constructing two loss functions L
1And L
2Is prepared by mixing L
1And L
2And adding the two to generate the final total loss L, minimizing the total loss function, and finishing the training of the model to obtain the project recommendation model. Wherein L is
1For predicting cross-entropy loss function of an item and a tagged item, L
2And the reconstruction loss function is used for enabling the learned hidden variable distribution to be closer to the real distribution. The method comprises the following steps:
s251, obtaining a first loss L
1: inputting the vector product of the final potential representation matrixes U 'and V' of the users and the items obtained in the step S24 into a classifier consisting of a multi-layer perceptron, outputting a probability matrix of each item accessed by the users, namely a scoring matrix R 'of all the items by the users, solving the cross entropy loss by using the item label matrix R and the prediction scoring matrix R' obtained in the step S1,
in the above formula, r'
ijRepresenting user u
iAccessing an item v
jIs given by L
1As a first loss, M is the total number of users, N is the total number of items;
s252, obtaining a second loss L
2: the final hidden variables of the user and the project obtained in S23 are expressed
And
inputting the data into a reversible decoder, constructing the joint distribution of the hidden variables of the user and the project, and solving the relative entropy of the joint distribution of the hidden variables of the user and the project to obtain a second loss L
2(ii) a (for convenience of presentation, in the second penalty calculation formula, subscripts i and j of users and items are eliminated)
By the user posterior distribution q (z)
u| u) approximation of the implicit variable z
uTrue distribution p (u, z)
u),q(z
u| u) is defined as
Similarly, using the posterior distribution of items q (z)
vIv) approximating the implicit variable z
vTrue distribution of p (v, z)
v),q(z
v| v) is defined as
p(u,z
u) And p (v, z)
v) Representing the true distribution of user and project input data, z
u、z
vRepresenting implicit variables in the collaborative autoregressive flow model, u and v representing model input data, theta and phi representing parameters of probability distribution respectively, in the formula
And
which represents the loss of the reconstruction and,
and
is a constantTerms, the remaining four represent autoregressive flows. For the second loss L
2In the process of performing the optimization, the pair S223 is actually completed
For the minimization, the specific derivation process is referred to the following paper: [ van den Oord, A., Li, Y., Babuschkin, I., Simonyan, K., Vinyals, O., Kavukcuglu, K., van den Driessche,' Parallel wave net: Fast high-fidelity speed synthesis "];
s253, adding the first loss and the second loss together to generate a final total loss L which is L
1+L
2And minimizing the total loss function, completing the training of the model and obtaining the project recommendation model.
In the method for training a project recommendation model by using a collaborative autoregressive stream, the step S3 is to perform personalized project recommendation to a user by using the project recommendation model trained in the step S2, and includes the following steps:
s31, obtaining a user scoring matrix R' according to the model trained in S2, and setting the score of the item index position visited by the user in the training set to 0, such as user u
iHaving visited item v in training set
jThen the corresponding element R ' in the score matrix R ' is assigned '
ijSetting the value to be 0, and sequencing each row in the new scoring matrix according to the scoring height to obtain a matrix R';
and S32, selecting the item n before scoring in the R 'as the result recommended to the user finally according to the scoring matrix R' obtained in the S31.
The project recommendation method (CAF) realized by utilizing the collaborative autoregressive flow expands a flow-based generation model to a collaborative Filtering technology for modeling implicit feedback. Meanwhile, the potential representation of the user and the project can be better learned by combining the auxiliary information and the collaborative information of the user and the project, and the real distribution of the hidden variables can be learned through the autoregressive flow, so that the recommendation performance is greatly improved.
The invention provides a project recommendation method realized by utilizing a collaborative autoregressive stream. Compared with the prior art, the method has the following beneficial effects:
1. the method utilizes two variational automatic encoders to respectively learn the potential variable expressions of the user and the project, and then adopts a collaborative filtering technology to recommend the project to the user. Most of the traditional methods utilize a user item matrix to learn user preferences, and recommendation is rarely performed in combination with three aspects of users, items and user items, so that the traditional model cannot comprehensively capture interaction information of the user items.
2. In the invention, a collaborative autoregressive flow is added in the model coding process to learn the real distribution of hidden variables, so that the error of the traditional variational model (such as VAE) in the project recommendation problem is reduced; this is because the conventional VAE needs to assume an a priori distribution, which will bring a large error; the regularized flow does not assume a prior distribution, so that the learned distribution is closer to the true distribution of the underlying variables.
3. The Autoregressive Flow in the invention adopts an Inverse Autoregressive Flow (IAF) and a Mask Autoregressive Flow (MAF), can effectively promote the efficiency of variable inference and data sampling, and reduces the difference between the simple distribution of hidden variables and the actual data with complex distribution.
Drawings
Fig. 1 is a diagram illustrating an overall structure of a project recommendation model implemented using a collaborative autoregressive stream.
FIG. 2 is a process for variation inference using autoregressive flow.
FIG. 3 is a visual atlas of the final potential representation of the user and project.
FIG. 4 shows initial hidden variables of an item
The visualization of the change process was performed on K autoregressive streams on different datasets.
FIG. 5 is a graph of the predicted outcome of the project recommendation model (CAF) on different datasets as a function of the parameter K in the autoregressive flow; (a) corresponding to MovieLens dataset, (b) corresponding to cineulike dataset; (c) for a LastFM data set, K can be understood as the number of invertible functions (or called autoregressive streams).
Fig. 6 is a schematic diagram of the variation of the prediction result of the project recommendation model (CAF) on different data and training rounds.
Interpretation of terms
A variation inference (variational inference) which simply means that a required distribution p needs to be inferred from existing data; when p is not easily expressed and cannot be directly solved, an attempt can be made to use a variational inference method. That is, find a distribution q that is easy to express and solve, and when the difference between q and p is small, q can be used as an approximate distribution of p instead of p. The whole process uses a variational self-encoder (VAE) to derive an Evidence Lower Bound (ELBO), which is used to learn our approximate distribution q by maximizing the evidence lower bound.
In brief, the preferences of a group with mutual interests and experience are utilized to recommend information of interest to a user, and individuals give a considerable response (such as scoring) to the information through a cooperation mechanism and record the response to filter the information, and the response is not necessarily limited to the information of particular interest, and is also important for the record of the information of particular no interest. Collaborative filtering can be further classified as rating or population filtering. It is an important part of electronic commerce to recommend items that a customer may like based on past purchasing behavior of the customer and purchasing behavior of a group of customers with similar purchasing behavior, that is, to provide personalized recommendation services for information, goods, etc. based on the preferences of the group. In addition to recommendation, in recent years, mathematical operations have been developed to allow systems to automatically calculate the intensity of preferences and then remove the stock of contents so that the filtered contents are more based and perhaps not completely accurate, but the concept is more widely applied due to the addition of the evaluation of intensity, and the concept can be applied to the fields of information retrieval, network personal audio-visual cabinets, personal bookshelves and the like besides the fields of electronic commerce.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Embodiments utilize project recommendation model training implemented in collaborative autoregressive flows
The present embodiment employs three real datasets MovieLens, cineulike and LastFM (datasets may be separately derived from
https://grouplens.org/datasets/movielens/1M/,
http://www.citeulike.org/,
http://www.lastfm.com/Obtained) as a research object, and the project recommendation model training method provided by the invention is explained in detail.
As shown in fig. 1, the project recommendation model (CAF) implemented by using the collaborative autoregressive stream provided in this embodiment mainly includes an encoder, an autoregressive stream layer, a decoder, and a classifier, and the project recommendation model first preprocesses data: for the MovieLens data set, the data of the movie with the score of 1 to 5 of the user is reserved, and other data are deleted; excluding users having access to fewer than ten articles in the CiteULike dataset; for the LastFM dataset, we retain all the data.
TABLE 1 statistical information of data sets
The statistical information of the data set in this embodiment is shown in table 1, a training set, a verification set, and a test set are divided on three experimental data sets, 70% of access items are randomly selected for each user as the training set, 10% as the verification set, and the remaining 20% as the test set.
A project tag matrix R is constructed according to the condition that a user accesses a project, and each user and each project in input data are embedded to obtain a user embedded vector matrix U and a project embedded vector matrix V;
after the training data set is constructed, the embodiment obtains three project recommendation models by training the data in the training set according to step S2, where the verification set is mainly used for parameter adjustment, and the test set is used for the following test effect discussion.
As shown in FIG. 1, first, the user is askedThe embedded vector matrix U and the item embedded vector matrix V are input into step S2, and the implicit variables of each user and each item are first obtained according to the multi-layer perceptron network coding in S21
And
as shown in FIG. 2, to obtain the final implied variable representation for the user and project
And
output of encoder in S21
And
the data is carried into step S22, and K reversible autoregressive streams are transformed to obtain
And
then according to S23
And
and combining the user and project collaborative information which obeys Gaussian distribution to obtain the final potential representation of the user and the project, and continuously optimizing the potential representation of the user and the project along with the continuous updating of the model so as to obtain the final potential representation u of the user and the project
i' and v
j' repeat step S2 until the model converges, resulting in potential representations U ' and V ' for all users and all projects.
As shown in the decoder and classifier section of fig. 1, inputting the vector product of the user and item final potential representation matrices U ' and V ' obtained at S23 into the classifier of step S25 to obtain a first loss of cross entropy calculated from the item label matrix R and the prediction score matrix R '; then the end user and item implicit variables obtained in S23 are used
And
inputting into reversible decoder of S25, constructing joint distribution of user and item hidden variables according to L
2Solving a second loss; and combining the first loss and the second loss to obtain a total loss L, minimizing the L, and finishing the training of the model to obtain the project recommendation model.
Application example
For the test sets of the three experimental data sets, respectively adopting the project recommendation model obtained by the embodiment training to execute project recommendation according to the following steps:
s1 ', according to the trained model CAF, obtaining a user scoring matrix R ', setting the value of the item index position visited in the user training set in the scoring matrix as 0, and sorting the newly obtained scoring matrix according to the scoring height to obtain an ordered scoring matrix R '.
S2 ', user prediction items are obtained, a scoring matrix method is adopted in the step, item recommendation is carried out on each user according to the user scoring matrix R' obtained in the step S1 ', items of n (n is 5,10,20 and 50) before scoring in the R' are selected as a result of final recommendation for the user, if the n items have real label items in test data, the prediction is considered to be correct, in practical application, the n items are recommended for the user every time, and if the user is interested in the n items, the recommendation is successful. Therefore, the accuracy and efficiency of item recommendation can be further improved while the recommended items are obtained.
The predicted effect on the test set using the item recommendation model (CAF) described above is shown in the bold section of table 2.
Table 2: project recommendation model results on three datasets
In order to further illustrate the prediction effect of the project recommendation method implemented by using the collaborative autoregressive stream provided by the invention. In the application example, three experimental data sets are trained on seven baseline methods (BPR, CDL, CVAE-B, MVAE, VAE-AR and CLVAES) to obtain a project recommendation model, then the seven models are used for recommending n projects which are most likely to be visited next to a user in a test set, and the prediction results of the seven models are shown in Table 2. The index R @ n represents the proportion of the number of predicted correct items to the number of real items in the test case, namely the recall rate; p @ n represents the ratio of the number of items predicted to be correct to the number of items predicted in the test set, i.e., the accuracy.
The rest of the methods in the table are described below:
BPR: is a widely used matrix factorization method that optimizes latent factors using implicit feedback using pairwise ordered objective functions through stochastic gradient descent. Reference may be made to the paper: rendle, C.Freudenhaler, Z.Gantner, and L.Schmidt-Thieme.2009.Bpr Bayesian personalised ranking from experimental feedback.InUAI.
CDL: the method is combined with Bayesian model learning auxiliary information, and potential features are extracted through stacking denoising automatic encoders and collaborative filtering. Reference may be made to the paper: wang, j. guo, y.lan, j.xu, s.wan, and x.cheng.2015.learning iterative representation model for nextbastreceiving in SIGIR.
CVAE: is the first collaborative project recommendation method based on variational auto-encoder, which incorporates the content information in the project into a matrix decomposition (the method of improving CVAE in conjunction with BPR model to improve the recommendation of the model is CVAE-B in table 2). CVAE models can be referred to the paper: li and j. she.2017. colloidal variable automatic encoder for recommender systems.
MVAE: the polynomial conditional likelihood is used as a prior distribution and it contains no side information for recommendation. Reference may be made to the paper: liang, r.g.krishnan, m.d.hoffman, and t.jebara.2018.variational automation codes for colour filtering. in WWW.
And VAE-AR, modeling the implicit feedback information of the user and the auxiliary information of the project by using a variation automatic encoder, and extracting potential variable representations influenced by the auxiliary information by using a countermeasure generation network. Reference may be made to the paper: in CIKM, L, K.Song, and I. -C.Moon.2017.augmented variant Autoencoders for a colour filtering with automatic information.
CLVAE: the method is a recommendation method based on a conditional variation automatic encoder, and expands CVAE with a hierarchical variation automatic encoder structure. Reference may be made to the paper: in CIKM, L, K.Song, and I. -C.Moon.2017.augmented variant Autoencoders for a colour filtering with an automatic information.
As can be seen from the prediction results in Table 2, the project recommendation method realized by utilizing the collaborative autoregressive flow provided by the invention has higher prediction precision than that of the existing methods.
In order to illustrate why the collaborative autoregressive flow can improve the prediction result, the implicit variables provided by the embodiment and the general method are visually compared, as shown in fig. 3, it can be seen from fig. 3 that the clustering effect of the CAF model is better, which is why the prediction effect of the CAF is higher than that of other methods.
To illustrate that the result of the true posterior of the data can be approximated using the autoregressive stream, the present application provides a visual comparison of the project hidden variables of the three data with different values of the number of the autoregressive streams, as shown in fig. 4. Theoretically, more reversible transformations may approach more complex distributions, but smaller values are sufficient for our model. On the movilens dataset it can clearly be observed that the most obvious representation is obtained when K ═ 7, after which the implicit variable representation of the item is warped again. On both the CiteULike and LastFM datasets, 5 autoregressive flow transformations were sufficient to obtain the best visualization. Such visualization is important because the visual representation is more interpretable in nature, and the degree of visualization can be used to guide the training process of an autoregressive flow-based recommendation system.
To illustrate the effect of the number of autoregressive streams on the prediction results, the present application example conducted experiments to investigate the effect of K (the number of autoregressive streams) on the recommendation performance, the results of which (R @10 index) are shown in fig. 5. As can be seen from fig. 5(a), as the number of streams increases, the accuracy of prediction increases until K >7 and then starts to decrease; FIG. 5(b) (c) corresponds to the CiteULike and LastFM data sets, respectively, with predicted accuracy rising when K <5 and beginning to fall when K > 5. We can see that the results of fig. 5 are consistent with the visualization results of fig. 4, and also illustrate that the more separable the implicit variables, the better the recommendation performance of CAF.
In order to illustrate the influence of the number of training rounds on the prediction result, the application example CAF model is trained on three data sets for 40 rounds respectively, then the training sets of the data sets MovieLens, cineulike and LastFM are used for training respectively according to the method provided by the embodiment, the trained CAF model is used for predicting in the test set of the three data sets in the data set, and the item most likely to be visited in the next time is recommended to the user, and the prediction result is shown in fig. 6. It can be seen from the figure that the best results were obtained in 25 training rounds on the data set MovieLens, while the best results were obtained after 30 training rounds for the citeuulike and LastFM data sets.
In summary, the collaborative autoregressive flow is used for achieving project recommendation, the collaborative autoregressive flow algorithm solves the problem of deviation inference by means of Bayes inference of probability recommendation and flexible autoregressive flow of posterior approximation, can effectively solve the problem that a data set lacks user project information, and can reduce the inherent unknown posterior estimation problem of the existing Bayes recommendation method. The added autoregressive flow can well solve the error in the prior variational inference (such as VAE) process. The real distribution of the user and the project can be learned by the model, and the complex interaction relation of the user and the project can be captured, so that the model is greatly helpful for improving the prediction accuracy.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.
Claims (4)
1. A Top-n project recommendation method based on collaborative autoregressive flow implementation is characterized by comprising the following steps:
s1, preprocessing of data: according to the condition that a user historically visits a project in original data, dividing a data set into a training set, a verification set and a test set; then, a project label matrix R is constructed according to the condition that the user visits the project in the training set; embedding each user and each project in the training set to obtain a user embedded vector matrix U and a project embedded vector matrix V;
s2, optimizing the model parameters, and acquiring an optimal project recommendation model: respectively inputting the user embedded vector matrix U and the project embedded vector matrix V into different variational automatic encoders, introducing a collaborative autoregressive stream in the encoding process, and obtaining final implicit variable representation matrices of the user and the project
And
then combining the final hidden variable representation of the user and the item with the corresponding collaborative information to obtain final potential representation matrixes U 'and V' of the user and the item, then inputting the vector product of the final potential representation of the user and the item into a classifier to obtain a first loss, then inputting the final hidden variable representation of the user and the item into a decoder to obtain a second loss, and comparing the first loss with the second lossAdding to generate a final total loss, and then minimizing the total loss to obtain the project recommendation model;
s3, recommending items to the user: and utilizing the trained item recommendation model to recommend Top-n items to the user.
2. The method for recommending Top-n project based on collaborative autoregressive flow implementation according to claim 1, wherein said step S1 comprises the following sub-steps:
s11, dividing the data set: according to the condition that a user accesses items in an original data set, randomly selecting 70% of items accessed by each user history as a training set, selecting 20% as a test set, and using the remaining 10% as a verification set;
s12, constructing a user embedded vector matrix U, a project embedded vector matrix V and a project label matrix R according to the condition that users visit projects in the training set, and specifically comprising the following steps:
s121, constructing a user embedded vector matrix U: will train all users U in the set
s={u
1,…,u
i,…,u
MExpressing as embedded vector, initializing a user embedded vector matrix according to user preference to items
Where M represents the number of users, N represents the number of items, u
ijRepresenting the user's embedded elements in the vector matrix, if user u
iAccessed item v
jThen u is
ijOtherwise, the symbol is 0, and the ith row of the user embedded in the vector matrix U represents the user U
iIs embedded with vector u
i;
S122, constructing a project embedding vector moment V: will train all items V in the set
s={v
1,…,v
j,…,v
NDenoted as the embedded vector, and the item embedded vector matrix is the transpose of the user embedded vector matrix, i.e., V ═ U
TItem ofItem embedding the jth row in the vector matrix V represents item V
jIs embedded vector v
j;
S123, constructing a project label matrix R: constructing a project label matrix according to the condition that users visit projects in a training set
Wherein r is
ij=u
ij。
3. The method for recommending Top-n project based on collaborative autoregressive flow according to claim 1, wherein said step S2 comprises the following sub-steps:
s21, respectively embedding a vector u into a user by using two variational automatic encoders
iAnd item embedding vector v
jCoding to obtain user u
iAnd item v
jInitial implicit variable representation of
And
the initial implicit variable representations are all d-dimensional vectors;
this step encodes the embedded vectors represented as users and items using an encoder, i.e., a multi-layer perceptron, the calculation process for each layer being as follows:
a second layer:
……
a t-th layer:
in the above formula, t tableThe number of layers of the multilayer perceptron is shown, t is 3 in the invention, and the result of the last layer is passed
And
computing initial hidden variable representations of users and items, respectively
And
and
all obey a Gaussian distribution with user-initiated implicit variable representation
Mean value of distribution of
Variance of
Then
Initial implicit variable representation for an item
Mean value thereof
Variance (variance)
Obtained by means of the mean and variance of the items
In the calculation process of each layer of the encoder,
represents a nonlinear activation function, commonly known as sigmoid or tanh, in the invention, sigmoid is selected as the activation function,
and
respectively representing the hidden state representation obtained by the user embedded vector and the project embedded vector through the t-th layer neural network, and obtaining a matrix
Sum deviation vector
Are training parameters of the user encoder, where the superscript t denotes that this is a parameter through the t-th layer neural network, and
is to find the initial hidden variable representation of the user
Parameters to be learned during distribution; matrix array
And a deviation vector
For the training parameters of the project encoder, the superscript t indicates that this is a parameter passing through the t-th layer neural network,
is to find the initial hidden variable representation of the project
Parameters to be learned during distribution;
and
are all variables of a standard normal distribution for random sampling;
s22, defining K reversible autoregressive flows, and representing the user initial hidden variables obtained in the last step
And item initial implicit variable representation
Inputting the K autoregressive flows to perform reversible transformation, and learning to obtain final hidden variable representation of the user and the project
And
making its distribution closer to the true potential layer data distribution. Learning due to final implicit variable representation of users and itemsThe learning process is consistent, so we omit the subscript u here
iAnd v
jBy the uniform symbol z
0And z
KTo represent the initial hidden variable representation and the final hidden variable representation, and thus, to learn
And
the distribution process is as follows:
in the above formula
Is a one-dimensional probability density and is measured in z
KConditioned on the probability of the previous i-1 dimension, d denotes z
KDimension (d);
the process of finding the final implicit variable representation of the user and the project comprises the following sub-steps:
s221, obtaining a target distribution p (z)
k): in getting the final implied variable representation, we first get z
kD dimensions, where k is considered to pass through the kth stream, z
kIs in z
k-1Is conditional on the first i-1 dimension of (A), according to
The distribution is p (z)
k-1) Calculating z
kD dimensions of, thereby obtaining z
kDistribution p (z)
k),
In the above formula, M (-) and S (-) are neural networks for calculating the mean and variance, respectively, and we can obtain a target distribution p (z) through S22
k) Since the process of generating the target distribution is based on p (z)
k-1) And therefore, the process is very fast for parallelization of the GPU. The process mainly utilizes the idea of Inverse Autoregressive Flow (IAF), which is a special neural network, and can output all values of mean and standard deviation at the same time, so as to facilitate the processing of mu
1:i-1And σ
1:i-1Sampling of (1);
s222, obtaining target distribution
For p (z)
k-1) Reversible process of density estimation, the invention uses the idea of Mask Autoregressive Flow (MAF) to estimate the target distribution
In the above equation, M '(. cndot.) and S' (. cndot.) are special neural networks that use mask autoregressive flow to average and variance, respectively, and act as a reversible computational process in the model;
s223, obtaining two same target distributions p (z) based on different conditions through steps S221 and S222
k) And
but they have different deviations in the process of modeling the autoregressive flow, in order to stabilize the training process with two output distributions, we calculate the KL divergence between the two output distributions,
in the above formula, two target distributions p (z)
k) And
the cross entropy calculation method of (1) is as follows:
s224, repeating the steps S221 to S223 for K times to obtain the final hidden variable representation of the user and the project
And
s23, representing the final hidden variables of the user and the project obtained in S22
And
collaborative information with users and items
And
combine to get the final potential representation u of the user and the project
i' and v
j′,
In the above formula, u
cAnd v
cThe two vectors are continuously optimized along with the optimization of the model, and finally, the collaborative information of the user and the project can be well represented;
s24, repeating the steps S21 to S23 until the final potential representation of all the users and the projects is obtained, and connecting the final potential representation of all the users and the projects to obtain final potential representation matrixes U 'and V' of the users and the projects;
s25, constructing two loss functions L
1And L
2Is prepared by mixing L
1And L
2And adding the two to generate the final total loss L, minimizing the total loss function, and finishing the training of the model to obtain the project recommendation model. Wherein L is
1For predicting cross-entropy loss function of an item and a tagged item, L
2And the reconstruction loss function is used for enabling the learned hidden variable distribution to be closer to the real distribution. The method comprises the following steps:
s251, obtaining a first loss L
1: inputting the vector product of the final potential representation matrixes U 'and V' of the users and the items obtained in the step S24 into a classifier consisting of a multi-layer perceptron, outputting a probability matrix of each item accessed by the users, namely a scoring matrix R 'of all the items by the users, solving the cross entropy loss by using the item label matrix R and the prediction scoring matrix R' obtained in the step S1,
in the above formula, r'
ijRepresenting user u
iAccessing an item v
jThe probability of (a) of (b) being,mixing L with
1As a first loss, M is the total number of users, N is the total number of items;
s252, obtaining a second loss L
2: the final hidden variables of the user and the project obtained in S23 are expressed
And
inputting the data into a reversible decoder, constructing the joint distribution of hidden variables of users and items, and calculating the relative entropy with the joint distribution of the input user and item data, for the convenience of expression, in a second loss calculation formula, eliminating subscripts i and j of the users and the items to obtain a second loss L
2:
By the user posterior distribution q (z)
u| u) approximation of the implicit variable z
uTrue distribution p (u, z)
u),q(z
u| u) is defined as
Similarly, using the posterior distribution of items q (z)
vIv) approximating the implicit variable z
vTrue distribution of p (v, z)
v),q(z
v| v) is defined as
p(u,z
u) And p (v, z)
v) Representing the true distribution of user and project input data, z
u、z
vRepresenting implicit variables in the collaborative autoregressive flow model, u and v representing model input data, theta and phi representing parameters of probability distribution respectively, in the formula
And
which represents the loss of the reconstruction and,
and
being constant terms, the remaining four terms represent autoregressive flows. For the second loss L
2In the process of performing the optimization, the pair S223 is actually completed
For the minimization, the specific derivation process is referred to the following paper: [ van den Oord, A., Li, Y., Babuschkin, I., Simonyan, K., Vinyals, O., Kavukcuglu, K., van den Driessche,' Parallel wave net: Fast high-fidelity speed synthesis "];
s253, adding the first loss and the second loss together to generate a final total loss L which is L
1+L
2And minimizing the total loss function, completing the training of the model and obtaining the project recommendation model.
4. The method for recommending Top-n project based on collaborative autoregressive flow according to claim 1, wherein said step S3 comprises the following sub-steps:
s31, obtaining a user scoring matrix R' according to the model trained in S2, and setting the score of the item index position visited by the user in the training set to 0, such as user u
iHaving visited item v in training set
jThen the corresponding element R ' in the score matrix R ' is assigned '
ijIs set to 0. Sequencing each row in the new scoring matrix according to the scoring height to obtain a matrix R';
and S32, selecting the item n before scoring in the R 'as the result recommended to the user finally according to the scoring matrix R' obtained in the S31.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911079406.2A CN110781401A (en) | 2019-11-07 | 2019-11-07 | Top-n project recommendation method based on collaborative autoregressive flow |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911079406.2A CN110781401A (en) | 2019-11-07 | 2019-11-07 | Top-n project recommendation method based on collaborative autoregressive flow |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110781401A true CN110781401A (en) | 2020-02-11 |
Family
ID=69389888
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911079406.2A Pending CN110781401A (en) | 2019-11-07 | 2019-11-07 | Top-n project recommendation method based on collaborative autoregressive flow |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110781401A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111310048A (en) * | 2020-02-25 | 2020-06-19 | 西安电子科技大学 | News recommendation method based on multilayer perceptron |
CN111552881A (en) * | 2020-05-09 | 2020-08-18 | 苏州市职业大学 | Sequence recommendation method based on hierarchical variation attention |
CN111708937A (en) * | 2020-05-27 | 2020-09-25 | 西安理工大学 | Cross-domain recommendation method based on label migration |
CN112085158A (en) * | 2020-07-21 | 2020-12-15 | 西安工程大学 | Book recommendation method based on stack noise reduction self-encoder |
CN112435751A (en) * | 2020-11-10 | 2021-03-02 | 中国船舶重工集团公司第七一六研究所 | Peritoneal dialysis mode auxiliary recommendation system based on variation inference and deep learning |
CN114065039A (en) * | 2021-11-17 | 2022-02-18 | 重庆邮电大学 | Mean value pooling operation-based self-encoder recommendation method and system |
CN114373537A (en) * | 2021-12-06 | 2022-04-19 | 云南联合视觉科技有限公司 | Diagnosis and treatment scheme recommendation method and device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040225509A1 (en) * | 2003-05-07 | 2004-11-11 | Olivier Andre | Use of financial transaction network(s) information to generate personalized recommendations |
CN108320187A (en) * | 2018-02-02 | 2018-07-24 | 合肥工业大学 | A kind of recommendation method based on depth social networks |
US20190130281A1 (en) * | 2017-10-31 | 2019-05-02 | Microsoft Technology Licensing, Llc | Next career move prediction with contextual long short-term memory networks |
CN109979429A (en) * | 2019-05-29 | 2019-07-05 | 南京硅基智能科技有限公司 | A kind of method and system of TTS |
CN110162709A (en) * | 2019-05-24 | 2019-08-23 | 中森云链(成都)科技有限责任公司 | A kind of personalized arrangement method of the robust of combination antithesis confrontation generation network |
CN110232480A (en) * | 2019-03-01 | 2019-09-13 | 电子科技大学 | The item recommendation method and model training method realized using the regularization stream of variation |
-
2019
- 2019-11-07 CN CN201911079406.2A patent/CN110781401A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040225509A1 (en) * | 2003-05-07 | 2004-11-11 | Olivier Andre | Use of financial transaction network(s) information to generate personalized recommendations |
US20190130281A1 (en) * | 2017-10-31 | 2019-05-02 | Microsoft Technology Licensing, Llc | Next career move prediction with contextual long short-term memory networks |
CN108320187A (en) * | 2018-02-02 | 2018-07-24 | 合肥工业大学 | A kind of recommendation method based on depth social networks |
CN110232480A (en) * | 2019-03-01 | 2019-09-13 | 电子科技大学 | The item recommendation method and model training method realized using the regularization stream of variation |
CN110162709A (en) * | 2019-05-24 | 2019-08-23 | 中森云链(成都)科技有限责任公司 | A kind of personalized arrangement method of the robust of combination antithesis confrontation generation network |
CN109979429A (en) * | 2019-05-29 | 2019-07-05 | 南京硅基智能科技有限公司 | A kind of method and system of TTS |
Non-Patent Citations (8)
Title |
---|
F ZHOU 等: "Recommendation via Collaborative Autoregressive Flows", 《NEURAL NETWORKS》 * |
FAN ZHOU 等: "Variational Session-based Recommendation Using Normalizing Flows", 《THE WORLD WIDE WEB CONFERENCEMAY》 * |
常标: "面向在线媒体的信息流动模式分析及流行度预测方法研究", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 * |
李宗阳: "基于深度学习的用户行为过程预测方法研究与实现", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
李鹏: "基于高斯混合模型的变分自动编码器", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
胡杰 等: "无线网络中的业务行为及业务容量——概念、模型及发展", 《中国电子科学研究院学报》 * |
莫玉华: "基于流和生成网络的推荐系统研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
陈辉 等: "自回归预测多级矢量量化线谱频率编码技术", 《西安科技大学学报》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111310048A (en) * | 2020-02-25 | 2020-06-19 | 西安电子科技大学 | News recommendation method based on multilayer perceptron |
CN111310048B (en) * | 2020-02-25 | 2023-06-20 | 西安电子科技大学 | News recommending method based on multilayer perceptron |
CN111552881A (en) * | 2020-05-09 | 2020-08-18 | 苏州市职业大学 | Sequence recommendation method based on hierarchical variation attention |
CN111552881B (en) * | 2020-05-09 | 2024-01-30 | 苏州市职业大学 | Sequence recommendation method based on hierarchical variation attention |
CN111708937A (en) * | 2020-05-27 | 2020-09-25 | 西安理工大学 | Cross-domain recommendation method based on label migration |
CN111708937B (en) * | 2020-05-27 | 2022-12-16 | 北京阅视无限科技有限公司 | Cross-domain recommendation method based on label migration |
CN112085158A (en) * | 2020-07-21 | 2020-12-15 | 西安工程大学 | Book recommendation method based on stack noise reduction self-encoder |
CN112435751A (en) * | 2020-11-10 | 2021-03-02 | 中国船舶重工集团公司第七一六研究所 | Peritoneal dialysis mode auxiliary recommendation system based on variation inference and deep learning |
CN114065039A (en) * | 2021-11-17 | 2022-02-18 | 重庆邮电大学 | Mean value pooling operation-based self-encoder recommendation method and system |
CN114373537A (en) * | 2021-12-06 | 2022-04-19 | 云南联合视觉科技有限公司 | Diagnosis and treatment scheme recommendation method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110781401A (en) | Top-n project recommendation method based on collaborative autoregressive flow | |
Wu et al. | Session-based recommendation with graph neural networks | |
CN109299396B (en) | Convolutional neural network collaborative filtering recommendation method and system fusing attention model | |
CN111797321B (en) | Personalized knowledge recommendation method and system for different scenes | |
CN112529168B (en) | GCN-based attribute multilayer network representation learning method | |
CN111127146B (en) | Information recommendation method and system based on convolutional neural network and noise reduction self-encoder | |
CN110955826B (en) | Recommendation system based on improved cyclic neural network unit | |
CN108563755A (en) | A kind of personalized recommendation system and method based on bidirectional circulating neural network | |
WO2019118644A1 (en) | Systems and methods for collaborative filtering with variational autoencoders | |
CN112364976A (en) | User preference prediction method based on session recommendation system | |
CN111274398A (en) | Method and system for analyzing comment emotion of aspect-level user product | |
CN112328900A (en) | Deep learning recommendation method integrating scoring matrix and comment text | |
CN111859166A (en) | Article scoring prediction method based on improved graph convolution neural network | |
CN106157156A (en) | A kind of cooperation recommending system based on communities of users | |
Deodhar et al. | A framework for simultaneous co-clustering and learning from complex data | |
CN112699310A (en) | Cold start cross-domain hybrid recommendation method and system based on deep neural network | |
Alfarhood et al. | DeepHCF: a deep learning based hybrid collaborative filtering approach for recommendation systems | |
Jiang et al. | An intelligent recommendation approach for online advertising based on hybrid deep neural network and parallel computing | |
CN117252665B (en) | Service recommendation method and device, electronic equipment and storage medium | |
Liu et al. | TCD-CF: Triple cross-domain collaborative filtering recommendation | |
CN112381225A (en) | Recommendation system retraining method for optimizing future performance | |
CN112818256A (en) | Recommendation method based on neural collaborative filtering | |
CN111930926A (en) | Personalized recommendation algorithm combined with comment text mining | |
CN116932862A (en) | Cold start object recommendation method, cold start object recommendation device, computer equipment and storage medium | |
CN110659962B (en) | Commodity information output method and related device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200211 |
|
WD01 | Invention patent application deemed withdrawn after publication |